This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
,
denotes the exponential polynomial
For the previous example
Therefore,
rt
-&
with w l (t, r ) = sin[w(t - t ) ] sinot. The higher-order kernels can be computed in the same way after decomposinginto partial fraction each rational power series.
Noncommutative Pad6-Type Approximants Assume that the functions frk : Rn -+ R ofEquation 55.48 are C m ,with fik(x17. - . x") htxl, . . - , xn) 7
ki
= Cjl,,,,,jn>0ajL .,,,j,,(xl)I1 . .. (xn)jn = Cjl,,,.,jnZo hjl ,...,jn( ~ 1 ) ". . . (xn)jn.
Let y denote an equilibrium point of the system (55.48) and let :P> Xll.....in
55.3.4 Approximation Abilities of Volterra Series Analysis of Responses of Systems
denote the monomial or new state (XI)j l . . . ( x ~ ) " ,
The rational power series g'pZ may be seen as a noncommutative Padk-type approximant for the forced differential system (55.48) which generalizes the notion of Pad&-typeapproximant obtained in [4]. Using algebraic computing, these approximants may be derived explicitly 1381. These algebraic tools for the first time enable one to derive the Volterra kerpels. Other techniques have recently been introduced in order to compute approximate solutions to the response ofnonlinear control systems. The method in [34] is based on the combinatorial notion of L-species. Links between this combinatorial method and the Fiiess algebraic setting have been studied in [33] and [35]. Another approach using automata representations [41] is proposed in [23]. The Volterra series (55.52) terminating with the term involving the pth kernel is called a Volterra series of length p. In the followingwe will summarize some properties of an input-output map having a finite Volterra series. The main question is how to characterize a state space representation (55.51) that admits a finite Volterra series representation [8]. This study lead in particular to the introduction of a large class of approximating systems, having a solvable but not necessarily nilpotent Lie algebra (91 (see also (241).
j l + . . . j n ~ .p
Then the Brockett bilinear system
In this part we show how to compute the response of nonlinear systems to typical inputs. We assume here m = 1. This method, based on the use of the formal representation of the Volterra kernels (55.64), is also easily implementable on a computer using formal languages [38]. These algebraic tools allow us to derive exponential polynomial expressions depending explicitly on time for the truncated Volterra series associated with the response [311, [27] and therefore lead to a finer analysis than pure numerical re~ults.
55.3. VOLTERRA ANL) FLIESS SERIES EXPANSIONS FOR NONLINEAR SYSTEMS To continue our use of algebraic tools, let us introduce the Laplace-Bore1 transform associated with a given analytic function input
885
Consider two g,enerating power series of the form (55.3.4),
and
Its Laplace-Bore). transform is
For example, the Elorel transformation of
.
where pandq E N,theindicesil, i2, . . . , i p E (0, I ) , jl,j2, . . . jq E 10, 11 and a, ,bj E C . The shuffle product of these expressions is given by induction on the length
is given by
Before seeing the algebraic computation itself in order to compute the first terms oi'the response to typical inputs, let us introduce a new operation on formal power series, the shuffle product. Given two formal power series,
The shujj7eprodnct of two formal power series gl and g2 is given by g10g2 = ) (g1, w1)(g2, w 2 ) w 1 0 w 2 , wl.w2~z*
where shuffle product of two words is defined as follows,
..
LOz -- 2 0 1 = Z . V w , w ' E Z* V Z , Z ' โฌZ , z w O z l w l = z [ w O z l w ' ] z'[zwOW']. Vz
โฌ
z,
See [ 3 0 ] for case-study examples and some oither rules for computing directly the stationary response to harmonic inputs or the response of a Dirac function and see [ 1 4 ]for the algebraic computation of the response to white noise inputs. This previous effective computation of the rational power series g and of the response to typical entries has been applied to the analysis of nonlinear electronics circuits [ 2 ] and to the study of laser semiconductors [ l 8 ] .
Volterra series expansionshave been used in order to study control variations for the output of nonlinear systems combined with some multiple integral identities [ l o ] . This analysis [ 3 2 ] , 1421, [ 4 3 ] ,demonstrates links between the classical Hamiltonian formalism and the Lie algebra associated with the nonlinear control problem. To be more precise, let us consid(:r the control system
+
This operation consists of shuffling all the letters of the two words by keeping the order of the letters in the two words. For instance,
It has been shown that the Laplace-Bore1 transform of Equation 55.62, for a given input u ( t ) with the Laplace-Bore1transsubstituting from the right, each variable form &, , is obtained 1~y zl by the operator zo [ g , O . ] . Therefore, in order to apply this result, we need to know how to compute shuffle product of algebraic expressions of the form
where f : Rn x Rm -+ Rn is a smooth mapping. Let h : Rn + R be a smooth function, let y ( t , X O )= et Lf xO
be the free resporlse of the system and let x (t ,xo, at) be the solution relative to the control u, u being an integrable function taking values in some given bounded open set U E Rm. kin example of control problem is the following: find necessary conditions such that h ( y ( T ) ) = min h ( x ( T , xo, u)). Let wo : [ 0 , T ] x Rn -+ R be defined as follows,
where i l , i2, . . . , in E (0, 11. This computation is very simple as it amounts to adding some singularities. For instance
It is easy to see that the map I : !O, TI
-+
(R)*given by
(55.66)
THE CONTROL HANDBOOK
55.3.5 Other Approximations: Application to Motion Planning
is the solution of the adjoint equation
af ax
- i ( t ) = I ( t ) - (x* O) (I,y ( t ) )
Let us consider a control system
and h ( T ) = d h ( y ( t ) ) . A first order necessary condition is provided by the application ofthe Maximum Principle: If y ( t )satisfies (55.66) for r E [0, TI, then
a d f o f iwo(t,y ( t ) ) = 0 for t
E
[0,T ]
(55.67)
and the matrix
is a non-negative matrix for t
E
[0, T ] where
and
The reference trajectory y is said extremal if it satisfies Equation 55.67 and is said singular if it is extremal and if all the terms in the matrix (55.68) vanish. If y is singular and it satisfies Equation 55.66, then it can be shown for instance (see (321) that if there exists s 2 1 such that for t E (0, T ] and i , j = I, . . . , m , k
[ a d ~ ~ l h , a d f o f i l w ~ ( t . ~ ( t )for ) = Ok = o , l , s - 1 ,
then
[ a d j ~ f i . a d ~ ~ fwo il
( t . y ( t ) ) = o for k l , k 2 2 O with kl k2 = 0, . . . ,2s
+
and the matrix
is a symmetric non-negative matrix for t E [0,T I . As a dual problem, sufficient conditions for local controllability have been derived using Volterra series expansions (see for instance [ 3 ] ) .
Search for Limit Cycles and Bifurcation Analysis The Hopf bifurcation theorem deals with the appearance and growth of a limit cycle as a parameter is varied in a nonlinear system. Several authors have given a rigorous proof using various mathematical tools, seriesexpansions, central manifold theorem, harmonic balance, Floquet theory, or Lyapunov methods. Using Volterra series (471 did provide a conceptual simplification of the Hopf prosf. In many problems, the calculations involved are simplified leading to practical advantages as well as theoretical ones.
The dynamical exact motion planning problem is the following: given two state vectors p and q, find an input function u ( t ) = (u1 ( t ) ,u2(t),. . . , urn( t ) )that drives exactly thestatevector from p to q. In order to solve this yroblem, several expansions for the solutions which are intimately linked with the Fliess series are used. When the vector fields f,, i = 1, . . . n1 are real analytic and complete and such that the generated control Lie algebra is everywhere of full rank and nilpotent, then in [46], [ 2 h ] ,the authors described complete solution of the yrev~ouscontrol problem using P. Hall basis. Let A ( z l ,22, . . . , z m ) denote the algebra of noncommutative polynomials in ( z l ,22,. . . , i , r l ) and let L ( z l , z 2 , . . . . z,,) denote the Lie subalgebra of A ( z l ,z2, . . . , z,,,) generated by z l , z 2 , . . . , zm with the Lie bracket defined by [ P , Q ] = P Q Q P. The elements of L ( z l ,22, . . . . z,,,) are knowg-asLiepolynomialsin z I , 22, . . . , zm. Let F,, be the set offormal Lie morromials in z l , z2, . . . , z,,,. A P Hall basis of L(z1, z2, . . . , z,,,) is a totally ordered subset (B,<) of Fm such that, The z, belong to B. If A, B E B , and degree(A)< degree(B), then AIB. If P E Fm and P is not one of the z,, then P E B if and only if P = [ A , B] with A, B E B, A
+
fi
+
written as a product
For example, in the case k = 3, rn = 2, we may choose z3 = [ Z Iz21 , , zq = [ z i , [ z i ,z 2 ] ] and z5 = [ z z , [ z l ,z z ] ] For this
I
55.3. VOLTERRA AND FLIESS SERIES EXPANSIONS FOR NONLINEAR SYSTEMS choice, the functions h j , j = 1, . . . ,r are computed by solving
withhj(0)=O, j = l , ..., 5. In order to take into account systems with drift, for the system (55.69) where ~ i l ( t )may be identically equal to 1, a different basis, called the Lyndon basis, has'been used in [21]. Without going into the details, for the previous case, it is shown that the solution is given by the following exponential product expansion
withherez3 = [ Z ~ , . Z I ]=, Z( ~~ 2 [, ~ i , ~ i ] ] ,= Z [ZZ, s [ Z ~ ~ fj =Ef (z,) for j = 3, . . . , 5 and
References [ I ] d'Alessandro, P., Isidori, A,, and Ruberti, A., Realizations and structure theory of bilinear dynamical systems, SIAM J. Control, 12, 517-535, 1974. [2] Baccar, S., Lmnabhi-Lagarrigue, F., and Salembier, G., Utilisation du calcul formel pour la modelisation et la simulation des circuits Clectroniques faiblement non linkaires, Annales des Tt?lkommunications, 46, 282-288,199 1. [3] Bianchini, R.M. and Stefani, G., A high order maximum principlle and controllability, 1991. (41 Brezinski, C., Padt-type approximants and general orthogonal polynomials, INSM, 50, Birkhaiiser, 1980. [5] Brockett, R.W., Nonlinear systemsand differentialgeometry, Proceedings IEEE, 64,61-72, 1976. [6] Brockett, R.W., Volterra series and geometric control theory,Autoniatica, 12,167-176,1976; Brockett, R.W. and Gilbert, E.G., An addendum to Volterra series and geometric control theory, Automatics, 12,635, 1976. [7] Chen, K.T., Integration of paths, geometric invariants and a generalized Baker-Hausdorff formula, Ann.Math., 65, 163-178, 1957. [8] Crouch, P.E., Dynamical realizationsof finite Volterra series, SIAM J. Control and Optimization, 19, 177202,1981. [9] Crouch, P.E., Solvable approximations to control systems, S M J . Controland Optimization, 22(1), 40-54, 1984. [lo] Crouch, P.E and Lamnabhi-Lagarrigue, F., Algebraic and multiple integral identities, Acta Applicanda Mathematicae, 15,235-274,1989.
[ l l ] Feynman, FLP., An operator calculus having applications in quantum electrodynamics, Physical Review, 84,108-128,1951. [12] Fliess, M., Fonctionnelles causales non linkaires et indkterminees non commutatives, Bull. Soc. Math. France, 109,3-40, 1981. [13] Fliess, M., Iamnabhi, M., and Lamnabhi-Lagarrigue, F., Algebraic approach to nonlinear functional expansions, IEEE CS, 30,550-570,1983. [14] Fliess, M., and Lamnabhi-Lagarrigue, F., Application of a new functional expansion to the cubic anharmonic oscillator, J. Math. Physics, 23,495-502, 1982. [15] ~ b b e r tE.G., , Functional expansions for the response of nonlinear differential systems, IEEE AC, 22, 909921,1977. Z ~ [16] I ] ~Grobner, 'N., Die Lie-Reihen und ihre Arrwendungen (2e Cdition), VEB Deutscher Verlag der Wissenschaften, Berlin, 1967. [17] Grunenfelder,L., Algebraic aspects of control systems and realizations, J. Algebra, 165,446-464, 1994. [18] Hassine, L., Toffano, Z., Lamnabhi-Lagarrigue,F. Destrez, A., and Birocheau, C., Volterra functional series expansions for semiconductor lasers under modulation, IEEE J. Quantum Electron., 30, 918-928, 1994; Hassine, L., Toffano, Z., Lamnabhi-Lagarrigue, F., Joindot, I., and Destrez, A., Volterra functional series for noise in semiconductor lasers, IEEE J. Quantum Electron., 30,2534-2546, 1994. [ 191 Hoang Ngoc Minh, V. and Jacob, G., Evaluation transform and its implementation in MACSYMA, in New Trends in Systems Theory, G. Conte, A.M. Perdon, B. Wyrnan, Eds., Progress in Systems and Control Theory, Birkhauser, Boston, 1991. [20] Jacob, G., Algebraic methods and computer algebra for nonlinear systems' study, Proc. IMACS S,ymposium MCTS, 1991. [21] Jacob, G., Motion planning by piecewise constant or polynomial inputs, Proc. NOLCOSa92, 1992. [22] Jacob, G. and Lamnabhi-Lagarrigue, F., Algebraic computing in control, Lect. Notes Contr. Inform. Sc., 165, Springer-Verlag, 1991. [23] Hespel, C. and Jacob,G., Truncated bilinear approximants: Carleman, finitevolterra, PadC-typie, geometric and structural automata, in Algebraic Computing in Control, Lect. Note Contr. Inform. Sc., G. Jacoband F. Lamnabhi-Lagarrigue, Eds., Springer-Verlag, 165, 1991. [24] Jakubczyk,B. andKaskosz,B., Realizabilityofvolterra series with constant kernels, Nonlinear Analysis, Theory, Methods and Applications, 5, 167-183, 1981. [25] Krener, A.J., Local approximations of control systems, J. Differential Equations, 19, 125-133, 1975. [26] Lderriere, G. and Sussmann, H.J., Motioin planning for controllable systems without drift: a preliminary report, Report SYCON-90-04, Rutgers Center for Systems and Control, June 1990.
THE CONTROL HANDBOOK
[27] Lamnabhi, M., A new symbolicresponse of nonlinear systems, SystemsandControlLetters, 2,156162,1982. [28] Lamnabhi, M., Functional analysis of nonlinear circuits: a generating power series approach, IEE Proceedings, 133,375-384, 1986. [29] Lamnabhi-Lagarrigue, F., Se'ries de Volterra et commande optimale singulitre, Thtse d'ktat, Universitk Paris-Sud, Orsay, 1985. [30] Lamnabhi-Lagarrigue, F., Analyse des systtmes non linkaires, Ed. Hermes, 1994. [311 Lamnabhi-Lagarrigue F. and Lamnabhi, M., Dktermination algkbrique des noyaux de Volterra associks 1 certains systkmes non linkaires. Ricercke di Automatica, 10, 17-26, 1979. [32] Lamnabhi-Lagarrigue,E and Stefani, G., Singularoptimal control problems: on the necessary conditions for optimality, SIAM J. Control Optim., 28, 823-840, 1990. [33] Lamnabhi-Lagarrigue, F., Leroux, P., and Viennot, X.G., Combinatorial approximations of Volterra series by bilinear systems, in Analyse des Systtmes Contrble's, B. Bonnard, B. Bride, J.P. Gautier, and I. Kupka, Eds., Progress in Systems and Control Theory, Birkhauser, 1991,304-3 15. [34] Leroux, P. and Viennot, X.G., A combinatorial approach to nonlinear functional expansions, Tkeoritical Computer Science, 79, 179-183, 1991. [35] Leroux, P., Martin A., and Viennot, X.G., Computing iterated derivatives dong trajectories of nonlinear systems, NOLCOS'92, M. Fliess, Ed., Bordeaux, 1992. 1361 Lesiak, C. and Krener, A.J., The existence and uniqueness of Volterra series for nonlinear systems,IEEEAC, 23,1090-1095,1978. [37] Magnus, W., On the exponential solution of differential equations for a linear operator, Comm. Pure Appl. Math., 7,649673, 1954. 1381 Martin, A,, Calcul d'approximations de la solution d'un systkme non linkaire utilisant le logiciel SCaTCHPAD, in Algebraic Computing in Control, Lect. Note Contr. Inform. Sc., G. Jawb and F. Larnnabhi-Lagarrigue, Eds., Springer-Verlag, 165, 1991. [39] Rugh, W.J., Nonlinear System Theory, The John Hopkins University Press, Baltimore, MD, 1981. [40] Schetzen, I.W., The Volterra and Wiener Theories of Nonlinear Systems, John Wiley & Sons, New York, 1980. [41] Schutzenberger, M.P., On the definition of a family of automata, Information and Control, 4,245-270,1961. [42] Stefani, G., Volterra approximations, NOLCOS'90, Nantes, 1990. [43] Stefani, G. and Zezza, P., A new type of sufficient optimality conditions for a nonlinear constrained optimal control problem, NOLCOS'92, Bordeaux, 1992. [44] Sontag, E.D., Bilinear realizability is equivalent to ex-
istence of a singular affine differential inputioutput equation, Syrt. Control Lett., 11, 181-187, 1988. [45] Sussmann, H.J., Lie brackets and local controllability: a sufficient condition for scalar-input systems, SIAM J. Control Optimiz., 2 1, 686-713, 1983. [46] Sussmann, H.J., Local controllability and motion planning for some classes of systems with drift, Proc. 30th IEEE CDC, 1110-1 113, 1991. [47] Tang, Y.-S., Mees, A.I., and Chua, L.O., Hopf bifurcation via Volterra series, IEEEAC, 28(1), 42-53, 1983. [48] Volterra, V., Theory of Functionals (translated from Spanish), Blackie, London, 1930 (reprinted by Dover, New York, 1959). 1491 Wang, Y. and Sontag,E.D., Generating series and nonlinear systems: Analytic aspects, local realizability, and input-output representations, Forum Math., 4, 299-322, 1992; Algebraic differential equations and rational control systems, SIAM J. Control Optimiz., [SO] Wei, J. and Norman, E., On global representations of the solutions of linear differential equations as a product of exponentials, Proc. Am. Math. Soc., 15, 327-334, 1964. [51] Wiener, N., Nonlinear Problems in Random Theory, John Wiley & Sons, New York, 1958.
Stability Hassan K. Khalil Michigan State University
.. 56.1 Lyapunov Stability.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .889
A.R. Tee1
Introduction*Stability ofEquilibrium Points*Examplesand Applications Defining Terms
Department o f Electric,zl Engineering, University o f Minnesota
T.T. Georgiou Department of Electrical Engineering, Ilniversity o f Minnesota
L. Praly Centre Automatique et Systemes. Ecole Des Mines de Paris
E. Sontag Department of Mathematics, Rutgers University
.. Reference.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .894 Further Reading ............................................................. 894 56.2 Input-Output Stability ...............................................895 Introduction Systems and Stability Practical Conditions and Examples
Defining Terms ..............................................................906 References.. ..................................................................907 907 Further Reading .............................................................
56.1 Lyapunov Stability H a s s a ~ l K.
Khalil,
Michigan
State
University
56.1.1 Introdiuction Stability theory plays a central role in systems theory and engineering. Generally speaking, stability theory deals with the behavior of a system over a long time period. There,are several ways to characterize stability. For example, we may characterize stability from an input-output viewpoint by requiring the output of a system to be "well behaved in some sense whenever the input is well behaved. Alternatively, we may characterize stability by studying the asymptotic behavior of the state of the system near steady-state solutions, like equilibrium points or periodic orbits. In this chapter we deal with stability of equilibrium points, usually characterized in the sense of Lyapunov. An equilibrium point is stable if all solutions starting at nearby points stay nearby; otherwise it is unstable. It is asymptotically stable if all solutions starting at nearby-points not only stay nearby, but also tend to the equilibrium point as time approaches infinity. These notions are made precise later on. We introduce Lyapunov's method for determining the stability of equilibrium points. Lyapunov laid the foundation of this method over a century ago, but of course the method as we use it today is the result oofinte~lsiveresearch efforts by many engineers and applied mathematicians. There are attractive features of Lyapunov's method which, in-
-
0-8493-8570-9/96/$0.00+!~,50 @ 1996 by CRC Press, Inc.
clude a solid theoretical foundation, the ability to concludestability without knowledge of the solution (no extensive simulation effort), and an analytical framework that makes it possible to study the effect of model perturbations and to design feedback control. Its main drawback is the need to search for an auxiliary function that satisfies certain conditions. The existence of such function cannot necessarily be asserted from a particular given model.
56.1.2 Stability of Equilibrium Points We consider a nonlinear system represented by the state model x = f (x)
(56.1)
where the components of the n-dimensional vector f (x) are locally Lipschitz functions of x, defined for all x in a domain D C Rn. A function f (x) is locally Lipschitz at a point xo if it satisfiesthe Lipschitz condition )If (x) - f (y)11 :s L Ilx - y 11 for all x, y in some neighborhood of xo, where L is a positive constant and 11 . 11 denotes the Euclidean norm of an n-dimensional vector; that is, llx 11 = Jx: I: ... xz. The Lipschitz condition guarantees that Equation 56.1 has a unique solution that satisfies the initial condition x(0) = xo. Suppose ? E D is an equilibrium point of Equation 56.1; that is, f (x)= 0. Whenever the state of the system starts at ?, it will reniain at f for all future time. Our goal is to characterize and study the stability of ?. For convenience, we take .f = 0. There is no loss of generality in doing this because any equilibrium point ? can be shifted to
+ + +
THE CONTROL HANDBOOK the origin via the change of variables y = x - i.Therefore, we shall always assume that f ( 0 ) = 0, and study stability of the origin x = 0. The equilibrium pointw-= O of Equation 56.1 is stable if, for each r > 0, there is S = S(r) > 0 such that IJx(0)JJ < S implies that Ilx (t) 11 < for all t 2 0. It is said to be asymptotically stable if it is stable and S can be chosen such that Ilx(O)ll < S implies that x ( t ) approaches the origin as t tends to infinity. When the origin is asymptoticallystable, the region of attraction (also called region of asymptotic stability, domain of attraction, or basin) is defined as the set of all points x such that the solution of Equation 56.1 that starts from x at time t = 0 approaches the origin as t tends to oo. When the region of attraction is the whole space Rn,we say that the origin is globally asymptotically stable. The 6-8 requirement for stability takes a challenge-answer form. To demonstrate that the origin is stable, then, for any value of r that a challenger may care to designate, we must produce a value of 8, possibly dependent on 6, such that a trajectory starting in a S neighborhood of the origin will never leave the r neighborhood. Lyapunov Method In 1892, Lyapunov introduced a method to determine the stability properties of an equilibrium point without solving the state equation. Let V ( x )be a continuously differentiable scalar function defined in a domain D c Rn that'contains the origin. A function V ( x ) is said to be positive definite if V ( 0 ) = 0 and V ( x ) > 0 for x # 0. It is said to be positive semidefinite if V ( x ) 2 0 for all x . A function V ( x ) is said to be negative definite or negative semidefinite if - V ( x )is positive definite or positive semidefinite, respectively. The derivative of V along the av . = trajectories of Equation 56.1 is given by ~ ( x=) =Xi av f ( x ) ,where av is a row vector whose ith component is F av . Lyapunov's stability theorem states that the origin is stable i f there is a continuously differentiablepositive definitefunction V ( x ) so that v ( x ) is negative semidefinite, and it is asymptotically stable i f ~ ( xis)negative definite. A function V ( x ) satisfying the conditions for stability is called a Lyapunov function. The surface V ( x ) = c, for some c > 0 , is called a Lyapunov surface or a level surface. Using Lyapunov surfaces, Figure 56.1 makes the theorem intuitively clear. v for decreasing constants c3 > It shows ~ f a ~ u n osurfaces c2 > c l > 0 . The condition v 5 0 implies that when a trajectory crosses a Lyapunov surface V ( x ) = c, it moves inside the set W, = { V ( x ) 5 c) and can never come out again, since v 5 0 on the boundary V ( x ) = c. When v < 0 , the trajectory moves from one Lyapunov surface to an inner Lyapunov surface with a smaller c. As c decreases, the Lyapunov surface V ( x ) = c shrinks to the origin, showing that the trajectory approaches the origin as time progresses. If we know only that v 5 0, we cannot be sure that the trajectory will approach the origin, but we can conclude that the origin is stable since the trajectory can be contained inside any r neighborhood of the origin by requiring the initial state x ( 0 )to lie inside a Lyapunov surface contained in
xy='=l
Figure 56.1
Level surfaces of a Lyapunov function.
that neighborhood. When ~ ( xis)only negative semidefinite, we may still be able to conclude asymptotic stability of the origin if we can show that no solution can stay forever in the set ( ~ ( x=) O}, other than the trivial solution x ( t ) = 0. Under this condition, V ( x ( t ) ) must decrease toward 0 , and consequentlyx ( t ) + 0, as t + oo. This extension of the basic theorem is usually referred to as the
invariance principle. Lyapunov functions can be used to estimate the region of attraction of an asymptotically stable origin, that is, to find sets contained in the region of attraction. If there is a Lyapunov function that satisfies the conditions of asymptotic stability over a domain D, and if W, = { V ( x J 5 c} is bounded and contained in D, then every trajectory starting in SZ, remains in W, and approaches the origin as t + oo. Thus, a, is an estimate of the region of attraction. If the Lyapunov function V ( x ) is radially unbounded (that is, llx 11 + oo implies that V ( x ) + oo),then any point x E Rn can be included in the interior of a bounded set a,.Therefore, the origin is globally asymptotically stable i f there is a continuously differentiable, radially unboundedfunction V ( x ) such that for all x E Rn,V ( x ) is positive definite and v ( x ) is either negative definite or negative semidefinite, but no solution can stayforever in theset { v ( x ) = 0 ) other than the trivial solution
x ( t ) = 0. Lyapunov's method is a very powerful tool for studying the stabilityofequilibrium points. However, thereare twodrawbacks of the method that we should be aware of. First, there is no systematic method for finding a Lyapunov function for a given system. Second, the conditions of the method are only sufficient; they are not necessary. Failure of a Lyapunov function candidate to satisfy the conditions for stability or asymptotic stability does not mean that the origin is not stable or asymptoticallystable.
Linear Systems The linear time-invariant system
has an equilibrium point at the origin. The equilibrium point is isolated if and only if det(A) # 0. Stability properties of the origin can be characterizedby the locations of the eigenvalues of the matrix A. Recall from linear system theory that the solution
1
56.1. LYAPUNOV STABILITY of Equation 56.2 for a given initial state x(0) is given by x(t) = exp(At)x(O) and th,at for any matrix A there is a nonsingular matrix P (possibly complex) that transforms A into its Jordan form; that is,
P-'AP == J = block diag[J1.J 2 , . . . , J,] where J; is the Jordan block associated with the eigenvalue A, of A. Therefore,
Equation 56.4 is a linear algebraic equation t h a ~can be solved by rearranging it in the form Mx = y where x and y are defined by stacking the elements of P and Q in vectors. Almost all commercial software programs for control systems include commands for solving the Lyapunov equation.
Linearization Consider the nonlinear system (Equation 56.4) and suppose that f (x) is continuously differentiable for allx E D c Rn. ., The Jacobian matrix ;If IS an n x n matrix whose (i, j) element is Let A be the Jacobian matrix evaluated at the origin x = 0. By the mean value theorem, it can be shown that f (x) = AX g(x), where - + 0 as llxll -+ 0. Suppose A is Hurwitz and Let P be the solution of the Lyapunov Equation 56.4 for some positive definite Q. Taking V(x) = x T P x , it can be shown the v (x) is negative definite in the neighborhood o f the origin. Hence, the orlgirl is asymptotically stable ifall the eigenwzlues o f A have rlegntive real parts. Using some advanced results of Lyapunov stability, it can be also shown that the origin is utlstable ifotle (or nlore) of the eigenvalues of A has apositive real part. This provides us with a simple procedure for determining stability of the origin of a nonlinear system by calculating the eigenvalues of its linearization about the origin. :Note, however, that linearization fails when Reh; 5 0 for all i, with Rehi = 0 for some i.
$.
(56.3) where mi is the orcler of the Jordan block associated with the eigenvalue hi. If osne of the eigenvalues of A is in the open right complex half-plane, the corresponding exponential term exp(hit) in Equation 56.3 will grow unbounded as t + m. Therefore, to guarantee stability we must restrict the eigenvalues to he in the closed left complex half-plane. But those eigenvalues on the imaginary axis (if any) could give rise to unbounded terms if the order of the associated Jordan block is higher than one, due to the term tk-' in Equation 56.3. Therefore, we must restrict eigenvalues on the imaginary axis to have Jordan blocks of order one. For asymptotic stability of the origin, exp(Af ) must approach 0 as t -+ oo. From Equation 56.3, this is the case if and only if Rehi < 0, V i. When all eigenvdues of A satisfy Reh, < 0, A is called a Hurwitz matrix or a stability matrix . The origin of Equation 56.2 is asymptotically stable if and only if A is Hurwitz. Asymptotic stability of the origin can be also investigated using Lyapunov's me.thod. Consider a quadratic Lyapunov function candidate V(x) = x T P x where P is a real symmetric positive definite matrix. The derivative of V along the trajectories of the linear system of Equation 56.2 is given by
+
Nonautwnomous Systems Equation 56.1 is called autonomous because f does not depend on t. The more general system
x = f (t, X )
(56.5)
is called nonautonomous. Lyapunov's method applies to nonautonomous systems. In this case, we may allow the Lyapunov function candidate V to depend on t. Let V (t, x ) be a contindously differentiable function defined for all t 2 0 and all x E D. The derivative of V along the trajectories of Equation 56.5 is given by ~ ( tx,) = f f (t, x). If there are positive definite functions Wl (x), W2(x) and W3(x) such that
+-
where Q is a symmetric matrix defined by
If Q is positive definite, we can conclude that the origin is asymptotically stable, that. is, Rehi < 0, for all eigenvalues of A. Suppose we start by choosing Q as a real symmetric positive definite matrix, and then solve Equation 56.4 for P. If Equation 56.4 has a positive definite solution, then again we can conclude that the origin is asymptotically stable. Equation 56.4 is called the Lyapunov equation. It turns out that A is Hurwitz if and only iffor any given positive definite symmetric matrix Q there exists a positive definite symmetric matrix P that satisfies the Lyapunov Equation 56.4. Moreover, i f A is H u w i t z , then P is the unique solution of Equatiotl56.4.
for all t 2 0 and all x E D, then the origin is uniformly asymptotically stable, where "uniformly" indicates that the 6-8 definition of stability and the convergence of x (t) to zero are independent of the initial time to. Such uniformity annotation is not needed with autonomous systems since the solution of an autonomous state equation starting at time to depends only on the difference t - to, which is not the case for nonautonomous systems. If inequalities Equations 56.6 and 56.7 hold globally and Mrl (x) is radially unbounded, then the origin is globally uniformly asymptotically stable.
T H E CONTROL HANDBOOK
56.1.3 Examples and Applications
definite because ~ ( x =) 0 for x2 = 0 irrespective of the value of x i . Therefore, we cannot conclude this asymptotic stability using Lyapunov's stability theorem. Here comes the role of the invariance principle. Consider the set ( ~ ( x=) 0 ) = {xz = 0). Suppose that a solution of the state equation stays forever In this set. Then
Examples
EXAMPLE 56.1: A simple pendulum moving in a vertical plane can be modeled by the second-order differential equation m l 9 = -mg sin 0 - klb
(56.8)
where I is the length of the rod, m is the mass of the bob, 0 is the angle subtended by the rod and the vertical line through the pivot point, g is the acceleration due to gravity, and k is a coefficient of friction. Taking xl = 0 and x2 = b as the state variables, we obtain the state equation
where a = g l l > 0 and b = k l m > 0 . The case b = 0 is an idealized frictionless pendulum. To find the equilibrium points, we set xl = x2 = 0 and solve for xl and x2. The first equation gives x2 = 0 and the second one gives sinxi = 0 . Thus, the equilibrium points are located at ( n n , O), for n = 0 , f1, f2, . . .. From the physical description of the pendulum it is clear that the pendulum has only two equilibrium positions corresponding to the equilibrium points ( 0 , O )and ( n , 0 ) . Other equilibrium points are repetitions of these two positions, which correspond to the number of full swings the pendulum would make before it rests at one of the two equilibrium positions. Let us use Lyapunov's method to study the stability ofthe equilibrium point at the origin. As a Lyapunov function candidate, we use the energy of the pendulum, which is defined as the sum of its potential and kinetic energies, namely, V(x) =
1''
a sin y dy
+ ixg
Hence, on the segment -n < xl < n of the x2 = 0 line, the system can maintain the ~ ( x =) 0 condition only at the origin x = 0 . Noting that the solution is confined to a set { V ( x ) c ) and for sufficiently small c , { V ( x ) 5 c ) c { - n < xi < n], we conclude that no solution can stay forever in the set { V ( x ) 5 c ) n ( x 2 = 0 ) other than the trivial solution x ( t ) = 0 . Hence, the origin is asymptotically stable. We can also estimate the region of attraction by the set { V ( x ) 5 c ) where c < rnin(,,=+,) V ( x ) = 2a is chosen such that V ( x ) = c is a closed contour contained in the strip { - n -= xl < n } . The pendulum equation has two equilibrium points at ( 0 , 0 ) and ( n , 0). Let us investigate stability of both points using linearization. The Jacobian matrix is given by
To determine stability of the origin we evaluate the Jacobian matrix at x = 0 .
4 4
The eigenvalues of A are A1.2 = - f d m . For all positive values of a and b , the eigenvalues satisfy R e l i < 0. Hence, the equilibrium point at the origin is asymptotically stable. In the absence of friction (k O), both eigenvalues are on the imaginary axis. In this case we cannot determine stability of the origin through linearization. We have seen before that in this case the origin is a stable equilibrium point, as determined by an energy Lyapunov function. To determine stability of the equilibrium point at ( n , 0 ) , we evaluate the Jacobian matrix at this point. This is equivalent to performing a change of variables zl = xl - 71, 22 = x2 to shift the equilibrium to the origin, and then evaluating the Jacobian [af/az] at z = 0 . I . ;
The reference of the potential energy is chosen such that V ( 0 ) = 0. The function V ( x )is positive definite over the domain - 2 n . : xl < 2 n . The derivative of V ( x ) along the trajectories of the system is given by
When friction is neglected ( b = 0 ) , ~ ( x =) 0 and we can conclude that the origin is stable. Moreover, V ( x )is constant during the motion of the system. Since V (x) = c forms a closed contour around x = 0 for small c > 0 , we see that the trajectory will be confined to one such contour and will not approach the origin. Hence, the origin is not asymptotically stable. On the other hand, in the case with friction ( b > 0 ) , ~ ( x =) -bxi 5 0 is negative semidefinite and we can conclude that the origin is stable. Notice that ~ ( xis)only negative semidefinite and not negative
he eigenvalues of 2 are h l , = ~ - fi d-. For all a > 0 and b >_ 0,there is one eigenvalue in the open right half-plane. Hence, the equilibrium point at ( n , 0 ) is unstable.
\
56.1. LYAPUNOV STABILITY
EXAMPLE 56.2: Consider the system
The right-hand side of ~ ( xis)written as the surn of two terms. x is the contribution of the linear part A x , The first term, - ~ l 112, while the second term is the contribution of the nonlinear term g ( x ) = f ( x ) - A x . Using the inequalities Ixl - 2x21 5 &llxll and lxlx21 ( we see that V ( X ) satisfies the inequality V ( X ) 5 -11x11~ (&/2)llxl14. Hence ~ ( xis)negative definite in the region {llx 11 < r ) , where r 2 = 2/&. We would like to choose a positive constant c such that { V ( x ) 5 c } is a subset of this region. Since x T p x 2 h m i n ( p ) l l x112, we can choose c < l m i n ( p ) r 2 .U s i n g l m i n ( P )2 0.69, we choosec = 0.615 < 0.69(2/&) = 0.617. The set ( V ( x ) 5 0.615)'is an estimate of the region of attraction.
+
where g l ( . ) and g2(..)are locally Lipschitz and satisfy
and 1; gl (z) dz --+ cm,as I yI -+ 00. The system has an isolated equilibrium point at the origin. The equation of this system can be viewed as a gene~alizedpendulum equation with g 2 ( x 2 )as the friction term. Therefore, a Lyapunov function candidate may be ix:, taken as the energy -like function V ( x ) = 1;' g l ( y ) d y which is positive definite in R2 and radially unbounded. The derivative of V ( x ) along the trajectories of the system is given by
+
Thus, ~ ( xis)negative semidefinite. Note that V ( x ) = 0 implies x 2 g 2 ( x 2 ) = 0 , which implies x2 = 0 . Therefore, the only solution that can stay forever in the set ( x E R2 I x2 = 0 ) is the trivial solution x ( t ) = 0 . Thus, the origin is globally asymptotically stable.
EXAMPLE 56.3:
EXAMPLE 56.4: Consider the nonautonomous system
where g ( t ) is continuously differentiableand satisfies0 5 g ( t ) 5 k and g 5 g(t ) for all t 2 0 . The system has an equilihrium point at the origin. Consider a Lyapunov function candidate V ( t ,x ) = x: [ l g(t)lx2. The function V satisfies the inequalities
+ +
The derivative of V along the trajectories of the system is given by v = -2x: 2 x 1 ~ 2- [2 2 g ( t ) - g ( t ) ] x :
The second-order system
+
has a unique equi1it)riumpoint at the origin. Linearization at the origin yields the matrix
which is Hurwitz. Hence, the origin is asymptotically stable. A Lyapunov fiinct~onfor the system can be found by solving P = - I for P . The unique the Lyapunov equation P A solution is the positive definite matrix
+
+
Usingthe bound on g ( t ) , we have 2 + z g ( t ) - g ( t ) 2 2 + 2 g ( t ) g ( t ) 2 2. Therefore,
The matrix Q i positive definite. Hence, the origin is uniformly asymptotic y stable. Since all inequalities are satisfied globally and x? x: is radially unbounded, the origin is globally uniformly asymptotically stable.
+
ad
Feedback Stabilization Considel-the nonlinear system
The quadratic function V ( x ) = xT P x is a Lyapunovfunction for the system in a certain neighborhood of the origin. Let us use this function to estimate the region of attraction. The function V ( x ) is positive definite for all x . We need to determine a domain D about the origin where ~ ( x is) negative definite and a set 52, c D, which is bounded. We are interested in the largest set QC that wecan determine, that is, the largest value for the constant c , because Q, will be our estimate of the region of attraction. The derivative of V ( x ) along the trajectories of the system is given by
where x is an n -dimensional state, u is an m-dimensional control input, and y is a p-dimensional measured output. Suppose f and h are continuously differentiable in the domain of interest, and f (0,O) = 0 and h ( 0 ) = 0 so that the origin is an openloop equilibrium point. Suppose we want to design an output feedbackcontrol to stabilize the origin; that is, to make the origin an asymptotically stable equilibrium point of the closed-loop system. We can pursue this design problem via linearization.
THE CONTROL HANDBOOK The linearization of the system about the point (x = 0, u = 0) is give11by
Locally Lipschitz function: A function f (x) islocallyLipschitz at a point if it satisfies the Lipschitz condition in the neighborhood of that point. Lyapunov equation: A linear algebraic matrix equation of the form PA P = -Q, where A and Q are real square matrices and Q is symmetric and positive definite. The equation has a (unique) positive definite matrix solution P if and only if A is Hurwitz. Lyapunov function: A scalar positive definite function of the state whose derivative along the trajectories of the system is negative semidefinite. Lyapunov surface: A set ofthe form V(x) = c where V(x) is a Lyapanov function and c is a positive constant. Negative (semi-) definite function: A scalar function of a vector argument V(x) is negative (semi-) definite if it vanishes at the origin and is negative (nonpositive) for all points in the neighborhood of the origin, excluding the origin. Negative (semi-) definite matrix: A square symmetric real matrix P is negative (semi-) definite if the quadratic form V(x) = xT P x is a negative (semi-) definite function. Positive (semi-) definite function: A scalar function of a vector argument V(x) is positive (semi-) definite if it vanishes at the origin and is positive (nonnegative) for all points in the neighborhood of the origin, excluding the origin.
+
where
Assuming that (A, B) is stabilizable and (A, c') is detectable (that is, uncontrollable and unobservable eigenvalues,if any, have negative real parts), we can design a dynamic output feedback controller
z
=
Fz +Gy
u
=
Hz
+K y
where z is a q-dimensional vector, such that the closed-loop matrix
is Hurwitz. When the feedback control of Equation 56.13 is applied to the nonlinear system of Equation 56.13 it results in a system of order (n q), whose linearization at the origin is the matrix A. Hence, the origin of the closed-loop system is asymptotically stable. By solving the Lyapunov equation P A A T P = -Q for some positive definite matrix Q, we can use V = X T P X , where X = [x zIT, as a Lyapunov function for the closed-loop system, and we can estimate the region of attraction of the origin, as illustrated in Example 56.3.
+
+
56.1.4 Defining Terms Asymptotically stable equilibrium: A stable equilibrium point with the additional feature that all trajectories starting at nearby points approach the equilibrium point as time approaches infinity. Equilibrium point: A constant solutionx * of x = f ( t , x). For an autonomous system x = f (x), equilibrium points are the real roots of the equation 0 = f (x). Globally asymptotically stable equilibrium: A stable equilibrium point with the additional feature that all trajectories approach the equilibrium point as time approaches infinity. Hurwitz matrix: A square real matrix is Hurwitz if all its eigenvalues have negative real parts. Linearization: Approximation of the nonlinear state equation in the vicinity of an equilibrium point by alinear state equation, obtained by dropping second- and higher-order terms of the Taylor expansion (about the equilibrium point) of the right-hand side function. Lipschitz condition: A condition imposed on a function f (x) to ensure that it has a finite slope. For a vectorvalued function, it takes the form I1f (x) - f (y) ll 5 L ( l x - y 11 for some positive constant L.
Positive (semi-) definite matrix: A square symmetric real matrix P is positive (semi-) definite if the quadratic form V(x) = x T p x is a positive (semi-) definite function. Region of attraction: The set of all points with the property that the trajectories starting at these points asymptotically approach the origin. Stable equilibrium: An equilibrium point where all solutions starting at nearby points stay nearby.
Reference [I] Khalil, H.K., Nonlinear Systems, Macmillan, New York, 1992.
Further Reading Our presentation of Lyapunov stability is based on the textbook by H.K. Khalil, (see [I]).For further information on Lyapunov stability, the reader is referred to Chapters 3, 4, and 5 of Khalil's book. Chapters 3 and 4 give a detailed treatment of the material covered in this chapter. They also address important topics, like the use of the center manifold theorem when linearization fails, and the use of Lyapunov's method when there is uncertainty in the state model. Chap-
56.2. INPUT-OU'rPUT STABILITY ter 5 illustrates the application of Lyapunov's method to control problems like the design of robust and adaptive control. Other engineering itextbooks where Lyapunov stability is emphasized include Vidyasagar, M. 1993. Nonlinear Systems Analysis, 2nd td., Slotine, J.-J. and Li, W. 1991. Applied Nonlinear Coiztrol, Atherton, D.P. 1981. Stability of Nonlinear Systems, Hsu, J.C. and Meyer, A.U. 1968. Modern Control Principles and Application. For a deeper look into the theoretical foundation of Lyapunovstability, thereare excellent references,which include Rouche, N., Habets, P., and Laloy, M. 1977. Stability Theory by Lyapunov's Direi-t Method, Hahn, W. 1967. Stability of Motion, Krasovskii, N.N. 1963. Stability ofMotion. Periodic journals o,n control theory and systems often include articles where Lyapunov's method is used in system analysis or control design. Examples are the IEEE Transactions on Automatic Control and the IFAC journal Automatica.
56.2 Input-Output Stability A.R. Teel, Department of Electrical Engineering, University of Minnesota ?: ?: Gcorgiou, Department of Electrical Engineering, University o f Minnesota L. Praly, Centre Automatique et Systbmes, ~ c o l Des e Mines (deParis E. Son tag, Department of Mathematics, Rutgers University
input-output point of view is still an area of ongoing and intensive research. The strength of input-output stability theory is that it provides a method for anticipating the qualitative behavior of a feedback system with only rough information about the feedback components. This, in turn, leads to notions of robustness of feedback stability and motivates many of the recent developments in modern control theory.
56.2.2 Systems and Stability Throughout our discussion of input-output stability, a signal is a "reasonable" (e.g., piecewise continuous) function defined on a finite or semi-infinite time interval, i.e., an interval of the form [0, T) where T is either a strictly positive real number or infinity. In general, a signal is vector-valued; its components typically represent actuator and sensor values. A dynamical system is an object which produces an output signal for each input signal. To discuss stability of dynamical systems, we introduce the concept of a norm function, denoted I I . 1 1 , which captures the "size" of signals defined on the semi-infinite time interval. The significant properties of a norm function are that 1) the norm of a signal is zero if the signal is identically zero, and is a strictly positive number otherwise, 2) scaling a signal results in a corresponding scaling of the norm, and 3) the triangle inequality holds, i.e., 1 lul u211 5 llul I I I lu211. Examples of norm functions are the p-norms. For any positive real number p >_ 1, the p-norm is defined by
+
+
where I . I represents the standard Euclidean n~orm,i.e., lul =
56.2.1 . Introdluction A common task fc~ran engineer is to design a system that reacts to stimuli in some specific and desirable way. One way to characterize appropriate behavior is through the formalism of input-output stability. In this setting a notion of well-behaved input and output signals is made precise and the question is posed: do well-behaved stimuli (inputs) produce well-behaved responses (outputs)? General input-output stability analysis has its roots in the development of the electronic feedback amplifier of H.S. Black in 1927 and the subsequent development of classical feedback design tools for linear systems by H. Nyquist and H.W. Bode in the 1930s and 1940s, dl at Bell Telephone Laboratories. These latter tools focused on determining input-output stability of linear feedback systems fiom the characteristics of the feedback components. Generalizations to nonlinear systems were made by several researchers in the late 1950s and early 1960s. The most notable contributions were those of G . Zames, then at M.I.T., I.W. Sandberg at Bell Telephone Laboratories, and V.M. Popov. Indeed, much of this chapter is based on the foundational ideas found in [5], 171 and [lo], with additional insights drawn from [6]. A thorough understanding of nonlinear systems from an
n
x u : . For p = oo, we define i=l
The oo-norm is useful when amplitude constraints are imposed on a problem, and the 2-norm is of more interest in the context of energy constraints. The norm of a signalmay very well be infinite. We will typically be interested in measuring signals which may only be defined on finite time intervals or measuring truncated versions of signals. To that end, given a signal u defined on [0, T) and a strictly positive real number r , we use u , to denote the truncated signal generated by extending u onto [0, m) by defining u ( t ) = 0 for t L T, if necessary, and then truncating, i.e., u , is equal to the (extended) signal on the interval [O, r ] and is equal to zero on the interval (t,oo). Informally, a system is stable in the input-output sense if small input signals produce correspondingly small output signals. To make this concept precise, we need a way to quantify the dependence of the norm of the output on the norm of he input applied to the system. To that end, we define a gain function as a function from the nonnegative real numbers to the nonnegative real
THE CONTROL HANDBOOK
numbers which is continuous, nondecreasing, and zero when its argument is zero. For notational convenience we will say that the "value" of a gain function at ca is ca.A dynamical system is stable (with respect to the norm ( 1 . 1 ) ) if there is a gain function y which gives a bound on the norm of truncated output signals as a function of the norm of truncated input signals, i.e.,
In the very special case when the gain function is linear, i.e., there is a t most an amplification by a constant factor, the dynamical system is finite gain stable. The notions of finite gain stability and closely related variants are central to much of classical inputoutput stability theory, but in recent years much progress has been made in understanding the role of more general (nonlinear) gains in system analysis. The focus of this chapter will be on the stability analysis of interconnected dynarnicalsystems as described in Figure 56.2. The composite system in Figure 56.2 will be called a well-defined mtercomection if it is a dynamical system with
(: )
as in-
(. ;: ) as output, i.e., given an arbitrary input signal ( i: ),a signal ( ) exists so that, for the dynamical sys-
put and
,
::+
tem E l , the input dl y2 produces the output yl and, for the dynamical system C 2 ,the input d2 yl produces the output y2. To see that not every interconnection is well-defined, consider the case where both C1 and E2 are the identity mappings. In this case, the only input signals
+
( ii )
for which an output
+
( ;: )
can be found are those for which dl d2 = 0. The dynamical systems which make up a well-defined interconnection will be called its feedback components. For stability of well-defined interconnections, it is not necessary for either of the feedback components to be stable nor is it sufficient for both of the feedback components to be stable. On the other hand, necessary and sufficient conditions for stability of a well-defined interconnection can be expressed in terms of the set of all possible input-output pairs for each feedback component. To be explicit, following are some definitions. For a given dynamical system C with input signals u and output signals y, the set of its ordered input-output pairs ( u , y ) is referred to as the graph of the dynamical system and is denoted Gc. When the input and output are exchanged in the ordered pair, i.e., ( y , u ) , the set is referred to as the inverse graph of the system and is denoted (j;. Note that, for the system in Figure 56.2, the inverse graph of C 2 and the graph of C 1 lie in the same Cartesian product space called the ambient space. We will use as norm on the ambient space the sum of the norms of the coordinates. The basic observation regarding input-output stability for a well-defined interconnection says, in informal terms, that if a signal in the inverse graph of C2 is near any signal in the graph of C 1 then it must be small. To formalize this notion, we need the concept of the distance to the graph of C1 from signals x in
Figure 56.2
Standard feedback configuration.
the ambient space. This (truncated) distance is defined by d T ( x , G c , ) : = inf Il(x-z),ll.
(56.16)
zESC,
THEOREM 56.1 Graph separation theorem: A well-defined interconnection is stable if; and only if; a gain furlction y exists whichgives a bound on the norm of trui~catedsignals in the {nverse graph of C 2 as a function of the truncated distance from the srgnds to thegraph of E l , i.e.,
In the special case where y is a linear function, the well-defined interconnection isfinitegain stable.
The idea behind this observation can be understood by considering the signals that arise in the closed loop which belong to the inverse graph of C2, i.e., the signals ( y 2 ,yl d2). (Stability with these signals taken as output is equivalent to stability with the original outputs.) Notice that, for the system in Figure 56.2, signals in the graph of C 1 have the form ( y 2 d l , y l ) . Consequently, signals x E GL2 and z E Gc, , which satisfy the feedback equations, also satisfy
+
+
and Iltx -z)rlI = Il(d17d2)rII
(56.19)
for truncations within the interval of definition. If there are signals x in the inverse graph of C2 with large truncated norm but small truncated distance to the graph of X I , i.e., there exists some z E Gc, and T > 0 such that I / ( x - z ) 1 I ~is small, then we can choose ( d l , d2) to satisfy Equation 56.18 giving, according to Equation 56.19, a small input which produces a large output. This contradicts our definition of stability. Conversely, if there is no z which is close to x , then only large inputs can produce large x signals and thus the system is stable. The distance obser,vationpresented above is the unifying idea behind the input-output stability criteria applied in practice. However, the observation is rarely applied directly because of the difficulties involved in exactly characterizing the graph of a dynamicalsystem andmeasuring distances. Instead, various simpler conditions have been developed which constrain the graphs of the feedback components to guarantee that the graph of
'
56.2. INPUT-OUTPUT STABILITY and the inverse graph of Z2 are sufficiently separated. There are many such sufficient conditions, and, in the remainder of this chapter, we will describe a few of them.
56.2.3 Practical Conditions and Examples T h e Classical S m a l l Gain T h e o r e m One of the most commoi~lyused sufficient conditions for graph separation constrains the graphs of the feedback components by assuming that each feedback component is finite gain stable. Then, the appropriate graphs will be separated ifthe product of the coefficients of the linear gain functions is sufficiently small. For this reason, the result based on this type of constraint has come to be known as the small gain theorem.
THEOREM 56.2 Small gain theorem: If each feedback component isfinitegain s t ble ~ and the product of thegains (the coeficients of the linear gain functions) is less than one, then the well-defined interconnectron is finite gain stable. Figure 56.3 provides the intuition for the result. If we were to draw an analogy between a dynamical system and a static map whose graph is a :set of points in the plane, the graph of C1 would be constrained to the darkly shaded conic region by the finite gain stability assumption. Likewise, the inverse graph of C2 would be constrained to the lightly shaded region. The fact that the product of the gains is less than one guarantees the positive aperture between the two regions and, in turn, that the graphs are separated sufficiently.
single-output (SISO) finite gain stable system modeled by a real, rational transfer function G(s), the smallest possible coefficient for the stability gain function with respect to the 2-norm, is given by (56.20) 7 := sup IG(jw)l . W
For multi-input, multioutput systems, the magnitude in Equation 56.20 is replaced by the maximum singular value. In either case, this can be established using Parseval's theorem. For SISO systems, the quantity in Equation 56.20 can be obtained from a quick examination of the Bode plot or Nyquist plot for the transfer function. If the Nyquist plot of a stable SISO transfer function lies inside a circle of radius y centered at the origin, then the coefficient of the 2-norm gain function for the system is less than or equal to 7 . More generally, consider a dynamical system that can be represented by a finite dimensional ordinary differential equation with zero initial state:
x and y
= =
f(x,u:, , h(x,u) .
x(O)=O
(56.21)
Suppose that f has globally bounded partial derivatives and that positive real numbers e l and t 2exist so that
Under these conditions, if the trajectories of the unforced system with nonzero initial conditions,
satisfy Ix(t)l 5 kexp(-ht)lxol,
(56.24)
for some positive real number k and h and any x, E Itn,then the system (Equation 56.21) is finite gain stable in any of the p-norms. This can be established using Lyapunov function arguments that apply to the system (Equation 56.23). The details can be found in the textbooks o n nonlinear systems mentioned later in Section 56.10.
EXAMPLE 56.5: Consider a nonlinear control system modeled by an ordinary differential equation with state x E Rn, input v E Rm and disturbance dl E Rm : Figure 56.3
Classical small gain theorem.
To apply the small gain theorem, we need a way to verify that the feedback components are finite gain stable (with respect to a particular norm:) and to determine their gains. In particular, any linear dynamical system that can be represented with a real, rational transfer function G (s) is finite gain stable in any of the pnorms if, and only if, all of the poles of the transfer function have negative real parts. A popular norm to work with is the 2-norm. It is associated with the energy of a signal. For a single-input,
x = f(x,v+dl).
(56.25)
Suppose that f has globally bounded partial derivatives and that a control v = a ( x ) can be found, also with a globally bounded partial derivative, so that the trajectories of the system i=f(x.ar(x)) satisfy the bound
,
x(O)=x,
(56.26)
THE CONTROL HANDBOOK for some positive real numbers k and h and for all x, E Rn.As mentioned above, for any function h satisfying the type ofbound in Equation 56.22, this implies that the system
sup, o ( G ( j w ) ) = cy2
.
We conclude from the small gain theorem that, if
(56.36) E
< -,
1
Yl Y2
then the composite system (Equations 56.31-56.33), with inputs dl and d2 and outputs yl = A-' ~ c i ( x and ) y2 = Cy, is finite gain stable. has finite 2-norm gain from input d l to output y. We consider the output
aa
y:=&=-f(x cr(x)+d I ) (56.29) ax ' which satisfies the type of bound in Equation 56.22 because a and f both have globally bounded partial derivatives. We will show, using the small gain theorem, that disturbances d l with finite 2-norm continue to produce outputs y with finite 2-norm even when the actual input v to the process is generated by the following fast dynamic version of the commanded input a(x) : ci
v
= =
Az
+ B(a(r)) + d2 , z(0) = -Ap'
The Classical Passivity Theorem Another very popular condition used to guarantee graph separation is given in thepassivitytheorern. For the most straightforward passivity result, thenumber ofinput channels must equal the number ofoutput channels for each feedbackcomponent. We then identify the relative location of the graphs of the feedback components using a condition involving the integral of the product of the input and the output signals. This operation is known as the inner product, denoted (., .). In particular, for two signals u and y of the same dimension defined on the semi-infinite interval,
Ba(x(0))
Cz.
(56.30) Here, E is a small positive parameter, the eigenvalues of A all have strictly negative real part (thus A is invertible), and -CA-' B = I. This system may represent unmodeled actuator dynamics. To see the stability result, we will consider the compositesystem in the coordinates x and { = z A-' Ba(x). Using the notation from Figure 56.2,
+
and
with the interconnection conditions ui = y2
+ d l , and u2 = yl + E - I
d2 .
(56.33)
Of course, if the system is finite gain stable with the inputs d l and c-ld2, then it is also finite gain stable with the inputs dl andd2. We have already discussed that the system C1 in Equation 56.3 1 has finite 2-norm gain, say yl . Now consider the system C2 in Equation 56.32. It can be represented with the transfer function
Note that (u, y) = (y, u) and (u, u) = 11~11:. A dynamical system is passive if, for each input-output pair (u, y) and each r > 0, (u,, y,) 2 0. The terminology used here comes from the special case where the input and output are a voltage and a current, respectively, and the energy absorbed by the dynamical system, which is the inner product of the input and output, is nonnegative. Again by analogy to a static map whose graph lies in the plane, passivity o f a dynamical system can be viewed as the condition that the graph is constrained to the darkly shaded region in Figure 56.4, i.e., the first and third quadrants ofthe plane. This graph and the inverse graph of asecond system would be separated if, for example, the inverse graph of the second system were constrained to the lightly shaded region in Figure 56.4, i.e., the second and fourth quadrants but bounded away from the horizontal andvertical axes by an increasing and unbounded distance. But, this is the same as asking that the graph of the second system followed by the scaling '-1: i.e., all pairs (u, -y), be constrained to the first and third quadrants, again bounded away from the axes by an increasing and unbounded distance, as in Figure 56.5a. For classical passivity theorems, this region is given a linear boundary as in Figure 56.5b. Notice that, for points (u,, yo) in the plane, if u, . yo 2 c(u: y2) then (u,, yo) is in the first or third quadrant, and (6)-' lu, 1 > I yo1 r E luo 1 as in Figure 56.5b. This leads to the following stronger version of passivity. A dynamical system is input and output strictly passive if a strictly positive real number E exists so that, for each input-output pair (u, y ) a n d e a c h r > 0, ( ~ T ~ Y 2 TE (ll~sll: ) II~rll:). There are intermediate versions ofpassivity which are also useful. These correspond to asking for an irgreasing and unbounded distance from either the horizontal axis o r the vertical axis but not both. For example, a dynamical system is input strictly passive if a strictly positive real number E exists so that, for each input-output pair (u, y) and each r > 0, (u,, y,) 2 611usll:. Similarly, a dynamical system is output strictlypassive if a strictly
+
+
Identifying G(s) = C ( s I - A)-', we see that, if yz := sup, a ( G ( j w ) ) ,
(56.35)
56.2. INPUT-OUTIJUT STABILITY respectively,passive and input and output strictlypassive, then the well-defined interconnection is finite gain stable in the 2norm. To apply this theorem, we need a way to verify that the (possibly scaled) feedback components are appropriately passive. For stable SISO systems with real, rational transfer function G ( s ) , it again follows from Parseval's theorem that, if
Figure 56.4
General passivity-based interconnection.
for all real values of w , then the system is passive. If the quantity Re G ( j w )is positive and uniformly bounded away from zero for all real values of w , then the linear system is input and output strictly passive. Similarly, if c > 0 exists so that, for all real values of w, Re G ( j w - E ) 2 0, (56.38) then the linear system is output strictly passive. So, for SISO systems modeled with real, rational transfer functions, passivity and the various forms of strict passivity can again be easily checked by means of a graphical approach such as a Nyquist plot.
Figure 56.5
Different notions of input and output strict passivity.
positive real number c exists so that, for each input-output pair (u, y) and each t > 0 , (u,, y,) L c ~ l ~ , l l $It is worth noting that input and output strict passivity is equivalent to input strict passive plus finite gain stability. This can be shown with standard manipulations of the inner product. Also, the reader is warned that all three types of strict passivity mentioned above are frequently called "strict passivity" i n the literature. Again by thinking of a graph of a system as a set of points in the plane, output strict passivity is the condition that the graph is constrained to the darkly shaded region in Figure 56.6, i.e., the first and third qualdrants with an increasing and unbounded distance from the vertical axis. To complement such a graph, consider a second dynamical system which, when followed by the scaling '-1: is also output strictly passive. Such a system has a graph (without I he '-1' scaling) constrained to the second and fourth quadrants with an increasing and unbounded distance from the vertical axis. In other words, its inverse graph is constrained to the lightly shaded region of Figure 56.6, i.e., to the second and fourth quadrants but with an increasing and unbounded distance fiom the horizontal axis. The conclusions that we can then draw, using the graph separation theorem, are summarized in the following passivity theorem.
Figure 56.6
More generally, for any dynamical system that can be modeled with a smooth, finite dimensional ordinary differential equation,
if a strictly positive real number c exists and a nonnegative function V : Wn -t W> 0 with Y ( 0 ) = 0 satisfying
and
THEOREM 56.3 Passivity theorem: If one dynamical system and the other dynamic(a1system followed by the scaling '-1' are both input stricilly passive, OR both outputstrict~ypassive,OR
Interconnection of output strictly passive systems.
av
--(x)g(x) ax
=
hT(x),
then the system is output strictly passive. With E = 0, the system is passive. Both of these results are established by integrating V over the semi-infinite interval.
900
EXAMPLE 56.6: (This example is prompted by the work in Berghuis and Nijmeijer, Syst. ControlLett., 1993,21,289-295.) Consider al'completely controlled dissipative Euler-Lagrange" system with generalized "forces" F , generalized coordinates q , uniformly positive definite "inertia" matrix I ( q ) ,Rayleigh dissipation function R ( q ) and, say positive, potential V ( q ) starting from the position q d . Let the dynamics of the system be given by the EulerLagrange-Rayleigh equations,
where L is the Lagrangian
Along the solution of Equation 56.42,
THE C O N T R O L H A N D B O O K and, integrating, for each r
Since H 3 0, H ( 0 ) = 0 and Equation 56.47 holds
Using the notation from Figure 56.2, let C1 be the system (Equations 56.42 and 56.49) with input Fm and output q. Let E2 be any system that, when followed by the scaling '-1: is output strictly passive. Then, according to the passivity theorem, the composite feedback system as given in Figure 56.2 is finite gain stable using the 2-norm. One possibility for C2 is minus the identity mapping. However, there is interest in choosing C2 followed by the scaling '-1' as a linear, output strictly passive compensator which, in addition, has no direct feed-through term. The reason is that if, d2 in Figure 56.2 is identically zero, we can implement C2 with measurement only of q and without q. In general,
and the system G ( s ) s is implementable if G ( s ) has no direct feed-through terms. To design an output strictly passive linear system without direct feed-through, let A be a matrix having all eigenvalues with strictly negative real parts so that, by a wellknown result in linear systems theory, a positive definite matrix P exists satisfying
Then, for any B matrix of apprbpriate dimensions, the system modeled by the transfer function,
We will suppose that 6 r 0 exists so that
followed by the scaling '-1: is output strictly passive. To see this, consider a state-space realization
Now let Vd be a function so that the modified potential and note that has a global minimum at q = q d , and let the generalized "force"
-
xTPx
+ 2xTPBu -xTx + 2yTu.
= -xTx
(56.58)
=
(56.59)
But, with Equation 56.57, for some strictly positive real number We can see that the system (Equation 56.42) combined' with (Equation 56.49), havinginput Fm and output q , is output strictly passive by integrating the derivative of the defined Hamiltonian,
Indeed the derivative is
c,
2 c yT y 5 x T x .
(56.60)
So, integrating Equation 56.59 and with P positive definite, for all t , 2 (56.61) ( Y T , ~ r L) ~ l l ~ r 1 1 .2 As a point of interest, one could verify that G(S)S
= - B ~ P A ( S Z- A)-'B
-B ~ P B .
(56.62)
56.2. INPUT-OUTPUT STABILITY
Simple Nonlinear Separation Theorems In this section we illustrate how allowing regions with nonlinear boundaries in the small gain and passivity contexts may be useful. First we need a class of functions to describe nonlinear boundaries. A proper separation function is a function from the nonnegative real numbers to the nonnegative real numbers which is continuous, zero at zero, strictly increasing and unbounded. The main difference between a gain function and a proper separation function is that the latter is invertible, and the inverse is another proper separation function. Nonlinear passivity We will briefly disiuss a definition of nonlinear input and output strict passivity. To our knowledge, this idea has not been used much in the literature. The notion replaces the linear boundaries in the input and output strici passivity definition by nonlinear boundaries as in Figure 56.5a. ,4 dynamical system is nonlinearly input and output strictly passive if a proper separation function p exists so that, for each input-output pair ( u , y ) and each r > 0, ( u t , Y E ) 1 Ilutllzp(llutllz) I l ~ r l l 2 ~ ( l l y r l l 2 )(Note . that in the classical definition of strict passivity, p ( 5 ) = โฌ 5 for all 5 5 0.)
which is a nondecreasing function oft. So,
Now we can define
which is a proper separation function, so that
Finally, note that
+
so that THEOREM 56.4 Nonlinear passivitytheorem: Ifone dynamical system is passive (2nd the other dynamical system followed by the scaling '-1' is rmllnearly input and output strictlypassive, then the well-defined interconnection is stable using the 2-norm.
-
+
( - ~ 2 , 7 ~ 2 , )L ~ ( I I ~ 2 , 1 1 2 ) 1 1 ~ 2 , 1 1 2 ~ ( l I ~ 2 , l 1 2 ) 1 1 ~ 2 , 1 1. 2
(56.71) The conclusion that we can then draw from the nonlinear passivity theorem is that the interconnection of these two systems:
EXAMPLE 56.7:
Let
be a single integrator system,
This system is passive because
Let C2 be a system which scales the instantaneous value of the input according to the energy of the input:
is stable when measuring input ( d l ,d 2 )and output ( y y2) using the 2-norm. Nonlinear small gain Just as with passivity, the idea behind the small gain theorem does not require the use of linear boundaries. Consider a wcli-defined interconnection where each feedback component is stable but not necessarily finite gain stable. Let yl be a stability gain function for C 1 and let y2 be astability gain function for C 2 . Then the graph separation condition will be satisfied if the distance between the curves ( 5 , y l ( 5 ) ) and ( y 2 ( c ) ,t ) grows without bound as in Figure 56.7. This is equivalent to asking whether it is poss~ble/toadd to the curve ( 5 , y l ( { ) ) in the vertical dnrection and to the curve jy2(c), t )in the horizontal direction, by an increasing and unbounded amount, to obtainnewcurves( C, y l ( 0 + p ( O )and' 29:) p(t),t ) where p is a proper separation function, so that the modified first curve is never above the modified second curve. If this is possible, we will say that the composition of the functions yl and y2 is a strict contraction. To say that a curve ( 5 , yl (C)) is never above a second curve ( y 2 ( 6 ) ,6 ) is equivalent to saying that ?I(72({)) 5 t o r 72(?1(J))( C for all { >_ 0. (Equivalently,we
+
This system followed by the scaling '-1' is nonlinearly strictly passive. To see this, first note that
THE CONTROL HANDBOOK
will write yl o fi 5 Id or p2 o i Id.) So, requiring that the composition of yl and y2 is a strict contraction is equivalent to requiring that a strictlyproper separation function p exists so that (YI P) 0 ( ~ +P) 2 4 Id (equivalently( ~ 2 P) 0 (yl + p ) i Id). This condition was made precise in [3]. (See also [2].) Note that it is not enough to add to just one curve because it is possible for the vertical or horizontal distance to grow without bound while the total distance remains bounded. Finally, note that, if the gain functions are linear, the condition is the same as the condition that the product of the gains is less than one.
Also, because h is continuous and zero at zero, gain functions $x and &, exist so that
+
+
Given all of these functions, a stability gain function can be computed as
For more details, the reader is directed to [8].
THEOREM 56.5 Nonlinear small gain theorem: If each feedback component is stable (with gain functions yl and y2) and the composition of thegains is a strict contraction, then the well-defined interconnection is stable.
EXAMPLE 56.8:
Consider the composite system,
where x E Rn, z E R,the eigenvalues of A all have strictly negative real part, E is asmall parameter, and sat(s) = sgn(s) min{IsI, 1). This composite system is a well-defined interconnection of the subsystems
and
Figure 56.7
Nonlinear small gain theorem.
To apply the nonlinear small gain theorem, we need a way to verify that the feedback components are stable. To date, the most common setting for using the nonlinear small gain theorem is when measuring the input and output using the co-norm. For a nonlinear system which can be represented by a smooth, ordinary differential equation, f(x,u> , = h(x, u),
i =
and y
x(O)=O,
(56.73)
where h(0,O) = 0, the system is stable (with respect to the oonorm) if there exist a positive definite and radially unbounded function V : Rn -+ Rz 0, a proper separation function @, and a gain function so that
Since V is positive definite and radially unbounded, additional proper separation functions g and G exist so that
A gain function for the E l system is the product of the m-gain for the linear system
which we will call 71,with the function sat(s), i.e., for the system El, llylloo 4 P ~ s a t ( l l u l l l ~ > .
(56.82)
For the system E2,
The distance between the curves (5, ylsat(()) and (161 (exp ( 5 ) - 1). () must grow without bound. Graphically, one can see that a necessary and sufficient condition for this is that
56.2. INPUT-OUTPUT STABILITY
General Conic Regions There are many different ways to partition the ambient space to establish the graph separation condition in Equation 56.17. So far we have looked at only two very specific sufficient conditions, the sm;~llgain theorem and the passivity theorem. The general idea in these theorems is to constrain signals in the graph of C1 within some conic region, and signals in the inverse graph of C2 outside of this conic region. Conic regions more general than those used for the small gain and passivity theorems can be generated by using operator? on the input-output pairs of the feedback components. Let C and R be operators on truncated ordered pairs in the ambient space, ancl let y be a gain function. We say that the graph of C1 is inside C O N E ( C ,R,y ) if, for each ( u , y ) =: z belonging to the graph of C1,
On the other hand, we say that the inverse graph of Cz is strictly outside C O N E ( C ,IR, y ) if a proper separation function p exists so that, for each (y, u ) =: x belonging to the inverse graph of C2, IIC(xr)ll 1 Y
0
(Id + p)(IIR(xr)ll)+ ~ ( l l x r l l ) ? for all 5. (56.86)
follows from the nonlinear conic sector theore:m using the 2norm and taking
Suppose 4 is a memoryless nonlinearity which satisfies Il$(u, t )
+ cul 5 lrul
for all t , u
.
(56.89)
Graphically, the constraint on 4 is shown in Figure 56.8. (In the case shown, c > r 0.) We will use the notation SECTOR[-(c r ) , -(c - r ) ] for the memor~rlessnonlinearity. It is also clear that the graph of @ lies in the C O N E ( C ,R, y ) with C , R, y defined in Equation 56.88. For alinear, time invariant, finite dimensional SISO system, whether its inverse graph is strictly outside of this cone can be determined by examining the Nyquist plot of its transfer function. The condition on the Nyquist plot is expressed relative to a disk Dc,, in the complex plane centered on the real axis passing through the points on the r ) and - l / ( c - r ) as shown in real axis with real parts - l / ( c Figure 56.9.
+
+
We will only consider the case where the maps C and R are incrementally stable. i.e., a gain function y exists so that, for each x i and x2 in the ambienc space and all t ,
In this case, the following result holds.
THEOREM 56.6 Nonlinear code sector theorem: If thegraph of I:,i s i n s i d e C ~ ~ ~R,( Cy ), and theinversegraphof C2isstrictly outside C O N E I ( CR, , y ), then the well-defined interconnection is stable. Figure 56.8
When y and p are linear functions, the well-defined interconnection is finite gain stable. The small gain and passivity theorems we have discussed can be interpreted in the framework of the nonlinear conic sector theorem. For example, for the nonlinear small gain theorem, the operator C is a projection onto the s e c o ~ dcoordinate in the ambient space, ancl R is a projection onto the first coordinate; y is the gain function yl , and the small gain condition guarantees that the inverse graph of X2 is strictly outside of the cone specified by this C , R and y . In the remaining subsections, we will discuss other useful choices for the operators C and R. The classical conic sector (circle) theorem For linear SISO systems connected to memoryless nonlinearities, there is an additional classical result, known as the circle theorem, which
Instantaneous sector.
THEOREM 56.7 Circle theorem: Let r >_ 0 , and consider a well-defined interconnection of a rnernoryless nonlinearity belonging to SECTOR[-(c r ) , -(c - r ) ] with a SISO system having a real, rational transfer function G ( s ) . If
+
r > c, G ( s )is stable and the Nyquistplot of G ( s ) lies or in the interior of the disc D,,,, r = c, G ( s ) is stable and the Nyquist plot of G ( s ) is bounded away and to the right of the vertical line passing through the real axis at the value - l / ( c r ) , or r < c, the Nyquist plot of G ( s ) (with N;vquist path indented into the right-halfplane) is outside of and
+
THE CONTROL HANDBOOK But
Figure 56.9
A disc in the complex plane.
bounded awayfiom the disc V,,,, and the number of times theplot encircles this disc in the counterclockwise direction is equal to the number ofpoles of G ( s ) with strictly positive real parts, then the interconnection isfinite gait1 stable.
Setting the latter expression to zero defines the boundary of the disc 'D,,,. Since the expression is positive outside of this disc, it follows that y~ .c r-' . Returning to the calculation initiated in Equation 56.92, note that y~ < r-' implies that a strictly positive real number 6 exists so that ( 1 - E ~ D )1~r ~ 2.5. ' (56.96)
+
So. Case 1 is similar to the small gain theorem, and case 2 is similar to the passivity theorem. We will now explain case 3 in more detail. Let n ( s )and d ( s ) represent, respectively, the numerator and denominator polynomials of G ( s ) . Since the point (- l/c, 0 ) is it follows, from the assumption of the theinside the disc V,,,, orem together with the well-known Nyquist stability condition, that all of the roots of the poiynomial d ( s ) cn(s) have negative ) u where real parts. Then y = G ( s ) u = ~ ( sD(s)-'
+
D(s)
:=
and N ( s ) :=
d(s) d(s) crz(s)' n(s) d ( s ) c n ( s )'
+ +
(56.90)
and, by taking z = D(s)-'u, we can describe all of the possible input-output pairs as (56.91)
Notice that D ( s )
+ c N ( s ) = 1, so that / b+ ~ ~ 1 = 1 211~112.
(56.92)
To put a lower bound on this expression in terms of 1 lu 1 l2 and I I y 1 12, to show that the graph is strictly outside ofthe cone defined in Equation 56.88, we will need the 2-norm gains for systems modeled by the transfer functions N ( s ) and D ( s ) . We will use the symbols y~ and y o for these gains. The condition of the circle theorem guarantees that y~ < r-'. To see this, note that
implying
1 ( r +e)llyI12
+ e(llull2 + Ilull2) . (56.97)
We conclude that the inverse graph of the linear system is strictly outside of the C O N E ( C ,R, y ) as defined in Equation 56.88. Note, incidentally, that N ( s ) is the closed loop transfer function from dl to yl for the special case where the memoryless nonlinearity satisfies @ ( u )= -cu. This suggests another way of determining stability: first make a preliminary loop transformation with the feedback -cu, changing the original linear system into the system with transfer function N ( s ) and changing the nonlinearity into a newnonlinearity6 satisfying I&, t)l 5 r lul. Then apply the classical small gain theorem to the resulting feedback system.
EXAMPLE 56.9: Let
175 G ( s ) = ( s - 1)(s
+ 4)2
(56.98) '
The Nyquist plot of G ( s )is shown in Figure 56.10. Because G ( s ) has one pole with positive real part, only the third condition of the circle theorem can apply. A disc centered at -8.1 on the real axis and with radius'2.2 can be placed inside the left loop of the Nyquist plot. Such a disc corresponds to the values c = 0.293 and r = 0.079. Because the Nyquist plot encircles this disc once in the counterclockwise direction, it follows that the standard feedback connection with the feedback components G ( s ) and a memoryless nonlinearity constrained to the SECTOR[-0.372,0.2141 is stable using the 2-norm.
56.2. INPUT-OU7"PUT STABILITY for some strictly positive real number 6 . The last linequality follows from the fact that D2 and NZare finite gain stable.
EXAMPLE 56.10: (This example is drawn from the work in Potvin, M-J., Jeswiet, J., and Piedboeuf, 7.-C. , Trclns. NAMRUSME 1994,XXII, pp 373377.) Let C 1 represent the fractional Voigt-Kelvin model for the relation between stress and strain in structures displaying plasticity. For suitable values of Young's modulus, damping magnitude, and order of derivative for the strain, the transfer function of C1 is 1
gl(s) = I+&' -12
-10
-8
-6
-4
-2
0
Real part
Figure 56.10
The Nyquist plot for G ( s ) in Example 56.9
Coprimeftactions Typical input-output stability results based on stable coprime fractions are corollariesofthe conic sector theorem. For example, suppose both C 1 and C2 are n o d eled by transfer functions G I( s ) and G 2 ( s ) . Moreover, assume , stable (in any p-norm) transfer functions N 1 , D l , N1, D ~ N2 and D2 exist so that Dl, D2 and f i l are invertible, and
Let C ( u , y ) = f i l ( s ) y - N1( s ) u , which is incrementally stable in any p-norm, let R(u, y ) = 0 , and let y = 0. Then, the graph of Z l is inside and the inverse graph.of C2 is strictly outside C O N E ( C ,R, y ) and thus the feedback loop is finite gain stable in any p-norm. To verify these claims about the properties of the graphs, first recognize that the graph of Ci can be represented as
where z represents any reasonable signal. Then, for signals in the graph of E l ,
-
- Nl ( s )Dl (s)zT
0.
1
Integral feedback: control, g2(s) = --, may be used for asympS totic tracking. Here 1
N l ( s ) = --, s-t 1
It can be shown that these fractions are stable lirlear operators, and thereby incrementally stable in the 2-norm. (This fact is equivalent to proving nominal stability and can be shown using Nyquist theory.) Moreover, it is easy to see that Dl D2 - N1N2 = 1 so that the feedback loop is stable and finite gain stable. Robustness of stability and the gap metric It is clear from the original graph separation theorem that, if a well-defined interconnection is stable, i.e., the appropriate graphs are separated in distance, then modifications of the feedbackcomponents will not destroy stability if the modified graphs are close to the original graphs. Given two systems C I and C , define ; ( X I , C ) = ar if ar is the smallest number for which
The quantity ;(., .) is called the "directed gap" between the two systems and characterizesbasic neighborhoods where stability as well as closed-loop properties are preserved under small perturbations from the nominal system C1 to a nearby system C . More specifically, if the interconnection of ( E l , X2) is finite gain stable, we define the gain pxl,c2as the smallest real number so that
(56.101)
Conversely, for signalls in the inverse graph of C 2 , If C is such that
then the interconnection of ( Z , Z 2 ) is also finite gain stable. As a special case, let Z, E l , Z 2 represent linear systems acting on finite energy signals. Further, assums that stable transfer functions N, D exist where D is invertible, G1 = N D - l , and N and
THE CONTROL HANDBOOK
+
D are normalized so that D T(-s)D(s) N T (-s) N (s) = Id. Then, the class of systems in a ball with radius y 0, measured in the directed gap, is given by CONE(C,R , y), where R = Id and
where ll+ designates the truncation of the Laplace transform of finite energy signals to the part with poles in the left half plane. At the same time, if Bc,,c, < l l y , then it can be shown that C2 is strictly outside the cone CONE(C,R, y ) and, therefore, stability of the interconnection of C with C2 is guaranteed for any C inside CONE(C,R, y). Given C and C1, the computation of the directed gap reduces to a standard X,-optimization problem (see [I]). Also, given C1, the computation of a controller C2, which stabilizes a maximal cone around E l , reduces to a standard X,-optimization problem ( [ I ] ) and forms the basis of the 31,-loop shaping procedure for linear systems introduced in [4]. A second key result which prompted introducing the gap metric is the claim that the behavior of the feedback interconnection of C and C2 is "similar" to that of the interconnection of Cl and C2 if, and only if, the distance between C and Cl, measured using the gap metric, is small (i.e., C lies within a "small aperture" cone around C1 ). The "gap" function is defined as
-
to "symmetrize" the distance function S(., .) with respect to the order of the arguments. Then, the above claim can be stated more precisely as follows: for each 6 > 0, a ((6) > 0 exists so that < C(E) ===+ Ilx - ~ 1 1 < 1 ~~ l l d l l ~ J(C1,
( it ) is an arbitrary signalin the ambient spaceand ) of the feedback x (resp. X I )represents the response ( whered =
y2
interconnection of (C, C2) (resp. ( E l , C2)). Conversely, if Ilx xlll, < ~ l ( d l lf, o r a l l d a n d r , t h e n S ( C l , C ) 5 E.
each input-output pair, the norm of the output is bounded by the norm of the input times the constant. Gain function: a function from the nonnegative real numbers to the nonnegative real numbers whfch is continuous, nondecreasing and zero when its argument is zero; used to characterize stability; see Section 56.2.2; some form of the symbol y is usually used. Graph (of a dynamical system): the set of ordered inyutoutput pairs (u, y). Inner product: defined for signals of the same dimension defined on the semi-infinite interval; the integral from zero to infi~ityof the component-wise product of the two signals. Inside (or strictly outside) CONE(C,R, y): usedtocharacterize the graph or inverse graph of a system; determined by whether or not signals in the graph or inverse graph satisfy certain inequalities involving the operators C and R and the gain function y ; see Equations 56.85 and 56.86; used in the conic sector theorem; see Section 56.8. Inverse graph (of a dynamical system): the set of ordered output-input pairs (y, u). Norm function (I I . I I): used to measure the size of signals defined on the semi-infinite interval; examples are the p-norms p E [1, co](see Equations 56.13 and 56.14). Parseval's theorem: used to make connections between properties of graphs for SISO systems modeled with real, rational transfer functions and frequency domain characteristics of their transfer functions; Parseval's theorem relates the inner product of signals to their Fourier transforms if they exist. For example, it states that, if two scalar signals u and y, assumed to be zero for negative values of time, have Fourier transforms li (jm) and i ( j w ) then
Defining Terms Ambient space: the Cartesian product space containing the inverse graph of C2 and the graph of E l . Distance (from a signal to a set): measured using a norm function; the infimum, over all signals in the set, of the norm of the difference between the signal and a signal in the set; see equation (56.16); used to characterize necessary and sufficient conditions for input-output stability; see Section 56.2.2. Dynamical system: an object which produces an output signal for each input signal. Feedback components: the dynamical systemswhich make up a well-defined interconnection. Finite gain stable system: a dynamical system is finite gain stable if a nonnegative constant exists so that, for
Passive: terminology resulting from electrical network theory; a dynamical system is passive if the inner product of each input-output pair is nonnegative. Proper separation function: a function from the nonnegative real numbers to the nonnegative real numbers which is continuous, zero at zero, strictly increasing unboundand ed; such functions are invertible on the nonnegative real numbers; used to characterize nonlinear separation theorems; see Section 56.6; some form of the symbol p is usually used. Semi-infinite intewak the time interval [0, 00). Signal: a "reasonable" vector-valued function defined on a finite or semi-infinite time interval; by "reasonable"
we mean piecewise continuous or measurable. SISO systems: an abbreviation for single input, single output systems. Stable system: a clynamical system is stable if a gain function exists '$0that, for each input-output pair, the norm of the output is bounded by the gain function evaluated at the norm of the input. Strict contraction: the: composition of two gain functions yl and y2 1,s a strict contraction if a proper seyaration function p exists so that ( y l + p ) o (y2 p ) 5 Id, where Id(<) = < and );.I o );.2(<) = (F2(<)). Graphically, this is the equivalent to the never being above the curve curve ( I , y~(5) -k p (0) ( y z ( < )+ p ' ( < ) ,0. This concept is used to state the nonlinear s:rnall gain theorem. See Section 56.7. Strictly passive: We have used various notions of strictly passive includinginput-, output-, input and output,andnonlinear input and output-strictly passive. All notions strengthen the requirement that the inner product of the input-output pairs be positive by requiring a positive lower bound that depends on the 2-norm oftl~einput andlor output. Seesection 56.5. Truncated signal: A signal defined on the semi-infiniteinterval which is derived from another signal (not necessarily defined on the semi-infinite interval) by first appending zeros to extend the signal onto the semiinfinite interval and then keeping the first part of the signal and setting the rest of the signal to zero. Used to measure the size of finite portions of signals. Well-defined interconnection: An interconnection of two dynamical systems in the configuration of Figure 56.2 which results in another dynamical system, i.e., one in ,which an output signal is produced for each input signal.
+
References [ l ] Georgiou, T.'T. and Smith, M.C., Optimal robustness in the gap metric, IEEE Trans. Auto. Control, 35,673686,1990. [2] Jiang, Z.P., Tt:el, A.R., and Praly, L., Small-gain theo-
rem for ISS systems and applications, Math. Control, Sign., Syst., 7[2), 95-120, 1995. [3] Mareels, 1.M Y. and Hill, D.J., Monotone stability of nonlinear feedback systems, J. Math. Syst., Esi. Control, 2(3), 275-291,1992. [4] McFarlane, D.C. and Glover, K., Robust controller design using normalized coprime factorplant descriptions, Lecture Notes in Control and Information Sciences, Springer-Verlag,vol. 138, 1989. [5] Popov, V.M., Absolute stability of nonlinear systems of automatic control, Auto. Remote Control, 22, 857875, 1961. (6) Safonov, M., Stability and robustness of multivariable feedback systf9ms,The MIT Press, Cambridge, MA,
1980. [7] Sandberg, I.W., On the C2-boundedness of solutions
of nonlinear functional equations, Bell Sys. Tech. J., 43, 1581-1599,1964.
181 Sontag, E., Smooth stabilization implies coprime factorization, IEEE Trans. Auto. Control, 34, 435-443, 1989. 191 Zames, G., On the input-output stability of time-
varying nonlinear feedback systems, Part I: Conditions using concepts of loop gain, conicity, and positivity, IEEE Trans. Auto. Control, 11, 228-238, 1969. [ l o ] Zames, G., On the input-output stability of timevarying nonlinear feedback systems, Part 11: Conditions involving circles in the frequency plane and sector nonlinearities, IEEE Trans. Auto. Control, 11,465476,1966.
Further Reading As mentioned at the outset, the material presented in this chapter is based on the results in [6], [7], 191, and [lo].In the latter, a more general feedback interconnection structure is considered where nonzero initial conditions can also be consider as inputs.
Other excellent references on input-output stability include 'TheAnalysis of Feedback Systems, 197 1, by J.C. Willems and FeedbackSystems: Input-OutputProperties, 1975, by C. Desoer and M. Vidyasagar. A nice text addressing the factorization method in linear systems control design is Control Systems Synthesis: A Factorization Approach, 1985, by M. Vidyasagar. A treatment of input-output stability for linear, infinite dimensional systems can be found in Chapter 6 of Nonlinear Systems Analysis, 1993, by M . Vidyasagar. That chapter also discusses many of the connections between input-output stability and state-space (Lyapunov) stability. Another excellent reference is Nonlinear Systems, 1992, by H. Khalil. There are results similar to the circle theorem that we have not discussed. They go under the heading of "multiplier" results and apply to feedback loops with a linear element and a memoryless, nonlinear element with extra restrictions such as time invariance and constrained slope. Special cases are the well-known Popov and off-axis circle criterion. Some of these results can be recovered using the general conic sector theorem although we have not taken the time to do this. Other resdts, like the Popov criterion, impose extra smoothness conditions on the external inputs which are not found in the standard problem. References for these problems are Hyperstability of Control Systems, 1973, V.M. Popov, the English translation of a book originally published in 1966, and Frequency Domain Criteria for Absolute Stability, 1973, by K.S. Narendra and J.H. Taylor. Another topic closely related to these multiplier results is the structured small gain theorem for linear systems which
THE CONTROL HANDBOOK lends to much of the p-synthesis control design methodology. This is described, for example, in p-Analysis and Synthesis Toolbox, 1991, by G. Balas, J. Doyle, K. Glover, A. Packard and R. Smith. There are many advanced topics concerning input-output stabilitythat we have not addressed. These include the study of small-signal stability, well-posedness of feedback loops, and control design based on input-output stability principles. Many articles on these topics frequently appear in control and systems theory journals such as IEEE Transactions on Automatic Control, Automatics, InternationalJournal of Control, Systems and Control Letters, Mathematics of Control, Signals, and Systems, to name a few.
Design Methods Alberto Isidori Maria Domenica Di Benedetto Christopher I. Byrnes Randy A. Freeman Petar V. KokotaviC
R. A. DeCarlo S . H. i a k
S. V. Drakunov Eyad H. Abed Hua 0.Wang Alberto Tesi J. Baillieul B. Lehman
Miroslav Krstic Kevin M. Passirlo Stephen Yurkovich Jay A. Farrell
57.1 Feedback Linearization of Nonlinear Systems.. .................... .909 57.2 Nonlinear Zero Dynamics.. ......................................... .917 57.3 Nonlinear Output Regulation.. ...................................... 922 57.4 Lyapunov Design .................................................... 932 57.5 Variable Structure, Sliding-ModeController Design ..............941 57.6 Control of Bifurcations and Chaos .................................. 951 57.7 Open-Loop Control Using Oscillatory Inputs .....................967 57.8 Adaptive Nonlinear Control ........................................ 980 57.9 Intelligent Control.. .................................................. 994 57.10 Fuzzy Control.. ....................................................... 1001 57.1 1 Neural Control.. ......................................................1017
57.1 Feedback Linearization of Nonlinear Systems
A1 berto Isidori, Dipartimento di Informatica e Sistemistiaz, Universita di Roma "La Sapienza", Rome, and Department of Systems Sciences and Mathematics, Washington University, St.Louis, MO
Maria Domenica Di Benedetto, Dipartimento di Ingegneria Elettrica, Universith de L'Aquila, Monteluco di Rolo (LtAquila) 0-8493-8570-9/96/$0.00+$.50 @ 1996 by CRC Press, Inc.
57.1.1 The Problem of Feedback Linearization A basic problern in control theory is how to use feedback in order to modify the original internal dynamics of a controlled plant so as to achieve some prescribed behavior. In particular, feedback may be used for the purpose of imposing, on the associated closed-loop system, the (unforced) behavior of some prescribed autonomous linear system. When the plant is modeled as a linear time-invariant system, this is known as the problem of pole placement, while, in the more general case of a nonlinear model, this is known as the problem of feedback linearization (see [I], [21, [31,[41). The purpose of this chapter is to present some of the basic features of the theory of feedback linearization. Consider a plant modeled by nonlinear differential equations
THE CONTROL HANDBOOK
of the form
where z is the new state vector and @ ( x ) represents a (n-vectorvalued) function of n variables,
, . . . , xn) E R n , control input having internal state x = ( X I x2, u E R and measured output y E 1.The functions
with the following properties: 1. @ ( x ) is invertible; i.e., there exists a function @-'(2) such that -(@x))=x, @(@-l(z))=z forallx
are nonlinear functions of their arguments that are assumed to be differentiable a sufficient number of times. Changes in the description and in the behavior of this system will be investigated under two types of transformations: (1) changes of coordinates in the state space and (2) static state feedback control laws, i.e., rnemoryless state feedback laws. In the case of a linear system,
a static state feedback contrul law takes the form
in which v represents a new control input and F and G are matrices of appropriate dimensions. Moreover, only linear changes of coordinates are usually considered. This corresponds to the substitution of the original state vector x with a new vector z related to x by a transformation of the form
where T is a nonsingular matrix. Accordingly, the original description of the system of Equations 57.3 and 57.4 is replaced by a new description
2 y
in which
= ;iz = cz
+ Eu
-
X = TAT-', E = T B , c = CT-l. In the case of a nonlinear system, a static state feedbackcontrol law is a control law of the form
where v represents a new control input and p ( x ) is assumed to be nonzero for allx. Moreover, nonlinear changes of coordinates are considered, i.e., transformations of the form
E Rn andallz E
Rn.
2. @ ( x )and @ - l ( z ) are both smooth mappings; i.e., continuous partial derivatives of any order exist for both mappings. A transformation of this type is called aglobal diffeomorphism. The first property is needed to guarantee the invertibility of the transformation to yield the original state vector as
while the second one guaranteesthat the description ofthe system in the new coordinates is still a smooth one. Sometimes a transformation possessing both these properties and defined for all x is hard to find and the properties in question are difficult to check. Thus, in most cases, transformations defined only in a neighborhood of a given point are of interest. A transformation of this type is called a local diffeomorphism. To check whether a given transformation is a local diffeomorphism, the following result is very useful. PROPOSITION 57.1 Suppose @ ( x ) is a smooth function defined on some subset U of R n . Suppose the Jacobian matrix
is nonsingular at a point x = xO. Then, for some suitable open subset U O of U , containing xO, @ ( x ) defines a local diffeomorphism between U O and its image @ ( u O ) . The effect of a change of coordinates on the description of a nonlinear system can be analyzed as follows. Set z(t> = @(x(t))
and differentiate both sides with respect to time to yield
Then, expressing x ( t ) as @-' ( z ( t ) ) ,one obtains
5 7.1. FEEDBACK IdINEARZZATIONOF NONLINEAR SYSTEMS where
?(z)
possible. Thus, for instance, by differentiating A(x) first along f ( x ) and then along g(x), we may construct the function
= x=*-'(z)
.
=
91 1
(g'gcx,) x=Q-'(z)
The latter are the formulas relating the new description of the' system to the original one. Given the nonlineax system of Equation 57.1, the problem of fsedback linearization consists of finding, if possible, a change of coordinates of the form of Equation 57.7 and a static state feedback of the form of Equation 57.6 such that the composed dynamics of Equations 57.1 to 57.6, namely the system
expressed in the new c:oordinatesz , is the linear and controllable system il = 22 22 = 23 - ..
57.1.2 Normal Forms of Single-Input Single-Output Systems
or, differentiating k times A(x) along f ( x ) , we may construct a function recursively defined as
With the help of this operation, we introduce the notion of relative degree of a system. DEFINITION 57.1 system
The single-input single-output nonlinear
n Y
= f ( ~ +) ~ ( x ) u = h(x)
has relative degree r at x0 iE 1. L ~ L ; ~ ( X =) 0 for all x in a neighborhood of x0 and all k
EXAMPLE 57.1,:
Single-input single-output nonlinear systems can be locally given, by means of a mitable change of coordinates in the state space, a "normal fornl" of special interest, in which several important properties can be put in evidence and which is useful in solving the problem of feedback linearization. In this section, methods for obtaining this normal form are presented. Given a red-valued function of x = ( X I , . . . , x,)
Consider the system
For this system we have
and a (n-vector)-va1ue:dfunction of x
Thus, the system has relative degree 2 at any point x o . However, if the output function were, for instance we define a new real-valued function of x , denoted Lf h ( x ) , in the following way then L g h ( x ) = 1 and the system would have relative degree 1 at any point x O . Setting
the function L f A ( x ) can be expressed in the simple fc
1
ah L-fh(x) = -f ( x ) .
a.~
The new function Lf rl.(x) thus defined is sometimes :ded the derivative of h ( x ) along f ( x ) . Repeated use of this operation is
The notion of relative degree has the following interesting interpretation. Suppose the system at some time t o is in the state x ( t O ) = xO. Calculate the value of y ( t ) , the output of the system, and of its derivatives with respect to time, y ( k ) ( t ) ,for k = 1,2, . . ., at t == to, to obtain
THE CONTROL HANDBOOK If the relative degree r is larger than 1, for all t such that x ( t ) is near xO, i.e., for all t near t = to, we have L g h ( x ( t ) ) = 0 and therefore y ( ' ) ( t ) = Lf h ( x ( t ) ) .
The description of the system in the new coordinates zi = Qi ( x ) , 1 5 i 5 n can be derived very easily. From the previous calculations, we obtain for z l , . . . , zr
This yields
a#,_,dx - -a~)-~hdx - --
dzr-I dt
Again, if the relative degree is larger than 2, for all t near t = to we have L g L f h ( x( t ) ) = 0 and
Q dt ax d t = L)-'h(x(t)) = & ( x ( t ) ) = zr ( t ) .
For zr we obtain
y ( 2 ) ( t ) = L; h (x ( t ) ) . Continuing in this way, we get y (k' ( t ) = ~ ( j( xh( t ) )
forall'k < r andallt neart = tO,and
On the right-hand side of this equation we must now replace ~ ( t ) by its expression as a function of z ( t ) , i.e., x ( t ) = @-' ( ~ ( t ) ) . Thus, set a(z) = L~L;-'~(@-'(Z))
+
b(z) = Ljh(@-'(z)),
y(')(tO) = L j h ( x O ) L g ~ f' - ' h ( x ~ ) u ( t ~ ) .
to obtain
Thus, r is exactlythe number of times y ( t )has to be differentiated at t = to for u ( t O )to appear explicitly. The above calculationssuggest that the functions h ( x ) , L h ( x ) , . , L;-' h ( x ) have a special importance. As a matter of fact, it is possible to show that they can be used in order to define, at least partially, a local coordinate transformation near xO,where x0 is a point such that L g L;-' h ( x O )# 0. This is formally expressed in the following statement.
.
PROPOSITION 57.2 Let the system of Equations 57.1 and 57.2 be given and let r be its relative degree at x = x O . Then r 5 n. Set 4 i ( x ) = h(x) &(x) = Lfh(X)
If r is strictly less than n, it is always possible to find n - r additional functions q5r+l ( x ) , . . ., &,( x ) such that the mapping
has a Jacobian matrix that is nonsingular at x0 and therefore qualifies as a local coordinates transformation in a neighborhood of xO. The value at x0 of these additional functions can be fixed arbitrarily. Moreover, it is always possible to choose (x), . . ,@n ( x ) in such a way that
forallr
+ 1 5 i 5 n andallx aroundxO.
- = b ( z ( t ) )+ a ( z ( t ) ) u ( t ) .
dzr dt As far as the remaining coordinates are concerned, we cannot expect any special structure for the corresponding equations. If ( x ) , . . , & ( x ) have been chosenso that Lg+i ( x ) = 0, then
Setting qi ( z ) = L f 4 i (@-I(")
for r
+ 1 5 i 5 n, the latter can be rewritten as
Thus, in summary, the state space description of the system in the new coordinates is as follows
the output In addition to these equations, one has to specifyho,~ of the system is relaiedto the new statevariables. since y = h ( x ) , it is immediately seen that
57.1. FEEDBACK LINEARIZATION OF NONLINEAR SYSTEMS The equations thus defined are said to be in normal form. Note that at point z0 = @ ( x O ) ,a ( z O ) # 0 by definition. Thus, the coefficient a ( z ) is nonzero for all z in a neighborhood of zO.
Consider the system of Equations 57.9 and 57.10). In order to find the normal form, we set
~2
to construct the normal form is given by the function h ( x ) and its firs: n - 1 derivatives along f ( x ) ,i.e.,
41 ( x )
= =
4i(x)=h(x>=~3 (P2(x)= L f h ( x ) = -x2.
@z(x)
h(x) Lfh(x)
4n(1)
L;-' h ( x )
)=(
@(.I=(
EXAMPLE 57.2:
ZI
913
).
..,
NO additional functions are needed to complete the transformation. In the new coordinates z, = ~ , ( x ) = L > - ' ( x )
1 ii r n
the system is described by equations of the form:
.
We now have to find a function 43( x ) that completes the coor. dinate transformation and is such that Lgc$3 ( x ) = 0 , i.e.,
The function 43(x) = 1
+XI
- exp(x2)
satisfies the equation above. The transformation z = @ ( x ) de, (P3(x)has a Jacobian fined by the functions ( x ) , 4 2 ( ~ )and
which is nonsingular for all x , and @ ( 0 ) = 0 . Hence, z = Q ( x ) defines a global change of coordinates. The inverse transformation is given by
In the new coordinates the system is described by
These equations describe the system in normal form and are globally valid because the coordinate transformation we considered is global.
57.1.3 Condirtions for Exact Linearization via Feedback In this section, conditions and constructive procedures are given for a single-input single-output nonlinear system to be transformed into a linear and controllable system via change of coordinates in the state space and static state feedback. The discussion is based on the normal form developed in the previous section. Consider a nonlinear system having at some point x = x0 relative degree equal to the dimension of the state space, i.e., r = n. In this case, the change of coordinates required
where z = ( 2 1 ,z2, . . . , z n ) . ~ e c a lthat l at the point zO = @ ( x O ) , and hence at all z in a neighborhood of z O ,the function a ( z ) is nonzero. Choose now the following state feedback control law
which indeed exists and is well defined in a neighborhood of zO. The resulting closed-loop system is governed by the equations
i.e., it is linear and controllable. Thus we conclude that any nonlinear system with relative degree n at some point x0 can be transformed into a system that is linear and controllable by means of ( 1 ) a local change of coordinates, and ( 2 ) a local static state feedback. The two transformations used in order to obtain the linear form can be interchanged: one can first apply a feedback and thtn change the coordinates in the state space, without altering the result. The feedback needed to achieve this purpose is exactly the feedback of Equation 57.12, but now expressed in the x coordinates as
Comparing this with the expressions for a ( z ) and b ( z ) given in the previous section, one immediately realizes that this feedback, ex~ressedinterms of the functions f ( x ) , g ( x ) , h ( x ) , which characterize the original system, has the form U
=
1
L, L;-' ( x )
(-L?(x)
+v)
THE CONTROL HANDBOOK An easy calculation shows that the feedback of Equation 57.18, together with the same change of coordinates used so far (Equation 57. l l ) , exactlyyields the same linear and controllablesystem. l f x Ois an equilibrium point for the original nonlinear system, i.e., iff ( x o ) = 0, and if also h ( x O )= 0, then
We shall see that these conditions can be transformed into a partial differential equation for h ( x ) ,for which conditions for existence of solutions as well as constructive integration procedures are well known. In order to express this, we need to introduce another type of differential operation. Given two +vector)valued functions of x = ( x l , . . , x,), f ( x ) and g ( x ) , we define a new (n-vector)-valued function of x , denoted [ f, g ] ( x ) ,in the following way +
and
for all 2 5 i 5 n, so that z 0 = @ ( x O ) = 0. Note that a condition like h ( x O ) = 0 can always be satisfied by means of a suitable translation of the origin of the output space. Thus, we conclude that, if x0 is an equilibrium point for the original system, and this system has relative degree n at xO, there is a feedback control law (defined in a neighborhood of x O ) and a coordinate transformation (also defined in a neighborhood of x O ) that change the system into a linear and controllable one, defined in a neighborhood of the origin. New feedback controls can be imposed on the linear system thus obtained; for example,
where K = ( k l k2
... k n )
can be chosen to meet some given control specifications, e.g., to assign a specific set of eigenvalues or to satisfy an optimdity criterion. Recalling the expression of the z i s as f~nctionsof x , the feedback in question can be rewritten as
i.e., in the form of a nonlinear feedback from the state x of the original description of the system. Up to this point of the presentation, the existence of an "output" function h ( x ) relative to which the system of Equations 57.1 and 57.2 has relative degree exactly equal ton (at x O )has been key in making it possible to transform the system into a linear and controllable one. Now, if such a function h ( x ) is not available beforehand, either because the actual output of the system does not satisfy the conditions required to have relative degree n or simply because no specific output is defined for the given system, the question arises whether it is possible to find an appropriate h ( x ) that allows output linearization. This question is answered in the remaining part of this section. Clearly, the problem consists of finding a function, h ( x ) , satisfying the conditions
for all x near x0 and
a8 af where - and - are the Jacobian matrices of g ( x ) and f ( x ) , reax ax spectively. The new function thus defined is called the Lieproduct or Lie bracket off ( x ) and g ( x ) . The Lie product can be used repeatedly. Whenever a function g ( ~ is) "Lie-multiplied" several times by a function f ( x ) , the following notation is frequently used = [ f ,g l ( x ) a d j g ( x ) = [f, [ f ,gll(x) .. . ad;&)
=
[f, a d ; - l g ] ( x ) .
We shall see now that the conditions a function h ( x ) must obey in order to be eligible as "output" of a system with relative degree n can be re-expressed in a forni involving the gradient of h ( x ) and a certain number of the repeated Lie products o f f ( x ) and g ( x ) . For, note that, since
if L g h ( x ) = 0, the two conditions L [ f , g l h ( ~ = ) 0 and L , L f h ( x ) = 0 are equivalent. Using this property repeatedly, one can conclude that a system has relative degree n at x0 if and only if
for all x near xO, and
Keeping in mind the definition of derivative of h ( x ) along a given (n-vector)-valued function, the first set of conditions can be rewritten in the following form
This partial differentialequation for h ( x ) has important properties. Indeed, if a function h ( x ) exists such :hat
for all x near xO,and
915
57.1. FEEDBACK LINEARIZATION OF N O N L I N E A R SYSTEMS then necessarily the n vectors
The arguments developedthus far can be summarized formally as follows.
must be linearly independ;nt. So, in particular, the matrix
PROPOSITION 57.3
has rank n - 1. The conditions for the existence of solutions to a partial differential equation of the form of Equation 57.19 where the matrix
has full rank are given by the well-known Frobenius' theorem.
(Considera partial differential equation of the
THEOREM 57,B form ah - ( xl (x) X;(X) . . . X ~ ( X) )= 0, ax in which Xl (x), . . . , Xk(x) are (n-vector)-valued functions of x. Suppose the matrix
Consider a system:
There exists an "output" function h (x) for which the system has relative degree n at a point x0 if and only if the following conditions are satisfied: 1. The matrix
has rank n. 2. The set {g(x), adfg(x), xO.
. . . , ~ d f n - ~ g ( xis) ) involutive near
In view of the results illustrated at the beginning ofthe section, it is now possible to conclude that conditions 1 and 2 listed in this statement are necessary and sufficient conditions for the existence of a state feedback and of a change of coordinates transforming, at least locally around the point xO,a given nonlinear system of the form = f (x) g(x)u
+
has rank k at the point x = xO. There exist n - k real-valued functions o f x , say hl (x), . . . , hn-k(x), defined in a neighborhood of xO, that are solutions of the given partial d#eerential equation, and are such that the*Jacobian matrix
into a linear and controllable one. REMARK 57.2 For a nonlinear system whose state space has dimension n = 2, condition 2 is always satisfiedsince [g, g ] (x) = 0. Hence, by the above result, any nonlinear system whose state space has dimension n = 2 can be transformed into a linear system, via state feedback and change of coordinates, around a point x0 if and only if the matrix
has rank n - k at x = x0 ifand only if; for each pair of integers (i, j), 1 5 i, j k , the matrix
has rank k for all x i n a neighborhood of x0
REMARK 57.1 A set of k (n-vector)-valued functions {XI(x), . . , Xk(x)], such that the m a t r k
has rank k at the point x = xO,is said to be involutive near x0 if, for each pair of integers ~ ( ij,) , 1 5 i, j 5 k, the matrix
still has rank k for all x in a neighborhood of xO. Using this terminology, the necessary and sufficient condition indicated in the previous theorem can be simply referred to as the involutivity of the set {Xl(x), . . . , Xk(x)}.
has rank 2. If this is the case, the vector g(xO)is nonzero and it is always possible to find a function h ( x ) = h(xl, x2), defined locally around xO,such that
If a nonlinear system of the form of Equations 57.1 and 57.2 having relative degree strictly less than n meets requirements 1 and2 ofthe previous proposition, there exists adifferent "output" function, say k(x), with respect to which the system has relative degree exactly n. Starting from this new function, it is possible LO construct a feedback u = a(x) fJ(x)v and a change of coordinates z = @(x),thai transform the system
+
into a linear and controllable one. However, in general, the real output of the system expressed in the new coordinates
916
THE CONTROL HANDBOOK
is still a nonlinear function of the state z. Then the question arises whether there exist a feedback and a change of coordinates transforming the entire description of the system, output function included, into a linear and controllable one. The appropriate conditions should include the previous ones with some additional constraints arising from the need to linearize the output map. For the sake of completeness, a possible way of stating these conditions is given hereafter.
PROPOSITION 57.4 Let the system of Equations 57.1 and 57.2 be given and let r be its relative degree at x = xO. There exist a static state feedback and a change of coordinates, defined locally around xO,so that the system is transformed into a linear and controllable one
has rank 3 at all points x where its determinant, an analytic function of x, is different from zero. Hence, condition 1 is satisfied almost everywhere. Note that at point x = 0 the matrix
has rank 2, and this shows that condition 1 is not satisfied at the origin. The product [g, adfg] (x) has the form [g, adf gl(x) = 4x1 exp(2x2) exp(xz)(x? sinx2 2 cosx2) sin x2 - 2 exp(2x2) - 2x1 exp(x2) 0
+
+
+
Then one can see that the matrix
if and only if the following conditions are satisfied:
has rank 2 at all points x for which its determinant
1. TKL matrix
is zero. This set of points has measure zero. Hence, condition 2 is not satisfied at any point x of the state space. In summary, the system of Equations 57.9 and 57.10 satisfies condition 1 almost everywhere but does not satisfy condition 2. Hence, it is not locally feedback linearizable.
has rank n . 2. The (n-vector)-valued functions defined as
EXAMPLE 57.4:
are such that [ad%, f ad$] f
*
=0
Consider the system
fol- all pairs (i, j) such that 0 5 i, j 5 n.
EXAMPLE 57.3: Consider the system of Equations 57.9 and 57.10. In order to see if this system can be transformed into alinear and controllable system via statis state feedback and change of coordinates, we have to check conditions 1 and 2 of Proposition 57.3. We first compute adf g(x] and adjg(x): exp(xz)(x?
+ sin x2)
adZg(x) f = exp(xz)(x&+ sinxZ)(x:
+
+ sin12 + cosx2)
x: (sin x2 - 4x1 exp(x2)) 1 - 4x1 exp(xz)sinx2 -2xl exp(xZ)- cosx2
The matrix adfg(x) ad;g(x)
+ 2x1 exp(x2)cos x2
This system has relative degree 1 at aU x since Lgh(x) = exp(xl). It is easily checked that conditions 1 and 2 of Proposition 57.3 are satisfied. Hence, there exists a function h(x) for which the system has relative degree 3. This function has to satisfy
A solution to this equation is given by
The system can be transformed into a linear and controllable one by means of the static state feedback
57.2. NONLINEAR ZERO DYNAMICS and the coordinates transformation
91 7
Christopher I. Byrnes, Department of ~ y s tems Sciences and Mathematics, Washington University, St.Louis, MO 57.2.1 Input-Output Feedback Linearization
The original output of the system y = x;! is a nonlinear function of 2: Y=-Z2
2 e Z 3 -21.
To determine whether the entire system, output function included, can be transformed into a linear and controllable one, condition 2 of Provosition 57.4 should be checked. Since ,-
Lfh ( x ) =
Consider a nonlinear single-input single-output system, described by equations of the form
0, J r ( x ) = f ( x ) and g ( x ) =
lations yield
and suppose x = 0 is an equilibrium of the vector field f (x); i.e., f (0) = 0 , and h ( 0 ) = 0. Assume also that this system has relative degree r < n at x = 0. Then there is a neighborhood U of x = 0 in R" and a local change of coordinates z = @ ( x ) defined on U [and satisfying @(O) = O ] such that, in the new coordinates, the system is described by equations of the form (see Chapter 57.1 for details)
One can check that [ad%, ad%] # 0. Hence, condition 2 of f f Proposition 57.4 is not satisfied. Therefore, the system with its output cannot be transformed into a linear and controllableone.
References [ I ] Jakubczyk, B. and Respondek, .W., On linearization of control systlems, Bull. Acad. Polonaise Sci. Ser. Sci. Math., 28, 517-522, 1980. [2] Su, R., On the linear equivalentsof nonlinear systems, Syst. Control Liott., 2,48-52, 1982. [3] Isidori, A., Krengr, A.J., Gori-Giorgi, C., and Monaco, S., Nonlinear decoupling via feedback: a differential geometric approach, IEEE Trans. Autom. Control, 26, 33 1-345,198 1. [4] Hunt, L.R.,Su, R., and Meyer, G., Design for multiinput systems, in Differential Geometric Control Theory, Brockett, R.W., Millman, R.S., and Sussmann, H.J., Eds., Birkhauser, 1983,268-298. [5] Isidori, A., IVonlinear Control Systems, 2nd Ed., Springer-Verlag, 1989.
Equation 57.21, which describes the system in the new coordinates, can be more conveniently represented as follows. Set
and recall that, in particular,
Moreover, define
57.2 Nonlinear Zero Dynamics Alberto Isidori,
Dipartimento di Informatica e Sistemistica, Universita di Roma "La Sapienza", Rome, and Department of Systems Sciences and Mathematics, Washington University, St.Louis, MO
2nd set
THE C O N T R O L HANDBOOK
Then, Equation 57.2 1 reduces to equations of the form
Suppose now the input u to the system of Equation 57.22 is chosen as u=-
1
a((?
v)
+
(-b(t, I?) v) .
(57.23)
This feedback law yields a closed-loop system that is described by equations of the form
This system clearly appears decomposed into a linear subsystem, of dimension r , which is the only one responsible for the input-output behavior, and a possibly nonlinear subsystem, of dimension n - r , whose behavior does not affect the output. In other words, this feedback law has changed the original system so as to obtain a new system whose input-output behavior coincides with that of a linear (controllable and observable) system of dimension r having transfer function
REMARK57.3 To interpret the role played by the feedback law of Equation 57.23, it is instructive to examine the effect produced by a feedback of this kind on a linear system. In this case, the system of Equations 57.22 is modeled by equations of the form
in which R and S are row vectors, of suitable dimensions, of red numbers; K is a nonzero real number; and P and Q are matrices, of suitable dimensions, of real numbers. The feedback of Equation 57.23 is a feedback of the form
R
S
57.2.2 The Zero Dynamics In this section we discuss an important concept that in many instances plays a role exactly similar to that of the "zeros" of the transfer function in a linear system. Given a single-input single-output system, having relative degree r < n at x = 0 represented by equations of the form of Equation 57.22, consider the following problem, which is sometimes called the Problem ofzeroing the Output. Find, if any, pairs consisting of an initial state x0 and of an input function uO(.), defined for all t in a neighborhood o f t = 0, such that the corresponding output y (t) of the system is identically zero for all t in a neighborhood o f t = 0. Of course, the interest is to find all such pairs (xO,uO)and not simply the trivialpairxO = 0, u0 = 0 (corresponding to the situation in which the system is initially at rest and no input is applied). Recalling that, in the normal form of Equation 57.22
we observe that the constraint y (t) = 0 for all t implies
that is, t (t) = 0 for all t. In other words, if the output of the system is identicallyzero, its state necessarily respects the constraint ((t) = 0 for all t. In addition, the input u(t) must necessarily be the unique solution of the equation
[recall that a(0, q(t)) # 0 if q(t) is close to 01. As far as the variable q(t) is concerned, it is clear that, t ( t ) being identically zero, its behavior is governed by the differential equation
From this analysis it is possible to conclude the following. In order to have the output y (t) of the system identically zero, necessarily the initial state must be such that 6 (0) = 0, whereas ~(0= ) q0 can be chosen arbitrarily. According to the value of qO,the input must be set equal to the following function
1
u = --( - - q + - v . K K K A feedback of this type indeed modifies the eigenvalues of the system on which it is imposed. Since, from the previous analysis, it is known that the transfer function of the resulting closed-loop system has no zeros and r poles at s = 0, it can be concluded that the effect of the feedback of Equation 57.25 is such as to place r eigenvalues at s = 0 and the remaining n - r eigenvalues exactly where the n - r zeros of the transfer function of the open-loop system are located. The corresponding closedloop system, having n - r eigenvalues coinciding with its n - r zeros, is unobservable and its minimal realization has a transfer function that has no zeros and r poles at s = 0.
where ~ ( t denotes ) the solution of the differential equation
with initial condition q(0) = qO.Note also that for each set of initial conditions (6, q) = (0, qO)the input thus defined is the unique input capable of keeping y(t) identically zero. The dynamics of Equation 57.26 correspond to the dynamics describing the "internal" behavior of the system when input and initial conditions have been chosen in such a way as to constrain the output to remain identicallyzero. These dynamics, which are rather important in many instances, are called the zero dynamics of the system.
57.2. NONLINEAR ZERO DYNAMICS The previous analysis interprets the trajectories of the dimensional system
0 = d o , V)
-r ) -
(57.27)
as "open-loop" trajectories of the system,when the latter isforced (by appropriate choice of input and initial condition) to constrain the output to be idientically zero. However, the trajectories of Equation 57.27 can also be interpreted as autonomous trajectories of an appropriate "closed-loop system." In fact, consider again a system in the norimal form of Equation 57.22 and suppose the feedback control la^^ of Equation 57.23 is imposed, under which the input-output behavior becomes identical with that of a linear system. The correslponding closed-loop system thus obtained is described by Ecluations 57.24. If the linear subsystem is initially at rest and no input is applied, then y ( t ) = 0 for all values oft, and the correspontling internal dynamics of the whole (closedloop) system are e~actlythose of Equation 57.27, namely, the zero dynamics of the open-loop system. REMARK 57.4 In a linear system, the dynamics of Equation 57.27 are determined by the zeros of the transfer function of the system itself. In fact, consider a linear system having relative degree r and let
denote its transfer function. Suppose the numerator and denominator polynomials are relatively prime and consider a minimal realization of H ( s )
with
In the new coordinates we obtain equations in normal form, which, because of the linearity of the system, have the following structure 6 = A t B ( R 4 SV K u )
+
3 = Pt
+ Qv
+ +
where R and S are row vectors and P and Q are matrices of suitable dimensions. The zero dynamics of this system, according to our previous definition, are those of
The particular choice of the last n - r new coordinates (i.e., of the elements of 17) entails a particularly simple structure for the matrices P and Q. As a matter of fact, it is easily checked that
From the particular form of this matrix, it is clear that the eigenvalues of Q coincide with the zeros of the numerator polynomial of H ( s ) ,i.e., with the zeros ofthe transfer funcction. Thus, it is concluded that in a linear system the zero dynamics are linear dynamics with eigenvalues coinciding with the zeros of the transfer function of the system. These arguments also show that the linear approximation, at f~ = 0 , of the zero dynamics of a system coincides with the zero dynamics of the linear approximation of the system at x = 0 . In order to see this, consider for f ( x ) ,g ( x ) and h ( x ) expansions of the form f ( x ) = Ax f 2 ( x )
.
g(x)
=
h(x)
=
+ B +gl(x) C x +h2(x)
where
= ["I]
The realization in question can easily be reduced to the form of Equation 57.2 1. For the 6 coordinates one has to take
ax
X=O>
B=g(O),
C=
ah
I,,
[ax
.
An easy calculation shows, by induction, that
where dk(x) is a function such that
.. .
6
=
CA"-"X = boxr $-b l x r + l b n - r - 1 ~ ~ ~xn. 1
+
+
+ ... From this, one deduces that
while for the q coordinates it is possible to choose 1
= x1
?72
= x2
...
i.e., that the relative degree of the linear approximation of the system at x = 0 is exactly r .
THE CONTROL HANDBOOK From this fact, it is concluded that taking the linear approximation of equations in normal form, based on expansions of the form b(t,rl) = R t + S q + b 2 ( t 9 r l ) a(t,rl)
= K+al(t,v)
q ( t , rl)
=
Pt
+ Qrl +q2(t7 v)
the roots of the polynomial p(s) have negative real part. Then the feedback law of Equation 57.28 locally asymptotically stabilizes the equilibrium ( t , q) = (0,O). This is a consequence of the fact that the closed-loop system has a triangular form. According to a well-known property of systems in triangular form, since by assunlption the subsystem
yields a linear system in normal form. Thus, the Jacobian matrix
which describes the linear approximation at q = 0 of the zero dynamics of the original nonlinear system, has eigenvalues that coincide with the zeros of the transfer function of the linear approximation of the system at x = 0.
has a locally asymptotically stable equilibrium at q = 0 and the subsystem = ( A BK)<
+
has a (globally) asymptotically stable equilibrium at 6 = 0, the equilibrium ( t , q) = (0,O) of the entire system is locally asymptotically stable. Note that the matrix
57.2.3 Local Stabilization of Nonlinear Minimum-Phase Systems In analogy with the case of linear systems, which are traditionally said to be "minimum phase" when all their transmission zeros have negative real part, nonlinear systems (of the form of Equation 57.20) whose zero dynamics (Equation 57.27) have a locally (globally) asymptotically stable equilibrium at z = 0 are also called locally (globally) minimum-phase systems. As in the case of linear systems, minimum-phase nonlinear systems can be asymptotically stabilized via state feedback. We discuss first the case of local stabilization. Consider again a system in normal form of Equation 57.22 and impose a feedback of the form
where co, cl, . . . , cr-1 are real numbers. This choice of feedback yields a closed-loop system of the form
with
characterizes the linear approximation of the zero dynamics at g = 0. If this matrix had all its eigenvalues in the left complex half-plane, then the result stated in Proposition 57.5 would have been a trivial consequence of the Principle of Stability in the First Approximation, because the linear approximati011 of Equation 57.29 has the form
However, Proposition 57.5 establishes a stronger result, because it relies only upon the assumption that r] = 0 is simply an asymptoticallystable equilibrium of the zero dynamics ofthe system, and this (as is well known) does not necessarily require, for a nonlinear dynamics, asymptotic stability of the linear approximation (i.e., all eigenvalues of Q having negative real part). In other words, the result in question may also hold in the presence of some eigenvalue of Q with zero real part. In order to design the stabilizing control law there is no need to know explicitly the expression of the system in normal form, but only to know the fact that the system has a zero dynamics with a locally asymptotically stable equilibrium at r] = 0. Recalling how the 4 coordinates and the functionsa(6, q) and b(6, r ] ) are related to the original description of the system, it is easily seen that, in the original coordinates, the stabilizing control law assumes the form
+ B K has a characteristic polynomial P ( S ) = CO + CIS + . . . + cr-lsr-l + s r .
In particular, the matrix A
From this form of the equations describing the closed-loop system we deduce the following interesting property. PROPOSITION 57.5 Suppose the equilibrium q = 0 of the zero dynamics of the system is locally asymptoticallystable and all
which is particularly interesting because expressed ia terms of quantities that can be immediately calculated from the original data. If an output function is not defined, the zero dynamics are not defined as well. However, it may happen that one is able to design
57.2. NONLINEAR ZERO DYNAMICS
a suitable dummy oiltput whose associated zero dynamics have an asymptotically stable equilibrium. In this case, a control law of the form discussed before will guarantee asymptotic stability. This procedure is il!lustrated in the following simple example, taken from [5].
REMARK57.5 Note that a system in the normal form of Equation 57.21, considered in the previous sections, can indeed be changed, via feedback, into a system of the form
EXAMPLE 57.5: Consider the system f'
=
xtx;
f2
=
X ~ + U
whose linear approximation at x = 0 has an uncontrollable mode corresponding to the eigenvalue = 0. Suppose one is able to find a function :y ( x l )such that =xf [y(x1)l3
fl
is asymptotically stable at xi = 0. Then, setting y = h ( x ) = y ( x l ) - x2
a system with an, asy~nptoticallystable zero dynamics is obtained. As a matter of fact, we h o w that the zero dynamics are those induced by the constraint y ( t ) = 0 for all t . This, in the present case, requires that the xi and x2 respect the constraint y ( x l ) - x2 = 0
.
Thus, the zero dynamics evolve exactly according to fi
Moreover, if the normal form of Equation 57.21 is globally defined, so also is the feedback yielding the (globallydefined) normal'form of ~ ~ u a t i 57.3 o n 1. The form of Equation 57.30 is a special case of Equation 57.31, the one in which the function q ( t , 11) depends only on the component t1of the vector 6. In Equation 57.30, for consistency with the notations more frequently used in the literature on global stabilization, the vector z replaces the vector q of Equation 57.31 and the places of z acd are interchanged. In order to describe how systems of the form of Equation 57.30 can be globally stabilized,we begin with the analysis of the (very simple) case in which r = 1. For convenience of the reader, we recall that a smooth function V : Rn -+ R is said to be positive definite if V ( 0 ) = 0 and V ( x ) > 0 for x # 0, and proper if, for any a E R,the set V-' ([0, a ] ) = { x E Rn : 0 5 V ( x ) 5 a ) is compact. Consider a system described by equations of the form
=x:~y(xi)l~
and the system can be locally stabilized by means of the procedure discussed above. A suitable choice of y ( x l ) will be, e.g., Y(
~ 1= )
-x1
.
in which ( z , t ) E Rn x R, and f (0,O) = 0 . Suppose the subsystem i = f (z,O)
has a globally asymptotically stable equilibrium at z = 0 . Then, in view of a converse Lyapunov theorem, there exists a smooth 1 f ( z , 0 ) is a(x)=( - . ~ ~ h ( x ) - - c h (=~ -) C) X ~ - - ( ~ + c ) x ~ - - x : x ~ positive definite and proper function V ( z ) such that Lgh(x> negative for each nonzero z . Using this property, it is easy to show that the system of Equation 57.32 can be globally asymptotically with c > 0. stabilized. In fact, observe that the function f ( z , t ) can be put in the form 57.2.4 Global. Stabilization of Nonlinear f ( z , 0 = f (290) I- P ( Z , 0 6 (57.33)
Accordingly, a l~ocallystabilizing feedback is the one given by
Minimum-Phase Syste.ms
In this section we consider a special class of nonlinear system that can be globally asymptotically stabilized via state feedback. The systems in ques,tionare systems that can be transformed, by means of a globally defined change of coordinates and/or feedback, into a system having this special normal form
where p ( z , 6 ) is a smooth function. For it suffices to observe that the difference
is a smooth function vanishing at 6 = 0, and to express f ( z , 4 ) as
Now consider the positive definite and proper function
THE CONTROL HANDBOOK
and observe that
and observe that the feedback law
Choosing u
x
av
u ( z , 6 ) = -6 - -P(z,
az
5)
(57.35)
yields
for all nonzero ( z , t ) . By the direct Lyapunov theorem, it is concluded that the system
has aglobally asymptoticallystableequilibrium at ( z , 5 ) = (0,O). In other words, it has been shown that, if i = f ( z , 0 ) has a globally asymptotically stable equilibrium at z = 0, then the equilibrium ( z , t ) = (0, 0 ) of the system of Equation 57.32 can be rendered globally asymptoticallystable by means of a smooth feedback law u = u ( z , 5). The result thus proven can be easily extended by showing that, for the purpose of stabilizing the equilibrium ( z , t ) = (0,O) of Equation 57.32, it suffices to assume that the equilibrium z = 0 of i = f ( z ,t ) is stabilizable by means of a smooth law 6 = v*(z).
LE57.1 Consider a system described by equations of the form of Equation 57.32. Suppose there exists a smooth realvalued function t = v*(z) t
with v* (0) = 0, and a smooth real-valued function V ( z ) ,which is positive definite and proper, such that
changes the latter into a system satisfying the hypotheses that are at the basis of the previous construction. Using repeatedly the property indicated in Lemma 57.1, it is straightforward to derive the following stabilization result about a system in the form of Equation 57.30.
THEOREM 57.2 Consider a system of the form of Equation 57.30. Suppose there exists a smooth real-valued function
with v*(O) = 0, and a smooth real-valued function V ( z ) ,which is positive definite and proper, such that
for all nonzero z. Then, there exists a smooth static feedback iaw
with u(0, 0, . . . ,0) = 0, which globally asymptotically stabilizes the equilibrium (z,{ I ,. . . ,t r ) = (0,0, . . . , 0 ) of the corresponding closed-loop system. Of course, a special case in which the result of Theorem 57.2 holds is when v*(z) = 0, i.e., when i = fo(z, 0 ) has a globally asymptotically stable equilibrium at z = 0. This is the case of a system whose zero dynamics have a globally asymptoticallystable equilibrium at z = 0, i.e., the case of a globally minimum-phase system. The stabilizationprocedure outlined above is illustrated in the following example, taken from [ I ] .
EXAMPLE 57.6: for all nonzero z. Then, there exists a smooth static feedback law u = u ( z , with u(0,O) = 0, and a smooth real-valued function W (z, 0,which is positive definite and proper, such that
e)
for all nonzero ( z , 6). In fact, it suffices to consider the (globally defined) change of variables y = 5 - v*(z>, which transforms Equation 57.32 into
Consider the problem of globallyasymptoticallystabilizing the equilibrium ( x l :k2, x s ) = (0,0,O) of the nonlinear system
x2
= x23 =
13
=
x1
4
(57.37)
U.
To this end, observe that a "dummy output" of the form Y =X3
- u*(x~,~
2 )
yields a system having relative degree r = 1 at each x E IR3 and two-dimensional zero dynamics. The latter, i.e., the dynamics, obtained by imposing on Equation 57.37 the constraint y = 0, are those of the autonomous system
,
.
5 7.3. N O N L I N E A R 0 U T P U T REGULATION From the discussion above we know that, if it is possible to find a function v * ( x l ,x 2 ) that globally asymptotically stabilizes the o n then there exists equilibrium ( x i , x 2 ) = ( 0 , O )of ~ ~ u a t i 57.38, an input u ( x l ,x2, x 3 ) that globally asymptotically stabilizes the equilibrium ( x i , x2, x3) = ( 0 , 0 , O ) of Equation 57.37. It is easy to check that the function
with
accomplishes this task. In fact, consider the system
and i = f ( z , 0 ) has a globally asymptotically stable equilibrium at z = 0 . As a consequence, this system can be globally asymptotically stabilized by means of a feedback law u' = u l ( z ,c) of the form of Equation 57.35.
and choose a candidate Lyapunov function
References
which yields
This function is nonpositive for all ( x l , x z ) and zero only at xl = 0 or x2 = 0. Since no nontrivial trajectory of Equation 57.39 is contained in the set
by Lasalle's invariance principle it is concluded that the equilibrium ( x l , x 2 ) = (0,O) of Equation 57.39 is globally asymptotically stable. In order to obtain the input function that globally stabilizes the equilibrium ( x l , x2, x3) = ( 0 , 0 , O ) of Equation 57.37, it is necessary to use the construction indicated in the proof of Lemma 57.1. in fact, consider the change of variables
[ l ] Byrnes, C.I. and Isidori, A., New results and examples in nonlinear feedback stabilization, Syst. Control Lett., 12,437-442,1989. [2] Tsinias, J., SufficientLyapunov-like conditionsfor stabilization, Math. Control Signals Syst., 2, 424-440, 1989. [3] Byrnes, C.I. and Isidori, A., Asymptotic stabilization of minimum phase nonlinear systems, IEEE Trans. Autom. Control, 36, 1122-1137, 1991. [4] Byrnes, C.I. and Isidori, A., On the attitude stabilization of a rigid spacecraft, Automatics, 27, 87-96, 1991. [5] Isidori, A,, Nonlinear Control Systems, 2nd. ed., Springer-Verlag, 1989.
57.3 Nonlinear Output Regulation Alberto Isidori, Dipartimento d i Informatica e Sistemistica, Universita di R o m a "La Sapienza", R o m e and Department of Systems Sciences and Mathematics, Washington University, St.Louis, M O
which transforms Equation 57.37 into
57.3.1 The Problem of Output Regulation
Choosing a pre:liminary feedback
yields f1
= x;
ffz
= (Y+U*(X~,X~))~
y
(57.41)
A classical problem in control theory is to impose, via feedback, a prescribed steady-state response to every external command in a given family. This may include, for instance, the problem of having the output y (.) of a controlled plant asymptotically track any prescribed reference signal y,,f(.) in a certain class of functions of time, as well as the problem of having y (.) asymptotically reject any undesired disturbance w (.) in a certain class of disturbances. In both cases, the issue is to force the so-called tracking error, i.e., the difference between the reference output and the actual output, to be a function of time
= u'.
which has exactly the form of Equation 57.32, namely, that decays to zero as time tends to infinity, for every reference output and every undesired disturbance ranging over prespecified families of functions.
THE CONTROL HANDBOOK The problem in question can be characterizedas follows. Consider a nonlinear system modeled by equations of the form
f (x, w, U)
li.
=
e
= h(x, w).
The first equation of Equation 57.42 describes the dynamics of a plant, whose state x is defined in a neighborhood U of the origin in Rn,with control input u E Rm and subject to a set of exogenous input variables w E Rr, which includes disturbances (to be rejected) and/or references (to be tracked). The second equation defines an error variable e E Rm,which is expressed as a function of the state x and of the exogenous input w. For the sake of mathematical simplicity, and also because in this way a large number of relevant practical situations can be covered, it is assumed that the family of the exogenous inputs w(.) that affect the plant, and for which asymptotic decay of the error is to be achieved, is the family of all functions of time that are solutions of a (possiblynonlinear) homogeneous differential equation w = s(w) (57.43) with initial condition w(0) ranging on some neighborhood W of the origin of Rr . This system, which is viewed as a mathematical model of a "generator" of all possible exogenous input functions, is called the exosystem. It is assumed that f (x, w, u), h(x, w), s(w) are smooth functions. Moreover, it is also assumed that f (0, 0,O) = 0, s(0) = 0, h(0,O) = 0. Thus, for u = 0, the composite system of Equations 57.42 and 57.43 has an equilibrium state (x, w) = (0, 0) yielding zero error. The control action to Equation 57.42 is to be provided by a feedback controller that processes the information received from the plant in order to generate the appropriate control input. The structure of the controller usually depends on the amount of information available for feedback. The most favorablesituation, from the point of view of feedback design, occurs when the set of measured variables includes all the components of the state x of the plant and of the exogenous input w. In this case, it is said that the controller is provided with full information and can be constructed as a memoryless system, whose output u is a function of the states x and w of the plant and of the exosystem, respectively, u =a(x, w). (57.44) The interconnection of Equation 57.42 and Equation 57.44 yields a dosed-loop system described by the equations .
In particular, it is assumed that a(0,O) = 0, so that the closedloop system of Equation 57.45 has an equilibrium at (x, w) = (0,O). A more realistic, and rather common, situation is the one in which only the components of the error e are available for measurement. In this case, it is said that the controller is provided
with error feedback, and it can be useful to synthesize the control signal by means of a dynamical nonlinear system, modeled by equations of the form
e
with internal state defined in a neighborhood E of the origin in
R".The interconnection of Equation 57.42 and Equatlon 57.46 yields, in this case, a closed-loop system characterized by the equations .i= f (x? W ? @(O>
e
=
rl(<,h(x, w))
(57.47)
Again, it is assumed that q(0,O) = 0 and @(0)= 0, so that the triplet (x, 6, w) = (0,0,O) is an equilibrium of the closed-loop system of Equation 57.47. The purpose of the control is to obtain a closed-loop system in which, for every exogenous input w (.) (in the prescribed family) and every initial state (in some neighborhood of the origin), the output e ( . ) decays to zero as time tends to infinity. When this is the case, the closed-loop system is said to have the property of output regulation. The property of output regulation is particularly meaningful when the exogenous inputs are "persistent" in time, as it is in the case of any periodic (and bounded) function. In these cases, in fact, the system may exhibit a "steady-state response" that is itself a persistent function of time, and whose characteristics depend entirely on the specific input imposed on the system and not on thestate in which the system was at the initial time. To ensure that the exogenous inputs generated by an exosystem of the form of Equation 57.43 are bounded, it suffices to assume that the point w = 0 is a stable equilibrium (in the ordinary sense of Lyapunov) of the vector field s(w) and to choose the initial condition at time t = 0 in some appropriate neighborhood WO c W of the origin. To guarantee that the inputs are persistent in time (that is, to exclude the possibility that some input decays to zero as time tends to infinity), it is convenient to assume that every point w of WOis Poisson stable. We recall that a point w0 is said to be Poisson stable if the flow @s(wO)of the vector field s(w) is defined for all t E R and, for each neighborhood U0 of w0 and for each real number T > 0, there exists a time t i > T such that (wO)E UO,and a time tz < -T such that @;2(w0) E UO. In other words, a point w0 is Poisson stable if the trajectory w(t) that originates in wOpasses arbitrarily close to w0 for arbitrarily large times, in both the forward and backward directions. Thus, it is clear that if every point wOof W0 is Poisson stable, no trajectory of Equation 57.43 can decay to zero as time tends to infinity. In what follows, we assume that the vector field s(w) has the two properties indicated above; namely, that the poinr w = 0 is
'See, e.g., [9] for the definitions of vectorfield and flow of a vector field.
5 7.3. NONLINEAR 0 UTPUT REGULATION a stable equilibrium (in the ordinary sense) and there exists an open neighborhood of the point w = 0 in which every point is Poisson stable. For convenience, these two properties together will be referred to as the property of neutral stability.
REMARK 57.6 Note that the hypothesis of neutral stability implies that the matrix
which characterizes the linear approximation of the vector field S ( W ) at w = 0, has all its eigenvalues on the imaginary axis.
Error Feedback Output Regulation Problem Given anonlinear system ofthe form ofEquation 57.42 and a neutrally stable exosystem modeled by Equation 57.43, find, if possible, an integer v and two mappings 6 ( 6 ) and ~ ( 6e ), , such that (S)EF the equilibrium ( x , 6 ) = (0,O) of Equation 57.49 is asymptotically stable in the first approximation. (R)EF there exists a neighborhood V c U x 6 x W of ( O , O , O ) such that, for each initialcondition [x(O), 6 ( 0 ) , w(O)] E V , the solution of Equation 57.47 satisfies lim t-+m
If the exosystem is neutrally stable and the closed-loop system
is asymptotically stable in tlte first approximation, then the response of the composite system of Equation 57.45 from any initial state [x(O),w(O)]in a suitable neighborhood of the origin (0,O) converges, as t tends to oo, to a well-defined steady-state response, which is independent of x(0) and depends only on w ( 0 ) . If this steady-state response is such that the associated tracking error is identically zero, the,n the closed-loop system of Equation 57.45 has the required property of output regulation. This motivates the following definition.
Since one of the specifications in the problem of output regulation is that of achieving stability in the first approximation, it is clear that the properties of stabilizability and detectability of the linear approximation of the controlled plant at the equilibrium ( x , w , u) = ( 0 , 0 , 0 )play a determinant role in the solution of this problem. For notational convenience, observe that the closed-loop system of Equation 57.45 can be written in the form
where $ ( x . w ) and + ( w ) vanish at the origin with their firstorder derivatives, and A, B, P , K , L, S are matrices defined by
Full Inform,ationOutput Regulation Problem Given a nonlinear system of the form of Equation 57.42 and a neutrally stable emsystem modeled by Equation 57.43, find, if possible, a mapping a ( x , w ) such that ( S ) F I the equilibrium x = 0 df Equation 57.48 is asymptotically stable in the first approximation. (R)FIthere exists a neighborhood V C U x W of (0,O) such that, for each initial condition [x(O), w ( 0 ) ] E V , the solution of Equation 57.45 satisfies
lim h ( x ( t ) ,w ( t ) ) = 0 .
r-+m
Again, if the exosystern is neutrally stable and the closed-loop system
is asymptotically stable in the first approximation, then the response of the composite system of Equation 57.47 from any initial state [x(O),6(0), w(O)]in a suitable neighborhood of the origin ( 0 , 0 , 0 )converges, as t tends to ao,to a well-defined steady-state response, which is independent of [x(O),6(0)]and depends only on w(0). If this steady-state response is such that the associated tracking error is identically zero, then the closed-loop system of Equation 57.47 has the required property of output regulation.
On the other hand, the closed-loop system of Equation 57.47 can be written in the form f
=
Ax+BH6+Pw+$(x,6,w)
4
=
w
=
F6 G C x G Q w Sw+$r(w:)
+
+
+ ~ ( x6 ,,w )
where $ ( x , .$,w ) , x ( x , 6 , w ) and $r(w)vanish at the origin with their first-order derivatives, and C , Q , F, H , G are matrices defined by
Using this notation, it is immediately realized that the requirement ( S ) F ~is the requirement that all the eigenvalues of the Jacobian matrix of (Equation 57.48) at x = 0 ,
THE CONTROL HANDBOOK have negative real part, whereas ( S ) E ~ is the requirement that all the eigenvalues of the Jacobian matrix of Equation 57.49 at (x, 0 = (0, O),
have negative real part. From the theory of linear systems, it is then easy to conclude that: 1. (C)F~can be achieved only zf the pair of matrices (A, B) is stabilizable [i.e., there exists K such that all the eigenvalues of (A f B K ) have negative real part]. 2. (S)EF can be achieved only if the pair of matrices (A, B) is stabilizable and the pair of matrices (C, A) is detectable [i.e., there exists G such that all the eigenvalues of (A + G C ) have negative real part]. These properties ofthe linear approximation ofthe plant ofEquation 57.42 at (x, w, u ) = (0,0, 0) are indeed necessary conditions for the solvability of a problem of output regulation.
57.3.2 Output Regulation in the Case of Full Information In this section, we show how the problem of output regulation via full information can be solved.
THEOREM 57.3 The full information output regulation problem (Section 57.3.1) is solvable if and only if the pair (A, B) is stabilizable and there exist mappings x = n(w) and u = c(w), with n(0) =I 0 and c(0) = 0, both defined in a neighborhood W 0 c W of the origin in Rr,satisfying the conditions
for all w
E
Wo.
To see that the conditions of Theorem 57.3 are sufficient, observe that, by hypothesis, there exists a matrix K such that (A + B K ) has eigenvalues with negative real part. Suppose the conditions of Equation 57.52 are satisfied for some n(w) and c(w), and define a feedback law in the following way
It is not difficult to see that this is a solution ofthe full information output regulation problem. In fact, this choice clearly satisfies the requirement (S)FI because a ( x , 0) = Kx. Moreover, by construction, the function a ( x , w) is such that
This identity shows that the graph of the mapping x = n(w) is an invariant (and locally exponentially attractive) manifold for
Equation 57.45. In particular, there exist real numbers M > 0 and a > 0 such that
for all t 2 0 and for every sufficiently small [x (0), w (O)]. Thus, since the error map e = h(x, w) is zero on the graph of the mapping x = n(w), the condition (R)FI is satisfied.
REMARK 57.7 The result expressed by Theorem 57.3 can be interpreted as follows. Observe that, if the identities of Equation 57.52 hold, the graph of the mapping x = n(w) is an invariant manifold for the composite system
and the error map e = h(x, w) is zero at each point of this manifold. From this interpretation, it is easy to observe that, for any initial (namely, at time t = 0) state w* of the exosystem, i.e., for any exogenous input
if the plant is in the initial state x* = n(w*) and the input is equal to u*(t) = c(w*(~)) then e(t) = 0 for all t 2 0. In other words, the control input generated by the autonomous system
is precisely the control required to impose, for any exogenous input, a response producing an identically zero error, provided that the initial condition of the plant is appropriately set [namely, at x* = n(w*)]. The question of whether such a response is actually the steadystate response [that is, whether the error converges to zero as time tends to infinity when the initial condition of the plant is other than x* = n(w*)] depends indeed on the asymptotic properties of the equilibrium x = 0 of f (x, 0, 0). If the equilibrium at x = 0 is not stable in the first approximation, then in order to achieve the required steady-state response, the control law must also include a stabilizing component, as in the case of the control law a ( x , w) indicated above. Under this control law, the composite system
still has an invariant manifold of the form x = n(w), but the latter is now locally exponentially attractive. In this configuration, from any initial condition in a neighborhood of the origin, the response of the closed-loop system
5 7.3. NONLINEAR 0 UTPUT REGULATION to any exogenous input w* (.) converges towards the response of the open-loop system
assumes the form
il
= Z2
x = f ( x ,w , u ) produced by the same exogenous input w*(.), by the control , initial condition x* = n(w*). input u*(.) = c ( w * ( . ) )with
REMARK 57.8 If the controlled plant is a linear system, the conditions of Equation 57.52 reduce to linear matrix equations. In this case, the systern in question can be written in the form
e
= ZI + p ( w ) .
In order to check whether the conditions of Equation 57.52 can be solved, it is convenient to set
with k ( w ) = col(kl ( w ) ,. . . , k r ( w ) )
and, if the mappings x = n ( w ) and u = c ( w ) are put in the form n(w) = nw+ii(w)
In this case, the equations in question reduce to
with
the conditions of Equation 57.52 have a solution if and only if the linear matrix equations The last one of these, together with the first r - 1, yields immediately k i ( w ) = -Li-' . P(w) (57.56) are solved by some 11 and r. Note that, if this is the case, the mappings n ( w ) and I C ( W )that solve Equation 57.52 are actually linear mappings [i.e., n(w) = n w and c ( w ) = r w ] .
for all 1 5 i 5 r. The r -th equation can be sohed by
We describe now how the existence conditions of Equation 57.52 can be tested in the particular case in which m = 1 (one-dimensional control input and one-dimensional error) and the conditions of Equation 57.42 assume the form
and, therefore, we can conclude that the solvability of the conditions ofEquation 57.52 is, in this case, equivalentto the solvability of aA (57.58) -s(w) = q ( k ( w ) ,U w ) ) aw for some mapping q = A(w). We can therefore conclude that the full information output regulation problem is solvable if and only if the pair (A, B) is stabilizable and Equation 57.58 can be solved by some mapping h ( w ) ,with h(0) = 0. An application of these results is illustrated in the following example, taken from [ 9 ] .
which corresponds to the case af a single-input single-output system whose output is required to track my reference trajectory produced by zi, = s ( w ) Yref
=
-P(w)
.
We also assume that the triplet { f ( x ) ,g(x), h ( x ) )has relative degree r at x = 0 so that coordinate transformation to a normal form is possible.% ,thenew coordinates, the system in question
EXAMPLE 57.7: Consider the system already in normal form
11 12 'see, e.g.,Chapter 57.1 for the definitions of relativedegreeandnormal form.
= x2 =
u
rj
=
tl+x, + x i
Y
= x1
THE CONTROL HANDBOOK
and suppose it is desired to asymptotically track any reference output of the form
where a is a fixed (positive) number, and M, 4 arbitrary parameters. In this case, any desired reference output can be imagined as the output of an exosystem defined by
and therefore we could try to solve the problem via the theory developed in this section, i.e., posing a full information output regulation problem. Following the procedure illustrated above, one has to set
is asymptotically decaying to zero, and so is the error e ( t ) ,which in this case is exactly equal to x l . In fact, the variables tl, t2,t3 satisfy
57.3.3 Output Regulation in the Case of Error Feedback In this section, it will be shown that the existence of a solution of the problem of output regulation in the case of error feedback depends, among other things, on a particular property of the autonomous system of Equation 57.54, which, as we have seen, may be thought of as a generator of those input functions that produce responses yielding zero error. In order to describe this property, we need first to introduce the notion of immersion of a system into another system (see [2]). Consider a pair of smooth autonomous systems with outputs
and then search for asolution h(w w2)ofthe partial differential Equation 57.58, i.e., and j=f(%), An elementary calculation shows that this equation can be solved by a complete polynomial of second degree, i.e.,
Once A(wl, w2) has been calculated, from the previous theory it follows that the mapping
and the function
are solutions of the conditions of Equation 57.52. In particular, a solution of the regulator problem is provided by
y=i(~)
defined on two different state spaces, X and 2, but having the same output space Y = Em. Assume, as usual, f (0) = 0, h(0) = 0 and f(0) = 0,h(0) = 0 and let the two systems - -in question be denoted, for convenience, by { X , f,h) and { X , f , h), respectively. -.. System ( X , f , h) is said to be immersed into system ( X , f , h) if there exists a (continuosly differentiable) mapping t : X -+ ?, with k 1 1, satisfying r(0) = 0 and such that
-
for allx E X . It is easy to realize that the two conditions indicated in this definition express nothing else than the property that any output response generated by { X , f,h) is also an output response of - { X , f , h}. In fact, the first condition impliesthat the flows @,f (x)
-
and Q,i( 2 ) of the two vector fields f and f (which are r -related), satisfy in which K = (cl c2 c3) is any matrix that places the eigenvalues of
(;R 8) + (i)
in the left complex half-plane. As expected, the difference
K
r(@{(x)) = @/(r(x)) for all x yields
E
X and all t 2 0, from which the second condition
for allx E X and all t 2 0, thus showing that the output response produced by { X , f,h), when its initial state-is-any x E X , is a response that can also be proguced by ( X , f , h), if the latter is set in the initial state r (x) E X .
-
1
57.3. NONLINEAR 0 UTPUT REGULATION The notion of immersion is relevant because sometimes
{x,f,h ] may have some special property that {X, f,h ] does not have. For example, any linear system can always be immersed into an observablelinear system, and a similar thing occurs, under appropriate hypotheses, also in the case of a nonlinear system. Or, for instance, one may wish to have a nonlinear system immersed into a linear system, ifpossible. The notion of immersion is important for the aolution of the problem of output regulation in the case of error feedback because the possibility of having the autonomous system of Equation 57.54 immersed into a system with special properties is actually a necessary and sufficient condition for the existence of such a solution. In fact, the following result holds. THEOREM 57.4 'The error feedback output regulation problem (Section 57.3.1) is soi'vable ifand only ifthere exist mappings x = n(w) and u = c(w),with n(0) = 0 and c(0) = 0,both defined in a neighborhood kVO c W of the origin in IWr, satisfying the conditions an --s(w) aw
f (n(w),W ,4 ~ ) )
=
(57.60)
0 = h(n(w),w ) ,
for all w
E
WOandsuch that the autonomous system with output
{ W Os,,c)is immersed into a system
<
=
~(6)
Lt
=
~(61,
It is easy to see that the controller thus defined solves the problem of output regulation. In fact, it is immediately seen that all the eigenvalues of the Jacobian matrix of the vector field
at (~1603 61) = (07 03 0))which has the form
have negative real part. Moreover, by hypothesis, there exist mappings x = n(w),u = c(w) and t1 = s(w) such that the conditions of Equation 57.60 hold and
ar -s(w) aw = cp(r(w)),
C(W)= Y(~(w)).
This shows that the graph of the mapping
defined on a neighborhood E0of the origin in RV, in which ~ ( 0=) 0 and y (0)= 0,and the two matrices
@=[$'],=,
have negative real part. Now, consider the controller
(57,61
is an invariant (and locally exponentially attractive) manifold for the corresponding closed-loop system of Equation 57.47 and this shows, as in the case of full information, that the condition (R)EF is satisfied.
(57.62)
REMARK 57.9
.=[$It=,
are such that the pair
The controller of Equation 57.64 consists of the parallel connection of two subsystems: the subcontroller
is stabilizable for some choice ofthe matrix N , and the pair (57.63)
is detectable. To see that the conditions indicated in he or em 57.4 are sufficient, choose N so that Equation 57.62 is stabilizable. Then, observe that, as a consetquence of the hypotheses on Equations 57.62 and 57.63,the triplet
u
+
cpI1) N e = Y (61),
=
(1
(57.65)
and the subcontroller
The role of the second subcontroller, which is a linear system, is nothing else than that of stabilizing in the first approximation the interconnection 1
=
+
f ( x , w7 ~ ( 6 1 ) U )
is stabilizable and detectable. Choose K, L, M so that all the eigenvalues of that is, the interconnection of the controlled plant and the first subcontroller. The role of the first subcontroller, on the other
THE CONTROL HANDBOOK hand, is that of producing an input that generates the desired steady-state response. As a matter of fact, the identities
(which hold by construction) show that the submanifold
achieved if Equation 57.67 is a linear system, but it may not be possible in general. If Equation 57.67 is immersed into a linear detectable system, then the previous construction shows that the error feedback output regulation problem is solvable by a linear controller. Fortunately, there are simple conditions to test whether a given nonlinear system can be immersed into a linear system, and the use of these conditions may yield interesting and powerful corollaries of Theorem 57.4.
is an invariant manifold of the composite system
i.e., of the closed-loop system driven by the exosystem, and on this manifold the error map e = h ( x , w) is zero. The role of the subcontroller of Equation 57.65 is that of producing, for each initial condition in M,, an input that keeps the trajectory of this composite system evolving on M , (and thereby producing a response for which the error is zero). For this reason, the subcontroller of Equation 57.65 is sometimes referred to as an internal model of the generator of exogenous inputs. The role of the subcontroller of Equation 57.66 is that of rendering Mc locally exponentially attractive so that every motion starting in a sufficiently small neighborhood of the equilibrium (x, to,tl, W) = (0,0,0,O) exponentially converges towards the desired steady-state response. The statement of Theorem 57.4 essentially says that the problem of output regulation in the case of error feedback is solvable if and only if it is possible to find a mapping c(w) that renders the identities of Equation 57.60 satisfied for some n(w) and, moreover, is such that the autonomous system with output
COROLLARY 57.1 The error feedback output regulation problem is solvable by means of a linear controller if the pair (A, B) is stabilizable, the pair (C, A) is detectable, there exist mappingsx = rr(w)andu = c(w),withrr(O) = Oandc(0) = 0, both defined in a neighborhood W" C W of the origin, satisfying the conditions (of Equation 57.60) and such that, for some set of q real numbers ao, a l , . . . , aq- 1, L;C(W) = aoc(w)
+ a1 L,yc(w)+ . . . + ay-] L!-~C(W) , (57.68)
and, moreover, the matrix
is nonsingular for every A, which is a root of the polynomial
having nonnegative real part.
PROOF 57.1 The condition given by Equation 57.68 implies that { W Os, , c] is immersed into a linear observable system. In particular, it isvery easy to check that { W " , s, c) is immersed into the linear system 6 = Qt U = rt in which
satisfies a certain number of special conditions, which are expressed as properties of the linear approximation of an "auxiliary" system in which the latter is requested to be immersed. It is important to observe that the condition that the pair of Equations 57.62 is stabilizable implies the condition that the pair (A, B) is stabilizable; and, similarly, the condition that the pair of Equations 57.63 is detectable implies the condition that the pair (C, A) is detectable. Thus, the conditions of Theorem 57.4 include, as expected, the trivial necessary conditions mentioned earlier for the fulfillment of (S)EF. Moreover, the condition that the pair of Equations 57.63 is detectable also implies the condition that the pair ( r , @) is detectable. Therefore, a necessary condition for the solution of the problem of output regulation in the case of error feedback is that, for some c ( w ) satisfying Equation 57.60, the autonomous system with outputs of Equation 57.67 is immersed into a system whose linear approximation at the equilibrium = 0 is detectable. This can always be
<
-
Q
=
diag(G,. , . , @)
r
=
diag(r , . . . , r )
-
-
and
Note that the minimal polynomial of 0, is equal to p ( l ) . After having chosen amatrix N such that the pair (@, N) is stabiliuable, for instance N = d i a g ( z , . .., 2) with
-
N = col(O,O, . . . , 0 , 1) ,
5 7.3. NONLINEAR 0 UTPUT REGULATION using a standard result in linear system theory, one can conclude that the remaining conditions of Theorem 57.4 hold because the matrix of Equation 57.69 is nonsingular for every A that is an eigenvalue of a.
so long as p stays in some open neighborhood 'P of the origin in the parameter space. Moreover, this controller will be such e ( t ) = 0 for every [x(O),< ( O ) , w a ( 0 ) ]in a neighthat limt,, borhood of the origin. Since
57.3.4 Struct~rrallyStable Regulation Suppose nowthat the mathematical model ofthe controlled plant depends on a vector p E RP of parameters, which are assumed to be fixed, but who:ie actual value is not known
Without loss of generality, we suppose p = 0 to be the nominal value of the parameter p and, for consistency with the analysis developed earlier, we assume f ( x , w , u , p ) and h ( x , w , p ) to be smooth functions of their arguments. Moreover, it is assumed also that f (0, 0 , O , p ) = 0 and h ( 0 , 0 , p ) = 0 for each value of
the controller in question trivially yields the required property of output regulation for any plant of the family of Equation 57.70, so long as p stays on some open neighborhood P ' of the origin in the parameter space. The conditions provided in Theorem 57.4 or in Corollary 57.1 can be easily translated into necessary and sufficient conditions for the existence of solutions of the problem of structurally stable regulation. For instance, as far as the latter is concerned, the following result holds. Set
P.
The structurally stable output regulation problem is to find a controller, described1by equations of the form of Equation 57.46, that solves the error jfeedback output regulation problem for each value of p in some rleighborhood P of p = 0 in RP. The solution of the problem in question is easily provided by the results illustrated in the previous section. In fact, it suffices to look at w and p, as if they were components of an "augmented" exogenous input
and also observe that, because of the special form of the vector field s a ( w a ) ,
which is generated by the "augmented" exosystem
and the derivative of any function h ( w , p ) along s"wa) reduces to a u w , l*.) LsaA(w,p ) = ------s (w). aw For convenience, the latter will be indicated simply as
and regard the family of plants modeled by Equation 57.70 as a single plant of the form of Equation 57.42, modeled by equations of the form x = f a ( x , w a ,M ) It is easy to reali:ze that a controller that solves the problem of output regulatioin for the plant thus defined also solves the problem of structurally stable output regulation for the family of Equation 57.70. In fact, by construction, this controller will stabilize in the first approximation the equilibrium ( x , t ) = (0,O)
that is, the equilibrium ( x ,c) = (0,O) of
for p = 0. Since the property of stability in the first approximation is not destroyed.by small parameter variations, the controller in question stabilizt:~any plant of the family of Equation 57.70,
COROLLARY 57.2 The structurally stable output regulation problem is solvable by means of a linear controller if the pair [A(O),B(O)] is stabilizable, the pair [C(O),A ( 0 ) ] is detectable, there exist mappings x = na( w , y ) and u = ca( w , p ) , with na(O,p ) = 0 and ca(O,p ) = 0, both defined in a neighborhood WO x P C W x RPof the origin, satisfying the conditions
(57.71) and such that, for some set of q real numbers ao, a1 , . . . , aq-1,
for all ( w , p ) E W 0x P, and, moreover, the matrix
THE CONTROL HANDBOOK is nonsingular for every h that is a root of the polynomial
having nonnegative red part. It is interesting to observe that Corollary 57.2 contains, as particular cases, a number of results about structurally stable regulation of linear and nonlinear systems. In a linear system, the solutions na(w, p ) and ca(w, p) of the regulator modeled by Equation 57.71 are linear in w and, therefore, for a linear exosystem w = Sw, the condition of Equation 57.72 is indeed satisfied. The roots of p(h) coincide with those of S (possibly with different multiplicity) and thus the matrix of Equation 57.73 is required to be nonsingular for every eigenvalue h of S. Note also that this condition guarantees the existence of solutions on the regulator modeled by Equation 57.71 for each value of p. In the case of nonlinear systems and constant exogenous inputs, the condition ofEquation 57.72 is trivially satisfied by q = 1 and a. = 0. Finally, another case in which the condition of Equation 57.72 is satisfied is that of a nonlinear system in which the solutions n a (w , p ) and ca(w , p ) of the regulator modeled by Equation 57.71 are such that ca(w, p) is a polynomial in w whose degree does not exceed a fixed number independent of p, and the exosystem is any arbitrary linear system.
References [ l ] Davison, E.J., The robust control of a sewomechanismproblem, IEEE Trans.Autom. Control,21,25-34, 1976. [2] Fliess, M., Finite dimensional observation spaces for
nonlinear systems, in Feedback Control of Linear and Nonlinear Systems, Hinrichsen, D. and Isidori, A., Eds., Springer-Verlag, 1982,73-77. 131 Francis, B.A., The linear multivariableregulatorproblem, SIAM J. Control Optimiz., 14,486-505, 1976. [4] Francis, B.A. and Wonham, W.M., The internal model principle of control theory, Automatica, 12,457-465, 1977. [5] Hepburn, LS.A. and Wonham, W.M., Error feedback
and internal model on differentiable manifolds, IEEE . Trans. Autom. Control, 29,397-403, 1984. [6] Hepburn, J.S.A and Wonham, W.M., Structurallystable nonlinear regulation with step inputs, Math. Syst. Theory, 17,319-333, 1984. [7] Huang, J. and Lin, C.F., On a robust nonlinear servomechanism problem, in Proc. 30th IEEE Conf. Decision Control, Brighton, England, December 1991, 2529-2530. [8] Huang, J. and Rugh, W.J., On the nonlinear multivariable servomechanism problem, Autornatica, 26, 963-972,1990. 1
/9] isid~ri,A., Nonlinear Control Systems, 2nd. ed., Springer-Verlag, 1989. [ 101 Isidori,A. and Byrnes, C.I., Output regulation ofnonlinear systems, IEEE Trans. Autom. Control, 35, 131140, 1990.
57.4 Lyapunov Design Randy A. Freeman,
University of Califor-
nia, Santa Barbara
Petar V Kokotovib,
University of Califor-
nia, Santa Barbara
57.4.1 Introduction Lyapunov functions represent the primary tool for the stability analysis of nonlinear systems. They verify the stability of a given trajectory, and they also provide an estimate of its region of attraction. The purpose of this text is to illustrate the utility of Lyapunov functions in the synthesis of nonlinear control systems. We will focus on recursive state-feedback design methods which guarantee robust stability for systems with uncertain nonlinearities. Lyapunov design is used in many other contexts, such as dynamic feedback, output feedback, gain assignment, estimation, and adaptive control, but such topics are beyond the scope of this chapter. Given a state-space model of a plant, the Lyapunov design strategy is conceptuallystraightforward and consists of two main steps: 1. Construct a candidate Lyapunov function V for the closed-loop system. 2. Construct a controller which renders its derivative v negative for all admissible uncertainties.
Such a controller design guarantets, by standard Lyapunov theorems, the robust stability of the closed-loop system. The difficulty lies in the first step, because only carefully constructed Lyapunov functions can lead to success in the second step. In other words, for an arbitrary Lyapunovfunction candidate V, it is likely that no controller can render v negative in the entire region of interest. Those select candidates which do lead to success in the second step are called control Lyapunov functions. Our first design step should, therefore, be to construct a control Lyapunov function for the given system; this will then insure the existence of controllers in the second design step. In Section 57.4.2 we review the Lyapunov redesign method, in which a Lyapunov function is known for the nominal system (the system without uncertainties) and is used as the control Lyapunov function for the uncertain system. We will see that this method is essentially limited to systems whose uncertainties satisfy a restrictive matching condition. In Section 57.4.3 we show how such limitations can be avoided by taking the uncertainty into account while building the control Lyapunov function. We then present a recursive robust control design procedure in Sec-
5 7.4. LYAPUNOV DESIGN tion 57.4.4 for a class of uncertain nonlinear systems. Flexibilities in this recursive design are discussed in Section 57.4.6.
57.4.2
Lyapurnov Redesign
A standard method f or achieving robustness to state-spaceuncertainty is Lyapunov rtdesign; see [12]. In this method, one begins with a Lyapunov furiction for a nominal closed-loop system and then uses this Lyapu nov function to construct a controller which guarantees robustness to given uncertainties. To illustrate this method, we consider the system, x = F(x)
+ G(x)u + A(x, t),
,
(57.74)
at all points where VV(x) . G(x) = 0. This inequality constraint on the uncertainty A is necessary for the Lyapunov redesign method to succeed. Unfortunately, there are two undesirable aspects of this necessary condition. First, the allowable size of the uncertainty A is dictated by F and V and can thus be severely restricted. Second, this inequality (Equation 57.79) cannot be checked a priori on the system (Equation 57.74) because it depends on the choice for V. These considerations lead to the following question. Are there structural conditions that can be imposed on the uncertainty A so that the necessary condition (Equation 57.79) is automatically satisfied? One such structural condition is obvious. Ifwe require that the uncertainty A is of the form, A(x, t ) = G ( x ) . ~ ( xt ), ,
(57.80)
where F and G are known functions comprising the nominal system and A is an uncertain function known only to lie within some bounds. For example, we may know a function p(x) so that I A(x, t) 15 p ( x ) . A more general uncertainty A would also depend on the control variable u, but for simplicity we do not consider such uncertainty here. We assume that the nominal system is stabllizablc, that is, that some state feedback u,,, (x) exists so that the nolminal closed-loop system,
for some uncertain function A, then clearly VV . A = 0 at all points where VV . G = 0, and thus the necessary condition (Equation 57.79) is satisfied. In the literature, Equation 57.80 is called the matchingcondition because it allows the system (Equation 57.74) to be written
has a globally asymptotically stable equilibrium at x = 0. We also assume knowledge of a Lyapunov function V for this system so that
where now the uncertainty A is matched with the control u, that is, it enters the system through the same channel as the control [21, (41, (121. There are many methods available for the design of ur,b(x) when the matching condition (Equation 57.80) is satisfied. For example, if the uncertainty is such that I i ( x , t ) 15 p(x) for some known function p, then the control,
whenever x # 0.Our task is to design an additional robustifying feedbackurob(x)so that thecompositefeedbacku = u,,, +uroh robustly stabilizes the system (Equation 57.74, that is, guarantees stability for every admissible uncertainty A. It suffices that the derivative of V along closed-loop trajectories is negative for all such uncertainties. 'We compute this derivative as follows:
Can we make this derivative negative by some choice of urob(x)? Recall from Equation 57.76 that the first of the two terms in Equation 57.77 is negative; it remains to examine the second of these ternis. For those values of x for which the coefficient VV(x) . G(x) of the control urob(x) is nonzero, we can always choose the value of urob(x) large enough to overcome any finite bound on the uncertainty A and thus make the second term in Equation 5'7.77 negative. The only problems occur on the set where VV(x) . G(x) = 0, because on this set
regardless of our choice for the control. Thus to guarantee the negativity of V , the uncertainty A must satisfy
yields
The first term in Equation 57.83 is negative from the nominal design (Equation 57.76), and the second term is also negative because we know that I & (x, t) Is p (x). The composite control u = Unom Urob thus guarantees stability and robustness to the uncertainty A. This controller (Equation 57.82), proposed, for example, by [8], is likely to be discontinuous at points where VV(x) . G(x) = 0. Indeed, in the scalar iaput case, Equation 57.82 becomes
+
which is discontinuous unless P(x) = 0 whenever VV(x) . G(x) = 0. Corless and Leitmann [4] introduced a continuous approximation to this controller which guarantees convergence, not to the point x = 0, but to an arbitrarily small prescribed neighborhood of this point. We will return to this continuity issue in Section 57.4.5.
.
THE CONTROL HANDBOOK
We have seen that, because of the necessary condition (Equation 57.79), the Lyapunov redesign method is essentially limited to systems whose uncertainties satisfy the restrictive matching condition. In the next sections, we will take a different look at Equation 57.79 and obtain much weaker structural conditions on the uncertainty, which still allow a systematic robust controller design.
57.4.3 Beyond Lyapunov Redesign In the previous section, we have seen that, if a Lyapunov function V is to guarantee robustness to an uncertainty A, then the inequality
must be satisfied at all points where VV(x) . G(x) = 0. In the Lyapunov redesign method, this inequality was viewed as a constraint on the uncertainty A. Now let us instead view this ;nequality as a constraint on the Lyapunov function V. This new look at Equation 57.85 will lead us beyond Lyapunov redesign: our construction of V will be based on Equation 57.85 rather than on the nominal system. In other words, we will take the uncertainty A into account during the construction of V itself. To illustrate our departure from Lyapunov redesign, consider the second-order, single-input uncertain system,
where Al and A2 are uncertain functions which satisfy some known bounds. Let us try Lyapunov redesign. The first step would be to find a state feedback u,,, (x) so that the nominal closed-loop system,
pl (xi) 5 c I x1 I, that is, that the uncertainty A1 is restricted to exhibit only linear growth in xl at a rate determined by the constant c. In other words, if the uncertainty A1 does not satisfy this c-linear growth, then this particular Lyapunov redesign fails. This was to be expected because the uncertainty A1 does not satisfy the matching condition. The above Lyapunov redesign failed because it was based on the linear nominal system which suggested a quadratic Lyapunov function V. Let us now ignore the nominal system and base our search for V directly on the inequality (Equation 57.85). Let p ( x l ) be a smooth function so that ~ ( 0 = ) 0, and consider the Lyapunov function
This function V is smooth, positive definite, and radially unbounded and thus qualifies as a candidate Lyapunov function for our system (Equations 57.86-57.87). We will justify this choice for V in the next section; our goal here is to illustrate how we can use our freedom in the choice for the function p to derive a necessary condition on the uncertainty A1 which is much less restrictive than Equation 57.91. For V in Equation 57.92, VV(x) . G(x) = 0 if, and only if, xz = p(x1), SO that the necessary condition Equation 57.85 becomes
for all xl, t E R. Because we have left the choice for p open, this inequality can be viewed as a constraint on the choice of V (through p ) rather than a constraint on the uncertainty A1. We need only impose a structural condition on A1 which guarantees the existence of a suitable function p.. An example of such a condition would be the knowledge of a bound pl(xl) so that I A1 (XI,x2, t ) IS pl(x1); then Equation 57.93 becomes
for all xl E R. It is then clear that we can satisfy Equation 57.94 by choosing, for example, has a globally asymptotically stable equilibrium at x = 0. Because,the nominal system is linear, this step can be accomplished with a linear control law u,,, (x) = Kx, and we can obtain a quadratic Lyapunov function V(x) = xT Px for the stable nominal closed-loop system. In this case, the necessary condition Equation 57.85 becomes
at all points where 2xT P [O 1IT = 0, that is, where x2 = -cxl for some constant c > 0. We substitute x2 = -cxl in Equation 57.90, and, after some algebra, we obtain
for all xl, t E R. Now suppose our knowledge of the uncertainty A1 (xl, -cxl, t) consists of a bound pl(xl) so that 1 A1 (xl, -cxl, t ) 15 pl(xl). Then Equation 57.91 implies that
A technical detail is that this p. is not smooth at xl = 0 unless pi (0) = 0, which means V in Equation 57.92 may not strictly qualify as a Lyapunov function. As we will show in Section 57.4.5, however, smooth approximations always exist that will end up guaranteeing convergence to a neighborhood of x = 0 in the final design. What is important is that this design succeeds for any function pl (xi), regardless of its growth. Thus the c-linear growth condition on A1 which appeared in the above Lyapunov redesign through Equation 57.91 is gone; this new design allows arbitrary growth (in x l ) of the uncertainty A1 . We have not yet specified the controller design; rather, we have shown how the limitations of Lyapunov redesign can be overcome through a reinterpretation of the necessary condition (Equation 57.85) as a constraint on the choice of V. Let us now return to the controller design problem and motivate our choice of V in Equation 57.92.
57.4. LYAPUNOll DESIGN
57.4.4 Recursive Lyapunov Design Let us consider again the system (Equations 57.86-57.87):
We assume knowledge of two bounding functions pi (XI) and p2(x) SO that all atlmissible uncertainties are characterized by the inequalities,
that we arrived at the same expression for p in Equations 57.95 and 57.101 both from the viewpoint of the necessary condition (Equation 57.85) and from the viewpoint of the conceptual system (Equation 57.100). Let us nowverifythat the choice of Equation 57.102 for V leads to a robust controller design for the system (Equations 57.9657.97). Con~putingv we obtain
=
V
+ 2[x2 - p(x1)I [u + A2(x. t) - ~ ' ( x 1 ) [ ~+2A I ( x , f)1] +
~ X ~ [ X ZA I ( x , t)l
(57.103) for all x E R~ and: all t E R. Note that the bo&d pl on the uncertainty A1 is allowed to depend only on the state xi; this is the structural condition suggested in the previous section and will be characterized more completely below. We will take a recursive approach to the design of a ,obust controller for this system. $his approach is based on the integrator backstepping technique developed by [ l 11 for the adaptive control of nonlinear systems. The first step in this approach is to consider the scalar system,
which we obtain by treating the statevariablex2 in Equation 57.96 as a control variable G. This new system (Equation 57.100) is only conceptual; its relationship to the actual system (Equations 57.96-57.97) will be explored later. Let 3s next design a robust controller ii = p ( x l ) for this conceptual system. By construction, this new system satisfies the matching condition, and so we may use the Lyapunov redesign method to construct the feedbackii = p(xl). Thenominalsystem is simplyxl = ii which can be stabilized by a nominal feedback ii,,, = -XI. A suitable Lyapunovfunction for the nominal closed-loop systemxl = -XI would be Vl (xl) = x;. We then choose ii = ii,,, +iir(,h, where ir,b is given, for e~ample,by Equation 57.82 with p = pl. The resulting feedback function for 11 is
Ifwe now apply the feedback: = p ( x l ) to the conceptual system (Equation 57. LOO), we achieve ~1 5 -2x; and thus guarantee stability for every admissible uncertainty A 1. Let us assume for now that this function p is (sufficiently) smooth; we will return to the question of !smoothness in Section 57.4.5. The idea behind, the backstepping approach is to use the conceptual controller (Equation 57.101) in constructing a control Lyapunov f u n c t i o ~V~ for the actual system (Equations 57.9657.97). If we treat the Lyapunov function as a penalty function, it is reasonable to include in V a term penalizing the difference between the state r2 and the conceptual control ii designed for the conceptual sys tem (Equation 57.100):
We have seen in the previous section that this choice satisfies the necessary condition (Equati~n57.85) and is thus a good candidate for our system (Equations 57.96-57.97). It is no coincidence
where p' := d p / d x . tion 57.101,
V
Rearranging terms and using Equa-
+ 2[x2 - p(x1)] [xl + u + A2(x7t ) - ~ ' ( x I ) [ + x ~A I ( x , t)]]
5
-24
(57.104) The effect of backstepping is that the uncertainties enter the expression (Equation 57.104) with the same coefficient as the control variable u; in other words, the uncertainties effectivelysatisfy the matching condition. As a result, we can again apply the Lyapunov redesign technique; the feedback control u = u,, +urob with a nominal control,
yields
0
3
5
-21: [u,,h
- 2 [xi? - p(x1)12 + 2 [x2 - p ( x l ) ] + A2(x, t) - ~ / ( X ~ )(x,A t)]I . (57.106)
We may complete the design by choosing ur,b as in Equation 57.82:
wherep issome functionsatisfying I Az(x, t)-p'(x1)A1 (x, t) 11 p(x). This yields
for all admissible uncertainties A1 and A2, and thusrobust stability is guaranteed.
EXAMPLE 57.8: The above second-order design applies to the following system:
where we let the uncertainty A1 be al;y function satisfying I A1 (x, t) 1 5 1x113. In this case, the function ,A in Equation 57.101 would be p ( x l ) = -xl - x: which is smooth as "
THE CONTROL HANDBOOK required. The nominal control u,,, is given by Equation 57.105, which, for this example, becomes
The Lyapunov function will be of the form
Adding the robustifying term (Equation 57.107), we obtain the following robust controller:
where the functions pi are constructed according to the recursive procedure described above. For this approach to succeed, it is sufficient that the uncertainties A i satisfy the following strict feedback condition:
+
for known functions pi. The restriction here is that the ith bound pi can depend only on the first i states ( X I , ..., x i ) .
This robust controller is not continuous at points where x2 xl + x: = 0; an alternate smooth design will be proposed in Section 57.4.5. The above controller design for the system (Equations 57.9657.97) is a two-step design. In the first step, we considered the scalar system (Equation 57.100) and designed a controller ii = p ( x l ) which guaranteed robust stability. In the secondstep, we used the Lyapunov function (Equation 57.102) to construct a controller (Equations 57.105 + 57.107) for the actual system (Equations 57.96-57.97). This step-by-step design can be repeated to obtain controllers for higher order systems. For example, suppose that instead of the system (Equations 57.96-57.97), we have the system
where A 1 and A2 areasin Equations 57.98- 57.99, A 3 is anewuncertainty, and u is the control variable. We can use the controller u ( x ) := u,,, ( x ) u,,b(x) designed above in Equations 57.105 + 57.107 to obtain the following Lyapunov function W for our new system:
This recursive Lyapunov design technique appliesto uncertain systems more general than Equations 57.1 16-57.117. Multiinput versions are possible, and the strict feedback condition (Equation 57.1 19) can be relaxed to allow the bound pi to depend also on the state xi+l. In particular, the bound p, on the last uncertainty A, can also depend on the control variable u . More details can be found in [ 5 ] , [ 1 4 ] , 1161, [ 1 7 ] .
57.4.5 Smooth Control Laws The control law pi designed in the i th step of the recursive design ov (Equation 57.1 1 8 ) in the becomes part of the ~ y a ~ u n function next step. It is imperative, therefore, that each such function pi be differentiable. To illustrate how smooth functions can be obtained at each step, let us return to the second-order design in Section 57.4.4. The first step was to design a robust controller ii = p ( x I ) for the conceptual system,
+
Here V is the Lyapunov function (Equation 57.102) used above, and vie have simply added a term which penalizes the difference between the state variable z and the controller u ( x ) designed above for the old system (Equations 57.96- 57.97). If u ( x ) is smooth, then Equation 57.1 15 qualifies as a candidate Lyapunov function for our new system; see Section 57.4.5 below for details on choosing a smooth function u ( x ) . We can now construct a controller u = u ( x , Z) for our new system in the same manner as the construction of p ( x l ) and u ( x ) above. We have thus obtained a systematic method for constructing Lyapunov functions and robust controllers for systems of the form, XI
=
x2
12
=
~3
+Al(x,f), + A2(x, t ) ,
with I A1 15 pl ( X I ) . In general, when p l ( 0 ) # 0, we cannot choose p as in Equation 57.101 because of the discontinuity at xl = 0. One alternative is to approximate Equation 57.101 by smooth function as follows:
where 6 ( x 1 ) is a smooth, strictly positive function. This choice for p is once differentiable, and it reduces to the discontinuous function (Equation 57.10 1 )when S E 0. We compute the derivative of V l ( X I ) = x; for the system (Equation 57.120) with the smooth feedback (Equation 57.121):
(57.1 16)
If we now choose S ( x l ) so that plS is small, we see that except in a small neighborhood of x l = 0 .
VI < 0
In the second design step, we will choose urlomas before in Equation 57.105, but instead of Equation 57.106 we obtain the Lyapunov derivative:,
where the extra term 2pl S comes from Equation 57.122 and is a result of our smooth choice for /L(xI). Our remaining task is to construct the robustifying term u,,,h. Using the bound 1 A 2 ( x , t ) - p ' ( x l ) ~ (x, , t ) I( p ( x ) as before, we obtain
We cannot choose uroh as before in Equation 57.107 because it is not continuous at points where x2 = p ( x I j. We could choose a smooth approximation to Equation 57.107, as we did above for the function p , but, to illustrate an alternative approach, we will instead make use of Young's inequality,
which holds for any a , b E R and any E > 0. Applying this inequality to the l a s term in Equation 57.124, we obtain
where E(X) is a smooth, strictly positive function to be chosen below. Thus
out several of these degrees of freedom and discuss the consequences of various design choices. We have already seen thatthe choices for the functions pi in the Lyapunov function (Equation 57.1 18) are by no means unique, nor is it the final choice for the control law. For example, when choosing a robustifying term u,,h in some step of the design, should we choose a smooth approximation to Equation 57.82 as in Equation 57.121, or should we use Young's inequality and choose Equation 57.127? Also, how should we choose the nominal term unom? Is it always good to cancel nonlinearities and apply linearlike feedback as in Equation 57.105, and if so, which gain should we use? Such questions are difficult in general, but there are some guidelines that apply in many situations. For example, consider the task of robustly stabilizing the point x = 0 of the simple scalar system,
where A is an uncertain function with a known bound I A(x, t) (51x 1. Because the matching condition is satisfied, we can apply the Lyapunov redesign method and choose u = unom u,,b One choice for the nominal control would be unom(x)= x 3 - X , which, together with a robustifying term as in Equation 57.82 yields a control law
+
This control law isvalid from theviewpoint ofLyapunovredesign, and it indeed guarantees robustness to the uncertainty A. In such a choice, however, large positive feedbackx3 is used to cancel the nonlinearity -x3 in Equation 57.129. This is absurd because the nonlinearity -x3 is beneficial for stabilization, and positive feedback x3 will lead to unnecessarily large control effort and will cause robustness problems more severe than those caused by the uncertainty A . Clearly, a much more reasonable choice is u(x) = - 2 x , but how can we identify better choices in a more general setting? One option would be to choose the control to minimize some meaningful cost functional. For example, the control
is smooth and yield!; minimizes the worst-case cost functional
J = sup It is always possible to choose &(XI) and ~ ( x so ) that the righthand side of Equat~on57.128 is negative, except possibly in a neighborhood of x := 0. Thus we have gained smoothness in the control law but lost exact convergence of the state to zero.
57.4.6 Design. Flexibilities The degrees of freedom in the recursive Lyapunov design procedure outlined above are numerous and allow for the careful shaping of the closed-loop performance. However, this procedure is new, and guidelines for exploiting design flexibilities are only beginning to appear. Our purpose in this section is to point
Ihlilxl
Srn
[xi +u2]dt
0
for this system Equation 57.129. The two control laws (Equations 57.130 and 57.13l ) are shown in Figure 57. l. We see that the optimal control (Equation 57.131) recognizes the benefit of the nonlinearity -x3 in Equation 57.129 and accordingly produces little control effort for large x . Moreover, this optimal control is never positive feedback. Unfortunately, the computation of the optimal control (Equation 57.131) requires the solution of a Hamilton-Jacobi-Isaacs partial differential equation, and will be difficult and expensive for all but the simplest problems. As a compromise between the benefits of optimality and its,^ computational burden, we might consider the inverse optimal
938
THE CONTROL HANDBOOK
Figure 57.1 Comparison of the control laws Equation 57.130 (solid) and Equation 57.131 (dotted).
Figure 57.2 A comparison of the control laws Equation 57.135 (solid)
control problem, summarized for example by [7]. In this approach, we start with a Lyapunov function as in the Lyapunov redesign method above. We then show that this Lyapunov function is in fact the value function for some meaningful optimal stabilization problem, and we use this information to compute the corresponding optimal control. Freeman and Kokotovi'c [6] have shown that the pointwise solutions of static minimization problems of the form,
is the choice of the form of V itself. Recall that, given a Lyapunov function 1/;: and a control p ; at the ith design step, we constructed the new Lyapunov function Vi+i as follows:
minimize uT S U , ,
subject to
S = ST > 0;
sup [ ~ ( xu, , A) A
+ a (x)] 5 0,
(57.133) (57.134)
yield optimal controllers (in this inverse sense) for meaningful cost functionals, where V is a given control Lyapunov function and u is a positive function whose choice represents a degree of freedom. When the system is jointly affine in the control u and the uncertainty A, this optimization problem (Equations 57.13357.134) is a quadratic program with linear constraints and thus has an explicit solution u(x). For example, the solution to Equations 57.133-57.134 for the system (Equation 57.129) with V = x2 and u = 2x2 yields the control law, u(x) =
[
;3
-21
when x2 g 2, when x2 > 2.
(57.135)
As shown in Figure 57.2, this control (Equation 57.135) is qualitatively the same as the optimal control (Equation 57.131); both recognize the benefit of the nonlinearity -x3 and neither one is ever positive feedback. The advantage of the inverse optimal approach is that the controller computation is simple once a control Lyapunov function is known. We thus have some guidelines for choosing the intermediate control laws p ; at each step of the design, namely, we can avoid the wasteful cancellations of beneficial nonlinearities. However, these guidelines, when used at early steps of the recursive design, have not yet been proved beneficial for the final design. We have shown that the choices for the functions pi in Equation 57.118 represent important degrees of freedom in the recursive Lyapunov design procedure. Perhaps even more important
and Equation 57.131 (dotted).
This choice for V;+i is not the only choice that will lead to a successfuldesign, and we are thus confronted with another degree of freedom in the design procedure. For example, instead of Equation 57.136, we can choose
where K : R+ -t R+ is any smooth, positive-definite, unbounded function whose derivative is everywhere strictly positive. This function K represents a nonlinear weighting on the term V;and can have alarge effect on the control laws obtainable in future steps. The last degree of freedom we wish to discuss involves the quadraticlike term in Equations 57.136 and 57.137. Praly et al. [15] have'shown that Equation 57.137 can be replaced by the more general expression
for a suitable choice of the function 4. Indeed, Equation 57.137 is a special case of Equation 57.138 for = 2[s - pi (xi, . . .,xi)]. This degree of freedom in the choice of 4 can be significant; for example, it allowed extensions of the recursive design to the nonsmooth case by [Is]. It can also be used to reduce the unnecessarily large control gains often caused by the quadratic term in Equation 57.136. To illustrate this last point, let us return to the second-order example (Equations 57.109-57.1 10) given by
+
57.4. LYAPUNO'V DESIGN
where the uncertainty A 1 is any function satisfying ) A 1 ( x , t ) ) 5 I x l l3 . Recall that using the Lyapunov function,
we designed the following robust controller (Equation 57.1 12) for this system:
[
u(x) = -- X2
+XI
I'+ 31x11 5 )
- (1x1
+x;
]
- X I
sgn ( x i
+XI
- (1 + 3X;)X2
+ x:)
.
(57.142)
~ +x: = 0. This controller is not continuous at p o i n t s ~ h e r e x+XI In other words, the local controller gain a u / a x 2 is infiniteat such points. Such infinite gain will cause high-amplitude chattering along the manifold M described by x* X I + x: = 0. As a result of such chattering, this control law may not be implementable because ofthe unreasonable demands on the actuator. However, as was shown in Section 57.4.5, we can use this same Lyapunov function (Equation 57.141) to design a smooth robust controller ii for our system. Will such a smooth cqntroller eliminate the chattering caused by the discontinuity in Equation 57.142? Surprisingly, the answer is no. One can show that the local controller gain a i i / a x ; , although finite because of the smoothness of i i , grows like xf' along the rnanifold M. Thus for large signals, this local gain1 is extremely large and can cause chattering as if it were infinite Figure 57.3 shows the Lyapunov function V in Equation 57.141, plotted as a function of the two variables, zl : = ' x l and z2 := x2 x l .-I x:. A smooth control law ii designed using this V is shown in Figure 57.4, again plotted as a function of zl and z2. Note that the x f growth of the local gain of ii along the marlifold zz = 0 is clearly visible in this figure. One might conclucle that such high gain is unavoidable for this particular system. 'This conclusion is false, however, because the xf growth of the local gain is an artifact of the quadratic form of the Lyapunov funciion (Equation 57.141) and has nothing to do with robust stability requirements for this system. Freeman and Kokotovit [6] have shown how to choose the function 4 in Equation 57.138 to reduce greatly the growth of the local gain. For this example (Equations 57.139-57.140), they achieved a growth of xf as opposed to the x f growth caused by the quadratic V in Equation 57.141. Their new, flattened Lyapunov function is shown in Figure 57.5, and the corresponding control law is shown in Figure 57.6. The xf versus xf growth of the local gain is evident when compari.ng Figures 57.6 and 57.4. The control signals generated from a particular initial condition are compared in Figure 57.7. These controls produce virtually the same state trajectories, but the chattering caused by the old control law has been eliminated by the new one.
Figure 57.3 Quadraticlike Lyapunov function (Equation 57.141). (From Freeman, R. A. and Kokotovit, P.'V.,Automatica, 29(6), 14251437, 1993. With permission.)
+
+
Figure57.4 Controller from quadratic-likeLyapunovfunction (Equation 57.141). (From Freeman, R. A. and Kokotovit, P. V., Automatica 29(6),1425-!437,1993. With permission.)
Figure 57.5 New flattened Lyapunov function. (From Freeman, R. A. and Kokotovit, P. V., Automatica, 29(6), 1425-1437, 1993. With permission.)
THE CONTROL HANDBOOK 1425-1437,1993. [6] Freeman, R. A. and Kokotovic, P. V., Inverse optimal'
Figure 57.6 Controller from flattened Lyapunov function. (From Freeman, R. A. and Kokotovit, P. V., Autornatica, 29(6), 1425-1437, 1993. With permission.)
ity in robust stabilization, SIAM J. Control Optimiz., to appear revised November 1994. [7] Glad, S. T., Robustness of nonlinear state feedbackA survey, Autornatica, 23(4), 425-435, 1987. [8] Gutmzn, S., Uncertain dynamical systems - Lyapunov min-max approach, IEEE Trans. Automat. Control, 24,437-443, 1979. [9] Jacobson, D. H., Extensions of Linear-Quadratic Control, Optimization and Matrix Theory,Academic, London, 1977. [lo] Jurdjevic, V. and Quinn, J. P., Controllability and stability, J. Diff Eq., 28,381-389, 1978. [ 111 Kanellakopoulos, I., Kokotovit, P. V., andMorse, A. S., Systematicdesign of adaptive controllers for feedback linearizable systems, IEEE Trans. Automat. Control, 36(11), 1241-1253, 1991. [12] Khalil, H. K., Nonlinear Systems, Macmillan, New York, 1992. [I31 Krasovsky, A. A., A new solution to the problem of a control system analytical design, Autornatica, 7, 4550, 1971. [14] Marino, R, andTomei, P., Robust stabilization of feed-
back linearizable time-varying uncertain nonlinear systems, Autornatica, 29(1), 181-189, 1993. [IS] Praly, L., d'Andr4a Novel, B., and Coron, J.-M., Lyapunov design of stabilizing controllers for cascaded systems, IEEE Trans. Automat. Control, 36(10), 11771181,1991. [16] Qu, Z., Robust control ofnonlinear uncertain systems
0 0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
time
Figure 57.7 Comparison of control signals. (From Freeman, R. A. and Kokotovit, P.V., Autornatica, 29(6), 1425-1437,1993. With Termission.)
under generalized matching conditions, Autornatica, 29(4), 985-998, 1993. [17] Slotine, J. J. E. and Hedrick, K., Robust input-output feedbacklinearization, Int. J. Control, 57, 1133-1 139, 1993. 1181 Sontag, E. D., A Lyapunov-like characterization of
asymptotic controllability, SZAM J. Control Optimiz.,
References [ l ] Artstein, Z., Stabilizationwith relaxed controls, Nonlineardnal. 7(1 I), 1163-1 173, 1983. [2] Barmish, B. R, Corless, M. J., and Leitmann, G., A
new class bf stabilizing controllers for uncertain dynamicalsystems, SIAMJ. Cont. Optirniz., 21,246-255, 1983. [3] Corless, M. J., Robust stability analysis and controller
design with quadratic Lyapunov functions, in Variable Structure and Lyapunov Control, Zinober, A., Ed., Springer, Berlin, 1993. [4] Corless, M. J. and Leitmann, G., Continuous state feedback guaranteeing uniform ultimate boundedness for uncertain dynamic systems, IEEE Trans. Automat. Control, 26(5), 1139-1 144, 1981. [5] Freeman, R. A. and Kokotovit, P. V.,Design of 'softer' robust nonlinear control laws, Autornatica 29(6),
21(3), 462-471, 1983.
Further Reading Contributions to the development of the Lyapunov design methodology for systems with no uncertainties include [I], [9], [lo], [13], 1181. A good introduction to Lyapunov redesign can be found in Chapter 5.5 of [12]. Corless [3] has recently surveyed various methods for the design of robust controllers for nonlinear systems using quadratic Lyapunov functions. Details of the recursive design presented in Section 57.4.4 can be found in (61, [14], [16]. The statespace techniques discussed in this chapter can be combined with nonlinear inputloutput techniques to obtain more advanced designs (see Chapter 56.2). Finally, when the uncertain nonlinearities are given by constant parameters multiplying known nonlinearities, adaptive control techniques apply (see Chapter 57.8).
57.5. VARIABLE STRUCTURE, SLIDING-MODE CONTROLLER DESIGN
57.5 Variable Structure, Sliding-Mode Controller Design R. A. DeCarlo, School of Electrical and Computer Engineering, Purdue University, West Lafayette, IN S. H. ~ t ~ School k , of Electrical and Computer Engineering, Purdue University, West Lafayette, IN S. V. D1rakunov, Department of Electrical Engineering, Tular~eUniversity, New Orleans, LA 57.5.1 Introdliction and Background This chapter investigates Variable Structure Control (VSC) as a high-speed switched feedback control resulting in a sliding mode. For example, the gaiins in each feedback path switch between two values according to (3 rule that depends on the value of the state at each instant. The yuryose of the switching control law is to drive the nonlinear plant's state tiajectory onto a prespecified (user-chosen) surface in the state space and to maintain the plant's state trajectory on this surface for all subsequent time. This surface is called a switching surface. When the plant state trajectory is "above" the surface, a feedback path has one gain and a different gain if the trajectory drops "below" the surface. This surface defines the rule for proper switching. The surface is also called a sliding surface (sliding manifold) because, ideally speaking, once intercepted, the switched control maintains the plant's state trajectory on the surface for all subsequent time and the plant's state trajectory then slides along this surface. The plant dynamics restricted to this surface represent the controlled system's behavior. The first critical phase of a VSC design is to define properly a switching surface so that the plant, restricted to the surface, has desired dynamics, such as stability to the origin, tracking, regulation, etc. In summary, a VSC control design breaks down into two phases. The first phase is to design or chpose a switching surface so that the plarit state restricted to the surface has desired dynamics. The second phase is to design a switched control that will drive the plant litate to the switching surface and maintain it on the surface upon interception. A Lyapunov approach is .used to characterize this secpnd design phase. Here a generalized Lyapunov function, I hat characterizesthe motion of the state trajectory to the surface, is defined in terms of the surface. For each chosen switched control structure, one chooses the "gains" so that the derivative of this Lyapunov function is negative definite, thus guaranteeing motion of the state trajectory to the surface. As an introducto~.yexample, consider the first order system x(t) = u (x, t) with control
Hence,,the system with control satisfies x = -sgn(x) with trajectoriles plotted in Figure 57.8(a). Here the control u(x, t) switchies, changing its value between f1 around the surface a ( x , t:i = x = 0. Hence for any -initial condition xo, a finite time t:, exists for which x(t) = 0 for all t 2 tl. Now sup-
94 1
+
pose x(t) = u(x, t) v(t) where again u(x, t) = -sgn(x) and v(t) is a bounded disturbance for which sup, (v(t)( < 1. As before, the control u(x, t ) switches its value between f1 around the surface a ( x , t) = x = 0. It follows that if x(t) > 0, then x(t) = -sgn[x(r)] + v(t) < 0 forcing motion toward the line o(x, t) = x = 0, and if x(t) < 0, then x(t) = -sgn[x(t)] v(t) > 0 again forcing motion toward the line a(x, t) = x = 0. For a positive initial condition, this is illustrated in Figure 57.8(b). The rate of convergence to the line depends on the disturbance. Nevertheless, a finite time rl exists for which x(t) = 0 for all t 2 tl . If the disturbance magnitude exceeds 1, then the gain can always be adjusted to compensate. Hence, this VSC law is robust in the face of bounded disturbances, illustrating the simplicity and advantage of the VSC technique.
+
.
State trajectory
Figure 57.8 (a) State trajectories for the system 1 = -sgn(x); (b) State trajectory for the system x (t) = -sgn[x(t)] v(t).
+
From the above example, one can see that VSC can provide a robust means of controlling (nonlinear) plants with disturbances and parameter uncertainties. Further, the advances in computer technology and high-speed switching circuitry have made the practical implementation of VSC quite feasible and of increasing interest. Indeed, puise-width modulation control and switched dc-to-dc power converters [12] can be viewed in a VSC framework.
THE CONTROL HANDBOOK
57.5.2 System Model, Control Structure, and Sliding Modes System Model This chapter investigates a class of systems with a state model nonlinear in the state vector x ( . ) and linear in the control vector u ( . ) of the form,
sliding mode exists if, in the vicinity of the switching surface, S, the tangent or velocity vectors of the state trajectory point toward the switching surface. If the state trajectory intersects the sliding surface, the value of the state trajectory or "representative point" remains within an E neighborhood of S. If a sliding mode exists on S, then S, or more commonly u ( x , t ) = 0 , is also termed a sliding surface. Note that interception of the surface ui ( x , t ) = 0 does not guarantee sliding on the surface for all subsequent time as illustrated in Figure 57.9, although this is possible.
where x ( t ) E R n , u ( t ) E Km, and B ( x , t ) E R n X m ;further, each entry in f ( x , t ) and B ( x , t ) is assumed continuous with a bounded continuous derivative with respect to x . In the linear time-invariant case, Equation 57.143 becomes
with A n x n and B n x m constant matrices. Associated with the system is a ( n - m)-dimensional switching surface (also called a discontinuity or equilibrium manifold),
S
t
=
( ( x , t ) E ~ " + ' l o ( xt ,) = 0 , (57.145)
u(x,t) =
[ u l ( x , t ,..., ) v m ( x , t ) l T = O . (57.146)
where (We will often refer to S as u ( x , t ) = 0.) When there is no t dependence, this (n-m)-dimensional manifold in the state space Rn is determined as the intersection of m (n - 1)-dimensional surfaces u i ( x , t ) = 0. These surfaces are designed so that the system state trajectory, restricted to a ( x , t ) = 0, has a desired behavior such as stability or tracking. Although general nonlinear time-varying surfaces as in Equation 57.145 are possible, s more prevalent in design [2], [ 8 ] ,1 1 I ] , [ 1 3 ] , [ 1 4 ] . linear ~ n e are Linear surface design is taken up in Section 57.5.4.
Control Structure After proper design of the surface, a switched controller, U ( X t, ) = [ U I ( x , t ) , . . . , urn( x , t ) ] is constructed of the form, u i ( x ,t ) =
(
u+(x,t), u;(x,I),
when ~ i ( x , t>) 0 , (57.147) when u i ( x , t ) < 0 .
Equation 57.147 indicates that the control changes value depending on the sign of the switching surface at x and t . Thus a ( x , t ) = 0 is called a switching surface. The control is undefined on the switching surface. Off the switching surface, he control values u f are chosen so that the tangent vectors of the state trajectory point towards the surface such that the state is driven to and maintained on u ( x , t ) = 0 . Such controllers result in discontinuous closed-loop systems.
Sliding Modes The control u ( x , t ) is designed so that the system state trajectory is attracted to the switching surface and, once having intercepted it, remains on the switchingsurface for all subsequent time. The state trajectory can be viewed as sliding along the switching surface and thus the system is in a sliding mode. A
Figure 57.9 A situation in which a sliding mode exists on the intersection of the two indicated surfaces for t ? tl .
An ideal sliding mode exists only when the state trajectory x ( t ) of the controlled plant satisfies u ( x ( t ) ,t ) = 0 at every t 2 tl for some t l . This may require infinitely fast switching. In real systems, a switched controller has imperfections, such as delay, hysteresis, etc., which limit switching to a finite frequency. The representative point then oscillateswithin a neighborhood of the switching surface. This oscillation,called chattering (discussed in a later section), is also illustrated in Figure 57.9. If the frequency of the switching is very high relative to the dynamic response of the system, the imperfections and the finite switchingfrequencies are often but not always negligible. The subsequent development focuses primarily on ideal sliding modes.
Conditions for the Existence of a Sliding Mode The existence of a sliding mode [2], [13], [ 1 4 ] requires stability of the state trajectory to the switching surface u ( x , t ) = 0 , i.e., after some finite time t i , the system representative point, x ( t ) , must beinsomesuitableneighborhood, {xi Ilu(x, t)II e E ) , of S for suitable E > 0 . A domain, V ,of dimension n - rn in the manifold, S, is a sliding-mode domain if, for each E > 0 , there is a S > 0 , so that any motion starting within a n-dimensional 6-vicinityof V may leave the n-dimensional &-vicinityof V only through the n-dimensional &-vicinityof the boundary of V as illustrated in Figure 57.10. The region of attraction is the largest subset of the state space from which slidingis achievable. Asliding mode is globallyreachable if the domain of attraction is the entire state space. The
57.5. VARIABLE STRUCTURE, SLIDING-MODE CONTROLLER DESIGN
943
ble Lyapunov functions to verify its viability. A later section will take up the discussion of general control structures.
An Illustrative Example To conclude this section we present an illustrativeexample for a single pendulum system,
+
with a standard feedback control structure, u ( x ) = kl ( x ) x l k 2 ( x ) x 2 ,having nonlinear feedback gains switched according to the rule
- Boundary point of Z) Figure 57.10 mode.
Two-dimensionalillustration of the domain of a sliding
second method of Lyapunov provides the natural setting for a controller design leading to a sliding mode. In this effort one uses a generalized Lyapunov function, V ( t , x , a ) , that is positive definitewith a negative time derivative in the region of attraction.
THEOREM 57.5 1'13, p. 831: For the ( n - m)-dimensional domain D to be the domain of a sliding mode, it is suficient that in somen-dimensional domain D > D,a function V ( t ,x , a ) exists, continuously diferentiable with respect to all of its arguments and satisfying the following conditions: ( l ) V ( t ,x , a ) is positive definite with respect to a , i.e., for arbit r a r y t a n d x , V ( l , x , a ) > Owheno # O a n d V ( t , x , O ) = O ; o n the sphere ! ! a11 5 p > 0 ,for all x E S2 and any t , the relationships
and
inf IlulI=p sup
V ( t ,x , u ) = h p ,
hp > 0
V ( t , x , o ) = H,,
H, > 0
hold where h p and H,,depend only on p with h p # 0 ifp # 0. ( 2 ) The total time derivative of V ( t ,x , a ) on the trajectories of the system of Equation 57.143 has a negative supremum for all x E S2 exceptforx on {he switchingsurface where thecontrol inputs are undefined and the derivative of V ( t ,x , a ) does not exist. In summary, two phases underly VSC design. The first phase is to construct a switching surface a ( x , t ) = 0 so that the system restricted to the surface has a desired global behavior, such as stability, tracking, regulation, etc. The second phase is to design a (switched) controller u ( x , t ) so that away from the surface a ( x , t ) = 0 , the: tangent vectors of the state trajectories point toward the surface, i.'e., so that there is stability to the switching surface. This seconcl phase is achieved by defining - appropriate Lyapunov function V ( t ,x , a ) , differentiating . ;function so that the conti:ol u(x,t ) becomes explicit, and adjusting controller gains so that the derivative is negative definke. The choice of V ( t ,x , a ) determines th.e allowable control;-r structures. Conversely, a workable control structure has a set of possi-
that dependson thelinearswitchingsurfacea(x) = [sl szlx. For such single-input systems it is ordinarily convenient to choose a Lyapunov function of the form V ( t ,x , a ) = 0.5a2(x). To determine the gains necessary to drive the system state to the surface a ( x ) = 0 , they may be chosen so that
do2
+
U(X)XZ
[sl
= a(x)-
(x) = a ( x ) [ s l s2]x dt
+ s 2 k 2 ( x ) ]< 0.
This is satisfiedwhenever a l @ ) = a1
sin(xl)
=-I,
a2 < - ( s l / s 2 ) , and S2 -(sl/s2). Thus, for properly chosen sl and s2, the controller gains are readily computed. This example proposed no methodology for choosing sl and s2, i.e., for designing the switching surface. Section 57.5.4 takes up this topic. Further this example was only single input. For the multi-input case, ease of computation of the control gains depends on a properly chosen Lyapunov function. For most cases, a quadratic Lyapunov function is adequate. This topic is taken up in Section 57.5.5.
57.5.3 Existence and Uniqueness of Solutions to VSC Systems VSC produces system dynamics with discontinuous right-hand sides due to the switching action of the controller. Thus .they fail to satisfy conventional existence and uniqueness results of differential equation theory. Nevertheless, an important aspect of VSC is the presumption that the plant behaves uniquely when restricted to a ( x , t ) = 0. One of the earliest and conceptually straightforward approaches addressing existence and uniqueness
THE CONTROL HANDI3OOK is the method of Filippov [7]. The following briefly reviews this method in the two-dimensional, single-input case. From Equation 57.143, P(t) = F ( x , t , 1 1 ) and the control ic(x, r) satisfy Equation 57.147. Filippov's work shows that the state trajectoriesofEquation 57.143 withcontrol Equation 57.147 on the switching manifold Equation 57.145 solve the equation
Thisis illustratedin Figure 57.11 where F+ = F(x, t, u'), F- = F(x, t, u p ) , and F' is the resulting velocity vector of the state trajectory in a sliding mode.
Equivalent Control Equivalent control constitutes an equivalent input which, when exciting the system, produces the motion of the system on the sliding surface whenever the initial state is on the surface. Suppose at tl the plant's state trajectory intercepts the switching surface, and a sliding mode exists. The existence of a sliding mode implies that, for all t 2 tl, u(x(t), t ) = 0, and hence ir(x(t), t) = 0. Using the chain rule, we define the equivalent control ueq for systems of the form of Equation 57.143 as the input satisfying
% B(x, t ) is nonsingular for
Assuming that the matrix product all r and x , one can compute uey as
Therefore, given that a ( x ( t l ) , t l ) = 0, then, for all t 2 ti, the dynamics of the system on the switching surface will satisfy f (t)
Figure 57.11 Illustration of the Filippov method of determining the desired velocity vector F0for the motion of the state trajectory on the sliding surface as per Equation 57.148.
The problem is to determine a, which follows from solving the equation < grad(@), F' >= 0, i.e.,
provided (1) igrad(@),(F-- F+) ># 0, (2) < grad(@),F+ > 5 0, and (3)
=
[
I
[::
- B(x, t ) -B(x,
t )1 - I
g]
f(x7t)
This equation represents the equivalent system dynamics o n the sliding surface. The driving term is present when some form of tracking or regulation is required of the controlled system, e.g., when a ( x , t ) = Sx + r ( t ) = O with r(t) serving as a "reference" signal [ l 11. The (n - m)-dimensional switching surface, a ( x , t ) = 0, imposes rn constraints on the plant dynamics in a sliding mode. Hence, rn of the state variables can be eliminated, resulting in an equivalent reduced-order system whose dynamics represent the motion of the state trajectory in a sliding mode. Unfortunately, the structure of Equation 57.150 does not allow conveniently exploiting this fact in switching-surface design. To set forth a clearer switching-surface design algorithm, we first convert the plant dynamics to the so-called regular form.
Regular Form of the Plant Dynamics
57.5.4 Switching-Surface Design Switching-surface design is predicated upon knowledge of the system behavior in a sliding mode. This behavior depends on the parameters of the switching surface. Nonlinear switching surfaces are nontrivial to design. In the linear case, the switchingsurface design problem can be converted into an equivalent state feedback design problem. In any case, achieving a switchingsurface design requires analytically specifying the motion of the state trajectory in a sliding mode. The so-called method of equivalent control is essential to this specification.
The regular form of the dynamics of Equation 57.143 is
where z l E Rn-" , zz E Rm.This form can often be constructed through a linear state transformation, ~ ( t = ) Tx(t) where T has the property
57.5. I'AHIABLE STRUCTURE, SLIDING-MODE CONTROLLER DESIGN and b 2 ( z , t ) is an (rn x m ) nonsingular mapping for all t and z . 111 general, computing the regular form requires the nonlinear transformation,
where ( i ) T ( x , t ) is a diffeomorphic transformation, i.e., a continuous differentiable inverse mapping T ( z , t ) = x exists, satisfying F ( o , t ) = 0 for all t , ( i i ) T I ( . , ) : R1"xR -t Rn-m and T 2 ( . ; ) : R t l x R + R m , and ( i i i ) T ( x , t ) ha:; the property that
This partial differential equation has a solution only if the socalled Frobenius condition is satisfied [9]. The resulting nonlinear regular form of the plant dynamics has the structure,
a TI
il
=
-f ( F ( z , t ) , t )
zz
=
5 ax
ax
aTl +at
aT2 ( ~ ( zt ), , t ) -t at
fl(z, t), aTz + -B(T(z3 ax
t)
Sometimes all nonlinearities in the plant model can be moved to f2(z, t ) SO that
which solves the sliding-surface design problem with linear techniques (to be shown). If the original system model is linear, the regular form is given by
945
Substituting the nonlinear regular form of Equation 57.152 yieids i l
= f'(zI,z2. t ) =
.h ( z i , - - s ; I s I z ~
- ~ ; ' r ( t ) t, ) .
The goal then is to choose Sl and S2 to achieve a desired behavior of this nonlinear system. Ifthis system is linear, i.e., if Equation 57.153 is satisfied, then, using Equation 57.155, the reduced order dynamics are
Analysis of the State Feedback Structure of Reduced-Order Linear Dynamics Theequivalent reduced-order dynamics ofEquation 57.156 exhibit a state feedback structure in which F = SF' S1 is a state feedback map and A 1 2 represents an "input" matrix. Under the conditions that the original (linear) system is controllable, the following well-known theorem applies: 1161: If the linear regular form of the state model (Equation 57.154) is controllable, then the pair ( A 1 , A 1 2 ) of the reduced-order equivalent system of Equation 57.156 is con trolla ble.
THEOREM 57.6
This theorem leads to a wealth of switching-surface design mechanisms. First, it permits setting the poles of A l l A ~ ~ s ;for~ stabilizing s ~ , the state trajectory to zero when r ( t ) = 0 or to a prescribed rate of tracking, otherwise. Alternatively, one can determine S1 and S2 to solve the LQR problem when r ( t ) = 0 . As an example, suppose a system has the regular form of Equation 57.154 except that and A22 are time varying and nonlinear,
where z l E RnPm and 2 2 E RIn are as above.
Equivalent System Dynamics via Regular Form The regular form of the equivalent state dynamics is convenient for analysis and switching-surface design. To simplify the development we make three assumptions: 1. The sliding surface is given in terms of the states of the regular form. 2. The surface has the linear time varying structure,
where the matrix S 2 is chosen to be nonsingular. 3. The system is in a sliding mode, i.e., for somz t l , a ( x ( t ) ,t ) = 0 for all t 2 t l . With these three assumptions, one czn solve for z 2 ( t ) as
where a i j = a i j ( t , x ) and amin 5 aIj ( t , x ) 5 amax. Let the 'J IJ switching surface be given by
The pertinent matrices of the reduced order equivalent system matrices are
0
1
0
0
0
0
[: ;] 0
and
112
=
0
THE CONTROL HANDBOOK To stabilize the system, suppose the goal is to find F so that the equivalent system has eigenvalues at {-I, -2, -3). Using Matlab's Control System's Toolbox yields,
Substituting Equation 57.159 in Equation 57.158 leads to a Lyapunov-like theorem. THEOREM57.7 A suficieritcondition for theequilibrium manifold (Equation 57.145) to be globnlly atrractive is that the control u ( x , t ) be chose11so that
Choosing S2 = I, leaves S1 = F. This then specifies the switching surface matrix S = [F I]. Alternatively, suppose the objective is to find the control that ~ i n i m i z ethe s performance index '
for a # 0, i.e., V ( t , x , a ) is negative definite. where the lower limit of integration refers to the initiation of sliding. This is associated with the equivalent reduced-order system i l = A l l z l - A1212 where
Observe that Equation 57.160 is linear in the control. Virtually ' TSC are chosen so that this inequality is all control structures for V satisfied for appropriate W .Some control laws utilize an x- and t dependent W requiring that the derivation above be generalized.
Various Control Structures
;= S2- ' S , ~=~ F Z I .
To make the needed control structures more transparent, recall the equivalent control of Equation 57.149,
Suppose weighting matrices are taken as
B ( x , t ) is nonsincomputedassuming that thematrixproduct gular for all t and x . We can now decompose the general control structure as Using Matlab's Control Systems Toolbox, the optimal feedback
Again, choosing S2 = I, the switching surface matrix is given I].Here, the poles of the system in sliding are by S = [F {-1.7742, -0.7034 f j0.2623).
where u ( x , t ) is as yet an unspecified substructure. Substituting in Equation 57.160 produces the following sufficient condition for stability to the switching surlace: Choose u ~ ( xt ), so that
Because % B ( X ,t ) is assumed nonsingular for all t and x , it is convenient to set
57.5.5 Controller Design Stability to Equilibrium Manifold As mentioned, in VSC a Lyapunov approach is used for deriving conditions on the control u ( x , t ) that will drive the state trajectory to the equilibrium manifold./Ordinarily,it is sufficient to consider only quadratic Lyapunov functions of the form, V ( t ,x , a ) = u T ( x ,t ) W a ( x ,t ) ,
(57.157)
where W is a symmetric positive definite matrix. The control u ( x , t ) must be chosen so that the time derivative of V(t, X , a ) is negative definite for a # 0. To this end, consider ~ ( tx,,
+
Often a switching surface u ( x , t ) can be designed to achieve a desired system behavior in sliding and, at the same time, to satisfy . losing the constraint B = I in which case u ~ = i NWithout generality, we make one last simplifying assumption, W = I , because W 0, W is nonsingular, and can be compensated for in the control structure. Hence, stability on the surface reduces ( x , t ) such that to finding iN
a ) = bTwo a T w b = 2 a T w b , (57.158)
where we have suppressed specific x and t dependencies. Recalling Equation 57.143, it follows that
These simplificationsaUow us to specify five common controller structures: 1. Relays with constant gains: iiN (x, t ) is chosen so that
iiN = a! sgn(a(x, t ) )
I
57.5. VARIABLE STRUCTURE, SLIDING-MODE CONTROLLER DESIGN with a = [aij] an rn x rn matrix, and sgn(a(x, t)) is defined componentwise. Stability to the surfaceis achieved if a = [aij] is chosen diagonally dorninant with negative diagonal entries [13]. Alternatively,if a is ch,osento be diagonal with negative diagonal entries, then the control can be represented as
947
57.5.6 Design Examples This section presents two design examples illustrating typical VSC strategies.
DESIGN EXAMPLE 1: In this example, we illustrate a constant gain relay control with nonlinear sliding surface design for a single-link robotic manipulator driven by a dc armaturecontrolled motor modeled by the normalized (i.e., scaled) simplified equations,
iNi = a i i ~ g I l ((x, ~ it)) and, for ui # 0,
which guarantees stabilityto the surfacegiven the Lyapunovfunction, V(t, x, a,) = a q x , t)a(x, t). 2. Relays with state d,ependent gains: Each entry of liN(x, t) is chosen so that
iNi = ffii(x, t)sgn(ai(x, t ) ) ,
ffii(x,t) < 0.
The condition for stability to the surface is that 2a,,(x, t)u,sgn(u,) = 2a,, (x, t)lo, 1 < 0
~ O L ~ N=I
for
a,
# 0.
in the regular form. To determine the structure of an appropriate sliding surface, ~ ( xt), is nonsingular. Because recall the assumption that B = [0 0 l l T , it follows that must be nonzero. = 1. Hence, it is sufficient Without losing generality, we set to consider sliding surfaces of the form
An adequate choace for a,,(x, t) is to choose #?, < 0, y, > 0, and k a natural number with
Our design presumes that the reduced-order dynamics have a second-order response represented by the reduced-order state dynamics, 3. Linear state feedback with switched gains: Here liN(x,t ) is
chosen so that
To guarantee stability, it is sufficient to chooseai, and Bij so that
This form allows us to specifythe characteristic polynomial of the dynamics and thus the eigenvalues, i.e., rrA (A) = A2 a2A a1 . Proper choice of a l and a2 leads to proper rise time, settling time, overshoot, gain margin, etc. The switching-surface structure of Equation 57.165 implies that, in a sliding mode,
+ +
4. Linear continuous feedback: Choose
liN = --Pu(x, f),
P = pT > 0,
i.e., P E Rm x m is a symmetric positive definite constant matrix. Stability is achieved I~ecause
Substituting in the given system model, the reduced-order dynamics become
trTlib, = -a T PO < 0, where P is often chosen as a diagonal matrix with positive diagonal entries.
Hence the switching-surfacedesign is completed by setting
5. Univector nonlinearity with scale factor: In this case, choose To complete the controller design, we first compute the equivalent control, Stability to the surface is guaranteed because, for a # 0,
Of course, it is possible to make p time dependent, if necessary, for certain tracking problems. This concludes our discussion of control structures to achieve stability to the sliding surface.
For the constant gain relay control structure (Equation 57.161)
THE CONTROL HANDBOOK
948
Stability to the switching surface results whenever cr < 0 as
D E S I G N W P L E 2 : Consider the fourth-order (linear) model of a mass spring system which could represent a simplified model of a flexible structure in space with two-dimensional control (Figure 57.12).
will illustrate two controller designs. The first is a hierarchical structure [2] so that, for u =#0,
with the sign of a1 , cr2 # 0 to be determined. For stability to the surface, it is sufficient to have a1$1 < 0 and u2u2 < 0, as can be seen from Equation 57.158, with W = I. Observe that
and
u2 = x4 - i!, , f + c 2 ( f 2 - i
ref ) = x 4 - i ' ref
+ C2 ( ~ -4 iref).
Substituting for the derivatives of x3 and x4 leads to
Figure 57.12
A mass spring system for Design Example 2.
Here, X I is the position of m 1, x2 the position of m2, u I the force applied to m i , and u2 the force applied between ml and m2. The differential equation model has the form, where
where k is the spring constant. Given that x3 = x1 and x4 = x2, the resulting state model in regular form is
and
k
k - -x2
- ref + C2X4 - c2iref m2 rn2 Taking a brute force approach to the computation of the control gains, stability to the switching surface is achieved provided
h2 =
-XI
i.e., whenever There are two simultaneous control objectives: 1. Stabilize oscillations, i.e., xl = x2 2. Track a desired trajectory, x2(t) = zref(t). These goals are achieved if the following relationships are maintained for c l and c2 > 0 :
ff2 >
m21h21(> 0)
and provided
i.e., whenever
and a,
In a second controller design, we recall Equation 57.166. For # 0 and a 2 # 0, it is convenient to define the controller as
The first step is to determine the appropriate sliding surface. To achieve the first control objective, set where B1 and B2 are to be determined. It follows that and to achieve the desired tracking, set
The next step is to design a VSC law to drive the state trajectory to the intersection of these switching surfaces. In this effort we
As in the first controller design, the state trajectory will intercept the sliding surface in finite time and sliding will occur for B1 and f i 2 sufficiently large and negative, thereby achieving the desired control objective.
57.5. VARIABLE STRUCTURE, SLIDING-MODE CONTROLLER DESIGN
57.5.7
Chattering
The VSC controllers d.eveloped earlier assure the desired behavior of the closed-loop system. These controllers, however, require an infinitely (in the ideal case) fast switching mechanism. The phenomenon of nonideal but fast switching was labeled as chattering (actually the word stems from the noise generated by the switching element). The high-frequency components of the chattering are undesirable because they may excite unmodeled high-frequency plant dynamics resulting in unforeseen instabilities. To reduce chatter, define a so-called boundary layer as
949
where u C q ( x ,t ) is given by Equation 57.149. Given the piant and disturbance model of Equation 57.170, then, per Equation 57.162, it is necessary to choose u ~ ( xt ), so that
Choosing any one of the control structures outlined in Section 57.5.5, a choice of sufficiently "high" gains will produce a negative definite v ( t , x , a ) . Alternatively, one can use a control structure [ 2 ] ,
whose thickness is 2c. Now modify the control law of Equation 57.161 (suppress~ngt and x arguments) to
( 0, otherwise where p ( 0 , X ) is any continuous function satisfying p(0, x ) = 0 and p ( a , x) = u ~ ( xwhen ) llall = E . This control guarantees that trajectories are attracted to the boundary layer. Inside the boundary layer, Equal ion 57.168 offers a continuous approximation to the usual discmtinuous control action. Similar to Corless and Leitmann [ I ] , asymptotic stability is not guaranteed but ultimate boundedness of trajectories to within an 6-dependent neighborhood of the origin is assured.
57.5.8
where cr(x, t ) is to be determined. Assuming W = I, it follows that, for a # 0,
Robustriess to Matched Disturbances and Bariameter Variations
To explore the robustness of VSC to disturbances and parameter variations, one modifies Equation 57.143 to
Choosing cr(x, t ) = a! > 0 leads to.the stability of the state trajectory to the equilibrium manifold despite matched disturbances and parameter variations, demonstrating the robustness property of a VSC law.
57.5.9 where q ( t ) is a vector function representing parameter uncertainties, A f and A B represent the cumulative effects of all plant uncertain ies, and d ( t ) denotes an external (deterministic) disturbance. The first critical assumption in our development is that all uncertainties and external disturbances satisfy the socalled rnatchingtondition, i.e., AJ; A B , and d ( t ) lie in the image of B(x, t ) for all x and t . As such they can all be lumped into a single vector function ( x , t , q , d', u ) , so that
i
c
Observer Design
Recall the nonlinear model of Equation 57.143 whose linear form is given in Equation 57.144. Under the condition that the state is unavailable, we postulate a linear measurement equation, y =Cx,
! (57.173)
where C E R P x n . Our goal is to construct a dynamic system that estimates the state based on knowledge of the input and the measurements of Equation 57.173. In the linear case (Equations 57.144 and 57.173), this is referred to as the Luenberger observer, $ = A , ? + Bu L ( y - C i ) ,
+
The second critical assumption is that a positive continuous bounded function p ( x , t ) exists, satisfying
presupposing ( C , A ) is an observable pair. The context for a sliding-mode observer will be
To incorporate robustness into a VSC design, we utilize the controlstructureofEquation 57.161, u ( x , t ) = uCq(x,t ) + u N ( x , t ) ,
where E is a user-chosen function to insure convergence in the presence of uncertainties and/or nonlinear observations; L is
THE CONTROL HANDBOOK chosen so that A-LC is a stability matrix. Define the estimation error e ( t ) = i ( t ) - x ( t ) resulting in the error dynamics
We now apply steps 1 and 2 to Equation 57.179 using a second nonsingular state transformation on the partial state wl,
where the term - B t represents (matched) lumped uncertainties, parameter variations, or disturbances that affect the plant. "Matched" refers to the assumption that all disturbances, etc., affect the plant dynamics through the image of B ( x , t ) at each x and t . See Section 57.5.8.
to obtain
*
OBSERVER DESIGN 1 [15]: Aparticular implementation would
Assuming that e2 = j2 - y2 can be measured using Equation 57.178 one forms the estimator for the new partial state,
having error dynamics satisfying q >
where q is a design Rn"*, p > m , so that
I(t((, and F
E
The process is repeated until the estimator encompasses all state variables. Extensions to nonlinear plants can be found in 141. where P = pT
0 satisfies
+
( A - L C ) ~ P P ( A - LC) = -Q
for an appropriate Q =
which implies limt,,
eT> 0, if it exists. It follows that
e ( t ) = 0. For further analysis see [5], [8].
OBSERVER DESIGN 2 [3]: Here we set L = 0 and E ( y , i ) =sgn ( y - C.?). Further, we execute a nonsingular state transformation
57.5.10 Concluding Remarks This chapter has summarized the salient results of variable structure control theory and illustrated the design procedures with various examples. A wealth of literature exists on the subject which cannot be included because of space limitations. In particular, the literature is replete with realistic applications [12], extensions to output feedback [6], extensions to decentralized control [lo], as well as extensions to discrete-time systems. The reader is encouraged to search the literature for the many papers in this area.
57.5.1 1 Defining Terms to obtain the equivalent plant dynamics:
We will build the observer sequentially as follows: 1. Set el ( t ) = j ( t ) - y ( t ) = C i ( t ) - y ( t ) . This results in the error dynamics for the partial state "y" of the form
where e2( t ) = 3 ( t ) - w ( t ) . By choosing an appropriate nonsingular gain matrix G 1 we force Equation 57.177 into a sliding regime along the manifold { e l ( t ) = 0). 2 . The "equivalent control" of Equation 57.177(i.e., along the manifold el (t) = 0) would then take the form
3. Now consider the second subsystem of Equation 57.176
Chattering: The phenomenon of nonideal but fast switching. The term stems from the noise generated by a switching element. Equilibrium (discontinuity) manifold: A specified, userchosen manifold in the state space to which a system's trajectory is driven and maintained for all time subsequent to intersection of the manifold by a discontinuous control that is a function of the system's states, hence, discontinuity manifold. Other terms commonly used are sliding surface and switching surface. Equivalent control: The solution to the algebraic equation involving the derivative of the equation of the switching surface and the plant's dynamic model. The equivalent control is used to determine the system's dynamics on the sliding surface. Equivalent system dynamics: The system dynamics obtained after substituting the equivalent control into the plant's dynamic model. It characterizes state motion parallcl to the sliding surface if the system's initial state is off the surface and state motion is on the sliding surface if the initial state is on the surface.
57.6. CONTROL OF BIFURCATIONS AND C H A O S , Ideal sliding mode: Motion of a system's state trajectory
along a switching surface when switching in the cnntrol law is infinitely fast. Matchiigcondition: The condition requiring the plant's uncertainties to lie in the image of the input matrix, that is, the uncertainties can affect the plant dynam ics only through the same channels as the plant's input. Region of attraction: A set ofinitial states in the state space from which sliding is achievable. Regularform: A particular form of the state-space description of a dynamic system obtained by a suitable transformation of the system's state variables. Sliding surface: See equilibrium manifold. Switching surface: See equilibrium manifold.
24(2), 187-193, 1988. [ 1Z I Sira-liamirez, H., Nonlinear P-I Controller Design for
Switchmode dc-to-dc Power Converters, IEEE Trnns. otr Circuits &-Systems, 38(4), 410-417, 1991. (131 Utkin, V.I., Sliding Modes and Their Application in Variable Structure Control, Mir, Moscow, 1978. 114) Utkin, V.I., Sliding Modes in Control and Optimization, Springer, Berlin, 1992. (151 Walcott, B.L. and ~ a k ,S.H., State Observation of Nonlinearluncertain Dynamical Systems, IEEE Trans. on Automat. Control, AC-32(2), 166-170,1987. (161 Young, K.-K.D., Kokotovit, P.V., and Utkin, V.I., A Singular Perturbation Analysis of High-Gain Feedback Systems, IEEE Trans. on Automat. Control, AC22(6), 931-938, 1977.
57.6 Control of Bifurcations and Chaos References [ I ] Corless, M.J. and Leitmann, G., Continuous State Feedback Guaranteeing Uniform Ultimate Boundedness for Uncertain Dynamic Systems, IEEE Trans. on Atitomat. Control, AC-26(5),1139-1144, 1981. [2] DeCarlo, R.A., ~ a kS., H., and Matthews, G.P., Variable Struclure Control of Nonlinear Multivariable Systems: ATutorial, Proc. IEEE, 76(3), 212-232,1988. [3] Drakunov, S.V., Izosimov, D.B., Lukyanov, A. G., Utkin, V. A., and Utkin, V.I., The Block Control Principle, Part [I, Automation and Remote Control, 51(6), 737-746,1990. [4] Drakunov, S.V., Sliding-Mode Observers Based on Equivalent Control Method, Proc. 31st IEEE Conf. Decision and Control, Tucson, Arizona, 2368-2369, 1992. [5] Edwards, C. and Spurgeon, S.K., On the Development of Disconi:inuous Observers, Int. J. Control, 59(5), 1211-1229, 1994. [6]El-Khazali, R. and DeCarlo, R.A., Output Feedback Variable Structure Control Design, Autornatica, 31(5), 805-816,1995. (71 Filippov, A. F., Differential Equations with Discontinuous Righihand Sides, Kluwer Academic, Dordrecht, The Nethc:rlands, 1988. [8] Hui, S. and hk, S.H., Robust Control Synthesis for UncertainINonlinear Dynamical Systems, Automatica, 28(2), 289-298, 1992. [9] Hunt, L.R., Su, R., andMeyer, G., GlobalTransformations of Nonlinear Systems, IEEE Trans. on Automat. Control, AC-28(1), 2441,1983. [lo] Matthews, G. and DeCarlo, R.A., Decentralized Variable Structure Control of Interconnected MultiInputIMulti-Output Nonlinear Systems, Circuits Systems &Signal Processing, 6(3), 363-387, 1987. [ l l ] Matthews G.P. and DeCarlo, R.A., Decentralized Tracking )fora Class oEInterconnected Nonlinear Systems Using Variable Structure Control, Autornatica,
Eyad H. Abed, Department of Electrical Engineering and the Institute for Systems Research, Un~versity of Maryland, College Park, MD Hua 0.Wang, United Technologies Research Center, East Hartford, CT Alberto Tesi, Dipartimento di Sistemi e Informatica, Universita di Firenze, Firenze, Italy 57.6.1 Introduction This chapter deals with the contror of bifurcations and chaos in nonlinear dynamical systems. This is a young subject area that is currently in a state of active development. Investigations of control system issues related to bifurcations and chaos began relatively recently, with most currently available results having been published within the past decade. Given this state ofaffairs, a unifying and comprehensive picture of control of bifurcations and chaos does not yet exist. Therefore, the chapter has a modest but, it is hoped, useful goal: to summarize some of the motivation, techniques, and results achieved to date on control of bifurcations and chaos. Background material on nonlinear dynamical behavior is also given, to make the chapter somewhat self-contained. However, in.terested readers unfamiliar with nonlinear dynamics will find it helpful to consult nonlinear dynamics texts to reinforce their understanding. Despite its youth, the literature on control of bifurcations and chaos contains a large variety of approaches as well as interesting applications. Only a small, number of approaches and applications can be touched upon here, and these reflect the background and interests of the authors. The Further Reading section provides references for those who would like to learn about alternative approaches or to learn more about the approaches discussed here. Control system design is an enabling technology for ayariety of application problems in which nonlinear dynamical behavior arises. Theability to manage this behavior can result in significant practical benefits. This might entail facilitating system operabil-
THE C O N T R O L HANDBOOK ity in regimes where linear control methods break down; taking advantage of chaotic behavior to capture a desired oscillatory behavior without expending much control energy; or purposely introducing a chaotic signal in a communication system to mask ' a transmitted signal from an adversary while allowing perfect reception by the intended party. The control problems addressed in this chapter are characterized by two main features: 1. nonlinear dynamical phenomena impact system be-
havior 2. control objectives can be met by altering nonlinear phenomena Nonlinear dynamics concepts are clearly important in understandingthe behavior of such systems (with or without control). Traditional linear control methods are, however, often effective in the design of control strategies for these systems. In other cases, such as for systems ofthe type discussed in Section 57.6.5, nonlinear methods are needed both for control design and peiformance assessment. The chapter proceeds as fo!lows. In section two, some basic nonlinear system terminology is recalled. This section also includes a new term, namely "candidate operating condition", which facilitates subsequent discussions on control goals and strategies. Section three contains a brief summary of basic bifurcation and chaos concepts that will be needed. Section four provides application examples for which bifurcations and/or chaotic behavior occur. Remarks on the control aspects of these applications are also given. Section five is largely a review of some basic concepts related to control of parameterized families of (nonlinear) systems. Section five also includes a discussion of what might be called "stressed operation" of a system. Section six is devoted to control problems for systems exhibiting bifurcation behavior. The subject of section seven is control of chaos. Conclusions are given in section eight. The final section gives some suggestions for further reading.
57.6.2 Operating Conditions of Nonlinear Systems In linear system analysis and control, a blanket assumption is made that the operating condition of interest is a particular equilibrium point, which is then taken as the origin in the state space. The topic addressed here relates to applying control to alter the dynamical behavior of a system possessing multiple possible operating conditions. The control might alter these conditions in terms of their location and amplitude, and/or stabilize certain possible operating conditions, permitting them to take the place of an undesirable open-loop behavior. For the purposes of this chapter, therefore, it is important to consider a variety of possible operating conditions, in addition to the single equilibrium point focused on in linear system theory. In this section, some basic terminology regarding operating conditions for nonlinear systems is established.
Consider a finite-dimensional continuous-time system
where x E Rn is the system state and F is smooth in x. (The terminology recalled next extends straightforwardly to discretetime systems xktl = F ( x ~ ) . ) An equilibrium point or f i e d point of the system (57.180) is a constant steady-state solution, i.e., a vector x0 for which ~ ( x ' ) = 0. A periodic solutio . of the system is a trajectory x ( t j for which there is a minimum T > 0 such that x(t T) = x(t) for all t. An invariant set is a set for which any trajectory starting from an initial condition within the set remains in the set for all future and past times. An isolated invariant set is a bounded invariant set a neighborhood of which contains no other invariant set. Equilibrium points and periodic solutions are examples of invariant sets. A periodic solution is called a limit cycle if it is isolated. An attractor is a boul;ded invariant set to which trajectories starting from all sufficiently nearby initial conditions converge as t -+ m. The positive limit set of (57.180) is the ensemble of points that some system trajectory either approaches as t -+ co or makes passes nearer and nearer to as t -+ co. For example, if a system is such that all trajectories converge either to an equilibrium point or a limit cycle, then the positive limit, set would be the union of points on the limit cycle and the equilibrium point. The negative limit set is the positive limit set of the system run with time t replaced by -t. Thus, the positive limit set is the set where the system ends up at t = +co,while the negative limit set is the set where the system begins at t = -00 [30]. The limit set of the system is the union of its positive and negative limit sets. It is now possible to introduce a term that will facilitate the discussions on control of bifurcations and chaos. A candidate operating condition of a dynamical system is an equilibrium point, periodic solution or other invariant subset of its limit set. Thus, a candidate operating condition is any possible steady-state solution of the system, without regard to its stability properties. This term, though not standard, is useful since it permits discussion of bifurcations, bifurcated solutions, and chaotic motion in general terms without having to specify a particular nonlinear phenomenon. The idea behind this term is that such a solution, if stable or stabilizable using available controls, might qualify as an operating condition of the system. whether or not it would be acceptable as an actual operating condition would depend on the system and on characteristics ofthe candidate operating condition. As an extreme example, a steady spin of an airplane is a candidate operating condition, but certainly is not acceptable as an actual operating condition!
+
57.6.3 Bifurcations and Chaos This section summarizes background material on bifurcations and chaos that is needed in the remainder of the chapter.
Bifurcations A bifurcation is a change in the number of candidate operating conditions of a nonlinear system that occurs as a parameter
57.6. CONTROL O F BIFURCATIONS A N D CHAOS is quasistatically varied. The parameter being varied is referred to as the bifurcation parameter. A value of the bifurcation parameter at which a bifurcation occurs is called a critical value of the bifurcation parameter: Bifurcations from a nominal operating condition can only occur at parameter values for which the. condition (say, an equilibrium point or limit cycle) either loses stability or ceases to exist. To fix ideas, consicler a general one-parameter family of ordinary differential equation systems
where x E Rn is the system state, p E R denotes the bifurcation parameter, and F is smooth in x and p. Equation 57.181 can be viewed as resulting from a particular choice of control law in a family of nonlinear control systems (in particular, as Equation 57.188 with the control u set to a fixed feedback function u ( x , p)). For any value of p, the equilibrium points of Equation 57.181 are given by the solutions for x of the algebraic equations F ~ ( x=) 0. Local bifurcations are those that occur in the vicinity of an equilibrium point. For example, a small-amplitude limit cycle can emerge (bifurcate) from an equilibrium as the bifurcation parameter is varied. The bifurcation is said to occur regardless of the stability or instability of the "bifurcated" limit cycle. In another local bifurcat ion, a pair of equilibrium points can emerge from a nominal equilibrium point. In either case, the bifurcated solutions are close to the original equilibrium point -hence the name local bifurcation. Global bifurcations are bifurcations that are not local, i.e., those that involve a domain in phase space. Thus, if a limit cycle loses stability releasing a new limit cycle, a global bifurcation is said to take place.3 The n o m i ~ ~operating al condition of Equation 57.181 can be an equilibrium point or a limit cycle. In fact, depending on the coordinates used, it is possible that alimit cycle in one mathematical model corresponds to an equilibrium point in another. This is the case, for example, when a truncated Fourier series is used to approximate a limit cycle, and the amplitudes of the harmonic terms are used as stat~evariables in the approximate model. The original limit cycle is then represented as an equilibrium in the amplitude coordinates. If the nominal operating condition of Equation 57.181 is dn equilibrium point, then bifurcations from this condition can occur only when the linearized system loses stability. Suppose, for example, that the origin is the nominal operating condition for some range of parameter values. That is, let Fp(0) = 0 for all values of p for whict~the nominal equilibrium exists. Denote the Jacobian matrix of (57.181) evaluated at the origin by
3 ~ h iuse s of the term; local bifurcation and global bifurcation is common. However, in some books, a bifurcation from a limit cycle is also referred to as a local bifurcation.
Local bifurcations from the origin can only occur at parameter values p where A ( p ) loses stability. The scalar differential equation
provides a simple example of a bifurcation. The origin x0 = 0 is an equilibrium point for all values of the real parameter p. The Jacobian matrix is A(p) = p (a scalar). It is easy to see that the origin loses stability as p increases through p = 0. Indeed, a bifurcation from the origin takes place at p = 0. For p 5 0, the only equilibrium point of Equation 57.182 is the origin. For p > 0, however, there are two additional equilibrium points at x = ff i .This pair of equilibrium points is said to bifurcate from the origin at the critical parameter value pC= 0. This is an example of a pitchfork bifurcation, which will be discussed later. Subcritical vs. supercritical bifurcations In a very real sense, the fact that bifurcations occur when stability is lcst is helpful from the perspective of control system design. To explain this, suppose that a system operating condition (the "nominal" operating condition) is not stabilizable beyond a critical parameter value. Suppose a bifurcation occurs at the critical parameter value. That is, suppose a new candidate operating condition emerges from the nominal one at the critical parameter value. Then it may be that the new operating condition is stable and occurs beyond the critical parameter value, providing an alternative operating condition near the nominal one. This is referred to as a supercritical bifurcation. In contrast, it may happen that the new operating condition is unstable and occurs prior to the critical parameter value. In this situation (called a subcritical bifurcation), the system state must leave the vicinity of the nominal operating condition for parameter values beyond the critical value. However, feedback offers the possibility of rendering such a bifurcation supercritical. This is true even if the nominal operating condition is not stabilizable. If such a feedback control can be found, then the system behavior beyond the stability boundary can remain close to its behavior at the nominal operating condition. The foregoing discussion of bifurcations and their implications for system behavior can be gainfully viewed using graphical sketches called bifurcation diagrams. These are depictions of the equilibrium points and limit cycles of a system plotted against the bifurcation parameter. A bifurcation diagram is a schematic representation in which only a measure of the amplitude (or norm) of an equilibrium point or limit cycle need be plotted. In the bifurcation diagrams given in this chapter, a solid line indicates a stable solution, while a dashed line indicates an unstable solution. Several bifurcation diagrams will now be used to further explain the meanings of supercritical and subcritical bifurcation, and to introduce some common bifurcations. It should be noted that not all bifurcations are'supercritical or subcritical. For example, bifurcation can also be transcritical. In such a bifurcation, bifurcated operating conditions occur both prior to and beyond the critical parameter value. Identifying a bifurcation as supercritical, subcritical, transcritical, or otherwise is the problem of determining the direction of the bifurcation. A book on nonlinear dynamics should be consulted for a more extensive treatment
THE CONTROL HANDBOOK (e.g., [lo, 18, 28,30,32,43, 461). In this chapter only the basic elements of bifurcation analysis can be touched upon. Suppose that the origin of Equation 57.181 loses stability as p increases through the critical parameter value p = kc.Under mild assumptions, it can be shown that a bifurcation occurs at
PC.
(
Figure 57.13 serves twopurposes: it depicts asubcritical bifurcation from the origin, and it shows a common consequence of subcritical bifurcation, namely hysteresis. A subcritical bifurcation occurs from the origin at the point labeled A in the figure. It leads to the unstable candidate operating condition corresponding to points on the dashed curve connecting points A and B. As the parameter p is decreased to its value at point B, the bifurcated solution merges with another (stable) candidate operating condition and disappears. Asaddle-node bifurcation is said to occur at point B. This is because the unstable candidate operating condition (the "saddle" lying on the dashed curve) merges with a stable candidate operating condition (the "node" lying on the solid curve). These candidate operating conditions can be equilibrium points or limit cycles -both situations are captured in the figure. Indeed, the origin can also be reinterpreted as a limit cycle and the diagram would still be meaningful. Another common name for a saddle-node bifurcation point is a turningpoint.
CL Figure 57.13
x
Subcriticalbifurcation with hysteresis.
1
7
The physical scenario implied by Figure 57.13 can be summarized as follows. Starting from operation at the origin for small p, increasing p until point A is reached does not alter system behavior. If p is increased past point A, however, the origin loses stability. The system then transitions ("jumps") to the available stable operating condition on the upper solid curve. This large transition can be intolerable in many applications. As p is then decreased, another transition back to the origin occurs but at a lower parameter value, namely that corresponding to point B. Thus, the combination of the subcritical bifurcation at A and the saddle-node bifurcation at B can lead to a hysteresis effect. Figure 57.14 depicts a supercritical bifurcation from the origin. This bifurcation is distinguished by the fact that the solution bifurcating from the origin at point A is stable, and occurs locally for parameter values p beyond the critical value (i.e., those for which the nominal equilibrium point is unstable). In marked difference with the situation depicted in Figure 57.13, here as the critical parameter value is crossed a smooth change is observed in the system operating condition. No sudden jump occurs. Suppose closeness of the system's operation to the nominal equilibrium point (the origin, say) is a measure of the system's performance. Then supercritical bifurcations ensure close operation to the nominal equilibrium, while subcritical bijhrcations may lead to large excursions away from the nominal equilibrium point. For this reason, a supercritical bifurcation is commonly also said to be safe or sofi, while a subcritical bifurcation is said to be dangerous or hard [49,32,52]. Given full information on the nominal equilibrium point, the occurrence of bifurcation is a consequence of the behavior of the linearized system at the equilibrium point. The manner in which the equilibrium point loses stability as the bifurcation parameter is varied determines the type of bifurcation that arises.
P Figure 57.14
Supercriticalbifurcation.
Three types of local bifurcation and a global bifurcation are discussed next. These are, respectively, the stationary bifurcation, the saddle-node bifurcation, the Andronov-Hopf bifurcation, and the period doubling bifurcation. All of these except the saddle-node bifurcation can be safe or dangerous. However, the saddle-node bifurcation is always dangerous. There are analytical techniques for determining whether a stationary, Andronov-Hopf, or period doubling bifurcation is safe or dangerous. These techniques are not difficult to understand but involve calculations that are too lengthy to repeat here. The calculationsyield formulas for so-called "bifurcation stabilitycoefficients" [IS], the meaning of which is addressed below. The be consulted for details. references [1,2,4, 18,28,32] What is termed here as "statlonary bifurcation" is actually a special case of the usual meaning of the term. In the bifurcation theory literature [lo], stationary bifurcation is any bifurcation of one or more equilibrium points from a nominal equilibrium point. When the nominal equilibrium point exists both before and after the critical parameter value, a stationary bifurcation "from a known solution" is said to occur. If the nominal solution disappears beyond the critical parameter value, a stationary bifurcation "from a unknown solution" is said to occur. To simplify the terminology, here the former type of bifurcation is called a stationary bifurcation. The saddle-node bifurcation is the most common example of the latter type of bifurcation. Andronov-Hopf bifurcation also goes by other names. "Hopf bifurcation" is the traditional name in the West, but this name neglects the fundamental early contributions of Andronov and
P
57.6. CONTROL OF BIFURCATIONS AND CHAOS his co-workers (slee, e.g., [6]). The essence of this phenomenon was also known to Poincark, who did not develop a detailed theory but used the concept in his study oflunar orbital dynamics [37, Secs. 51-521. The same phenomenon is sometimes called flutter bifurcation in the engineering literature. This bifurcation of alimit cycle from an equilibrium point occurs when a complex' conjugate pair of eigenvalues crosses the imaginary axis into the right half of the complex plane at p = p,. A small-amplitude limit cycle then emerges from the nominal equilibrium point at
PC. Saddle-node bifurcation and stationary bifurcation Saddle-node bifurcation occurs when the linearized system has a zero eigenvalue at p = pc but the origin doesn't persist as an equilibrium point beyond the critical parameter value. Saddlenode bifurcation was discussed briefly before, and will not be discussed in any detail in the following. Several basic remarks are, however, in order.
1
1. Saddle-node bifurcation of a nominal, stable equilibrium point entails the disappearance of the equilibrium upon its merging with an unstable equilibrium point at a critical parameter value. 2. The bifurcation occurring at point B in Figure 57.13 is representative of a saddle-node bifurcation. 3. The nominal equilibrium point possesses a zero eigenvalue at a saddle-node bifurcation. 4. An important feature of the saddle-node bifurcation is the disuppearance, locally, of any stable bounded solution ofthe system (57.18 1). Stationary bifurcation (according to the agreed upon terminology above) is guaranteed to occur when a single real eigenvalue goes from being negative to being positive as p passes through the value pc. More precisely, the origin of Equation 57.181 undergoes a stationary bifurcation at the critical parameter value = 0 if hypotht:ses (Sl) and (S2) hold.
where the ellipsis denotes higher order terms. One of the new equilibrium points occurs for E > 0 and the other for E < 0. Also, the stability of the new equilibrium points is determined by the sign of an eigenvalue B(E) of the system linearization at the new equilibrium points. This eigenvalue is near 0 and is also given by a smooth function of the parameter c:
Stability of the bifurcated equilibrium points is determined by the sign of B(E). Xf p ( c ) < 0 the corresponding equilibrium point is stable, while if &(c) > 0 the equilibrium point is unstable. The coefficients pi, i = 1, 2, . . . in the expansion above are the bifurcation stability coefficientsmentioned earlier, for the case of stationary bifurcation. The values of these coefficients determine the local nature of the bifurcation. Since E can be positive or negative, it follows that if P1 # 0 the bifurcation is neither subcritical nor supercritical. (This is equivalentto the condition pl # 0.) The bifurcation is therefore generically transcritical. In applications, however, special structures of system dynamics and inherent symmetries often result in stationary bifurcations that are not transcritical. Also, it is sometimes possible to render a stationary bifurcation supercritical using feedbackcontrol. For these reasons, a brief discussionof subcritical and supercritical pitchfork bifurcations is given next. If &1 = 0 and 82 # 0, a stationary bifurcation is known as a pitchfork bifurcation. The pitchfork bifurcation is subcritical if p2 > 0; it is supercritical if B2 < 0. The bifurcation diagram of a subcritical pitchfork bifurcation is depicted in Figure 57.15, and that of a supercritical pitchfork bifurcation is depicted in Figure 57.16. The bifurcation discussed previously for the example system (57.182) is a supercritical pitchfork bifurcation.
(Sl) F of systern (57.181) is sufficiently smooth in x , p, and F'P(0) = 0 for all p in aneighborhood of 0. The := g ( 0 ) possesses a simple real Jacobian eigenvalw h(p) such that h(0) = 0 and h'(0) # 0.
(S2) All eigenva'iues of the critical Jacobian 0 have negative real parts.
(0) besides
P Under (Sl) and (S2), two new equilibrium fioints of Equation 57.181 emerge from the origin at p = 0. Bifurcation stability coefficients are quantities that determine the direction of bifurcation, and in particular the stability of the bifurcated solutions. Next, a brief discussion of the origin and meaning of these quantities is given. Locally, the new equilibrium points occur for parameter values given by a smooth function of an auxiliary small parameter c (E can be positive or negative):
Figure 57.15
Subcriticalpitchfork bifurcation.
Andronov-Hopf bifurcation Suppose that the origin of Equation 57.181 loses stability as the result of a complex conjugate pair of eigenvalues of A ( p ) crossing the imaginary axis. All other eigenvalues are assumed to remain stable, i.e., their real parts are negative for all values of p. Under this simple condition on the linearization of a nonlinear system, the nonlinear system typically undergoes a bifurcation. The word "typically" is used because there is one more conditior, to satisfy, but it is a
THE CONTROL HANDBOOK
P
I
Figure 57.16
exponents have negative real parts, then the limit cycle is stable. Although it isn't possible to discuss the basic theory oflimit cycle stabilityin any detail here, the reader is referred to almost any text on differentialequations, dynamical systems, or bifurcation theory for a detailed discussion (e.g, [lo, 18,28,30,32,43,46,48]). The stability of the limit cycle resulting from an AndronovHopf bifurcation is determined by the sign of a particular characteristic exponent p(c). This characteristic exponent is real and vanishes in the limit as the bifurcation point is approached. It is given by a smooth and even function of the amplitude c of the limit cycles:
Supercriticalpitchfork bifurcation.
mild condition. The type ofbifurcation that occurs under these circumstances involves the emergence of a limit cycle from the origin as p is varied through p c This is the Andronov-Hopf bifurcation, a more precise description of which is given next. The following hypotheses are invoked in the Andronov-Hopf Bifurcation Theorem. The critical parameter value is taken to be pc ='o without loss of generality. ( M I ) F of system (57.181) is sufficiently smooth in x , p , and FK(0) = 0 for all fi iq a neighborhood of 0. The Jacobian (0) possesses a complex-conjugate pair of (algebraically) - simple eigenvalues A(p) = a ( p ) +iw(p), h ( p ) ,such that a(0) ='O,af(0) # 0 and w, := w(0) > 0.
(AH2) All eigenvalues of the critical Jacobian sides fiw, have negative real parts.
(0) be-
The coefficients p2 and p2 in the expansions above are related by the exchange of stability formula
Genericaliy, these coefficients do not vanish. Their signs determine the direction of bifurcation. The coefficients p2, 84,. . . in the expansion (57.186) are the bifurcation stability coefficients for the case of Andronov-Hopf bifurcation. If p2 > 0, then locally the bifurcated limit cycle is unstable and the bifurcation is subcritical. This case is depicted in Figure 57.17. If P(2 < 0, then locally the bifurcated limit cycle is stable (more precisely, one says that it is orbitally asymptotically stable [lo]). This is the case of supercritical Andronov-Hopf bifurcation, depicted in Figure 57.18. Ifit happens that p2vanishes, then stabilityis determined by the first nonvanishing bifurcation stability coefficient (if one exists).
The Andronov-Hopf Bifurcation Theorem asserts that, under (AHI) and (AH2), a small-amplitude nonconstant limit cycle (i.e., periodic solution) of Equation 57.181 emerges from the origin at ,u = 0. Locally, the limit cycles occur for parameter values given by a smooth and even function of the amplitude c of the limit cycles:
where the ellipsis denotes higher order terms. Stability of an equilibrium point of the system (57.181) can be studied using eigenvalues of the system linearization evaluated at the equilibrium point. The analogous quantities for consideration of limit cycle stability for Equation 57.181 are the characteristic multipliers of the limit cycle. (For a definition, see for example [lo, 18,28,30,32,43,46,48].) A limit cycle is stable (precisely: orbitally asymptotically stable) if its characteristic multipliers all have magnitude less than unity. This is analogous to the widely known fact that an equilibrium point is stable if the system eigenvalues evaluated there have negative real parts. The stability condition is sometimes stated in terms of the characteristic exponents of the limit cycle, quantities which are easily obtained from the characteristic multipliers. If the characteristic
P Figure 57.17
SubcriticalAndronov-Hopf bifurration.
Period doibling bifurcation The bifurcations considered above are all local bifurcations, i.e., bifurcations from an equilibrium point of the system (57.181). Solutions emerging at these bifurcation points can themselves undergo further bifurcations. A particularly important scenario involves a global bifurcation known as the period doubling bifurcation.
57.6. CONTROL OF BIFURCATIONS AND CHAOS rived in the literature (see, e.g., 141). Recently, an approximate test that applies in continuous-time has been derived using the harmonic balance approach j47j.
Chaos
Figure 57.18
SupercriticalAndronov-Hopf bifurcation.
To describethe period doubling bifurcation, consider the oneparameter family of nonlinear systems (57.181). Suppose that Equation 57.18 1 has a limit cycle y f i for a range of values of the real parameter p. Moreover, suppose that for all values of p to one side (say, less than) a critical value p,, all the characteristic multipliers of yw have magnitude less than unity. If exactly one characteristic multiplier exits the unit circle at p = pc, and does so at the point (- 1, Q ) , and if this crossing occurs with a nonzero rate with respect to p, then one can show that a period doubling bifurcation from y f i occurs at p = p,. (See, e.g., [4] and references therein.) This means that another limit cycle, initially oftwice the period of ypc, emerges from y f i at p = pc Typically, the bifurcation is either sgpercritical or subcritical. In the supercritical case, the period doubled lirnit cycle is stable and occurs for parameter values on the side of p, for which the limit cycle y f i is unstable. In the subcritical case, the period doubled limit cycle is unstable and occurs on the side of p, for which the limit cycle y* is stable. In either case, an exchange of stability is said to have occurred between the nominal limit cycle yfi and the bifurcated limit cycle. Figure 57.19 depicts a supercritical period doubling bifurcation. In this figure, a solid curve represents a stable limit cycle, while a dashed curve represents an unstable limit cycle. The figure assumes that the nominal limit cycle loses stability as p increases througlh p,.
Figure 57.19
Period doubling bifurcation (supercriticalcase).
The direction of a period doubling bifurcation can easily be determined in discrete-time, using formulas that have been de-
Bifurcations of equilibrium points and limit cycles are well understood and there is little room for alternative definitions of the main concepts. Although the notation, style, and emphasis may differ among various presentations, the main concepts and results stay the same. Unfortunately, the situation is not quite as tidy in regard to discussions of chaos. There are several distinct definitions of chaotic behavior of dynamical systems. There are also some aspects of chaotic motion that have been found to be true for many systems but have not been proved in general. The aim of this subsection is to summarize in a nonrigorous fashion some important aspects of chaos that are widely agreed upon. The following working definition of chaos will suffice for the purposes of this chapter. The definition is motivated by [46, p. 3231 and [11, p. 501. It uses the notion of "attractor" defined in section two, and includes the definition of an additional notion, namely that of "strange attractor." A solution trajectory of a deterministic system (such as Equation 57.180) is chaotic if it converges to a strange attractor. A strange attractor is a bounded attractor that: (1) exhibits sensitive dependence on initial conditions, and ( 2 ) cannot be decomposed into two invariant subsets contained in disjoint open sets. A few remarks on this working definition are in order. An aperiodic motion is one that is not periodic. Long-term behavior refers to steady-state behavior, i.e., system behavior that persists after the transient decays. Sensitive dependence on initial conditions means that for almost any initial condition lying on the strange attractor, there exists another initial condition as close as desired to the given one such that the solution trajectories starting from these two initial conditions separate by at least some prespeiified amount after some time. The requirement of nondecomposability simply ensures that strange attractors are considered as being distinct if they are not connected by any system trajectories. Often, sensitive dependence on initial conditions is defined in terms of the presence of at least one positive "Liapunov exponentn (e.g., [34,35]). Further discussion of thisviewpoint would entail technicalities that are not needed in the sequel. From a practical point of view, a chaotic motion can be defined as a bounded invariant motion of a deterministic system that is not an equilibrium solution or a periodic solution or a quasiperiodic solution 132, p. 2771. (A quasiperiodic function is one that is composed of finitely many periodic functions with incommensurate frequencies. See [32, p. 23 11.) There are two more important aspects ofstrange attractors and chaos that should be noted, since they play an important role in a technique for control of chaos discussed in section 57.6.7. These are:
THE CONTROL HANDBOOK 1. A strange attractor generally has embedded within
itself infinitely many unstable limit cycles. For example, Figure 57.20(a) depicts a strange attractor, and Figure 57.20(b) depicts an unstable limit cycle thatis embedded in the strange attractor. Note that the shape of the limit cycle resembles that of the strange attractor. This is to be expected in general. The limit cycle chosen in the plot happens to be one of low period. 2. The trajectory starting from any point on a strange attractor will, after sufficient time, pass as close as desired to any other point of interest on the strange attractor. This follows from the indecompossbility of strange attractors noted previously.
Figure 57.20
Strange attractor with embedded unstable limit cycle.
An important way in which chaotic behavior arises is through sequences of bifurcations. A well-known such mechanism is the period doubling route to chaos, which involves the following sequence of events: 1. A stable limit cycle loses stability, and a new stable limit cycle of double the period emerges. (The original stable limit cycle might have emerged from an equilibrium point via a supercritical AndronovHopf bifurcation.) 2. The new stable limit cycle loses stability, releasing another stable limit cycle of twice its period. 3. There is a cascade of such events, with the parameter separation between each two successive events decreasing geometrically.4 This cascade culminates in a sustained chaotic motion (a strange attractor).
57.6.4 Bifurcations and Chaos in Physical Systems In this brief section, a list of representative physical systems that exhibit bifurcations and/or chaotic behavior is given. The purpose is to provide practical motivation for study of these phenomena and their control.
4 ~ h iiss true exactly in an asymptotic sense. The ratio in the geometric sequence is a universal constant discovered by Feigenbaum. See, e.g., [ l l , 32, 34,46,43,48].
Examples of physical system exhibiting bifurcations and/or chaotic behavior include the following. An aircraft stalls for flight under a critical speed or above a critical angle-of-attack (e.g., [8, 13,39,51]). '4spects of laser operation can be viewed in terms of bifurcations. The simplest such observation is that a laser can only generate significant output if the pump energy exceeds a certain threshold (e.g., [19, 461). More interestingly, as the pump energy Increases, the laser operating condition can exhibit bifurcations leading to chaos (e.g., 1171). The dynamics of ships at sea can exhibit bifurcations for wave frequencies close to a natural frequency of the ship. This can lead to large-amplitude oscillations, chaotic motion, and ship capsize (e.g., (24,401). Lightweight, flexible, aircraft wings tend to experience flutter (structural oscillations) (e.g., 1391) (along with loss ofcontrol surface effectiveness (e.g., [151)). At peaks in customer demand for electric power (such as during extremes in weather), the stability margin of an electric power network may become negligible, and nonlinear oscillations or divergence ("voltage collapse") can occur (e.g., 1121). Operation of aeroengine compressors at maximum pressure rise implies reduced weight requirements but also increases the risk for compressor stall (e.g., [16, 20, 251). A simple model for the weather consists of fluid in a container (the atmosphere) heated from below (the sun's rays reflected from the ground) and cooled from above (outer space). A mathematical description of this model used by Lorenz [27] exhibits bifurcations of convective and chaotic solutions. This has implications also for boiling channels relevant to heat-exchangers, refrigeration systems, and boiling water reactors [14]. Bifurcationsand chaos have been observed and studied in a variety of chemically reacting systems (e.g., [421). Population models useful in the study and formulation of harvesting strategies exhibit bifurcations and chaos (e.g., [19,31,46]).
.
57.6.5 Control of Parametrized Families of Systems Tracking of a desired trajectory (referred to as regulation when the trajectory is an equilibrium), is a standard goal in control system design 1261. In applying linear control system design to this standard problem, an evolving (nonstationary) system is' modeled by a parametrized family of time-invariant (stationary)
57.6. CONTROL OF BIFURCATIONS AND CHAOS systems. This approach is at the heart of the gain scheduling method, for example 1261. In this section, some basic concepts in control of parameterized families of systems are reviewed, and a notion of stressed operation is introduced. Control laws for nonlinear systems usually consist of a feedfo~wardcontrol plus a feedback control. The feedforward part of the control is selected first, followed by the feedback part. This decomposition of control laws is discussed in the next subsection, and will be useful in the discussions of control of bifurcations and chaos. In the second subsection, a notion of "stressed operation" ofa system is introduced. Stressed systems are not the only ones for which control of bifurcations and chaos are relevant, but they are an important class for which such control issues need to be evaluated.
Feedforward/lFeedback Structure of Control Laws Control designs for nonlinear system can usually be viewed as proceeding in two main steps [26,Chapters 2,141, [45, Chapter 31 :
1. feedforward control 2. feedback control
In Step 1, the feedforward part of the control input is selected. Its purpose is to achieve a desired candidate operating condition for the system. If the syslem is considered as depending on one or more parameters, then the feedforward control will also depend 011the parameters. The desired operating condition can often be viewed as an equilibrium point of the nonlinear system that varies as the system parameters are varied. There are many situations when this operating condition is better viewed as a limit cycle that varies with the parameters. In Step 2, an additional part of the control input is designed to achieve desired qualities of the transient response in a neighborhood of the nominal operating condition. Typically this second part of the control is selected in feedback form. In the following, the feedforward part of the control input is taken to be designed already and reflected in the system dynamical equations. The discussion centers on design of the feedback part of the control input. Because of this, it is convenient to denote the feedback part of the control simply by u and to view the feedforward part as being determined a priori. It is also convenient to take u to be small (close to zero) near the nominal operating condition. (Any offset in u is considered part of the feedforward control.) It is convenient to view the system as depending on a single exogeneous parameter, denoted by p. For instance, p can represent the set-point of an aircraft's angle-of-attack, or the power demanded of an electric utility by a customer. In the former example, the control u might denote actuation of the aircraft's elevator angles about the nominal settings. In the latter example, u can represent a control signal in thevoltage regulator of apower generator. Consider, then, a nonlinear system depending on a single parameter p
Here x E Wn is the n-dimensional state vector, u is the feedback part of the control input, and p is a parameter. Both u and p are taken to be scalar-valued for simplicity. The dependence of the system equations on x , u , and p is assumed smooth; i.e., f is jointly continuous and several times differentiable in these variables. This system is thus actually a one-parameter family of nonlinear control systems. The parameter p is viewed as being allowed to change so slowly that its variation can be taken as quasistatic. Suppose that the nominal operating condition of the system is an equilibrium point x O ( p )that depends on the parameter 11. For simplicity of notation, suppose that the state x has been chosen so that this nominal equilibrium point is the origih for the range of values of p for which it exists (xO(p) = 0). Recall that the nominal equilibrium is achieved using feedforward control. Although the process of choosing a feedforward control is not elaborated on here, it is important to emphasize that in general this process aims at securing a particular desired candidate operating condition. The feedforward control thus is expected to result in an acceptable form for the nominal operating condition, but there is no reason to expect that other operating conditions will also behave in a desirable fashion. As the parameter varies, the nominal operating condition may interact with other candidate operating conditions in bifurcations. The design of the feedback part of the control should take into account the other candidate operating conditions in addition to the nominal one. For simplicity, suppose the control u isn't allowed to introduce additional dynamics. That is, suppose u is required to be a direct state feedback (or static feedback), and denote it by u = u ( x , p). Note that in general u can depend on the parameter p. That is, it can be scheduled. Since in the discussion above it was assumed that u is small near the nominal operating condition, it is assumed that u(0, p ) = 0 (for the parameter range in which the origin is an equilibrium). Denote the linearization of Equation 57.188 at x0 = 0, u = 0 by
Here,
and
Consider how control design for the linearized system depends on the parameter p. Recall the terminology from linear system theory that the pair (A(p), b(p)) is controllable if the system (57.189) is controllable. Recall also that there are several simple tests for controllability, one of which is that the controllability matrix
is of full rank. (In this case this is equivalent to the matrix being nonsingular, since here the controllability matrix is square.)
THE CONTROL HANDBOOK If ,u is such that the pair ( A ( / L )h, ( / ~ is ) )controllable, then a standard linear systems result asserts the existenceof alinear feedback u ( x , b ) = -k(p)x stabilizing the system. (The associated closed-loop system would be f = (A(p) - b(p)k(p))x.) Stabilizability tests not requiring controllability also exist, and these are more relevant to the problem at hand. Even more interesting from a practical perspective is the issue of output feedback stabilizability, since not all state variables are accessible for real-time measurement in many systems. As p is varied over the desired regime of operability, the system (57.189) may lose stability and stabilizability.
aeroengine compressor at its peak pressure rise. The increased pressure rise comes at the price of nearness to instability. The unstable modes that can arise are strongly related to flow asymmetry modes that are unstabilizable by linear feedback to the compression system throttle. However, bifurcation control techniques have yielded valuable nonlinear throttle actuation techniques that facilitate operation in these circumstances with reduced risk of stall 125, 161.
Stressed Operation and Break-Down of Linear Techniques
Most engineering systems are designed to operate with a comfortable margin of stability. This means that disturbances or moderate changes in system parameters are unlikely to result in loss of stability. For example, a jet airplane in straight level flight under autopilot control is designed to have a large stability margin. However, engineering systems with a usually comfortable stability margin may at times be operated at a reduced stability margin. A jet airplane being maneuvered at high angle-ofattack to gain an advantage over an enemy aircraft, for instance, may have a significantly reduced stability margin. If a system operating condition actually loses stability as a parameter (like angle-of-attack) is slowly varied, then generally it is the case that a bifurcation occurs. This means that at least one new candidate operating condition emerges from the nominal one at the point of loss of stability. The purpose of this section is to summarize some results on control of bifurcations, with an emphasis placed on control of local bifurcations. Control of a particular global bifurcation, the period doubling bifurcation, is considered in the next section on control of chaos. This is because control of a period doubling bifurcation can result in quenching of the period doubling route to chaos summarized at the end of section three. Bifurcation control involves designing a control input for a system to result in a desired modification to the system's bifurcation behavior. In section 57.6.5, the division of control into a feedforward component and a feedback component was discussed. Both components of a control law can be viewed in terms of bifurcation control. The feedforward part of the control sets the equilibrium points of the system, and may influence the stability margin as well. The feedback part of the control has many functions, one of which is to ensure adequate stability of the desired operating condition over the desired operating envelope. Linear feedback is used to ensure an adequate margin of stability over a desired parameter range. Use of linear feedback to "delay" the onset of instability to parameter ranges outside the desired operating range is a common practice in control system design. An example is the gain scheduling technique [26]. Delaying instability modifies the bifurcation diagram of a system. Often, the available control authority does not allow stabilization of the nominal operating condition beyond some critical parameter value. At this value, instability leads to bifurcations of new candidate operating conditions. For simplicity, suppose that a single candidate operating condition is born at the bifurcation point. Another important goal in bifurcation control is to ensure that the bifurcation is supercritical (i.e., safe) and that
A main motivation for the study of control of bifurcations is the need in some situations to operate a system in a condition for which the stability margin is small and linear (state or output) feedback is ineffective as a means for increasing the stability margin. Such a system is sometimes referred to as being "pushed to its limits': or "heavily loaded". In such situations, the ability of the system to function in a state of increased loading is a measure of system performance. Thus, the link between increased loading and reduced stability margin can be viewed as a performance vs. stability trade-off. Systems operating with a reduced achievable margin of stability may be viewed as being "stressed This trade-off is not a general fact that can be proved in a rigorous fashion, but has been found to occur in a variety of applications. Examples of this trade-off are given at the end of this subsection. Consider a system that is weakly damped and for which the available controls cannot compensate with sufficient additional damping. Such a situation may arise for a system for some ranges of parameter values and not for others. Let the operating envelope of a system be the possible combinations of system parameters for which system operability is being considered. Linear control system methods lose their effectiveness on that part of the operating envelope for which the operating condition of interest of system (57.189) is either: 1. not stabilizablewith linear feedback using available sensors and actuators 2. linearly stabilizable using available sensors and actuators but only with unacceptably high feedback gains 3. vulnerable in the sense that small parameter changes can destroy the operating condition completely (as in a saddle-node bifurcation) Operation in this part of the desired operating envelope can be referred to by terms such as "stressed operation". An example of the trade-off nated previously is provided by an electric power system under conditions of heavy loading. At peaks in customer demand for electric power (such as during extremes in weather), the stability margin may become negligible, and nonlinear dynamical behaviors or divergence may arise (e.g., [12]). The divergence, known as voltage collapse, can lead to system blackout. Another example arises in operation of an
57.6.6 Control of Systems Exhibiting Bifurcation Behavior
57.6. CONTROL, OF BIFURCATIONS AND CHAOS the resulting candidate operating condition remains stable and close to the original operating condition for a range of parameter values beyond the critical value. The need for control laws that soften (stabilize)a hard (unstable) bifurcation has been discussed earlier in this chapter. his need is greatest in stressed systems, since in such systerns delay of the bifurcation by linear feedback is not viable. A soft bifurcation provides the possibility of an alternative operating condition beyond the regime of operability at the nominal condition. Both of these gclals (delaying and stabilization) basically involve local considerations and can be approached analytically (if a good system model is available). Another goal might entail a reduction in amplitude of any bifurcated solutions over some prespecified parameter range. This goal is generally impossible to work with oln a completely analytical basis. It requires extensive numerical study in addition to local analysis near the bifurcation(s) . In the most severe local bifurcations (saddle-node bifurcations), neither 1 hen ominal equilibrium point nor any bifurcated solution exists past the bifurcation. Even in such cases, an understanding of bifurcations provides some insight into control design for safe operation. For example, it may be possible to use this understanding I o determine (or introduce via added control) awarning signalthat becomes more pronounced as the severe bifurcation is approached. This signal would alert the high-level control system (possiblya human operator) that action is necessary to avoid catastrophe. In this section, generally it is assumed that the feedforward component of the control has been pre-determined, and the goal IS to design the feedback component. An exception is the following brief discussion of a real-world example of the use of feedforward control to modify a system's operating condition and its stability margin in the face of large parameter variations. In short, this is an example where feedforward controls are used to successfully avoid the possibility of bifurcation. During takeoff and landing of a commercial jet aircraft, one can observe the deployment of movable surfaces on the leading- and trailing-edges of the wings. Thest: movable surfaces, called flaps and slats, or camber changers [13], result in a nonlinear change in the aerodynamics, and, in turn, in an increased lift coefficient [51, 131. This is needed to allow takeoff and landing at reduced speeds. Use of these surfaces has the drawback of reducing the critical angle-of-attack for stall, resulting in a reduced stability margin. A common method to alleviate this effect is to incorporate other devices, called vortex generators. These are small airfoil-shaped vanes, protruding upward from the wings [13]. The incorporation of vortex generators results in a further modification to the aeodynamics, nioving the stall angle-of-attack to a higher value. Use of the camber changers and the vortex generators are examples of feedforward control to modify the operating condition within a part of the aircraft's operating envelope. References [13] and [51] provide further details, as well as diagrams showing how the lift coefficient curve is affected by these devices.
Local Direct State Feedback To give a flavor of the analytical results available in the design of the feedback component in bifurcation control, consider the nonlinear control system (57.188), repeated here for convenience:
x = f p ( x , u).
(57.190)
Here, u represents the feedback part of the control law; the feedforward part is assumed to be incorporated into the function f . The technique and results of [I, 21 form the basis for the following discussion. Details are not provided since they would require introduction of considerable notation related to multivariate Taylor series. However, an illustrative example is given based on formulas available in [1,2]. Suppose for simplicity that Equation 57.190 with u = 0 possesses an equilibrium at the origin for a parameter range of interest (including the value p = 0). Moreover, suppose that the origin of Equation 57.190 with the control set to zero undergoes either a subcritical stationary bifurcation or a subcritical Andronov-Hopf bifurcation at the critical parameter value p = 0. Feedback control laws of the form u = u(x) ("static state feedbacks") are derived in [I, 21 that render the bifurcation supercritical. For the Andronov-Hopf bifurcation, this is achieved using a formula for the coefficient D2 in the expansion (57.186) of the characteristic exponent for the bifurcated limit cycle. Smooth nonlinear controls rendering p2 < 0 are derived. For the stationary bifurcation, the controlled system is desired to display a supercritical pitchfork bifurcation. This is achieved using formulas for the coefficients p1 and B2 in the expansion (57.184) for the eigenvalue of the bifurcated equilibrium determining stability. Supercriticality is insured by determining conditions on u(x) such that Dl = 0 and p2 < 0. The following example is meant to illustrate the technique of [2]. The calculations involve use of formulas from [2] for the bifurcation stability coefficients Dl and B2 in the analysis of stationary bifurcations. The general formulas are not repeated here. Consider the one-parameter family of nonlinear control systems
Here xl, x2 are scalar state variables, and p is a real-valued parameter. This is meant to represent a system after application of a feedforward control, so that u is to be designed in feedback form. The nominal operating condition is taken to be the origin (xl ,xZ)= (0, O), which is an equilibrium of (57.191),(57.192) when the control u = 0 for ail values of the parameter p. Local stability analysis at the origin proceeds in the standard manner. The Jacobian matrix A(p) of the right side of (57.191),(57.192,)is given by
962 The system eigenvalues are p and - 1. Thus, the origin is stable for ,u < 0 but is unstable for p > 0. The critical value of the bifurcation parameter is therefore pL = 0. Since the origin persists as an equilibrium point past the bifurcation, and since the critical eigenvalue is 0 (not an imaginary pair), it is expected that a stationary bifurcation occurs. The stationary bifurcation that occurs for the open-loop system can be studied by solving the pair of algebraic equations
THE CONTROL HANDBOOK a subcritical pitchfork bifurcation is predicted for the open-loop system, a fact that was established above using simple algebra. Now let the control consist of a quadratic function of the state and determine conditions under which /32 < 0 for the closedloop system.5 Thus, consider u to be of the form
The formula in [ 2 ] yields that then given by
for a nontrivial (i.e., nonzero) equilibrium ( x l , x 2 ) near the origin for p near 0. Adding Equation 57.194 to Equation 57.195 gives
Disregarding the origin, this gives two new equilibrium points that exist for ,u < 0, namely
Since these bifurcated equilibria occur for parameter values ( p < 0) for which the nominal operating condition is unstable, the bifurcation is a subcritical pitchfork bifurcation. The first issue addressed in the control design is the possibility of using linear feedback to delay the bifurcation to some positive value of p . This would require stabilization of the origin at p = 0. Because of the way in which the control enters the system dynamics, however, the system eigenvalues are not affected by linear feedback at the parameter value p = 0. TOsee this, simply note that in Equation 57.192, the term p u vanishes when p = 0, and the remaining impact of the control is through the term xl u . This latter term would result only in the addition of quadratic terms to the right side of Equations 57.191 and 57.192 for any linear feedback u . Hence, the system is an example of a stressed nonlinear system for p near 0. Since the pitchfork bifurcation cannot be delayed by linear feedback, next consider the possibility of rendering the pitchfork bifurcation supercritical using nonlinear feedback. This rests on determining how feedback affects the bifurcation stability coefficients p, and p2 for this stationary bifurcation. Once this is known, it is straightforward to seek a feedback that renders 81 = 0 and j32 < 0. The formulas for and B2 derived in [2] simplify for systems with no quadratic terms in the state variables. For such systems, the coefficient p1 always vanishes, and the calculation of B2 also simplifies. Since the dynamical equations (57.191) and (57.192) in the example contain no quadratic terms in x , it fol!ows that p1 = 0 in the absence of control. Moreover, if the control contains no linear terms, then it will not introduce quadratic terms into the dynamics. Hence, for any smooth feedback u ( x ) containing no linear terms in x, the bifurcation stability coefficient p1 = 0. As for the bifurcation stability coefficient p z , the pertinent formula in [2] applied to the open-loop system yields p2 = 2. Thus,
B2
for the closed-loop system is
Thus, to render the pitchfork bifurcation in (57.191),(57.192) supercritical, it suffices to take L( to be the simple quadratic filnction
with any gain k l > 1. In fact, other quadratic terms, as well as any other cubic or higher order terms, can be included alongwith this term without changing the local result that the bifurcation is rendered supercritical. Additional terms can be useful in improving system behavior as the parameter leaves a local neighborhood of its critical value.
Local Dynamic State Feedback Use of a static state feedback control law u = u ( x ) has potential disadvantages in nonlinear control of systems exhibiting bifurcation behavior. To explain this, consider the case of an equilibrium x O ( p ) as the nominal operating condition. The equilibrium is not translated to the origin to illustrate how it is affected by feedback. In general, a static state feedback
designed with reference to the nominal equilibrium path x O ( p ) of Equation 57.190 will affect not only the stability of this equilibrium but also the location and stability of other equilibria. Now suppose that Equation 57.190 is only an approximate model for the physical system of interest. Then the nominal equilibrium branch will also be altered by the feedback. A main disadvantage of such an effect is the wasted control energy that is associated with the forced alteration of the system equilibrium structure. Other disadvantages are that system performance is often degraded by operating at an equilibrium that differs from the one at which the system is designed to operate. Moreover, by influencing the locations of system equilibria, the feedback control is competing with the feedforward part of the control. For these reasons, dynamic state feedback-type bifurcation control laws have been developed that do not affect the locations
5 ~ u b i terms c are not included because they would result in quartic terms on the right side of Equations 57.191 and 57.192, while the formula for /?2 in [2] involves only terms up to cubic order in the state.
57.6. CONTROL O F BlFURCATIONS AND CHAOS of system equilibria [;!2,23, 501. The method involves incorporation of filters called "washout filters" into the controller architecture. A washout filter-aided control law preserves all system equilibria, and does st3 without the need for an accurate system model. A washout filter is a stable high pass filter with zero static gain [15, p. 4741. The typical transfer function for a washout filter is
,
where x, is the input variable to the filter and y, is the output of the filter. A washout filter produces a nonzero output only during the transient period. Thus, such a filter "washes out" sensed signals that have settled to constant steady-state values. Washout filters occur in control systems for power systems [5, p. 2771, (41, Chapter 91 and aircraft [7, 15,39,45]. Washout filters are positioned in a control system so that a sensed signal being fecl back to an actuator first passes through the washout filter. If, due to parameter drift or intentional parameter variation, the sensed signal has a steady-state value that deviates from the assumed value, the washout filter will give a zero output and the deviation will not propagate. If a direct state feedback were used instead, the steady-state deviation in the sensed signal would result in the control modifying the steady-state values of the other controlled variables. As an example, washout filters are used in aircraft yaw damping control systems to prevent these dampers from "fighting" the pilot in steady turns [39, p. 9471. From a nonlinear systems perspective, the property ofwashout filters described above translates to achieving equilibrium preservation, i.e., zero steady-state tracking error, in the presence of system uncertainlies. Washout filters car1 be incorporated into bifurcation control laws for Equation 57.190. This should be done only after the feedforward control has been designed and incorporated and part of the feedback control ensuring satisfactory equilibrium point structure has also been designed and incorporated into the dynamics. Othervvise,since washout filters preserve equilibrium points, there will be no possibility of modifying the equilibria. It is assumed below that these two parts of the control have been chosen and implemerlted. For each system state variable xi, i = 1, . . . , n, in Equation 57.190, introduce a washout filter governed by the dynamic equation
along with output equation
Here, the di are posi~iveparameters (this corresponds to using stable washout filters ). Finally, require the control u to depend only on the measured variables y, and that u ( y ) satisfy u ( 0 ) = 0. In this formulation, n washout filters, one for each system state, are present. In fact, the actual number of washout filters needed, and hence also the resulting increase in system order, can usually be taken less than n.
It is straightforward to see that washout filters result in equilibrium preservation and automatic equilibrium (operating point) following. Indeed, since u ( 0 ) = 0, it is clear that y vanishes at steady-state. Hence, thex subvector ofa closed-loop equilibrium point (x, z ) agrees exactly with the open-loop equilibrium value of x. Also, since yi may be re-expressed as
the control function u = u ( y ) is guaranteed to center at the correct~operatingpoint.
57.6.7 Control of Chaos Chaotic behavior of a physical system can either be desirable or undesirable, depending on the application. It can be beneficial for many reasons, such as enhanced mixing of chemical reactants, or, as proposed recently [ 3 6 ] , as a replacement for random noise as a masking signal in a communication system. Chaos can, on the other hand, entail large amplitude motions and oscillations that might lead to system failure. The control techniques discussed in this section have as their goal the replacement of chaotic behavior by a nonchaotic steady-statebehavior. The first technique discussed is that proposed by Ott, Grebogi, and Yorke [ 3 3 ] . The Ott-Grebogi-Yorke(OGY)method sparked significant interest and activity in con~rolof chaos. The second technique discussed is the use of bifurcation control to delay or extinguish the appearance of chaos in a family of systems.
Exploitation of Chaos for Control Ott, Grebogi, and Yorke (331 proposed an approach to control of chaotic systems that involves use.of small controls and exploitation of chaotic behavior. To explain this method, recall from section three that a strange attractor has embedded within itself a "dense" set (infinitelymany) of unstable limit cycles. The strange attractor is a candidate operating condition, according to the definition in section two. In the absence of control, it is the actual system operating ccndition. Suppose that system performance would be significantly improved by operation at one of the unstablelimit cycles embedded in the strange attractor. The OGY method replaces the originally chaotic system operation with operation along the selected unstable limit cycle. Figure 57.20 depicts a strange attractor along with aparticular unstable limit cycle embedded in it. It is helpful to keep such a figure in mind in contemplating the OGY method. The goal is to reach a limit cycle such as the one shown in Figure 57.20, or another of some other period or amplitude. Imagine a trajectory that lies on the strange attractor in Figure 57.20, and suppose the desire is to use control to 'force the .trajectory to reach the unstable limit cycle depicted in the figure and to remain there for all subsequent time. Control design to result in operation at the desired limit cycle is achieved by the following reasoning. First, note.that the desired unstable limit cycle is also a candidate operating condition. Next, recall from section three that a trajectory on the strange
THE CONTROL HANDBOOK attractor will, after sufficient time, pass as close as desired to any other point of interest on the attractor. Thus, the trajectory will eventually come close (indeed, arbitrarily close) to the desired limit cycle. Thus, no control effort whatsoever is needed in order for the trajectory to reach the desired limit cycle -chaos guarantees that it will. To maintain the system state on the desired limit cycle, a small stabilizing control signal is applied once the trajectory enters a small neighborhood of the limit cycle. Since the limit cycle is rendered stable by this control, the trajectory will converge to the limit cycle. If noise drives the trajectory out of the neighborhood where control is applied, the trajectory will wander through the strange attractor again until it once again enters the neighborhood and remains there by virtue of the control. Note that the control obtained by this method is an example of a variable structure control, since it is "on" in one region in state space and "off" in the rest of the space. Also, the particular locally stabilizing control used in the neighborhood of the desired limit cycle has not been discussed in the foregoing. This is because several approaches are possible, among them pole placement [38]. See [34,35] for further discussion. Two particularly significant strengths of the OGY technique are: 1. it requires only small controls 2. it can be applied to experimental systems for which no mathematical model is available The first of these strengths is due to the assumption that operation at an unstable limit cycle embedded in the strange attractor is desirable. If none of the embedded limit cycles provides adequate performance, then alarge control could possibly be used to introduce a new desirable candidate operating condition within the strange attractor. This could be followed by a small control possibly designed within the OGY framework. Note that the large control would be a feedforward control, in the terminology used previously in this chapter. For a discussion of the second main strength of the OGY control method mentioned above, see, e.g., [17,33,34]. It suffices to note here that a construction known as experimental delay coordinate embedding is one means to implement this control method without a priori knowledge of a reliable mathematical model. Several interesting applications of the OGY control method have been performed, including control of cardiac chaos (see [35]). In [17]; a multimode laser was controlled well into its usually unstable regime.
Bifurcation Control of Routes t o Chaos
'
The bifurcation control techniques discussed in section six have direct relevance for issues of control of chaotic behavior of dynamical systems. This is because chaotic behavior often arises as a result of bifurcations, such as through the period doubling route to chaos. The bifurcation control technique discussed earlier in this chapter is model-based. Thus, the control of chaos applications of the technique also require availability of a reliable mathematical model. Only a few comments are given here on bifurcation control of
routes to chaos, since the main tool has already been discussed in section six. The cited references may be consulted for details and examples. In 1501, a thermal convection loop model is considered. The model, which is equivalent to the Lorenz equations mentioned in section four, displays a series of bifurcations leading to chaos. In [50], an Andronov-Hopf bifurcatior. that occurs in the model is rendered supercritical using local dynamic state feedback (of the type discussed in section six). The feedback is designed using local calculations at the equilibrium point of interest. It is found that this simple control law results in elimination of the chaotic behavior in the system. From a practical perspective, this allows operation of the convection loop in a steady convective state with a desired velocity and temperature profile. Feedback control to render supercritical a previously subcritical period doubling bifurcation was studied in [4,47]. In [4], a discrete-time model is assumed, whereas a continuous-time model is assumed in [47]. The discrete-time model used in [4] takes the form (k is an integer)
where x (k) E Wn is the state, u (k) is a scalar control input, p EL W is the bifurcation parameter, and the mapping f fi is sufficiently smooth in x, u,and ,u. The continuous-time model used in [47] is identical to Equation 57.188. ' For discrete-time systems, a limit cycle must have an integer period. A period- 1 limit cycle sheds a period-2 limit cycle upon period doubling bifurcation. The simplicity of the discrete-time setting results in explicit formulas for bifurcation stability coBy improving the stability efficients and feedback controls characteristics of a bifurcated period-2 limit cycle, an existing period doubling route to chaos can be extinguished. Moreover, the period-2 limit cycle will then remain close to the period-l limit cycle for an increased range of parameters.
[$I.
For continuous-time systems, limit cycles cannot in general be obtained analytically. Thus, [47] employs an approximate analysis technique known as harmonic balance. Approximate bifurcation stability coefficients are obtained, and co~itrolto delay the onset of period doubling bifurcation or s~abilizesuch a bifurcation is discussed. The washout filter concept discussed in section six is extended in [4] to discrete-time systems. In [47], an extension of the washout filter concept is used that allows approximate preservation of limit cycles of a certain frequency.
57.6.8 Concldding Remarks Control of bifurcations and chaos is a developing area with many interesting avenues for research and for application, Some of the tools and ideas that have been used in this area were discussed. Connections among these concepts, and relationships to traditional control ideas, have been emphasized.
57.6. C O N T R O L OF BIFURCATIONS A N D C H A O S
57.6.9 Acknowledgment During the preparation of this chapter, the authors were supportedin part by the Air Force Office of ScientificResearch (U.S.), the Electric Power Research Institute (U.S.), the National Science Foundation (U.S.), aind the Minister0 della Universith e della Ricerca Scientifica e 'Tecnologica under the National Research Plan 40% (Italy).
References [ l ] Abed, E.H. and Fu, J.-H., Local feedback stabilization and bifurcation control, I. Hopf bifurcation, Syst. Control Lett., 7. L 1-17, 1986. [2] Abed E.H. and I:u 7.-H., Local feedback stabilization and bifurcation control, 11. Stationary bifurcation, Syst. Control Lett:, 8,467-473, 1987. [3] Abed E.H. and Mrang H.O., Feedback control ofbifurcation and chaos in dynamical systems, in Nonlinear Dynamics and Stochastic Mechanics, W. Kliemann and N. Sri Namachchivaya, Eds., CRC Press, Boca Raton, 1995,153-15'3. [4] Abed E.H., Wang H.O., and Chen R.C., Stabilization of period doubling bifurcations and implications for control of chaos, Physica D, 70(1-2), 154-164, 1994. [5] Anderson P.M. and Fouad A.A., Power System Control and Stability, Iowa State University Press, Ames, 1977. [6] AndronovA A., Vitt, AA., andKhaikin S.E., Theory of Oscillators, P'ergamon Press, Oxford, 1966 (reprinted by Dover, NewYork, 1987),English translation of Second Russian Edition; Original Russian edition published in 1937. [7] Blakelock J.H., Automatic Control of Aircraft and Missiles, 2nd ed., John Wiley & Sons, New York, 1991. [8] Chapman, G.T., Yates, L.A., and Szady, M.J., Atmospheric flight dynamics and chaos: Some issues in modeling and dimensionality, in Applied Chaos, J.H. Kim and J. Stringer, Eds., John Wiley & Sons, New York, 1992,87-141. [9] Chen, G. and Dong, X., From chaos to order -Perspectives and methodologies in controlling chaotic nonlinear dynamical systems, Internat. J. Bifurcation Chaos, 3, 1363-1409, 1993. [lo] Chow, S.N. andHale, J.K., MethodsofBifurcation Theory, Springer-Verlag,New York, 1982. [ l 11 Devaney, R.L., An Introduction to Chaotic Dynamical Systems, 2nd led., ,4ddison-Wesley, Redwood City, CA, 1989. [12] Dobson, I., CJavitsch, H., Liu, C.-C., Tamura, Y., and Vu, K., Voltage collapse in power systems, IEEE Circuits and Devices Magazine, 8(3), 40-45, 1992. [13] Dole, C.E., Flight Theory and Aerodynamics: A Practical Guidefair Operational Safety, John Wiley & Sons, New York, 1981. [ 141 Doming, J.J. and Kim, J.H., Bridging the gap between thescience ofchaos and its technological applications,
in AppIred Chaos, J.H. Kim and J. Stringer, Eds., John Wiley & Sons, New York, 1992,3-30. [ 151 Etkin, B., Dynamics ofAtrnospheric Flight, John Wiley & Sons, New York, 1972. [j6] Eveker, K.M., Gysling, D.L., Nett, C.N., and Sharma, O.P., Integrated control of rotating stall and surge in aeroengines, in Sensing, Actuation, and Control in Aeropropulsion, J.D. Paduano, Ed., Proc. SPIE 2494, 1995,21-35. [17] Gills, Z., Iwata, C., Roy, R., Schwartz, I.B., and Triandaf, I., Tracking unstable steady states: Extending the stability regime of a multimode laser system, Phys. Rev. Lett., 69,3169-3172, 1992. (181 Hassard, B.D., Kazarinoff, N.D., and Wan, Y.H., Theory and Applications of Hopf Bifurcation, Cambridge University Press, Cambridge, 1981. [19] Jackson, E.A., Perspectives of Nonlinear Dynamzcs, Vols. 1 and ;!, Cambridge University Press, Cambridge, 1991. (201 Kerrebrock, J.L., Azrcraft Engines and Gas Turbines, 2nd ed., MIT Press, Cambridge, MA, 1992. (211 Kim, J.H. and Stringer, J., Eds., Applzed Chaos, John Wiley & Sons, New York, 1992. [22] Lee, H.C., Robust Control ofBifurcatzngNonlinear Systems with Applications, Ph.D. Dissertation, Department of Electrical Engineering, University of Maryland, College Park, 1991. (231 Lee, H.-C. and Abed, E.H., Washout filters in the bifurcation control of high alpha flight dynamics, Proc. 1991 Am. Control Conj, Boston, pp. 206-21 1, 1991. [24] Liaw, C.Y. and Bishop, S.R., Nonlinear heave-roll coupling and ship rolling, Nonlinear Dynamics, 8, 197211, 1995. (251 Liaw, D.-C. and Abed, E.H., Analysis and control of rotating stall, Proc. NOLCOS'92: Nonlinear Control System Design Symposium, (M. Fliess, Ed.), June 1992, Bordeaux, France, pp. 88-93, Published by the International Federation of Automatic Control; See also: Active control of compressor stall inception: A bifurcation-theoretic approach, Automatica, 32,1996 (to appear). 1261 Lin, C.-F., Advanced Control Systems Design, Prentice HaU Series in Advanced Navigation, Guidance, and Control, and their Applications, Prentice Hall, Englewood Cliffs, NJ, 1994. [27] Lorenz, E.N., Deterministic nonperiodic flow, 1.Atmosph. Sci., 20, 130-141, 1963. [28] Marsden, J.E. and McCracken, M., The Hopf Bifurcation and Its Applications, Springer-Verlag, New York, 1976. 1291 McRuer, D., Ashkenas, I., and Graham, D., Aircraft Dynamics and Automatic Control, Princeton University Press, Princeton, 1973. [30] Mees, A.I., Dynamics of Feedback Systems, John Wiley & Sons, New York, 1981.
THE CONTROL HANDBOOK [311 Murray, J.D., Mathematical Biology, Springer-Verlag, New York, 1990. 132) Nayfeh, A.H. and Balachandran, B., Applied Nonlinear Dynamics: Analytical, Computational, and Expcrimental Methods, Wiley Series in Nonlinear Science, John Wiley & Sons, New York, 1995. [33] Ott, E., Grebogi, C., and Yorke, J.A., Controlling chaos, Phys. Rev. Lett., 64, 1196-1 199, 1990. [34] Ott, E., Chaos in Dynanlical Systems, Cambridge University Press, Cambridge, 1993. 1351 Ott, E., Sauer, T., and Yorke, J.A., Eds., Copir~gwith Chaos: Analysis of Chaotic Data and the Exploitation ofchaotic Systems, Wiley Series in Nonlinear Science, John Wiley & Sons, New York, 1994. [36] Pecora, L.M., and T.L. Carroll, Synchronization in chaotic systems, Phys. Rev. Lett., 64, 821-824, 1990. [37] Poincark, H., New Methods of Celestial Mechanics, Parts 1, 2, and 3 (Edited and introduced by D.L. Goroff), Vol. 13 of the History of Modern Physics and Astronomy Series, American Institute ofPhysics, U.S., 1993; English translation of the French edition Les Mkthodes nouvelles de la Micanique cileste, originally published during 1892-1899. [38] Romeiras, F.J., Grebogi, C., Ott, E., and Dayawansa, W.P., Controlling chaotic dynamical systems, Physica D, 58, 165-192, 1992. [39j Roskam, J., Airplane Flight Dynamics and Automatic Flight Controls (Part ll),Roskam Aviation and Engineering Corp., Lawrence, Kansas, 1979. [40] Sanchez, N.E. and Nayfeh, A.H., Nonlinear rolling motions ofships in longitudinal waves, Internat. Shipbuilding Progress, 37, 247-272, 1990. [41] Sauer, P.W. anJPai, M.A., Powersystem Dynamicsand Stability, Draft manuscript, Department of Electrical and Computer Engineering, University of Illinois a t Urbana-Champaign, 1995. [42] Scott, S.K., Chemiczl Chaos, International Series of Monographs on Chemistry, Oxford University Press, Oxford, 1991. [43] Seydel, R., Practical Bifurcation and Stability Analysis: From Equilibrium to Chaos, 2nd ed., Springer-Verlag, Berlin, 1994. [44] Shinbrot, T., Grebogi, C., Ott, E., and Yorke, J.A., Using small perturbations to control chaos, Nature, 363,411-417, 1993. [45] Stevens, B.L. and Lewis, F.L., Aircraft Control andSimulation, John Wiley & Sons, New York, 1992. [46] Strogatz, S.H., Nonlinear Dynamics and Chaos: With Applications to Physics, Biology, Chemistry, and Engineering, Addison-Wesley, Reading, M A , 1994. [47] Tesi, A., Abed, E.H., Genesio, R., and Wang, H.O., Harmonic balance analysis of period doubling bifurcations with implications for control of nonlinear dynamics, Report No. RT 11/95, Dipartimento di Sistemi e Inforrnatica, Universith di Firenze, Firenze, Italy, 1995 (submitted for publication).
[48] Thompson, J.M.T. and Stewart, H.B., Nonlinear Dynamics and Chaos, John Wiley & Sons, Chichester, U.K., 1986. [49] Thompson, J.M.T., Stewart, H.B., and Ueda, Y.,Safe, explosive, and dangerous bifurcations in dissipative dyndmical systems, Phys. Rev E, 49, 1019-1027, 1994. [50] Wang, H.O. and Abed, E.H., Bifurcation control of a chaotic system, Automatica, 31, 1995, in press. 151) Wegener, P.P., W h a t Makes Airplanes Fly?: History, Science, and Applications of Aerodynamics, SpringerVerlag, New York, 1991. [52] Ziegler, F., Mechanics of Solids and Fluids, SpringerVerlag, New York, 1991.
Further Reading Detailed discussions of bifurcation and chaos are available in many excellent books (e.g., [ 6 , 10, 11, 18, 19,28,30,32, 34, 35, 43, 46, 481). These books also discuss a variety of interesting applications. Many examples of bifurcations in mechanical systems are given in [52]. There are also several journals devoted to bifurcations and chaos. Of particular relevance to engineers are Nonlinear Dynamics, the International Journal of Bifurcation and Chaos, and the journal Chaos, Solitons and Fractals. The book 1451 discusses feedforward control in the context of "trimming" an aircraft using its nonlinear equations of motion and the available controls. Bifurcation and chaos in flight dynamics are discussed in [8]. Lucid explanations on specific uses of washout filters in aircraft control systems are given in [15, pp. 474-4753, [7, pp. 144-1461, [39, pp. 946-948 and pp. 1087-10951, and [45, pp. 243-246 and p. 2761, The book [311 discussed applications of nonlinear dynamics in biology and population dynamics. Computational issues related to bifurcation analysis are addressed in [43]. Classification of bifurcations as safe or dangerous is discussed in [32,43,48,49]. The edited book [14] contains interesting articles on research needs in applications of bifurcations and chaos. The article [3] contains a large number of references on bifurcation control, related work on stabilization, and applications of these techniques. The review papers 19, 441 address control of chaos methods. In particular, [44] includes a discussion of use of sensitive dependence on initial conditions to direct trajectories to targets. The book [35] includes articles on control of chaos, detection of chaos in time series, chaotic data analysis, and potential applications of chaos in communication systems. The book [32] also contains discussions of control of bifurcations and chaos, and of analysis of chaotic data.
57.7. OPEN-LOOP CONTROL USING OSCILLATORY INPUTS
57.7 Open-Loop Control Using 0scill.atory Inputs
1. Baillieul, B. L e h m a n ,
Boston university6 Northeastern university7
57.7.1 Introduction The interesting discovery that the topmost equilibrium of a pendulum can be stabilized by oscillatory vertical movement of the suspension point has been attributed to Bogolyubov [ l l ] and Kapitsa [26], who published papers on this subject in 1950 and 1951, respectively. In the intervening years, literature appeared analyzing the dynamics of systems with oscillatory forcing, e.g., [31]. Control designs based on oscillatory inputs have been proposed (for instance [8] and [9]) for a number of applications. Many classical resulits on the stability of operating points for systems with oscillatory inputs depend on the eigenvalues of the averaged system lying in the left half-plane. Recently, there has been interest in the stabilization of systems to which such classical results d o not apply. Coron [lo], for instance, has shown the existence of a time-varying feedback stabilizer for systems whose averaged versions have eigenvalues on the imaginary axis. This design is interesting because it provides smooth feedback stabilization for systems which Brockett [15] had previously shown were never stabilizaible by smooth, time-invariant feedback. For conservative mecheinical systems with oscillatory control inputs, Baillieu1 [7] has shown that stability of operating points may be assessed in terms of an energy-like quantity known as the averagedpotential. Control designs with objectives beyond stabilization have been studied in path-planning for mobile robots [40] and in other applications where the models result in "drift-free" controlled differential equations. Work by Sussmann and Liu [36]-[38], extending earlier ideas of Haynes and Hermes, [21], has shown that, for drift-free systems satisfying a certain Lie algebra rank condition (LARC discussed in Section 57.7.2), arbitrary smooth trajectories may be interpolated to an arbitrary accuracy by appropriate choice of oscillatory controls. Leonard and Krishnaprasad [28] have reported algorithms for generating desired trajectories when certain "depth" conditions on the brackets of the defining vector fields are satisfied. This chapter summarizes the current theory ofopen-loop control using oscillatory forcing. The recent literature has emphasized geometric aspects of the methods, and our discussion in Sections 57.7.2 and 57.7.3 will reflect this emphasis. Open-loop methods are quite appealing in applications in which the realtime sensor measurements needed for feedback designs are expensive o r difficult to (obtain. Because the methods work by virtue of the geometry of the motions in the systems, the observed effects
6 ~ hfirst e author gratefully acknowledges supportofthe1J.S.Air Force Office of Scientific: Research under grant AFOSR-90-0226. h he second author gratefully acknowledges support of an NSF Presidential Faculty Fellow Award, NSF CMS-9453473.
967
may be quite robust. This is borne out by experiments described below. The organization of the article is as follows. In the present section we introduce oscillatory open-loop control laws in two very different ways. Example 57.9 illustrates the geometric mechanism through which oscillatory forcing produces nonlinear behavior in certain types of (drift-free) systems. Following this, the remainder ofthe section introduces a more classical analytical approach to control systems with oscillatory inputs. Section 57.7.2 provides a detailed exposition of open loop design methods for so-called "drift-free" systems. The principal applications are in kinematic motion control, and the section concludes with an application to grasp mechanics. Section 57.7.3 discusses some geometric results of oscillatory forcing for stabilization. Examples have been chosen to illustrate different aspects of the theory.
Noncommuting Vector Fields, Anholonomy, and the Effect of Oscillatory Inputs We begin by describing a fundamental mathematical mechanism for synthesizing motions in a controlled dynamical system using oscillatory forcing. We shall distinguish among three classes of systems: I. Drift-free systems with input entering linearly:
Here we assume each g, : Rn -+ Rn is a smooth (i.e. analytic) vector field, and each "input" u , (.) is a piecewise analytic function of time. Generally, we assume m in. 11. Systems with drift and input entering affinely:
The assumptions here are, as in the previous case, with f : Rn -+ Rn also assumed to be smooth. 111. Systems with no particular restriction on the way in which the control enters: x = f ( x , 14).
'
(57.209)
Here u = (ui,. . . , umlT is a vector of piecewise analytic inputs, as in the previous two cases, and f : Rn x R" -+ Rn is analytic. This is a hierarchy of types, each a special case of its successor in the list. More general systems could be considered, and indeed, in the Lagrangian models which are described in SectiQ'n 57.7.3, we shall encounter systems in which the derivatives of inputs also. enter the equations of motion. It will be shown that these systems can be reduced to Class 111, however.
REMARK57.10 The extra term on the right hand side of Equation 57.208 is called a "drift" because, in the absence of control
THE CONTROL HANDBOOK input, the differential equation "drifts" in the direction of the vector field f .
EXAMPLE 57.9: Even Class I systems possess the essential features of the general mechanism (anholonomy) by which oscillatory inputs may be used to synthesize desired motions robustly. (See remarks below on the robustness of open-loop methods.) Consider the simple and widely studied "Heisenberg" system (see [16]):
curve
/-I
net motion of the system is along the xs-axis. This is illustrated in Figure 57.21. Brockett [17] has observed that the precise shape of the input curve is unimportant, but thexg-distance is twice the (signed) area circumscribed by the (xl, x2)-curve. For this simple system, we thus have a way to prescribe trajectories between any two points in R ~ Taking . the case of trajectories starting at the origin, for instance, we may specify a trajectory passing through any other point (x, y , z ) at time t = 1 by finding a circular arc initiating at the origin in (XI, x2)-space with appropriate length, initial direction, and curvature. This construction of inputs may be carried out somewhat more generally, as discussed below in Section 57.7.2. Ideas along this line are also treated in more depth in 1281. The geometric mechanism by which motion is produced by oscillatory forcing is fairly transparent in the case of the Heisenberg system. For systems of the form of Equation 57.208 and especially of the form of Equation 57.209, there is no comparably complete geometric theory. Indeed, much of the literature on such systems makes no mention of geometry. A brief surveyloverview of some of the classical literature on such systems is given next. We shall return to the geometric point of view in Sections 57.7.2 and 57.7.3.
input space curve
Oscillatory Inputs to Control Physical Systems
Figure 57.21 The anholonomy present in the Heisenberg system is depicted in a typical situation in which the input variables trace a closed curve and the state variables trace a curve which does not close. The
distance between endpoints of the state space curve (measured as the length of the dashed vertical line) reflects the anholonomy in the system.
This system is a special case of Equation 57.207 in which m = 2 and .,(XI =
let the inputs, (ur (.), u2 (-)),trace a closed curve so that the states xl and x2 end with the same vdues with which they began, the
(4,)
7
g2(.x) = (x\)
.
If we define the Lie bracket of these vector fields by [ g l , g2] = i1g1 g2 - @ g l , then a simple calculation reveals [gl , g2] = (0, 0, -21T . Another fairly straightforward calculation shows that, from general considerations, there is a choice of inputs ( U (-),u2 (.)) which generates a trajectory pointing approxiand this approximation may mately in the direction (0, 0, be made arbitrarily accurate (see, Nijmeijer and Van der Schaft, [30], p. 77 or Bishop and Crittenden, [lo], p. 18.). In the present case, we can be more precise and more explicit. Starting at the origin, ( X I ,x2, x3) = (0, 0. 0), motion in any of the three coordinate directions is possible. By choosing u 1 ( t ) -= 1, 142 ( b ) = 0, for instance, motion along the XI-axisis produced, and motion along the x2-axis may similarly be produced by reversing the role of u l (.) and u2 (.). Motion along the x3-axis is more subtle. If we
The idea of using oscillatory forcing to control physical processes is not new. Prompted by work in the 1960s on the periodic optimal control of chemical processes, Speyer and Evans 1331 derived a sufficiency condition for a periodic process to minimize a certain integral performance criterion. This approach also led to the observation that periodic paths could be used to improve aircraft fuel economy (see [32].). Previously cited work of Bogolyubov and Kapitsa [ l l ] and [26], led Bellman et al. [8] and [ 9 ]to investigate the systematic use of vibrational control as an open loop control technique in which zero average oscillations are introduced into a system's parameters to achieve a dynamic response (such as stabilizing effects). For example, it has been shown in [8],[9], [24], and [25], that the oscillation of flow rates in a continuous stirred tank reactor allows operating exothermic reactions at (average) yields which were previously unstable. Similarly, [6] and [7] describe the general geometric mechanism by which oscillations along the vertical support of an inverted pendulum stabilize the upper equilibrium point. This section treats the basic theory of vibrational control introduced in [8]and [9]. The techniques are primarily based o n [8], with additional material taken from [24] and [25]. In particular, in [24] and [ X I , the technique of vibrational control has been extended to delay differential equations.
Problem Statement Consider the nonlinear differential equation (Class 111)
57.7. OPEN-LOOP CONTROL USING OSCILLATORY INPUTS where f : Rn x RL"-+ R" is continuously differentiable,x E Rn is the state, and u = (u 1 , . . . , u d ) T is a vector of control inputs assumed to be piecewise analytic functions of time. These are the quantities which we can directly cause to vibrate. Introduce into Equation 57.211 oscillatory inputs according to the law u(t) = ho + y (t) where ho is a constant vector and' y (t) is a periodic average zero (PhZ) vector. (For simplicity, y (t) has been assumed periodic. However, the following discussion can be extended to the case where y ( t ) is an almost periodic zero average vector !8],[9], [24], an? [25].) Then Equation 57.211 becomes
Assume that Equation 57.2 11 has a fixed equilibrium point x, = x, (Ao) for fixed u (t) = Xo. DEFINITION 57.2 An equilibrium point x, (Ao) of Equavibrationally stabilizable if, for any 6 0, tion 57.21 l is said to 1~e there exists a PAZ vector y (t) such that Equation 57.212 has an asymptotically stable periodic solution, x * (t), characterized by
I1 T* - x,(ha)
115 6. where
P*=
h
x*(t)dt
It is often preferable that Equations 57.2 1 1 and 57.2 12 have the same fixed equi1ibriu.mpoint, x, (ho).However, this is not usually the case because the right hand side of Equation 57.212 is time varying and periodic. Therefore, the technique of vibrational stabilization is to determine vibrations y(t) so that the (possibly unstable) equilibrium point x, (Ao) bifurcates into a stable periodic solution whose average is close to xS(ka). The engineering aspects of the problem consist of 1) finding conditions for the existence of stabilizing vibrations, 2) determining which oscillatory inputs, u(.), are physically realizable, and 3) determining the shape (waveform type, amplitude, phase) of the oscillations which will insure the desired response. In Section 57.7.3, we shall present an example showing how oscillatory forcing induces interesting stable motion in neighborhoods of points which are not close to equilibria of the time-varying system of Equation 57.212.
969
because, although the theory is not heavily dependent on the exact shape of the waveform of the periodic input, there is a crucial dependence on the simultaneous scaling of the frequency and amplitude. Because we are usually interested in high frequency behavior, this usually implies that the amplitude of y(t) is large. It is possible, however, that i ( . ) has small amplitude, making the amplitude of y (t) small also. Under these assumptions, Equation 57.213 can be rewritten
To proceed with the stability analysis, Equation 57.214 will be transformed to an ordinary differential equation in "standard" form ($ = c f ((x, t ) ) so that the method of averaging can be applied (see jll] and [24]). This allows the stability properties of the time varying system Equation 57.214 to be related to the stabilityproperties ofa simpler autonomous differential equation (the averaged equation). To make this transformation, consider the so-called "generating equation"
Suppose that this generating equation has a T-periodic general solution h(t, c), for some i ( . ) and t 2 to, where h : R x Rn -+ Rn and c E Rn is uniquely defined for every initial condition x(to) E $2 c R". Introduce into Equation 57.214 the Lyapunov substitution x(t) E= h(wt, q(t)) to obtain an equation for q(.):
which, in slow time r = wt, with ~ ( r = ) q(t) and becomes
E
=
A,
Equation 57.215 is a periodic differential equation in "standard" form and averagingcan be applied. If T denotes the period ofthe right hand side of Equation 57.2 15, then the averaged equation (autonomous) corresponding to Equation 57.2 15is given as
Vibratior~alStabilization It is frequently assumed that Equation 57.212 can be decomposed as
6
where
T
f(c)
wherelo and y (.) are as above and where f 1(x ( t ) ) = f l (kc,x (t )) and the function f2(x(t), y(t)) is linear with respect to its second argument. Systems for which this assumption does not hold are discussed in Section 57.7.3. For simplicity only, assume that f i and fi are analytic: functions. Additionally, assume that y ( t ) , the control, is periodic of period T (0 < T << 1) and in the form, y ( t ) = @;(cot), where w = $5, and i(.) is some fixed period-21r function ie.g., sin or cos). We write y (.) in this way
Y (y (r));
=
[%]-I az
fl(h(r. c)) d r . (57.216)
By the theory of averaging, it is known that an โฌ0 > 0 exists such that for 0 < E 5 โฌ0, the hyperbolic stability properties n are the same. Specifiof Equation 57.215 and ~ ~ u a t i o57.216 cally, if y, is an asymptotically stable equilibrium point of Equation 57.216, this implies that, for 0 < 5 โฌ0, a unique periodic solution, z*(t) of Equation 57.215 exists, in the vicinity of y, that is asymptotically stable also. Since the transformation x(t) = h(wt, q(t)) is a homeomorphism, there will exish an asymptotically stable, periodic, solution to Equation 57.214
5 7.7. OPEN-LOOP CONTROL USING OSCILLATORY INPUTS linearization of the averaged equation reveals that its upper equilibrium point is asymptoticallystable when a2p2> 2gL. Under these conditions, the upper equilibrium point of the inverted pendulum is staidto be vibrationally stabilized.
57.7.2 The Constructive Controllability of Drift-Free (Class I) Systems Class I control systems arise naturally as kinematic models of mechanical systems. In this section, we outline the current theory of motion control for such systeins, emphasizing the geometric mechanism (anholonomy) through which oscillatory inputs to Equation 57.207 produce motions of the state variables. Explicit results along the lines given in Section 57.7.1 for the Heisenberg system have been obtained in a variety of settings, and some recent work will be discussed below. The question of when such explicit constructions are possible more generally for Class I systems does not yet have a complete answer. Thus we shall also discuss computational approaches that yield useful approximate solutions. Afier briefly outlining the state of current theory, we conclude the section with an example involving motion specification for a ball "grasped" between two plates. The recent literature treating control problems for such systems suggests that it is useful distinguishing between two control design problems: PI: The prescribed endpoint steeringproblem requires that, given any pair of points xo, x i E Rn, a vector of piecewise analytic control inputs u ( . ) = (ul(.), . . . , urn(.)) is to be determined to steer the state of Equation 57.207 from xo at time t = 0 to xl attime t = T > 0. P2: The trajectory approximation steeringproblem requires that, given any sufficiently "regular" curve y : [O, T ] += Rn,we determine a sequence {uj(.)) of control input vectors such that the corresponding sequence of trajectories of Equation 57.207 converges (uniformly) to y .
A general solution to either of these problems requires that a certain Lie algebra rank condition (LARC) be satisfied. More specifically, with the Lie bracket of vector fields defined as in Section 57.7.1, define a set of vector fields
Then C = span((?) (= the set of all linear combinations of elements of C) is called the Lie algebra generked by Isl, . . . , g,}. We say Equation 57.207 (or equivalently the set of vector fields {gl, . . . , g,}) satisfies the Lie algebra rank condition on Rn if C spans Rn at each point of Rn.The following result is fundamental because it characterizes those systems for which the steering problems may, in principle, be solved. THEOREM 57.9 A drift-free system Equation 57.207 is completely controllable in thesense that, given any T > 0 andanypair
971
ofpointsxo, xl E Rn,there is a vector of inputs u = (u1, . . . , urn) which are piecewise analytic on [O, 7'j and which steer the system from xo t o x , in T units of time if, and only if, the system satisfies the Lie algeb;a rank condition. As stated, this theorem is essentially due to W.L. Chow (191, but it has been refined and tailored to control theory by others ([13] and [35]). The various versions of this theorem in the literature have all been nonconstructive. Methods for the explicit determination of optimal (open-loop) control laws for steering Class I systems between prescribed endpoints have appeared i n [ I ] , [2], [16], and [18]. The common features in all this work are illustrated by the following: Amodelnonlinear optimalcontrol problemwith threestates and two inputs: Find controls 11 1 (.), 112 (.) which steer the systeill
between prescribed endpoints to minimize the cost criterion I
Several comments regarding the geometry of this problem are in order. First, an appropriate version of Chow's theorem shows that Equation 57.217 is controllable on any 2-sphere, S = {x E R~ : IIxII = r} for some r > 0, centered at the origin in R ~ Hence, . the optimization problem is well-posed precisely when the prescribed endpoints xo, xl E R~ satisfy 11x0ll = llxl II. Second, the problem may be interpreted physically as seeking minimum length paths on a sphere in which only motions composed of rotations about the x-axis (associated with input 21) and y-axis (associatedwith u2) are admissible. General methods for solving this type of problem appear in [ I ] and [2]. Specifically, in the first author's 1975 Ph.D. thesis (see reference in [ I ] and [2]), it was shown that the optimal inputs have the form, ul(t) = l ~sin(wt . $), u ~ ( t )= l ~cos(wt . 4). The optimal inputs depend on three parameters reflecting the fact that the set (group) of rotations of the 2-sphere is three dimensional. The details for determining the values of the parameters p , W , and $ in terms of the end points xo and xl are given in the thesis cited in [ 1] and [2]. The general nonlinear quadratic optimal control problem of steering Equation 57.207 to minimize a cost of the form llu112d t has not yet been solved in such an explicit fashion. The general classes of problems which have been discussed in (11, [2], [16], and [ 181 are associated with certain details of structure in the set of vector fields {gl, . . . , g,) and the corresponding Lie algebra C. In [16] and (181, for example, Brockett discusses various higher dimensional versions of the Lie algebraic structure characterizingthe above sphere problem and the Heisenberg system. In addition to optimal control theory having intrinsic interest, it also points to a broader approach to synthesizing control inputs. Knowing the form of optimal trajectories, we may relax
+
#
+
THE CONTROL HANDBOOK
the requirement that inputs be optimal and address the simpler question of whether problems P1 and P2 may be solved using inputs with the given parametric dependence. Addressing the cases where we have noted the optimal inputs are phase-shifted sinusoids, we study the effect of varying each of the parameters. For instance, consider Equation 57.210 steered by the inputs u 1( t ) = p sin(wt $), u2(t) = p cos(wt $), with p and $ fixed and w allowed to vary. As w increases, the trajectories produced become increasingly tight spirals ascending about the z-axis. One consequence of this is that, although the vectorfields
+
+
are both perpendicular to the x3-axis at the origin, pure motion in the x3-coordinate direction may nevertheless be produced to an arbitrarily high degree of approximation. This basic example can be generalized, and extensive work on the use of oscillatory inputs for approximating arbitrary motions in Class 1 systems has been reported by Sussmann and Liu, [36]. The general idea is that, when a curve y (t) is specified, a sequence of appropriate oscillatory inputs uJ (.) is produced so that the corresponding trajectories of Equation 57.207 converge to y (.) uniformly. The interested reader is referred to [37],[38],and the earlier work of Haynes and Hermes, [211, for further details. Progress on problem P1 has been less extensive. Although a general constructive procedure for generating motions of Equation 57.207 which begin and end exactly at specified points, xo and xl, has not yet been developed, solutions in a number of special cases have been reported. For the case of arbitrary "nilpotent" systems, Lafferriere and Sussmann, (221 and [23],provide techniques for approximating solutions for general systems. Leonard and Krishnaprasad 1281 have designed algorithms for synthesizing open-loop sinusoidal control inputs for point-to-point system maneuvers where up to depth-two brackets are required to satisfy the LARC. Brockett and Dai 1181have studied a natural subclass of nilpotent systemswithinwhich the Heisenberg system Equation 57.2 10 is the simplest member. The underlying geometric mechanism through which a rich class of motions in R~ is produced by oscillatory inputs to Equation 57.210 is also present in systemswith two vector fields but higher dimensional state spaces. These systems are constructed-in terms of nonintegrable p-forms in the coordinate variables xl and x2. We briefly describe the procedure, referring to [18] for more details. The number of linearly independent p-forms in x l and x2 is 1. (Recall that a p-form in xl and x2 is a monomial of p The linearly independent p-forms may be the form X:X;-~. listed explicitly {xf;x;- ' ~ 2 ,. . . , x:}.) Thus, there are 2(p 1) linearly independent expressions of the form
+
where y is a homogeneous polynomial in xl, x;! of degree p 1 (Such expressions are called exact differentials). There is a complementary p-dimensional family (p = 2(p 1) - ( p +2)) of 11's which are not integrable. For example, if p = 2, there are 2 linearly independent nonintegrable forms 11,and we may take these to be (x!&, ~ 2 2 x 1 ) From . these, we construct a completely controllable two-input system
+
More generally, for each positive integer p we could write a completely controllable two-input system whose state space has dimension 2 p ( p 1)/2. Brockett and Dai 1181 consider the optimal control problem of steering Equation 57.2 18 between prescribed endpoints x (0), x (T) E R~to minimize the cost functional
+
+
lT + u2
u2 dt.
It is shown that explicit solutions may also be obtained in this case, and these are given in terms of elliptic functions. Before treating an example problem in mechanicswhich makes use of these ideas, we summarize some other recent work on control synthesis for Class I systems. Whereas Sussmann and his coworkers [22], [23],[36]-[38]have used the concept of a I? Hall basis for free Lie algebras to develop techniques applicable in complete generality to Class I systems, an approach somewhat in the opposite direction has been pursued by S. Sastry and his co-workers [29], [39], 140). This approach sought to characterize systems controlled by sinusoidal inputs. Motivated by problems in the kinematics of wheeled vehicles and robot grasping, they have defined a class of chained systems in which desired
+
+
where 4, @ are homogeneous polynomials of degree p in xi and X I . Within the set of such expressions, there is a set of p 2 linearly independent expressions of the form q = d
+
5,
Figure 57.23
A ball tolling without slipping between two flat plates.
motions result from inputs which are sinusoids with integrally related frequencies. While the results are no less explicit than the Heisenberg example in Section 57.7.1, the systems themselves are very special. Conditions under which a Class I system may be converted to chained form are given in [29].
57.7. OPEN-LOOP CONTROL USING OSCILLATORY INPUTS EXAMPLE 57.111: The example we discuss next (due to Brockett and Dai [18])is prototypical of applications involving the kinematics of objects in the grasp of a robotic hand. Consider a ball that rolls without slipping between two flat horizontal plates. It is convenient to assume the bottorn plate is fixed. Suppose that the ball has unit radius. Fix a coordinate system whose x - and y-axes lie in the fixed bottom plate with the positive z-axis perpendicular to the plate in the direction of the ball. Call this the (bottom) "plate frame." We keep track of the ball's motion by letting (xl, x2, 1) denote the plate-frame coordinates ofthe ball's center. We also fix an orthonormal frame in the ball, and we denote the plate-frame directions of the three coordinate axes by a 3 x 3 orthogonal matrix a l l a12 a13
The ball's position and orientation are thus specified by the 4 x 4 matrix
0
1
a l l a12 a13 xl a21 a22 a23 x2 a31 a32 a33 1 0 0 0 1
As the top platemoves in the plate-frame x direction with velocity vl, the ball's center also moves in the same direction with velocity u 1 = ~ 1 1 2 .This motion imparts a counterclockwise rotation about the y-arus, and since the ball has unit radius, the angular velocity is also u 1. Similarly, if the top plate moves in the (plateframe) y direction with velocity v2, the ball's center moves in the y direction with veloclty u2 = v2/2, and the angular velocity imparted about the x-axis is -u2. The kinematic description of this problem is obtained by differentiating H with respect to time.
where
This velocity relationship, at the point (xl, n2) = (0, O), is
9 73
Computing additional Lie brackets yields only quantities which may be expressed as linear combinations of U IH , U2H, and U3 H. Since the number of linearly independent vector fie!& obtained by taking Lie brackets is three, the system is completely controllable on a three dimensional space. Comparison with the problem of motion on a sphere discussed above shows that, by having the top plate execute a high-frequency oscillatory motion-ul(t) = psin(wt #), u2(t) = p cos(wt #), the ball may be made to rotate about its z-axis. The motion of the ball is "retrograde." If the top plate executes a small amplitude clockwise loop about the "plate-frame" z-axis, the motion of the ball is counterclockwise.
+
+
57.7.3 Systems with Drift-Stability Analysis Using Energy Methods and the Averaged Potential Constructive design methods for the control of systems of Class I1 are less well developed than in the Class I case. Nevertheless, the classical results on stability described in Section 57.7.2 apply in many cases, and the special structure of Class 11 systems again reveals the geometric mechanisms underlying stability.
First and Second Order Stabilizing Effects of Oscillatory Inputs in Systems with Drift Much of the published literature on systems of Class I and Class I1 is devoted to the case in which the vector fields are linear in the state. Consider the control system
where A , B1, . . . , B, areconstantn x n matrices, x(t) E Rn with control inputs of the type we have been considering. We shall be interested in the possibility of using high-frequency oscillatory inputs to create stable motions in a neighborhood of the origin x = 0. Assume that il(.), . . . , iim(.) are periodic functions and (for simplicity only) assume that each iii (.) has common fundamental period T z 0. To apply classical averaging theory to study the motion of Equation 57.220, we consider the effect of increasing the frequency of the forcing. Specifically, we study the dynamics of
where as w becomes large. The analysis proceeds by scaling time and considering r =: wt. Let z ( r ) = x(t). This satisfies the differential equation Computing the Lie bracket of the vector fields U1H and U2H according to the fbrmula given in Section 57.7.1, we obtain a new vector field U3H = [Ul N,IJ2H I , where
b,
Assuming w > 0 is large, and letting 6 = we see that Equation 57.22 1 is in a form to which classical averaging theory applies.
THEOREM 57.10 Consider the system of Equation 57.220 with ui(*) = &(.) where, for i = 1 , . . . ,m, L i i ( . ) is continuous on
THE CONTROL HANDBOOK 0 5 t < tf 5 00 arrdperioriic ofperiod T
<< t
~ Let .
and let y ( r ) be a solution of the constant coefficient linear system
I f z o and yo are respective initial conditions associated with Equations 57.221 and 57.222 such that lzO - yol = O ( E ) , then
-
Iz(r) - y ( r ) ( = O ( E o) n n timescale s $. I f A f Cii,Bihas its eigenvalues in the left halfplane, then I z ( r ) - y(r)l = O ( E )as r -+
00.
This theorem relies on classical averaging theory and is discussed in [7]. A surprising feature of systems of this form is that the stability characteristics can be modified differently if both the magnitude and frequency of the oscillatory forcing are increased. A contrast to Theorem 57.10 is the following:
THEOREM 57.1 1 Let iil (.), . . . , 6, (.) be per~odicfunctions of period T > 0 for i = 1, . . . , m . Assume that each ii, (.) has mean 0. Consider Equation 57.220, and assume that, for all i, j = 1 , . . . ; m , B, B, = 0. Let6 = Define for each i = 1,. . . , m , the periodic function v, ( t ) = &, ( s ) d s , and let
Jof
fii =
+ 1:
d.
v ; ( s ) d s , i = 1, ..., m ,
and ai, =
physical systems. In this subsection, we again discuss stabilizing effects of oscillatory forcing. The main analytical tool will be an energy-like quantity called the averaged potential. This is naturally defined in terms of certain types of Hamiltonian control systems of the type studied in [30]. Because we shall be principally interested in systems with symmetries most easily described by Lagrangian models, we shall start from a Lagrangian viewpoint and pass to the Hamiltonian description via the Legendre transformation. Following Brockett, [14] and Nijmeijer and Van der Schaft, 1301, we define a Lagrangian control system on a differentiable manifold M as a dynamical system with inputs whose equations of motion are prescribed by applying the Euler-Lagrange operator to a function L : T M x U -+ R, L = L ( q , q ; u ) , whose dependence on the configuration q , the velocity q , and the control input u is smooth. U is a set of control inputs satisfying the general properties outlined in Section 57.7.1. Lagrangian systems arising via reduction with respect to cyclic coordinates: Consider a Lagrangian control system with configuration variables (ql , q 2 ) E Rnl x Rn2. The variable ql will be called cyclic if it does not enter into the Lagrangian when u = 0, i.e., if x341 ( q l , 92, q l , 4 2 ; 0) E 0. A symmetry is associated with the cyclic variables ql which manifests itself in the invariance of L ( q l , q z , q ~4,2 ; 0) with respect to a change of coordinates ql H ql + a for any constant a E R". We shall be interested in Lagrangian control systems with cyclic variables in which the cyclic variables may be directly controlled. In such systems, we shall show how the velocities associated with the cyclic coordinates may themselves be viewed as controls. Specifically, we shall consider systems of the form
v i ( s ) v i ( s ) d s , i, j = 1 , . . . , rn.
Let y ( t ) be a solution of the constant coeficient linear system
Suppose that the eigenvalues o f Equation 57.223 have negative real parts. Then there is a tl > 0 such that for w > 0 suficiently large (i.e., for E > 0 suficiently small), if xo and yo are respective initial conditions for Equations 57.220 and 57.223 such that 1x0 - yo/ = U ( E ) ,then Ix(t) - y(t)l = O ( Efor ) all t > ti. The key distinction between these two theorems is that induced stability is a first order effect in Theorem 57.10 where we scale frequency alone, whereas inTheorem 57.1 1where both frequency and magnitude are large, any induced stability is a second order effect (depending on the rms value of the integral of the forcing). Further details are provided in [7]. Rather than pursue these results, we describe a closely related theory which may be applied to mechanical and other physical systems.
A Stability Theory for Lagrangian a n d Hamiltonian Control Systems with Oscillatory Inputs The geometric nature ofemergent behavior in systems subject to oscillatory forcing is apparent in the case of conservative
where
The matrices m ( q 2 ) and M ( q 2 ) are symmetric and positive definite of dimension n l x n l and n2 x n2 respectively, where dim ql = n l , dim 92 = n2, and the matrix A ( q z ) is arbitrary. To emphasize the distinguished role to be played by the velocity associated with the cyclic variable ql ,we write v = 91. Applying the usual Euler-Lagrange operator to this function leads to the equations of motion, d aL: -=U d t a~
The first of these equations may be written more explicitly as
Although, in the physical problem represented by the Lagrangian Equation 57.224, u(.) is clearly the input with v ( . ) determined via Equation 57.226, it is formally equivalent to take v(.) as
57.7. OPEN-LOOPCONTROL USING OSCILLATORY INPUTS the input with the corresponding u ( . ) determined by Equation 57.226. In actual practice, this may be done, provided we may command actuator inputs u(-) large enough to dominate the dynamics. The ~ n o t i oof~92 is then determined by Equation 57.225 with the influence of the actual control input felt only through v(.) and v( ). Viewing v(.) together with v(-) as control input, Equation 57.225 is a Lagrangian control system in its own right. The defining Lagrangian is given by L(q2, 42; v) = f&M(q2)q2 Q ? A ( ~ ~) VVa(q2; v), where Va is the augrnentedpotential defined by Va (9311) = V(q) - f vTm(q2)v. In the remainder of oulFdiscussion, we shall confine our attention to controlled dynamical systems arising from such a reduced Lagrangian. Because the reduction process itselfwill typically not be central, we henceforth omit the subscript "2" on the generalized configuration and velocity variables which we wish to control. We write:
+
1
+
i ( q . 4; v) = i q T ~ ( q ) q Q ~ A ( ~-)V.(q; v v). (57.227)
EXAMPLE 57.12:
I
The rotating pendulum.
As in [4], we consider a mechanism consisting of a solid uniform rectangular bar fixed at one end to a universal joint as depicted in Figure 57.24. The universal joint is comprised of two single degree of freedom revolute joints with mutually orthogonal intersecting axes (labeled x and y in Figure 57.22). These joints are assumed I:O be frictionless. Angular displacements about the x- and y-axes are denoted $1 and $2 respectively, with (@I,@2) = (0,O) designating the configuration in which the pendulum hangs straight down. The pendulum also admits a controlled rotation about a spatidy fixed vertical axis. Let denote the amount of this rotation relative to some chosen reference configuration. To describe the dynamics of the forced pendulum, we choose a (principal axis) coordinate frame, fixed in the bar, consisting of the x- and y-axes of the universal joint together with the corresponding z-axis prescribed by the usual right-hand rule. When the pendulum is at rlest, for some reference value of 8, the body frame x-, y-, and z-axes will coincide with corresponding axes x', y', and z' of an rnertial frame, as in Figure 57.24, with respect to which we shiall measure all motion. Let I,, ly, and I, denote the principal moments of inertia with respect to the body (x, y, 2)-coordinate system. Then the system has the Lagrangian
Figure 57.24
A rotating pendulum suspended from a universal joint.
Equation 57.227 takes the form h@,);v)
1
= i[~,d:+(~y~f+h~:))$]
+
(57.229)
+ (Iz - z ~ ) s ~ c I c ~ & ] + IYs:)c: + IX$]V' + crc2.
[IXSZ~I
+ [(l,cf
The corresponding control system is given by applying the EulerLagrange operator to i,and this is represented by a system of coupled second order differential equations, d a i aL ----
dt ad
ad
= 0.
(57.230)
From the way in which v appears in the reduced Lagrangian, the equations of motion Equation 57.230 will have terms involving ir as well as terms involving v. Though it is possible to analyze such a system directly, we s h d not discuss this approach. The general analytical technique advocated here is to transform all Lagrangian models to Hamiltonian form. The Hamiltonianviewpointand the avenqed - -potential: Recall that in Hamiltonian form the dynamics are represented in terms of the configuration variablesq and conjugate momenta p = (see [30],p. 35 1). Referring to the Lagrangian in Equation 57.227 and applyingthe Legendre transformation H (q, p; v) = [ p .q i ( q . Q ; u ) ] ~ ( ~we , ~obtain ), the corresponding Hamiltonian
.
where the last term is a normalized gravitational potel al, and where si = sin &, ci = cos c$~.We assume that th :is an actuator capable of irotatlng the mechanism about the inertial z-axis with any prescribed angular velocity 8. We shall stcdy the dynamics of this system when I, > Iy >> I,, and as above, we shall view v = &(.) as a control input. The reduced Lagrangian
The equations of motion are then written in the usual way in position and momentum coordinates as
THE C O N T R O L H A N D B O O K
We may obtain an averaged version of this system by replacing all coefficients involving the input v ( . ) by their time-averages. Assuming v ( . ) is bounded, piecewise continuous, and periodic of period T > 0, v(.) has a Fourier series representation:
We may think of the first term 4(p - ~ ( q ) 5 ) ~ ~ ( q ) -'(~ A(q)ii) in Equation 57.235 as an "averaged kinetic energy." It is not difficult to see that there is an "averaged Lagrangian"
from which the Harniltonian H in Equation 57.235 is obtained by means of the Legendre transformation. Equations 57.232 and 57.233 contain terms of order not greater than two in v. Averaging the coefficients we obtain PROPOSITION57.6 Suppose v (.) is given by Equation 57.234. Then, if all coefficients in Equations 57.232 and 57.233 are replaced by their time averages, the resulting averaged system is Hamiltonian with corresponding Hamiltonian function
The averagedpotential is usefulin assessingthe stabilityofmotion in Hamiltonian (or Lagrangian) control systems. The idea behind this is that strict local minima of the averaged potential will correspond to stable equilibria of the averaged Harniltonian system. The theory describing the relationship with the stability of the forced system is discussed in [6] and [7]. The connection with classical averaging is emphasized in [6],where Rayleigh dissipation is introduced to make the critical points hyperbolically stable. In 171, dissipation is not introduced, and a purely geometric analysis is applied within the Hamiltonian framework. We state the principal stability result.
where
and
- m(q)
1
v ( t )dt.
DEFINITION 57.3 We refer to ~ ( qp), in Equation 57.235 as the averaged Hamiltonian associated with Equation 57.231. VA(q), defined in Equation 57.236, is called the averaged potential. (Averaged kinetic and potential energies, the averaged Lagrangian.) Before describing the way in which the averagedpotential may be used for stability analysis, we discuss its formal definition in more detail. The Legendre transformation used to find the Hamiltonian corresponding to Equation 57.227 makes use of the conjugate momentum
THEOREM 57.12 Consider a Lagrangian control system prescribed by Equation 57.227 or its Hamiltonian equivalent (Equation 57.231). Suppose that the corresponding system of Equations 57.232 and 57.233 is forced by the oscillatory input given in Equation 57.234. Let qo be a critical point of the averaged potential which is independent of the period T (or frequency) of the forcing. Suppose, moreover, that, for all T suficiently small (frequencies sufficiently large), qo is a strict local minimum of the averaged potential. Then (q, q ) = (qo, 0) is a stable equilibrium of the forced Lagrangian system, provided T is sufficiently small. If (q, p) = (go, 0) is the corresponding equilibrium of the forced Hamiltonian system, then it is likewise stable, provided T is sufficiently small. This theorem is proved in (71. We end this section with two examples to which this theorem applies, followed by a simple example which does not satisfy the hypothesis and for which the theory is currently less complete.
REMARK 57.11
This explicigy depends on the input v ( t ) . Given a point in the phase space, ( q ,q ) , the corresponding averaged momentum is
EXAMPLE 57.13: Oscillatory stabilization of a simple pendulum. Consider, once again, the inverted pendulum discussed in Example 57.10. Assume now that no friction exists in the hinge and, therefore, b = 0. Using the classical theory of vibrational control in Example 57.14, it is not possible to draw conclusions on the stabilization of the upper equilibrium point by fast oscillating control, because the averaged equation will have purely imaginary eigenvalues when b = 0. The averaged potential provides a useful alternative in this case. Forb = 0, the pendulum dynamics may be written
977
57.7. OPEN-LOOP CONTROL USING OSCILLATORY INPUTS where all the parameters have been previously defined in Example 57.10. Writing the pendulum's vertical velocity as v(t) = ~ ( t ) this , is a system of the type we are considering with (reduced) Lagrangian t ( 0 , e ; v) = (1/2)fe2 + ve sin 8 + g cos0. To find stable motions using the theory we have presented, we construct the averaged potential by passing to the Hamiltonian description of the system. The momentum (conjugate to 0) is a i = e0. 4- u siln 0. Applying the Legendre transformation, p = %, we obtain the corresponding Hamiltonian
If we replace the coefficients involving v(.) with their timeaverages over one period, we obtain the averaged Hamiltonian
where 5 and Z are the time averages over one period of v(t) and v(t12 respec:tively. The averaged potential is just VA(0) = (C - fi2)sin2 0 -- g cos0. Consider the simple sinusoidal oscillation of the hinge-point, R(r) = a! sin St. Then v(t) = a$ cos Bt. Carrying out the construction we have outlined, the averaged potential is given more explicitly by
and some algebraic manipulation shows that the averaged potential is given in this case by 1 VA($I,$Z) = -clc2 - 4
1
I,I~C;
- -7 [Ixsi + (1,s; -L
a!2
lye: + IZsi
+ Izc:)c~]02. A
Stable modes ofbehavior under this type of forcing correspond to local minima of the averaged potential. A brief discussion of how this analysis proceeds will illustrate the utility of the approach. When a = 0, the pendulum undergoes rotation at a constant rate about the vertical axis. For all rates o , the p e n d ~.nl ii in equilibrium when it hangs straight down. There is a critlcal value, w,,, however, above which the vertical configuration is no longer stable. A critical point analysis of the averaged potential yields the relevant information and more. The partial derivatives of VAwith respect to and 42 both vanish at $2) = (0,O) for all values of the parameters a , /?, o. To assess the stability of this critical point using Theorem 57.12, we compute the Hessian (matrix of second partial derivatives) of VA evaluated at (41 > 42) = (0,O):
a2p2
VA(0) = -sin2 0 - g cos 0. 4L
Looking at the first derivative V i (O), we find that 8 = n is a critical point for all values of the parameters. Looking at the second derivative, we find that Vi(?r) > 0 precisely when a2p2> 2&g. From Theorem 57.12 we conclude that, for sufficiently large values of the frequency 6, the upright equilibrium is stable in the sense that motions of the forced system will remain nearby. This is of course completely consistent with classical results on this problem. (Cf. Example 57 10.)
EXAMPLE 57.14:
Example 57.12 4, reprise: oscillatory stabilization of a rotating pendulum.
Let us return to the mechanical system treated in Example 57.12. Omitting a )fewdetails, we proceed as follows. Starting from the Lagrangian in Equation 57.229, we obtain the corresponding Hamiltoniai~(the general formula for which is given by Equation 57.23 1). 'The averaged potential i.s given by the formula in Equation 57.236. Suppose the pendulum is forced to rotate at a constant rate, perturbed by a small-amplitude sinusoid, v(t) = w a! sinpt. Then the coefficients in Equation 57.236 are
+
v(t)dt = w, and
Let us treat the constant rotation case first. We have assumed I, 2 Iy >> I,. When cr = 0, this means that the Hessian matrix above is positive definite for 0 5 w2 < l/(Ix - I,). This inequality gives the value of the critical rotation rate precisely We wish to answer the following quesas w,, = I/,/=. tion: Is it possible to provide a stabilizingeffect by superimposing a small-amplitude, high-frequency sinusoidal oscillation on the constant-rate forced rotation? The answer emerges from Theorem 57.12 together with analysis of the Hessian. In the symmetric case, I, = I,, the answer is "no" because any nonzero value of cr will decrease the (1, 1)-entry and hence the value of w,,. If I, > I,, however, there is the possibility of increasing w,, slightly, because, although the (1, 1)-entry is decreased, the more important (2, 2)-entry is increased. Current research on oscillatoryforcing to stabilize rotating systems (chains, shafts,turbines, etc.) is quite encouraging. Though only modest stabilization was possible for the rotating pendulum in the example above, more pronounced effects are generallypossible with axial forcing. Because this approach to control appears to be quite robust (as seen in the next example), it merits attention in applications where feedback designs would be difficult to implement. We conclude with an example to which Theorem 57.12 does not apply and for which the theory is currently less well developed. Methods of [6] can be used in this case.
THE CONTROL HANDBOOK
978
EXAMPLE 57.15: Oscillation induced rest points in a pendulum on a cart. We consider a slight variation on Example 57.13 wherein we consider oscillating the hinge point of the pendulum along a line which is not vertical. More specifically, consider a cart to which there is a simple pendulum (as described in Example 57.13) attached so that the cart moves along a track inclined at an angle to the horizontal. Suppose the position of the cart along its . trackat time t is prescribed byavariabler(t). Then thependulum dynamics are expressed
+
ee + i: cos(9 - +) + g sin 13 = 0. Note that when $ = ~ / 2 the , track is aligned vertically, and we recover the problem treated in Example 57.13. In the general case, let v(t) = i ( t ) and write the averaged potential
where 6
= ~~Tv(r)dt
As in Example 57.13, we may take v(t) = ix,8;os,Bt. For sufficiently large frequencies 9, there are strict local minima of the averaged potential which arenot equilibrium points of the forced system. Nevertheless, as noted in [6], the pendulum will execute motions in a neighborhood of such a point. To distinguish such emergent behavior from stable motions in neighborhoods of equilibria (of the nonautonomous system), we have called motions confined to neighborhoods of nonequilibrium critical points of the averaged potential hovering motions. For more information on such motions, the reader is referred to [41]. Remark on the robustness of open-loop methods. The last example suggests, and laboratory experiments bear out, that the stabilizing effects of oscillatory forcing of the type we have discussed are quite pronounced. Moreover, they are quite insensitive to the fine details of the mathematical models and to physical disturbances which may occur. Thus, the stabilizing effect observed in the inverted pendulum will be entirely preserved if the pendulum is perturbed or if the direction of the forcing isn't really vertical. Such robustness suggests that methods of this type are worth exploringin a wider variety of applications. Remark o n oscillatory control with feedback. There are interesting applications (e.g., laser cooling) where useful designs arise through a combination of oscillatory forcing 'and certain types of feedback. For the theory of time-varying feedback designs, the reader is referred to (201 and the chapters on stability by Khalil, Teel, Sontag, Praly, and Georgiou appearing in this handbook. I:
Defining Terms Anholonomy: Consider the controfled differential equation 57.207, and suppose that there is a non-zero
function 4 : Rn x Rn -+ R such that $(x, gi (x)) E 0 for i = 1, . . . ,m. This represents a constraint on the state velocities which can be commanded. Despite such a constraint, it may happen that any two specified states can be joined by a trajectory x(t) generated via Equation 57.207 by an appropriate choice of inputs ui (.). Any state trajectory arising from Equation 57.207 constrained in this way is said to be determined from the inputs ui (.) by anholonomy. In principle, the notation of anholonomy can be extended to systems given by Equation 57.208 or 57.209. Some authors who were consulted in preparation of this chapter objected to the use of the word in this more general context. Averaged potential: An energy-like function that describes the steady-statebehavior produced by high-frequency forcing of a physical system. Completely controllable: A system of Equation 57.209 is said to be completely controllable if, given any T > 0 and any pair of points xo, x l E Rn, there is a control input u(.) producing a motion x(.) of Equation 57.209 such that x(0) = no and x(T) = XI. LSIRC: The Lie algebra rank condition is the condition that the defining vector fields in systems, such as Equation 57.207 or Equation 57.208 together with their Lie brackets of all orders span Rn.
57.7.4 Acknowledgment The authors are indebted to many people for help in preparing this chapter. R.W. Brockett, in particular, provided useful guidance and criticism.
References (11 Baillieul, J., Multilinear optimal control, Proc. Conf. Geom. Control Eng., (NASA-Ames, Summer 1976), Brookline, MA. Math. Sci. Press, 337-359, 1977. [2] Baillieu], J., Geometric methods for nonlinear optimalcontrolproblems, J. Optimiz. TheoyAppl., 25(4), 519-548, 1978. [3] Baillieul, J., The Behavior of Super-ArticulatedMechanisms Subject to Periodic Forcing, in Analysis of Controlled Dynamical Systems, B. Bonnard, B. Bride, J.P. Gauthier, and I. Kupka, Eds., Birkhaiiser, 1991,35-50. [4] Baillied, J.andLevi, M., Constrained relative motions in rotational mechanics, Arch. Rational Mech. Anal., 115/2,101-135, 1991. [5] Baillied, J., ,The behavior of single-input superarticulated mechanisms, Proc. 1991 Am. Control Conf., Boston, )une 26-28, pp. 1622-1626,1991. [6] Baillied, J., Stable Average Motions of Mechanical Systems Subject to Periodic Forcing, Dynamics and Control of Mechanical Systems: The Falling Cat and Related Problems, Fields Institute C~mmunications,
1
57.7. OPEN-LOOP C O N T R O L U S l N G OSCILLATORY INPUTS 1,AMS, Providence, RI, 1-23, 1993. [7] Baillieul, J., Energy methods for stability of bilinear
ometric Approach to Motion Planning, in Nonholonomic Motion Planning, Kluwer Academic Publishers,
systems with oscillatory inputs, Int. J. Robust Nonlinear Control, Special Issue on the "Control of Nonlinear Mechanical Systems," H. Nijmeijer and A.J. van der Schaft, Guest Eds., July, 285-301, 1995. [8] Bellman, R., Bentsman, J., and Meerkov, S.M., Vibrational control of nonlinear systems: Vibrational stabilizability IEEE Trans. Automatic Control, AC-31,
235-270, 1993. 1241 Lehman, B., Bentsman, J., Lunel, S.V., and Verriest,
710-716, 1986. [9] Bellman, R., Bentsman, J., and Meerkov, S.M., Vibra-
tems, in Ordinary and Delay Equations, J.Wiener and J. Hale, Eds., Pitman Research Notes in Mathematical Series (272), 1 1 1-1 15, 1992. 126) Kapitsa, P.L., Dynamic stability of a pendulum with a vibrating point of suspension, Zh. Ehksp. Teor. Fiz.,
tional contrc~lof nonlinear systems: Vibrational controllability and transient behavior, IEEE Trans. Automatic Control, AC-3 1,717-724, 1986. [lo] Bishop, R.L. and Crittenden, R.J., Geometry of Manifolds, Acadenrlic Press, New York, 1964. [ l 11 Bogolyubov, N.N., Perturbation theory in nonlinear mechanics, Sb. Stroit. Mekh. Akad. Nauk Ukr. SSR 14, 9-34, 1950. [12] Bogolyubov, N.N. and Mitropolsky, Y.A., Asymptotic
Methods in the Theory of Nonlinear Oscillations, 2nd ed., Gordon & Breach Publishers, New York, 1961. [13] Brockett, R.W., System theory on group manifolds and coset spaces, SIAM J. Control, 10(2), 265-284, 1972. [14] Brockett, R.W., Control Theory and Analytical Me-
[15]
1161
[17] [18]
1191
chanics, in Geometric Control Theory, Vol. VII of Lie Groups: History, Frontiers, and Applications, C. Martin and R. Hermann, Eds., Math Sci Press, Brookline, MA, 1 4 6 , 1977. Brockett, R.W., Asymptotic Stability and Feedback Stabilization, in Differential Geometric Control Theory, R.W. Brockett, R.S. Millman, and H.J. Sussmann, Eds., Birkhahser, Basel, 1983. Brockett, R.MI., ControlTheory andsingular Riemannian Geometry, in New Directions in Applied Mathematics, Springer-Verlag, New York, 13-27, 1982. Brockett, R.W., On the rectification of vibratory motion, Sens. Actuat., 20,91-96, 1989. Brockett, R.ML and Dai, L., Nonholonomic Kinematics and the Role of Elliptic Functions in Constructive Controllability, in Nonholonomic Motion Planning, Kluwer Academic Publishers, 1-21, 1993. Chow, W.L., Uber Systeme von Linearen Partiellen Differentialgleichungenerster Ordnung, Math. Ann.,
117,98-105,1939. [20] Coron, J.M., Global asymptotic stabilization for con-
trollable systems without drift, Math. Control, Sig., Syst., 5,295-312, 1992. 1211 Haynes, G.W. and Hermes, H., Nonlinear controllability via Lie theory, SIAM J. Control, 8(4), 450-460, 1970. [22] Lafferriere, G . and Sussmann, H. J., Motion planning
for controllable systems without drift, Proc. IEEE Intl. Conf Robot. Automat., 1148-1 153, 1991. [23] Lafferriere, G.and Sussmann, H.J., A Differential Ge-
E.I., Vibrational control of nonlinear time lag systems with bounded delay: Averaging theory, stabilizability, and transient behavior, IEEE Trans. Auto. Control, AC39,898-9 12,1994. [25] Lehman, B., Vibrational Control of Time Delay Sys-
21(5), 588-598, 1951. [27] Leonard, N.E. and Krishnaprasad, P.S., Control of
Switched Electrical Networks Usirig Averaging on Lie Groups, The 33rd IEEE Conference on Decision and Control, Orlando, FL, Dec. 14-16, pp. 1919-1924, 1994. [28] Leonard, N.E. and Krishnaprasad, P.S., Motion con-
trol of drift-free, left-invariant systems on Lie groups, IEEE Trans. Automat. Control, AC40(9), 1539-1 554, 1995. [29] Murray, R.M. andsastry, S.S., Nonholonomic motion
planning: steering using sinusoids, IEEE Trans. Auto. Control, 38(5j, 700-716, 1993. [30] Nijmeijer, H. and van der Schaft, A.J., Nonlinear Dynamical Control Systems, Springer-Verlag, New York, 1990. 131) Sanders, ].A. and Verhulst, F., Averaging Methods in
Nonlinear Dynamical Systems, Springer-Verlag, Applied Mathematical Sciences, 59, New York, 1985. 1321 Speyer, J.L., Nonoptimality of steady-state cruise for airciaft, AIAA J., 14(11), 1604-1610, 1976. 1331 Speyer, J.L. and Evans, R.T., A second variational theory for optimal periodic processes, IEEE Trans. Auto. Control, AC-29(2), 138-148, 1984. (341 Stoker, J.J., Nonlinear Vibrations in Mechanical and Electrical Systems, ). Wiley & Sons, New York, 1950. Republished 1992 in Wiley Classics Library Edition. 1351 Sussmann, H. and Jurdjevic, V., Controllability of nonlinear systems, J. Difi Eqs., 12,95-116, 1972. 1361 Sussmann, H.J. and Liu, W., Limits of highly oscillatory controls and approximation of general paths by admissible trajectories, 30th IEEE Conf Decision Control, Brighton, England, 199 1. 1371 Sussmann, H.J. and Liu, W., An approximation algorithm for nonholonomic systems, Rutgers University, Department of Mathematics preprint, SYCON93-1 1. To appear in SIAMJ. Optimiz. Control. [38] Sussmann, H.J. and Liu, W., Lie Bracket Extension and Averaging: The Single Bracket Case, in Nonholonomic Motion Planning, Kluwer Academic Publishers, 109-147,1994. \ [39] Tibury, D., Murray, R. and Sastry, S., Trajectory gen-
THE CONTROL HANDBOOK eration for the N-trailer problem using Goursat normal form, 30th IEEE Con$ Decision Control, San Antonio, Texas, 971-977, 1993. [40] Tilbury, D., Murray, R. and Sastry, S., Trajectory generation for the N-trailer problem using Goursat normal form, IEEE Trans. Auto. Control, AC40(5), 802819,1995. (411 Weibel, S., Baillieul, J., and Kaper, T., Small-amplitude periodic motions of rapidly forced mechanical systems, 34th IEEE Con$ on Decision and Control, New Orleans, 1995.
where z2 is an error variable expressing the fact that x2 is not the true control. Differentiating zl and 22 with respect to time, the complete system Equations 57.238 and 57.239 is expressed in the error coordinates Equations 57.242 and 57.242:
57.8 Adaptive Nonlinear Control
It is important to observe that the time derivative is implemented analytically, without a differentiator. For the system of Equations 57.243 and 57.244 we now design a control law u = a 2 (xl , x2) to render the time derivative of a Lyapunov function negative definite. The design can be completed with the simplest Lyapunov function
Miroslav KrstiC, Department of Mechanical Engineering, University of Maryland, College Park, MD Petar V. KokotoviC, Department of Electrical and Computer Engineering, University of California, Santa Barbara, C A 57.8.1 Introduction: Backstepping Realistic models of physical systems are nonlinear and usually contain parameters (masses, inductances, aerodynamic coefficients, etc.) which are either poorly known or depend on a slowly changing environment. If the parameters vary in a broad range, it is common to employ adaptation: d parameter estimator identifier - continuously acquires knowledge about the plant and uses it to tune the controller "on-line". Instabilities in nonlinear systems can be more explosive than in linear systems. During the parameter estimation transients, the state can "escape" to infinity in finite time. For this reason, adaptive nonlinear controllers cannot simply be the "adaptive versions" of standard nonlinear controllers. Currently the most systematic methodology for adaptive nonlinear control design is backstepping. We introduce the idea of backstepping by carrying out a nonadaptivedesign for the system
where 0 is a known parameter vector and p(xl) is a smooth nonlinear function. Our goal is to stabilize the equilibrium xl = 0, x2 = - p ( 0 ) ~ 0 = 0. Backstepping design is recursive. First, the state x2 is treated as a virtual control for the xi-equation (57.238), and a stabilizing function
is designed to stabilize Equation 57.238 assuming that x2 = can be implemented. Since this is not the case, we define
Its derivative for Equations 57.243 and 57.244 is
An obvious way to achieve negativity of v is to employ u to make the bracketed expression equal to -c2~2 with c2 > 0, namely,
This control may not be the best choice because it cancels some terms which may contribute to the negativityof V . Backstepping design offers enough flexibility to avoid cancellation. However, for the sake of clarity, we will assume that none of the nonlinearities is useful, so that they all need to be cancelled as in the control law (57.247). This control law yields
which means that the equilibrium z = 0 is globally asymptotically stable. In view of Equations 57.242 and 57.242, the same is true about x = 0. The resulting closed-loop system in the z-coordinates is linear:
a1(xi)
In the next four sectionswe present adaptive nonlinear designs through examples. Summaries of general design procedures are also provided but without technical details, for which the reader is referred to the text on nonlinear and adaptive control design
57.8. ADAPTZVI:"NONLINEAR CONTROL by Krstit, Kanellakopoulos, and Kokotovit (91. Only elementary background on Lyapunov stability is assumed, while no previous familiarity with adaptive linear control is necessary. The two main methodologies for adaptive backstepping design are the tuning functions design, Section 57.8.2, and the modular design, Section 57.8.3. These sections assume that the full state is available for feedback. Section 57.8.4 presents designs where only the output is measured. Section 57.8.5 discusses various extensions to more general classes of systems, followed by a brief literature review in Section 57.8.5.
that is, the parameter estimation error 8 continues to act as a disturbance which may destabilize the system. Our task is to find an update law for 6 ( t ) which preserves the boundedness of x ( t ) and achieves its regulation to zero. To this end, we consider the Lyapunov function
where r is a positive definite symmetric matrix referred to as "adaptation gain''. The derivative of V l is
57.8.2 Tuning Functialns Design In the tuning functions design both the controller and the parameter update law are designed recursively. At each consecutive step a tuningfunction is designed as a potential update law. The tuning functions are not implemented as update laws. Instead, the stabilizing functions use them to compensate the effects of parameter estimation transients. Only the final tuning function is used as the parameter update law. Introductory Examples The tuning functions design will be introduced through examples with increasing complexity:
Our goal is to select an update law for 8 to guarantee
The only way this can be achieved for any unknown 8 is to choose
This choice yields Vl = - c l x l 2 ,
(57.257)
which guarantees global stability of the equilibrium xl = 0 , 6 = 8 , andhence, the boundedness o f x l ( t )and6 ( t ) . By LaSalle's invariance theorem (see the chapter by Khalil in this volume), all of the trajectories of the closed-loop adaptive system converge to the set where ~1 = 0 , that is, to the set where c l x f = 0 , implying that lim x l ( t ) = 0 . (57.258) t--+m
The adaptive problein arises because the parameter vector 8 is unknown. The nonlinearity q ( x 1 ) is known and, for simplicity, it is assumed that q ( 0 ) = 0. The systems A, B, and C differ structurally: the number of integrators between the control u and the unknown parameter 8 increases from zero at A, to two at C. Design A will be. the simplest because the control u and the ~ 8 "matched", that is, the control does not uncertainty q 1 ( x ~ ) are have to overcome integrator transients to counteract the effects of the uncertainty. Design C will be the hardest because the control must act through two integrators before it reaches the uncertainty. Design A. Let 6 be an estimate of the unknown parameter 8 in the system
Alternatively, we can prove Equation 57.258 as follows. By integrating Equation 57.257 c l x l ( ~ ) ~ = d Vl r ( r l (O), 6(0)) V l ( X I( t ) , 6 ( t ) ) , which, due to the nonnegativity of V1,implies c l x l ( t 1 2 d r 5 V l ( x l ( 0 ) ,6 ( 0 ) ) < oo. Hence, xl is that square-integrable. Due to the boundedness of x l ( t ) and 6 ( t ) , from Equations 57.252 and 57.256, it follows that i l ( t ) and8 are also bounded. By Barbalat's lemma, we conclude that xl ( t ) -P 0 . The update law Equation 57.256 is driven by the vector q ( x l ) , called the regressor, and the state x l . This is a typical form of an update law in the tuning functions design: the speed of adaptation is dictated by the nonlinearity q ( x l ) and the state x l . Design B. For the system
If this estimate were clorrect, 6 = (9, then the control law
we have already designed a nonadaptive controller in Section 57.8.1. To design an adaptive controller, we replace the unknown 6 by its estimate 6 in the stabilizing function Equation 57.240 and in the change of coordinate Equatian 57.242,
would achieve global asymptotic stability of x = 0. Because 6 4 - 6 2 0 ,
ji
lof
THE CONTROL HANDBOOK Because the control input is separated from the unknown parameter by an integrator in the system (57.259),the control law Equation 57.247 will be strengthened by a term v2(xl,x2, 6) which will compensate for the parameter estimation transients,
The only way to eliminate the unknown parameter error select the update law
is to
Then ~2 is nonpositive, The resulting system in the z coordinates is which means that the global stability o f t = 0, 8 = 0is achieved. Moreover,by applying either the LaSalle or the Barbalat argument mentioned in Design A, we prove that z ( t ) -+ 0 as t + CO. Finally, from Equation 57.260, it follows that the equilibrium x = 0, 6 = 9 is globally stable and x(t) -+ 0 as t -+ CO.
or, in vector form,
The term v2 can now be chosen to eliminate the last brackets,
This expression is implementablebecause 6 will be available from the update law. Thus we obtain the error system
When the parameter error 6 is zero, this system becomes the linear asymptotically stable system Equation 57.249. Our remaining task is to select the update law 8; = r t 2 ( x , 4). Consider the Lyapunov function
=
=--cIz;-c2z:+[z1,
The closed-loopadaptive system (57.266), (57.269).
The crucial property of the control law in Design B is that it incorporates the vz-term Equation 57.265 which is proportional to 6 and compensates for the effect ofparameter estimation transients on the coordinate change Equation 57.260. It is this departure from the certainty equivalence principle that makes the adaptive stabilization possible for systems with nonlinearities of arbitrary growth. By comparing Equations 57.269 and 57.256, we note that the first term, vz1 is the potential update law for the zl -system. The functions
are referred to as the tuningfundions, because of their role as POtential update laws for intermediate systems in the backstepping procedure. Design C. The system,
Because $ = -6, the derivative of V2 is
v2
Figure 57.25
z21
is obtained by augmenting system Equation 57.259 with an integrator. Thecontrol lawa2(xl ,x2,6) designed in Equation 57.261 can no longer be directly applied because xs is a state and not a
57.8. ADAPTIVE NONLINEAR CONTROL control input. We ''step back" through the integrator x3 = u and design the control li~wfor the actual input u. However, we keep the stabilizing function a2 and use it to define the third error coordinate 23 = x3 -u2(xl9X2,6). (57.274) The parameter update law Equation 57.269 will have to be modified with an additional z3-term. Instead of 8; in Equation 57.265, the compensating term v2 will now use the potential update law Equation 57.272 for the system equation 57.266,
Hence, the role of the tuning function t2 is to substitute for the actual update law in compensating for the effects of parameter estimation trar~sienlts. With Equations 57.260, 57.260, 57.274, 57.272, 57.275, and the stabilizing function a 2 in Equation 57.261,
Again we must eliminate the unknown parameter error 6 from v3. For this we must choose the update law as
Upon inspecting the bracketed terms in v3, we'pick the control law,
The compensation term v3 is yet to be chosen. Substituting Equation 57.282 into Equation 57.280,
From this expression it is clear that v3 should cancel
9
%8. ae
In
This system differs from the error system Equation 57.266 only in its last term. Likewise, instead of the Lyapunov inequality Equation 57.270,
order to cancel the cross-term z2 (rn - 6) with v3, we need to divide by z3. However, the variable 23 might take a zero value during the transient, and should be regulated to zero to accomplish the control objective. We resolve this difficulty by noting that
Differentiating Equation 57.274,
so that
v3in Equation 57.283 is rewritten as
From Equation 57.285 the choice of v3 is immediate: We now stabilize the (zl, z2, ~3)-systemEquations 57.276 and 57.279 with respect to the Lyapunov function The resulting v3is
Its derivative along Equations 57.276 and 57.279 is
which guarantees that the equilibrium x = 0, 6 = 0 is globally stable, and x(t) -+ 0 as t -+ oo. The Lyapunov design leading to Equation 57.285 is effective but does not reveal the stabilization mechanism. .TOprovide further insight we write the (zl, z2, ~3)-systemEquations 57.276 and 57.279 with u given in Equation 57.282 but with v3 yet to be selected:
THE CONTROL HANDBOOK The general design summarized in Table 57.1 achieves asymptotic tracking, that is, the output y = x l of the system Equation 57.292 is forced to track asymptoticallythe reference output yr (t) whose first n derivatives are assumed to be known, bounded and piecewise continuous. While
v3
can cancel the matched term %b, it cannot cancel
ao
the term % ( r r 2
ae
- b) in the second equation. By substituting
Equation 57.284, we note that $ ( r r 2 - 6) has z3 as a factor dB and absorb it into the "system matrix":
TABLE 57.1 Summary of the Tuning Functions Design for Tradung. A
A
i-l
+~ ( a " l - ' x k + l k=l
Now v3 in (57.287) yields
This choice, which places the term - % r z y , at the (2,3) position in the system matrix, achieves skew-symmetry with its positive image above the diagonal. What could not be achieved by pursuing a linearlike form, was achieved by designing a nonlinear system where the nonlinearities are 'balanced' rather than cancelled.
General Recursive Design Procedure
Adaptive control law:
Parameter update law:
. A systematic backstepping design with tuning functions has been developed for the class of nonlinear systems transformable into the parametric strict-feedback form,
The closed-loop system has the form
where where B and
are smooth nonlinear functions, and B(x) # 0, V x E Rn. (Broader classes of systems that can be controlled by adaptive backstepping are listed in Section 57.8.5).
A
(For notational conveniencewe define z0 = 0, rro = 0,TO = 0.)
axk
57.8. ADAPTIVE .NONLINEAR CONTROL and
Because of the skew-symmetry of the off-diagonal part of the matrix A,, it is easy to see that the Lyapunov function
has the derivative
which guarantees that the equilibrium z = 0 , 6 = 6' is globally stable, and z(t) + 0 as t + oo. This means, in particular, that the system state and the control input are bounded and asymptotic tracking is achieved: liml,, [y(t) - y,(t)] = 0. To help understand hc~wthe control design of Table 57.1 leads to the closed-loop system Equations 57.301-57.304, we provide an interpretation of the matrix A, for n = 5:
It is not hard to extend various standard identifiers for linear systems to nonlinear systems. It is therefore desirable to have adaptive designs where the controller can be combined with different identifiers (gradient, least-squares, passivity based, etc.). We refer to such adaptive designs as modular. In nonlinear systems it is not a good idea to connect a good identifier with a controller which is good when the parameter is known (a "certainty equivalence" controller). To illustrate this, let us consider the error system
obtained by applyinga certainty equivalence controller u = -x q(x)6 to the scalar system 1 = u q(x)e. The parameter estimators commonly used in adaptive linear control generate bounded estimates 6(t) with convergence rates not faster than exponential. Suppose that 8(t) = e-' and q(x) = x3, which, upon substitution in Equation 57.308, gives
+
&,
the system Equation 57.309 is For initial conditions lxol > unstable, and its solution escapes to infinity in finite time:
From this example we conclude that, for nonlinear systems, we need stronger controllers which prevent unbounded behavior caused by 6. If the parameters were known, 6 = 8, in which case we would not use adaptation, r =: 0, the stabilizingfunctions Equation 57.295 would be implemented with vi = 0, and hence ai,j = 0. Then A, would be just the above constant tridiagonal asymptotically stable matrix. \ h e m the parameters are unknown, we use r > 0 and, due to thechange ofvariablezi = xi -y,(i-1) -ai-1,in each ofthe&-equations, aterm -%b = x;=la i k z k appears. The term vi = -
EL=,
We strengthen the controller for the preceding example, u = -x - q(x)6, with a nonlinear damping term - ~ ( X ) ~ X , that is, u = -x - q(x)6 - ~ ( X ) ~ With X . this stronger controller, the closed-loop system is
Je
aikzk -
xi-!;akizk in the stabilizing func-
tion Equation 57.29 5 is crucial in compensating for the effect of$. The uik-terms above the diagonal in Equation 57.307 come from 8;. Their skew-symmetric negative images come from feedback Vi
Controller Design
To see that x is bounded whenever 6 is, we consider the Lyapunov function V = i n 2 . Its derivative along the solutions of Equation 57.3 11 is
.
It can be shown that the resulting closed-loop system Equations 57.302 and 57.301, as well as each intermediate system, has a strictpassivity prclperty from 8 as the input to ri as the output. The loop around this operator is closed (see Figure 57.25) with the vector integrator with gain r, which is.a passive block. It follows from passivity theory that this feedback connection of one strictly passive and one passive block is globally stable.
57.8.3 Modular Design In the tuning functions design, the controller and the identifier are derived in ;3n interlaced fashion. This interlacing led to considerable controller complexity and inflexibility in the choice of the update law.
From this inequality it is clear that Ix(t)l will not grow larger than $l6(t)l, because then v becomes negative and V = $x2 decreases. Thanks to the nonlinear damping, the boundedness of 6(t) guarantees that x(t) is bounded. To showhow nonlinear damping is incorporated into a higherorder backstepping design, we consider the system
THE COKTROL HANDBOOK
Viewing
as a control input, we first design a control law crl ( X I , 6 ) to guarantee that the state X I in i 1 = x 2 d p ( x l ) T is ~ bounded whenever 8 is bounded. In the first stabilizing function we include a nonlinear damping term8 - K I 1 C 0 ( x 1 ) ) 2 x 1 : x2
+
Then we define the error variable z2 = x 2 - crl ( X I , e), and for uniformity denote z l = x l . The first equation is now
nonlinear damping terms -
1510 1 and -g2 I$ l 2 2
~ 2
~2
z2
to
counteract the effects of both 8 and 6:
where c 2 , K 2 , g 2 > 0. Upon completing the squares, as in Equation 57.3 16,
If z 2 were zero, the Lyapunov function V l = dz: would have the derivative which means that the state of the error system,
so that z l would be bounded whenever # 0,
8 is bounded.
With
Z2
Differentiating x 2 = z 2 Equation 57.3 13 yields
+ crl
(XI,
6), the second expression in
he derivative of the Lyapunov function
is boundedwhenever the disturbance inputs 8 and6 are bounded. Moreover, because V2 is quadratic in z, see Equation 57.3 19, we can use Equation 57.322 to show that the boundedness of z is guaranteed also when 6 is square-integrable but not bounded. This observation is crucial for the modular design with passive identifiers where cannot be a priori guaranteed as bounded. The recursive controller design for the parametric strictfeedback systems Equation 57.292 is summarized in Table 57.2. Comparing the expression for the stabilizing function Equation 57.325 in the modular design with Equation 57.295 for the tuning functions design we see that the difference is in the second lines. Though the stabilization in the tuning functions design is achieved with the terms v i , in the modular design stabilization is accomplished with the nonlinear damping term, s i z i , where
along the solutions of Equations 57.3 15 and 57.3 18 is
The resulting error system is
where A,, W, Q are We note that, in addition to the 8-dependent disturbance term $Jg, we also have a &dependent disturbance $8. No such term appeared in the scalar system Equation 57.31 1. We now use
he Euclidian norm of a vector v is denoted as Ivl =
a..
,
\
987
57.8. ADAPTIVE NONLINEAR CONTROL
TABLE 57.2 Summary of the Controller Design in the Modular A A Approach. (For notational convenience we define zo = 0, ffo = 0.)
In addition to boundedness of x ( t ) , our goal is to achieve asymptotic tracking, that is, to regulate z ( t ) to zero. With z and 6 bounded, it is not hard to prove that z ( t ) -+ 0 provided W ( ~ ( t )6 ,( t ) , t ) T 8 ( t ) -+ 0 and 6 ( t ) -+ 0.
Let us factor the regressor matrix W, using Equations 57.333, 57.326 and 57.293, as
i = 1, . . . , n
n. - (X I , . . . , x i ) ,
Since the matrix N ( z , 6 , t) is invertible, the tracking condition W ( z ( t ) ,6 ( t ) , t ) T 8 ( t ) -+ 0 becomes ~ ( x ( t ) ) ~ i -+ ( t )0. In the next two subsections we develop identifiers for the general parametric model
j,'" = ( y , , j,, . . . , yr(i) ) (57.328) The parametric strict-feedback system Equation 57.292 is a special case of this model with F ( x , u ) given by Equation 57.293 and f ( x , u ) = ( ~ 2 ,. . . , xn, B o ( x ) u l T . Before we present the design of identifiers, we summarize the properties required from the identifier module:
Adaptive control law:
-
(i)
~ E L , ~ I I ~ ~ E L ~ o ~ L , ,
(ii)
if* E
Controller module guarantees:
1f8E L~ and6 E L2 or&,,
Loo, then ~ ( x ( t ) ) ~ J -+ ( t )0 and 6 ( t ) + 0.
then X E f&
We present two types of identifiers: the passive identifier and the swapping identifier.
Passive Identifier
Since the controller module guarantees that x is bounded whenever 6 is bounded and is either bounded or squareintegrable, then we need identifiers which independently guarantee these properties. Both the boundedness and the squareintegrability requirements for 6 are essentially conditions which limit the spee~dof adaptation, and only one of them needs to be satisfied. The modular design needs slow adaptation because the controller does not cancel the effect of 6 , as was the case in the tuning functions design.
6
Figure 57.26
The passive identifier.
For the parametric model Equation 57.335, we implement the "observer"
THE CONTROL HANDBOOK
where li. > 0 and A0 is an arbitrary constant matrix so that
PA~+A,TP=-I,
p=pT>o.
(57.337)
By direct substitution it can be seen that the observer error E = X - x
(57.338)
is governed by $ =
6 ( t ) -+ 0 . Both properties are established by Barbalat's lemma. The latter property can easily be shown to follow from the squareintegrabilityof$. The regulation of F ( x ) ~ to B zero follows upon ) < ( t ) converge to zero. Though the showing that both ~ ( tand ) by deducing its square-integrability convergence of ~ ( tfollows 57.342, the convergence of < ( t ) follows froin the from Equation j o dO g ( r ) d r = ~ ( 0 0-e(O) ) = -E(O), exists. fact that itsintegral,
. [A o - h F ( x , u ) T ~ ( x , u ) PI< + F ( x , u ) ~ ~(57.339)
Swapping Identifier For the parametric model Equation 57.335, we implement two filters,
The observer error system Equation 57.339 has a strict passivity property from the input 8 to the output F ( x , u ) P E .
+ F(x, u l T , and
(57.343)
I
= [AO - AF(x. u ) ~ F ( xu, ) P ( G o - x )
0
-f
( x ,u ) .
(57.344)
where X 2 0 and A0 is as defined in Equation 57.337. The estimation error,
can be written in the form Figure 57.27 Negative feedbackconnection of the strictlypassive system Equation 57.339 with the passive system .!
A
where i: x governed by
+ no - nTe decays exponentially because it is
A standard result of passivity theory is that the equilibrium
8 = 0, E = 0,of the negative feedback connection of one strictly passive and one passive system is globally stable. Using integral feedback, such a connection can be formed as in Figure 57.27. This suggests the use of the following update law
To analyze the stability properties of the passive identifier, we use the Lyapunov function
After uncomplicated calculations, its derivative canbe shown to satisfy
This guarantees the boundedness of g and c, even when li. = 0. However, cannot be shown to be bounded (unless x and u are known to be bounded). Instead, for the passive identifier one can show that 6 is square integrable. For this we must use li. z 0, that is, we rely on the nonlinear damping term - h F ( x , u ) F~( x , u ) P in the observer. The boundedness of 8 and the square-integrability of 6 imply (cf. Table 57.2) that x is bounded. To prove the tracking, we need to show that the identifier guarantees that, whenever x is bounded, F ( x ( t ) l T 6 ( t )4 0 and
6
Figure 57.28
The swapping identifier.
The filters Equations 57.343 and 57.344 have converted the dynamic model Equation 57.335 into the linear static parametric model Equation 57.346 to which we can apply standard estirnation algorithms. As our update law, we will employ either the gradient
57.8. ADAPTI'VE NONLINEAR CONTROL and the vectors of unknown constant parameters are
or the least squares algorithm
T
a = [ a , , . . . , a q l T , b = [b,, . . . , bo]
anT
r(o) = r ( ~ >) o~ l + v t r { ~ ~ ~ v) 2~0 .. (57.349) By allowing v = 0, we encompass unnormalized gradient and least-squares. The complete swapping identifier is shown in Figure 57.28. The update law n,ormalization, v > 0, and the nonlinear damping, h > 0, are two different means for slowing down the identifier in order to guarantee the boundedness and squareintegrability of 6 . For the gradient update law Equation 57.348, the identifier properties (boundedness of8 and 6 and regulation of F(x)B and 6 ) are established via the Lyapunov function
r
=
-r-
.
(57.355)
We make the following assumptions: the sign of b, is known, the polynomial B(s) = bmsm . . . bls bo is known to be Hurwitz, and a (y) # 0 Vy E R. An important restriction is that the nonlinearities @(y)and @(y) are allowed to depend only on the output y. Even when 0 is known, this restriction is needed to achieve global stability.
+ +
TABLE 57.3
+
State Estimation Filters.
Filters:
whose derivative is
The Lyapunov function for the least-squares update law Equation 57.349 is V = BTr(t)-'6 EPE.
+
We define the parameter-dependent state estimate
57.8.4 Output Feedback Designs For linear systems, a common solution to the output-feedback problem is a stabilizing state-feedback controller employing the state estimates from an exponentially converging observer. Unfortunately, this approach is not applicable to nonlinear systems. Additional difficulties arise when the nonlinear plant has unknown parameters because adaptive observers, in general, are not exponentially convergent. These obstacles have been overcome for systemstransformable into the output-feedback form,
i=t+QTO,
(57.361)
which employs the filters given in Table 57.3, with the vector k = [kl, . . . , knlT chosen so that the matrix A. = A - keT is Hurwitz, that is, P A ~ + A ; ~ P = - I , P = PT > o .
(57.362)
The state estimation error, E=X-X,
(57.363)
as is readily shown, satisfies y
= e;x,
(57.352)
where only the output y is available for measurement,
&
= AO&.
(57.364)
The following two expressions for y are instrumental in the backstepping design: y
= oo+wTO+~z,
y
=
and brnvrn.2
(57.365)
+ wo +
+ ~2 ,
(57.366)
where
wo = (po,1+F2, = [vmtz9V~-1,2,.. ., Vo.2, and
(57.367)
@(I)
+ 8(2)lT, (57.368)
THE CONTROL HANDBOOK
Since the states x2, . . . ,x, are not measured, the backstepping design is applied to the system
TABLE 57.4
Outvut-FeedbackTunine Functions Desien.
The order of this system is equal to the relative degree of the plant Equation 57.352.
Output-Feedback Design with Tuning Functions The output-feedback design with tuning functions is summarized in Table 57.4. The resulting error.system is
where
and
Adaptive control law:
(
The nonlinear damping terms -di ay )l in~~uation57.375 are included to counteract the exponentially decaying state esti. variable & is an estimate of Q = lib,. mation error ~ 2 The
Output-Feedback Modular Design In addition to sgn(bm), in the modular design we assume that a positive constant 5, is known so that I b, 1 2 5,.
Parameter update laws:
99 1
57.8. ADAPTIVE NONLINEAR CONTROL
The complete design of the control law is summarized in Table 57.5. The resulting error system is TABLE 57.5
Output-FeedbackController in the Modular Design. i:l Zi
= =
Y-yr. um,i
(57.389) 1
- -Yr
(i-I)
- ai-1
where
9
bm
i
=
2, . . . , p .
(57.390)
and 0
Q
Passive identifier For the parametric model Equation 57.365, we introduce the scalar obsewer
8 = - (co + K O I 2~ ) (i- y ) + oo + oT6.
(57.403)
The observer error, c=y-i, is governed by Adaptive contn)llaw: The parameter update lawis
e^ = Proj ( r o e ) , 6m
Jrn(0)sgnbrn > ~m r = r T > o
(57.406)
where the projection operator is employed to guarantee that lirn(t)l I Srn v 0, Vt I 0.
THE C O N T R O L H A N D B O O K -
where Xi is a v;-vector, vl 5 v2 5 . . . 5 vn, Xi = . . . , x:] T , X = x,,. and the matrices B; ( x i ) have full
Swapping identifier
[xT,
The estimation error
satisfies the following equation linear in the parameter error:
rank for all f; e B = = ~'. The input u is a vn-vector. The matrices B, can be allowed to be unknown provided they are constant and positive definite.
Block strict-feedback systems. The update law for 6 is either the gradient,
,
or the least squares,
+
the regressor w ,is augmented by -
Adaptive nonlinear control designs presented in the preceding sections are applicable to classes of nonlinear systems broader than the parametric strict-feedback systems Equation 57.292. Pure-feedback systems.
=
+
@ , . o ( ~ , , & ) + @ , ( ~ , , ~i ,=) ~1 ,Q. .,. , p ,
(57.414) - = with the followiiig notation: i , = [ X I , . . . , x , ]T , 5, ,:I[ . . . , { : l T x = i p ,and ( = i,. Each (,-subsystem of Equation 57.414 is assumed to be bounded-input bounded-state (BIBS)stable with respect to the input (i, , f,- 1 ) . For this class of systems it is quite simple to modify the procedure In Tables 57.1 and 57.2. Because of the dependence of q, on &, the stabilizing function cr, is augmented by the term *mx,., and
57.8.5 Extensions
Xn
=
zz/
x;~!, ( w ) ~ . @,
Partialstate-feedbacksystems. In many physical systems there are unmeasured states as in the output-feedback form Equation 57.352, but there are also states other than the output y = xi that are measured. An example of such a system is
+
[ P O @ ) B ( X ) ~ Qu ] V O ( X ) +%( x l T @ > (57.411 ) where qo(0) = 0 , q l ( 0 ) = . . . = ~ ~ (=00 ,) Bo(0) # 0 . Because of the dependence of pi on xi+ 1 , the regulation or tracking for pure-feedback systems is, in general, not global, even when Q is known.
Unknown virtual control coefficients.
Xn
=
T
b n B ( x ) ~ + ~ ~ n ( x l , . . . ,Qx ,n )
(57.412) where, in addition to the unknown vector 0 , the constant coefficients bi are also unknown. The unknown b;-coefficients are frequent in applications ranging from electric motors to flight dynamics. The signs of b i , i = 1 , . . . , n, are assumed to be known. In the tuning functions design, in addition to estimating b i , we also estimate its inverse e; = l / b i . In the'modular design, we assume that, in addition to sgnbi, a positive constant si is known such that lb;1 2 si. Then, instead of estimating = l / b i , we use the inverse of the estimate Bi, i.e., I / & , where gi ( t ) is kept away from zero by using parameter projection. Multi-input systems.
The states x3 and YQ are assumed not to be measured. To apply the adaptive backstepping designs presented in this chapter, we combine the state-feedback techniques with the output) in the outputfeedback techniques. The subsystem ( x 2 ,xs,Y Q is feedback form with x2 as a measured output, so that we employ a state estimator for ( x 2 ,x3,Y Q )using the filters introduced in Section 57.8.4.
References [ I ] Bastin, G., Adaptive nonlinear control of fed-batch stirred tank reactors,Znt. J. Adapt. Control Sig. Process., 6,273-284, 1992. [ 2 ] Dawson, D.M., Carroll, J.J., and Schneider, M., Integrator backstepping control of a brushed DC motor turning a robotic load, IEEE Trans. Control Syst. Tech., 2,233-244,1994.
57.8. ADAPTIVE NO.NLlNEAR C O N T R O L [3] Jankovit, M., Adaptive output feedback control of nonlinear feedback-linearizable systems, Int. J. Adapt. Control Sig. Process., to appear. [4] Jiang, Z.P. and Pomet, J.B., Combining backstepping and time-varying techniques for a new set of adaptive controllers, Proc. 33rd IEEE Conf: Dec. Control, Lake Buena Vista, FL, December 1994, pp. 2207-2212. [5] Kanellakopoulos, I., Kokotovit, P.V., andMorse, AS., Systematic design of adaptive controllers for feedback linearizable systems, IEEE, Trans. Auto. Control, 36, 1241-1253, 1991. [6] Kanellakopoulos, I., Kokotovit, P.V., andMorse, AS., Adaptive nonlinear control with incomplete state information, Int. J. Adapt. Control Sig. Process., 6, 367394, 1992. [7] Khalil, H., Adaptive output-feedback control of nonlinear systems r e p ~ s e n t e dby input-output models, Proc. 3 r d IEEE Con$ Dec. Control, Lake Buena Vista, FL, December 1994, pp. 199-204; also submitted to IEEE Trans. Auto. Control. [8] Krstit, M., Kanellakopoulos, I., and Kokotovlt, P.V., Adaptive nonl~nearcontrol without overparametrization, Syst. Control Lett., 19, 177-185, 1992. [9] Krstit, M., Kanellakopoulos, I., and Kokotovit, P.V., Nonlinear and Adaptive Control Design, John Wiley & Sons, New York, NY, 1995. [ l o ] Krstit, M. and Kokotovit, P.V., Adaptive nonlinear design with controller-identifier separation and swapping, IEE13 Trcxns. Auto. Control, 40,426-441, 1995. [ l l ] Marino, R., Peresada, S., and Valigi, P., Adaptive input-output linearizing control ofinduction motors, IEEE Trans. Auto. Control, 38, 208-221, 1993. [12] Marinc, K. arid Tomei, P., Global adaptive outputfeedback control of nunlinear systems, Part I: linear parametrization, IEEE Trans. Auto. Control, 38, 1732, 1993. [13] Pomet, J.B. and Praly, L., Adaptive nonlinear regulation: estimation from the Lyapunov equation, IEEE Trans. Auto. Control, 37, 729-740, 1992. [14] Praly, L., Bastin, G., Pomet, J.-B., and Jiang, Z.P., Adaptive stabilization of nonlinear systems, in Foundations of Adaptive Control, P. V. Kokotovit, Ed., Springer-Verlag, Berlin, 1991,347-434. [15] Sastry, S.S. and Isidori, A., Adaptive control of linearizable systems, IEEE Trans. Auto. Control, 34, 1123-1131,1989. [16] Seto, D., A.nnaswamy, A.M., and Baillieul, J., Adaptive control of a class of nonlinear systems with a triang d a r structure, IEEE Trans. Auto. Control, 39, 14111428,1994. [17] Teel, A.R., Adaptive tracking with robust stability, Proc. 32nd IEEE Con$ Dec. Control, San Antonio, TX, December 1993, pp. 570-575.
Further Reading Here we have briefly surveyed representative results in adaptive nonlinear control, a research area that has been rapidly growing in the 1990s. The first adaptive backstepping design was developed by Kanellakopoulos, Kokotovit and Morse [5]. Its overparametrization was removed by the tuning functions design of Krstit, Kanellakopoulos and Kokotovit (81. Possibilities for extending the class of systems in [5] were studied by Seto, Annaswamy and Baillieul [ 161. Among the early estimation-based results are Sastry and Isidori [15], Pomet and Praly [13], etc. They were surveyed in Praly, Bastin, Pomet and Jiang [14].All of these designs involve some growth conditions. The modular approach of Krstit and Kokotovit [ l o ] removed the growth conditions and achieved a complete separation of the controller and the identifier. One of the first output-feedback designs was proposed by Marino and Tomei [12]. Kanellakopoulos, Kokotovit and Morse [6] presented a solution to the partial state-feedback problem. A tracking design where the regressor depends only on reference signals was given in Teel [17]. Current efforts in adaptive nonlinear control focus on broadening the class of nonlinear systems for which adaptive controllers are available. Jiang and Pomet [4] developed a design for nonholonomic systems using the tuning functions technique. Khalil [7] and Jankovit [3] developed semiglobal output feedback designs for a class which includes some systems not transformable into the output feedback form. Among the applications of adaptive nonlinear control are biochemical processes (Bastin [ l ] ) and electric motors (Marino, Peresada and Valigi [ l l ] , and Dawson, Carroll and Schneider [2]). An imporrant future research topic is the robustness of adaptive nonlinear designs. For a complete and pedagogical presentation of adaptive nonlinear control the reader is referred to the text Nonlinear and Adaptive ControlDesign by KrstiS, Kanellakopoulos, and Kokotovit [9].The book introduces backstepping and illustrates it with numerous applications (including jet engine, automotive suspension, aircraft wing rock, robotic manipulator, and magnetic levitation). It contains the details of methods surveyed here and their extensions. It also covers several important topics not mentioned here. Among them is the systematic improvement of transient performance. It also shows the advantages of applying adaptive backstepping to linear systems.
994
57.9 Intelligent Control Kevin M.P U S S ~Department ~ O , of Electrical Engineering, Ohio State University, Columbus, OH 57.9.1 Introduction Intelligent contr01,~the discipline in which control algorithms are developed by emulating certain characteristics of intelligent biological systems, is an emerging area of control that is being fueled by advancements in computing technology [I], [16], [18], [8]. For instance, software development and validation tools for expert systems (computer programs that emulate the actions of a human who is proficient at some task) are being used to construct "expert controllers" that seek to automate the actions of a human operator who controls a system. Other knowledge-based systems such as fuzzy systems (rule-based systems that use fuzzy logic for knowledge representation and inference) and planning systems (that emillate human planning activities)are being used in a similar manner to automate the perceptual, cognitive (deductive and inductive), and action-taking characteristics of humans who perform control tasks. Artificial neural networks emulate biological neural networks and have been used (1) to learn how to control systems by observing the way that a human performs a control task, and (2) to learn in an on-line fashion how best to control a system by taking control actions, rating the quality of the responses achieved when these actions are used, then adjusting the recipe used for generating control actions so that the response of the system improves. Genetic algorithms are being used to evolve controllers via off-line computer-aided design of control systems or in an on-line fashion by maintaining a population of controllers and using "survival of the fittest" principles where "fittest" is defined by the quality of the response achieved by the controller. For fuzzy, genetic, and neural systems, in addition to software development tools there are several computer chips available that provide efficient parallel implementation so that complex reasoning processes can be implemented in real time. Via these examples we see that computing technologyis driving the development of the field of control by providing alternative strategiesfor thefunctionality and implementation of controllers for dynamical systems. In fact, there is a trend in the field of control to integrate the functions of intelligent svstems, such as those listed above, with conventional control systems to form highly "autonomous" systems that have the capabihy to perform complex control tasks independently with a high degree of success. This trend toward the development of intelligent autonomous control systems is gaining momentum as control engineers have solved many problems and are naturally seekingcontrol problems where broader issues must be taken into consideration and where
g~artialsupport for this work came from the National Science Foundation under grants IRI-9210332 and EEC-9315257.
THE CONTROL HANDBOOK the full range of capabilities of available computing technologies is used. The development of such sophisticated controllers does, however, still fit within the conventional engineeringmethodology for the construction of control systems. Mathematical modeling using first principles or data from the system, along with heuristics, is used. Some intelligent control strategies - rely more on the use of heuristics (e.g., direct fuzzy control) but others utilize mathematical models in the same way that they are used in conventional control, while still others use a combination of mathematical models and heuristics (see, e.g., the approaches to fuzzy adaptive control in [17]). There is a need for systematic methodologies for the construction of controllers. Somemethodologies for the construction of intelligent controllers are quite ad hoc (e.g., for the fuzzy controller) yet often effective since they provide a method and formalism for incorporating and representing the nonlinearities that are needed to achieve high-performance control. Other methodologies for the construction of intelligent controllers are no more ad hoc than ones for conventional control (e.g., for neural and fuzzy adaptive controllers). There is a need for nonlinear analysis of stability, controllability, and observability properties. Although there has been significant recent progress in stability analysis of fuzzy, neural, and expert control systems, there is need for much more work in nonlinear analysis of intelligent control systems. Simulations and experimental evaluations of intelligent control systems are necessary. Comparative analysis of competing control strategies (conventional or intelligent) is, as always, important. Engineering cost-benefit analysis that involves issues of performance, stability, ease of design, lead time to implementation, complexity ofimplementation, cost, andother issues must be used. Overall, while the intelligent control paradigm focuses on biologically motivated approaches, it is not clear that there are drastic differences in the behavior of the resulting controllers that are finally implemented (they are not mystical; they are simply nonlinear, often adaptive controllers). This is, however, not surprising since there seems to be an existing conventional control approach that is analogous to every new intelligent control approach that has been introduced. This is illustrated in Table 57.6 below. It is not surprising then that while there seem to be some new concepts growing from the field of intelligent control, there is a crucial role for the control engineer and control scientist to play in evaluating and developing the emerging field of intelligent control. For more detailed discussions on the relationships between conventional and intelligent control see [12] , [13]. In this chapter we briefly examine the basic techniques of intelligent control, provide an overview of intelligent autonomous control, and discuss the recent advances that have focused on comparative analysis, modeling, and nonlinear analysis of intelligent control systems. The intent is to provide only a brief introduction to the field of intelligent control and to the next two chapters; the interested reader should consult the references provided at the end of the chapter, or the chapters on fuzzy and neural control for more details.
57.9. INTELLIGENT CONTl
Analogies Between Conventional and Intelligent Control
Intelligent Contioi Tcchmque -
Convent~onalControl Approach
D ~ r r c fuzzy t control Fuz:sy ndapt~vellearn~ng control
Nonlinear control Adaptive control and identification
Fuzzy supervisory control Direct expert control
Gain-scheduled control, hierarchical control Controllers for automata, Petri nets, and other discrete
Planning systems for control
event systems Certain types of controllers for discrete event systems, receding horizon control of nonlinear systems
Neural control Genetic algorithms for computer-aided design (CAD) of control systems, controller tuning
Adaptive control and identification, optimal control CAD using heuristic optimization techniques, optimal control, receding horizoncontrol, and stochastic adaptive control
Fuzzy Controller
57.9.2 Intelligent Control Techniques The major approaches to intelligent control are outlined and references are provided in case the reader would like to learn more details about any one ofthese approaches. It is important to note that while each of the approaches is presented separately,in practice there is a significantamount of work being done to determine the best ways to utilize various aspects of each of the approaches in "hybrid" intelligent control techniques. For instance, neural and fuzzy control approaches are often combined. In other cases, neural networks are trained with genetic algorithms. One can imagine justification for integration ofjust about any permutation of the presented techniques depending on the application at hand.
Fuzzy Control A fuzzy controller can be designed to roughly emulate the human deductive process (i.e., the process whereby we successively infer conclusions from our knowledge). As shown in Figure 57.29, the fuzzy controller consists of four main components. The rule base holds a set of "if - then" rules that are quantified via fuzzy logic and used to represent the knowledge that human experts may have about how to solve a problem in their domain of expertise. The fuzzy inference mechanism successivelydecides what rules are most relevant to the current situation and applies the actions indicated by these rules. The fuzzification interface converts numeric inputs into a form that the fuzzy inference mechanism can use to determine which knowledge in the rule base is most relevant at the current time. The defuzzification interface combines the conclusions reached by the fuzzy inference mechanism and provides a numeric value as an output. Overall, the fuzzy control design methodology, which primarily involves the specification of the rule base, provides a heuristic technique to construct nonlinear controllers, and this seems to be one of its main advantages. For more details on direct fuzzy control, see the next chapter. Often it is the case that we have higher-level knowledge about how to control a process such as information on how to tune the controller while it is in operation or how to coordinate the application of different controllers based on the operating point of the system. For instance, in aircraft control, certain key variables are
\ Figure 57.29
Fuzzy control system.
used in the tuning ("scheduling") of control laws, and fuzzy control provides a unique approach to the construction and implementation of such a gain scheduler. In process control, engineers or process operators often have a significant amount of heuristic expertise on how to tune proportional-integral-derivative (PID) controllers while they are in operation, and such information can be loaded into the rule base of a "fuzzy PID tuner" and used to make sure that a PID controller is tuned properly at all times. In the more general case, we may have knowledge of how to tune and coordinate the application of conventional or fuzzy controllers, and this can be used in a rule-based supervisor as it is shown in Figure 57.30. For more details on fuzzy supervisory control, see the next chapter.
s Rule-Based Supervisor
Reference
Controller
Figure 57.30
Rule based supervisorycontrol.
In other fuzzy control approaches, rather than implementing deductivesystems, the goal is to implement an "inductive system," i.e., one that can learn and generalize from particular examples (e.g., examples of how the system is behaving). Such approaches
THE CONTROL HANDBOOK
typically fall under the title of "fuzzy learning control" or "fuzzy adaptive control." In one approach, shown in Figure 57.31, called "fuzzy model reference learning control" (FMRLC) [ l o ] , there is a fuzzy controller with a rule base that has no knowledge about how to control the system. A "reference model" with output y, (t) is used to characterize how you would like the closed-loop system to behave (i.e., it holds the performance specifications). Then, a learning mechanism compares y(t) to y,(t) (i.e., the way that the system is currently performing to how you would like it to perform) and decides how to synthesizeltune the fuzzy controller so that the difference between y(t) and y, (t) goes to zero and hence the performance objectives are met.
YO) Process
Figure 57.31
---t
Fuzzy model reference learning control.
Overall, our experienceswiththe FMRLC seem to indicate that significant advantages may be obtained if one can implement a controller that can truly learn from its experiences (while forgetting appropriate information) so that when a similar situation is encountered repeatedly, the controller already has a good idea of how to react. This seems to represent an advance over some adaptive controllers where parameters are adapted in such a way that each time the same situation is encountered, some amount of re-adaptation must occur, no matter how often this situation is encountered (for more details on this and other fuzzy learningladaptive control approaches, see the next chapter or [17]).
Expert and Planning Systems /
While fuzzy control techniques similar to those described above have been employed in a variety of industrial control applications, more general "knowledge-based controllers" have also been used successfully. For instance, there are "expert controllers" that are being used to directIy control complex processes or in a supervisory role similar to that shown in Figure 57.30 [ 11, [ I l l . Others are beitlg used to supervise conventional control algorithms. For instance, the work by Astrom, Anton, and Arzen [5] describes the use of expert supervisory systems for conventional adaptive controllers. Expert systems are also being used as the basis for learning controllers. In addition, there are planning systems (computer programs that emulate the way that expertsplan) that have been used in path planning and high-level decisions about control tasks for robots (11, [16], [14], [6]. A genericplanning system taken from Passino and Antsaklis [14], configured in the architecture of a standard control system, is shown in Figure 57.32. Here, the "problem
domain" is the environment in which the planner operates, i.e., the plant. There are measured outputs y, at step i (variables of the problem domain that can be sensed in real time), control actions u, (the ways in which we can affect the problem domain), disturbances d, (that represent random events that can affect the problem domain and hence the measured variable yi), and goals g, (what we would like to achieve in the problem domain). There are closed-loop specifications that quantify performance specifications and stability requirements. It is the task of the planner in Figure 57.32 to monitor the measured outputs and goals and to generate control actions that will counteract the effects of the disturbances and result in the goals and the closed-loop specifications to be achieved. To do this, the planner performs "plan generation" whereby it projects into the future (usually a finite number of steps, and, often, a model of the problem domain is used) and tries to determine a set of candidate plans. Next, this set of plans is pruned to one plan that is the best one to apply at the current time. The plan is executed and, during execution, the performance resulting from the plan is monitored and evaluated. Often, due to disturbances, plans will fail and hence the planner must generate a new set of candidate plans, select one, then execute that one. While not systems use "situation pictured in Figure 57.32, some assessment" to try to estimate the state of the problem domain (this can be useful in execution monitoring and in plan generation); others perform "world modeling" where a model of the problem domain is developed in an on-line fashion (similar to on-line system identification), and "planner design" where information from the world modeler is used to tune the planner (so that it makes the right plans for the current problem domain). The reader will, perhaps, think of such a planning system as a general "self-tuning regulator." For more details on the use of planning systems for control see [I], [ 141, [6].
Neural Networks for Control There has been a flurry of activity in the use of artificial neural networks for control [ 11, [18], [8], [9]. In this approach engineers are trying to emulate the low-level biological functions of the brain and to use these to solve challenging control problems. For instance, for some control problems we may train an artificial neural network to remember how to regulate a system by repeatedly providing it with examples of how to perform such a task. After the neural network has learned the task, it can be used to recall the control input for each value of the sensed output. Some other approaches to neural control, taken from Hunt et al. [9] (and the references therein), include a neural "internal model control" method and a "model reference structure" that is based on an approach to using neural networks for system identification. Still other neural control approaches bear some similaritiesto the FMRLC in Figure 57.3 1 in the serise that they automatically learn how to control a system by observing the behavior from that system. For instance, in Figure 57.33 we show a "neural predictive control" approach from [9] where one neural network is used as an identifier (structure) for the plant andanother is used
57.9. INTELLIGENT CONTROL
Goals gi
--C
Figure 57.32
Closed-loopplanning systems.
as a feedback controlier for the plant that is tuned on-line. This tuning proceeds at each step by having the "optimizer" specify an input u' for the neural model of the plant over some time interval. The predicted behavior of the plant y' is obtained and used by the optimizer, along 'with ym to pick the best parameters of the neural controller so that the difference between the plant and reference model toutputs is as small as possible (if y' predicts y well, wewould expect that the o~timizerwouldbe quite successful at tuning the controller). For more details on the multitude of techniques for using neural networks for control see the later chapter in-this section or [9].
---
"'
Neural Network for Modeling Plant
*
Reference
u Controlb
I
Figwe 57.33
Plant
Y
I
Neural predictive control.
Genetic Algorithms for Control A genetic algorithm uses the principles of evolution, natural selection, and genetics from natural biological systems in a computer algorithm to simulate evolution [7]. Essentially, the genetic algorithm performs a parallel, stochastic, but directed search to evolve the most fit population. It has been shown that a genetic algorithm can be used effectively in the (off-line) computer-aided design of control systems because it can artificially "evolve" an appropriate controller that meets the performance specifications i:o the greatest extent possible. To do this, the genetic algorithm maintains a population of strings that each represent a different controller, and it uses the genetic operators represents the "survival of the fittest" of "reproduction'~w1~ich characteristic of evolution), "crossovery' (which represents "mating"), and "mutation"' (which represents the random introduction of new "genetic material"), coupled with a "fitness measure" (which often quantifies the performance objectives) to generate
successive generations of the population. After many generations, the genetic algorithm often produces an adequate solution to a control design problem since the stochastic, but directed, search helps avoid locally optimal designs and seeks to obtain the best design possible. Another more challengingproblem is how to evolvecontrollers while the system is operating, rather than in off-line design. Recently, progress in this direction has been made by the introduction of the "genetic model reference adaptive controller" (GMRAC) shown in Figure 57.34 [15]. As in the FMRLC, the GMRAC uses a reference model to characterize the desired performance. For the GMRAC, a genetic algorithm maintains a population of strings that represents candidate controllers. This genetic algorithm uses a process model (e.g., a linear model of the process) and data from the process to evaluate the fitness of each controller in the population at each time step. Using this fitness evaluation, the genetic algorithm propagates controllers into the next generation via the standard genetic operators. The controller that is the most fit one in the population is used to control the system. This allows the GMRAC to automatically evolve a controller from generation to generation (i.e., from one time step to the next) and hence to tune a controller in response to changes in the process (e.g., due to temperature variations, parameter drift, etc.) or due to an on-line change of the specificationsin the reference model. Early indications are that the GMRAC seems quite promising as a new technique for stochastic adaptive control since it provides a unique feature whereby alternative controllers can be applied quickly to the problem if they appear useful, and since it has some inherent capabilities to learn via evolution of its population of controllers. There is, however, a significant amount of comparative and nonlinear analysis that needs to be done to more fully evaluate this approach to control.
Figure 57.34
Genetic model reference adaptive controller.
THE CONTROL HANDBOOK
Humans 1 Other Subsystems
57.9.3 Autonomous Control The goal of the field of autonomous control is to design control systems that automate enough functions so that they can perform well independently under significant uncertainties for extended periods of time, even if there are significant system failures or disturbances. Next we overview some of the basic ideas from [I], [3], (21 on how to specify controllers that can in fact achieve high levels of autonomy.
The Intelligent Autonomous Controller Figure 57.35 shows a functional architecture for an intelligent autonomous controller with an interface to the process (plant) involving sensing (e.g., via conventional sensing technology, vision, touch, smell, etc.), actuation (e.g., via hydraulics, robotics, motors, etc.), and an interface to humans (e.g., a driver, pilot, crew, etc.) and other systems. The "execution level" has low-level numeric signal processing and control algorithms [e.g., PID, optimal, or adaptive control; parameter estimators, failIre detection and identification (FDI) algorithms]. The "coordination level" provides for tuning, scheduling, supervising, and redesigning the execution-level algorithms, crisis management, planning and learning capabilities for the coordination of execution-level tasks, and higher-level symbolicdecision making for FDI and control algorithm management. The "management level" provides for the supervision of lower-level functions and for managing the interface to the human(s). In particular, the management level interacts with the users in generating goals for the controller and in assessing capabilities of the system. The management level also monitors performance of the lower-level systems, plans activities at the highest level (and in cooperation with the human), and performs high-level learning about the user and the lower-level algorithms. Applications that have used this type of architecture can be found in, e.g., [I], (161. Intelligent systemslcontrollers (e.g., fuzzy, neural, genetic, expert, etc.) can be employed as appropriate in the implementation of various functions at the three levels of the intelligent autonomous controller (e.g., adaptive fuzzy control may be used at the execution level; planning systems may be used at the management level for sequencing operations; and genetic algorithms may be used in the coordination level to pick an optimal coordination strategy). Hierarchical controllers composed of a hybrid mix of intelligenj and conventional systems are commonly used in the intelligent control of complex dynamical systems. This is due to the fact that to achieve high levels of autonomy, we often need high levels of intelligence, which calls for incorporation of a diversity of decision-making approaches for complex dynamic reasoning. Several fundamental characteristics have been identified for intelligent autonomous control systems (see [I], [16], [3], [2] and the references therein). For example, there is generally a successive delegation of duties from the higher to lower levels and the number of distinct tasks typically increases as we go down the hierarchy. Higher levels are often concerned with slower aspects of the system's behavior and with its larger portions, or broader aspects. There is then a smaller contextual horizon.at
Process Figure 57.35
Intelligent autonomous controller.
lower levels; i.e., the control decisions are made by considering less information. Higher levels are typically concerned with longer time horizons than lower levels. It is said that there is "increasing intelligence with decreasing precision as one moves from the lower to the higher levels" (see [I61 and the references therein). At the higher levels there is typically a decrease in time scale density, a decrease in bandwidth or system rate, and a decrease in the decision (control action) rate. In addition, there is typically a decrease in granularity of models used, or equivalently, an increase in model abstractness at the higher levels. Finally, we note that there is an ongoing evolution of the intelligent functions of an autonomous controller so that by the time one implements its functions they no longer appear intelligent, just algorithmic. It is due to this evolution principle and the fact that implemented intelligent controllers are nonlinear controllers that many researchers feel more comfortable focusing on achieving autonomy rather than whether the resulting controller is intelligent.
The Control-TheoreticView of Autonomy Next it is explained how to incorporate the notion of autonomy into the conventional manner of thinking about control problems. Consider; the general control system shown in Figure 57.36 where P is a model of the plant, C represents the controller, and T represents specifications on how we would like the closed-loop system to behave (i.e., closed-loop specifications). For some classical control problems, the scope is likited so that C and P are linear and T represents simply, for example, stability, rise time, overshoot, and steady-state tracking error specifications. In this case, intelligent control techniques may not be
57.9. INTELLIGENT CONTROL needed. For engineers, the simplest solution that works is the best one. We tend to need more complex controllers for more complexplants (where, for example, there is a significant amount of uncertainty) and more demanding closed-loop specifications T (see [16], [ l ] and the r6ferences therein).
Figure 57.36
Con1:rol system.
Consider the case where 1. P is so complex that it is most convenient to represent it with ordinary differential equations and discrete-event system (DES) models (or some other "hybrid" mix of models) and for some parts of the plant the model is not known (i.e., it may be too expensive to find). 2. T is used to characterize the desire to make the system perform well and act with high degrees of autonomy (i.e., so that the system performs well under significant uncertainties in the system and its environment for extended periods of time and compensates fox significant system failures without external interve~ltiori[ I ]:I. The general control problem is how to construct C, given P, so that T holds. The intelligent autonomous controller described briefly in the previous section provides a general architecture for C to achieve highly autonomous behavior specified by T for very complex plants P.
Modeling ;andAnalysis Conventional approaches l o to modeling P include the use of ordinary differential or difference equations, partial differential equation:;, stochastic models, models for hierarchical and distributed systems, and so on. However, often some portion of P is more easily represented with an automata model, Petri net, or some other DES model. Moreover, for analysis of the closed-loop system where a variety of intelligent and conventional controllers are working together to form C (e.g., a planning system and a conventional adaptive controller), there is the need for "hybrid" modeling formalisms that can represent dynamics
'O~hereadercan .findreferences for the work on modeling fnd analysis of intelligent control systems discussed in [13].
with differential equations and DES models. Control engineers use a model of the plant P to aid in the construction of the model of the controller C, and this model is then implemented to control the real plant. In addition, models of C and P are used as formalisms to represent the dynamics of the closed-loop system so that analysis ofthe properties ofthe feedback system is possible before implementation (for redesign, verification, certification, and safety analysis). If the model of P is chosen to be too complex and C is very complex, it will be difficult to develop and utilize mathematical approaches for the analysis of the resulting closedloop system. Often we want the simplest possible model P that will allow for the development of the (simplest) controller C and for it~tobe proven/demonstrated that the closed-loop specifications T are met (of course, a separate, more complex model P may be needed for simulation). Unfortunately, there is no clear answer to the question of how much or what type of modeling is needed for the plant P, and there is no standardization of models for intelligent control in the way that there is for many areas of conventional control. Hence, although it is not exactly clear how to proceed with the modeling task, it is clear that knowledge of many different types of models may be needed, depending on the task at hand. Given the model of the plant P and the model of the controller C, the next task often considered by a control engineer is the use of analysis to more fully understand the behavior of P or the closed-loop system, and to show that when C and P are connected, the closed-loop specificationsT are satisfied. Formal mathematical analysis can be used to verify stability, controllability, and observability properties. There is, in fact, a growing body of literature on nonlinear analysis (e.g., stability and describing function analysis) of fuzzy control systems (both direct and adaptive [17]). There is a significant amount of activity in the area of nonlinear analysis of neural control systems and results in the past on nonlinear analysis of (numerical) learning control systems (see the later chapter in this section of this book and [9]). There have already been some applications of DES theory to artificialintelligence (AI) planning systems and there have been recent results on stabilily analysis of expert control systems [ I l l . There has been recent progress in defining models and developing approaches to analysis for some hybrid systems, but there is the need for much more work in this area. Many fundamental modeling and representation issues need to be reconsidered; different design objectives and control structures need to be examined; our repertoire of approaches to analysis and design needs to be expanded; and there is the need for more work in the area of simulation and experimental evaluation for hybrid systems. The importance of the solution to the hybrid control system analysis problem is based on the importance of solving the general control problem'described above; that is, hybrid system analysis techniques could provide an approach to verifying the operation of intelligent controllers that seek to obtain truly autonomous operation. Finally, we must emphasize that while formal verification of the properties of a control system is important, simulation and experimental evaluation always play an especially important role also.
1000
57.9.4 Concluding Remarks We have provided a brief overview of the main techniques in the field of intelligent control and have provided references for the reader who is interested in investigating the details of any one of these techniques. The intent of this chapter is to provide an overview of a quickly emerging area of control, while the intent of the next two chapters is to provide an introduction to two of the more well-developed areas in intelligent control: fuzzy and neural control. In a chapter so brief, it seems important to indicate what has been omitted. We have not discussed: 1. Failure detection and identification (FDI) methods, which are essential for a truly autonomous controller 2. Reconfigurable (fault-tolerant) controlstrategies that use conventional nonlinear robust control techniques and intelligent control techniques 3. Sensor fusion and integration techniques that will be needed for autonomous control 4. Architectures for intelligent and autonomous control systems (e.g., alternativeways to structure interconnections of intelligent subsystems) 5. Distributed intelligent systems
6. Applications.
Moreover, the debate over the meaning of the word "intelligent" in the field of intelligent control has been purposefully avoided. The reader interested in a detailed discussion on this topic should see [4].
57.9.5 Defining Terms: Expert system: A computer program designed to emulate the actions of a human who is proficient at some task. Often the expert system is broken into two components: a "knowledge base" that holds information about the problem domain, and an inference mechanism (engine) that evaluates the current knowledge and decides what actions to take. An "expert controller" is an expert system that is designed to automate the actions of a human operator who controls a system. Fuzzy system: A type of knowledge-based system that uses fuzzy logic for knowledge representation and inference. It is composed of four primary components: the fuzzification interface, the rule base (a knowledge base that is composed of rules), an inference mechanism, and a defuzzification interface. A fuzzy system that is used to control a system is called a "fuzzy controller." Genetic algorithm: A genetic algorithm uses the principles of evolution, natural selection, and genetics from natural biological systems in a computer algorithm to simulate evolution. Essentially, the genetic algorithm performs a parallel, stochastic, but directed search to evolve the most fit population.
THE CONTROL HANDBOOK Intelligent autonomous control system: A control system that uses conventional and intelligent control techniques to provide enough automation so that the system can perform well independently under significant uncertainties for extended periods of time, even if there are significantsystem failures or disturbances. Neur.a1network: Artificial hardware (e.g., electrical circuits) designed to emulate biological neural networks. These may be simulated on conventional computers or on specially designed "neural processors." Planning system: Computer program designed to emulate human planning activities. These may be a type of expert system that has a special knowledge base that has plan fragments and strategies for planning and an inference process that generatesand evaluates alternative plans.
References [I] Antsaklis, P.J., Passino, K.M., Eds., An Introduction to Intelligent andAutonomous Control, Kluwer Academic Publishers, Norwell, MA, 1993. [2] Antsaklis, P.J., Passino, K.M., and Wang, S.J., An introduction to autonomous control systems, IEEE
Control Syst. Mag., Special Issue on Intelligent Control, 11(4), 5-13, June 1991. [3] Antsaklis, P.J., Passino, K.M., andwang, S.J., Towards intelligent autonomous control systems: architecture and fundamental issues, I. Intelligent Robotic Syst., 1 (4), 315-342, 1989. [4] Antsaklis, P.J., Defining intelligent control, report of the IEEE Control Systems Society Task Force on Intelligent Control (P.J. Antsaklis, Chair), IEEE Control Systems Magazine, 14(3), 4-5, 58-66, June 1994; See also Proc. IEEE Int. Conf. Intelligent Control, Columbus, OH, August 1994. (Coauthors include: J. Albus, P.J. Antsaklis, K. Baheti, J.D. Birdwell, M. Lemmon, M. Mataric, A. Meystel, K. Narendra, K. Passino, H. Rauch, G. Saridis, H. Stephanou, P. Werbos) [5] Astrom, K. J., Anton, J.J., and Arzen, K.E., Expert control, Automatics, 22,277-286, 1986. [6] Dean, T., and Wellman, M.P., Planning and Control, Morgan Kaufman, CA, 1991. [7] Goldberg, D.E., Genetic Algorithms in Search, Optimization, and Machine Learning, Addison Wesley, Reading, MA, 1989. [8] Gupta, M.M. 'and Sinha, N.K., Eds., Intelligent Control: Theory and Applications, IEEE Press, Piscataway, NJ, 1996. [9] Hunt, K.J., Sbarbaro, D., Zbikowski, k., and Gawthrop, P.J., Neural networks for control systems-a survey, in Neuro-Control Systems: Theory and Applications, Gupta, M.M. and Rao, D.H., Eds.,
57.10. F U Z Z Y C O N T R O L ,
1001 IEEE Transactions on Control Systems TechnolOSY IEEE Transactions on Systems, Man, and Cybernetics
IEEE Press, NJ, 1994. [ l o ] Layne, J.R. and Passino, K.M., Fuzzy model reference learning control for cargo ship steering, IEEE Corltrol Syst. Mag., 13(6), 23-34, December 1993. [ l l ] Passino, K.M. and Lunardhi, A.D., Qualitative ana!ysis of expert control systems, in Intelliget~tControl: Theory and Ap,dications, Gupta, M.M. and Sinha, N.K., Eds., IEEE Press, Piscataway, NJ, 1996. [12] Passino, K.hX., Bridging the gap between conventional andintelligent control, Special lsszle on IntelligentControl, IEEE Control Syst. Mag., 13(4), June 1993. [13] Passino, K.M., Towards bridging the perceivedgap between conventional and intelligent Control, in Intelligent Control: Theory and Applications, Gupta, M.M. and Sinha, N.K., Eds., IEEE Press, Piscataway, NJ, 1996. [14] Passino, K.bd.and Antsaklis, P.J.,Asystem and control
IEEE Transactions on Fuzzy Systems IEEE nansactions on Neural Networks Engineering Applications ofArtificia1 Intelligence Journal of Intelligent and Robotic Systems Applied Artificial Intelligence Journal of Intelligent and Fuzzy Systems Journal of Intelligent Control and Systems
There are many other journals on expert systems, neural networks, genetic algorithms, and fuzzy systems where applications to control can often be found.
theoretic perspective on artificial intelligence planning systems, Int. J. Appl. Art$ Intelligence, 3, 1-32,
The professional societies most active in intelligent control are
1989. [15] Porter, L.L. and Passino, K.M., Genetic model reference adaptive control, Proc. IEEE Int. Symp. Intelligent Control, 2 19-224, Columbus, OH, August 16-18, 1994. [16] Valavanis, K.P. and Saridis, G.N., Intelligent Robotic Systems: Theory, Design, and Applications, Kluwer Academic Publishers, Norwell, MA, 1992. [17] Wang Li-Xin, Adaptive Fuzzy Systems and Control, Prentice Hall, Englewood Cliffs, NJ, 1994. [18] White, D.A. and Sofge, D.A., Eds., Handbook oflntelligent Control: Neural, Fuzzy, and Adaptive Approaches, Van Nostrand R~einhold,Nk', 1992.
57.10 Fuzzy Control11
Further Reading
57.10.1 Introduction
IEEE Control Systems Society International Federation on Automatic Control IEEE Systems, Man, and Cybernetics Society
Kevin M.Passino, D e p a r t m e n t of Electrical Engineering, T h e Ohio S t a t e University, C o l u m b u s , OH Stephen Yurkovich, D e p a r t m e n t of Electrical Engineering, T h e Ohio S t a t e U n i v e r s i t y , C o l u m b u s , OH
While the books and articles referenced above (particularly [ 11 and 181) should provide the reader with an introduction to the area of intelligent control, there are other sources that may also be useful. For instance, there are many relevant conferences including
When confronted with a control problem for a complicatedphysical process, the control engineer usually follows a predetermined design procedure which begins with the need for understanding the process and the primary control objectives. A good example of such a process is that of an automobile "cruise control," designed with the objective of providing the automobile with the capability of regulating its own speed at a driver-specified set-point (e.g., 55 mph). One solution to the automotive cruise control problem involves adding an electronic controller that can sense the speed of the vehicle via the speedometer and actuate the throttle position so as to regulate the vehicle speed at the driverspecified value even if there are road grade changes, head-winds, or variations in the number of passengers in the automobile. Control engineers typically solve the cruise control problem by (1) developing a model of the automobile dynamics(which may model vehicle and power train dynamics, road grade variations,
IEEE International Symposium on Intelligent Control American Control Conference IEEE Conference on Decision and Control IEEE Conference on Control Applications IEEE International Conference on Systems, Man, and Cybernetics. In addition, there are many conferences on fuzzy systems, expert systems, genetic algorithnns, and neural networks where applications to control are often studied. There are many journals that cover the topic of intelligent control including: IEEE Conrrol Systems ibfagazine
'
his work was supported in part by National Science Foundation Grants IRI 9210332 and EEC 9315257.
THE CONTROL HANDBOOK etc.); (2) using the mathematical model to design a controller (e.g., via a linear model develop a linear controller with techniques from classical control); (3) using the mathematical model of the closed-loop system and mathematical or simulation-based analysis to study its performance (possiblyleading to redesign); and (4) implementing the controller via, for example, a microprocessor, and evaluating the performance of the closed-loop system (again possibly leading to redesign). See Chapter 74 for a sophisticated example of such a system. The difficult task of modeling and simulating complex realworld systems for control systems development, especially when implementation issues are considered, is well documented. Even if a relatively accurite model of a dynamic system can be developed it is often too complex to use in controller development, especially for many conventional control design procedures that require restrictive assumptions for the plant (e.g., linearity). It is for this reason that in practice conventional controllers are often developed via simple crude models of the plant behavior that satisfy the necessary assumptions, and via the ad hoc tuning of relatively simple linear or nonlinear controllers. Regardless, it is well understood (although sometimes forgotten) that heuristics enter the design process when the conventional control design process is used as long as one is concerned with the actual implementation of the control system. It must be acknowledged, however, that conventional control engineering approaches that use appropriate heuristics to tune the design have been relatively successful (the vast majority of all controllers currently in operation are conventional PID controllers). One may ask the following questions: How much of the success can be attributed to the use of the mathematical model and conventional control design approach, and how much should be attributed to the clever heuristic tuning that the control engineer uses upon implementation? If we exploit the use of heuristic information throughout the entire design process can we obtain higher-performancecontrol systems? Fuzzy control provides a formal methodologyfor representing, manipulating, and implementing a human's heuristic knowledge about how to control a system. Fuzzy controller design involves incorporating human expertise on how to control a system into a set of rules (a rule base). The inference mechanism in the fuzzy controller reasons over the information in the knowledge base, the process outputs, and the user-specified goals to decide what inputs to generate for the process so that the closed-loop fuzzy control system will behave properly (e.g., so that the userspecified goals are met). For the cruise control example discussed above, it is clear that anyone who has experience in driving a car can practice regulating the speed about a desired set-point and load this information into a rule base. For instance, one rule that a human driver may use is "IF speed is lower than the setpoint THEN press down further on the accelerator pedal': A rule that would represent even more detailed information about how to regulate the speed would be "IF speed is lower than the setpoint AND speed is approaching the set-point very fast THEN release the accelerator pedal by a small amount'! This second rule characterizes our knowledge about how to make sure that we do not overshoot our desired (goal) speed. Generally speaking, if
we load very detailed expertise into the rule base we enhance our chances of obtaining better performance. Overall, the focus in fuzzy control is on the use ofheuristic knowledge to achieve good control, whereas in conventional control the focus is on the use of a mathematical model for control systems development and subsequent use of heuristics in implementation.
Philosophy of Fuzzy Control Due to the substantial amount of hype and excitement about fuzzy control it is important to begin by providing a sound control engineering philosophy for chis approach. First, there is a need for the control engineer to assess what (if any) advantages fuzzy control methods have over conventional methods. Time permitting, this must be done by careful comparative analyses involving modeling, mathematical analysis, simulation, implementation, and a full engineering cost-benefit analysis (which involves issues of cost, reliability, maintainability, flexibility, lead time to production, etc.). When making the assessment of what control technique to use the engineer should be cautioned that most work in fuzzy control to date has only focused on its advantages and has not taken a critical look at what possible disadvantages there could be to using it. For example, the following ' questions are cause for concern: Will the behaviors observed by a human expert include all situations that can occur due to disturbances, noise, or plant pararneter'variations? Can the human expert realistically and reliably foresee problems that could arise from closed-loop system instabilities or limit cycles? Will the expert be able to effectively incorporate stability criteria and performance objectives (e.g., risetime, overshoot, and tracking specifications)into a rule base to ensure that reliable operation can be obtained? Can an effectiveand widely used synthesisprocedure be devoid of mathematical modeling and subsequent use of proven mathematical analysis tools? These questions may seem even more troublesome iE (1) the control problem involves a "critical environment" where the failure of the control system to meet performance objectives could lead to loss of human life or an environmental disaster (e.g., in aircraft or nuclear power plant control), or (2) if the human expert's knowledge implemented in the fuzzy controller is somewhat inferior to that of a very experienced specialist that we expect to have design the control system (different designershave different levels of expertise). Clearly, then, for some applications there is a need for a methodology to develop, implement, and evaluate fuzzy controllers to ensure that they are reliable in meeting their performance specifications. As it is discussed above, the standard control engineering methodology involves repeatedly coordinating the use of modeling, controller (re)design, simulation, mathematical analysis, and experimental evaluations to develop control systems. What is the relevance of this established methodology to the devel-
57.10. FUZZY CONTROL
opment of fuzzy control systems? Engineering a fuzzy control system uses many ideas from the standard control engineering methodology, except that in fuzzy control it is often said that a formal mathematical model is assumed unavailable so that mathematical analysis is impossible. While it is often the case that it is difficult, imposs~ble,or cost prohibitive to develop an accurate mathematical model for many processes, it is almost always possible for the control engineer to specify some type of approximate model of the process (after all, we do know what physical object we are trying to control). Indeed, it has been our experience that most often the control engineer developing a fuzzy control system does have a mathematical model available. While it may not be used directly in controller design, it is often used in simulation to evaluate the performance of the fuzzy controller before it is implemented (and it is often used for rule base redesign). Certainly there are some applications where one can design a fuzzy controller and evaluate its performance directly via an implementation In such applications one may not be overly concerned with a high performance level of the control system (e.g., for some commercial products such as washing machines or a shaver). In such cases there may thus be no need for conducting simulation-based evaluations (requiring a mathematical model) before implementation. In other applications there is the need for a high level of confidence in the reliability of the fuzzy control system before it is implemented (e.g., in systems where there is a concern for safety). In addition to simulation-based studies, one approach to enhancing our confidence in the reliability of fuzzy control systems is to use the mathematical model of the plant and nonlinear analysis for ( 1) verification of stability and performance specifications and (2) possible redesign of the fuzzy controller (for an overview of the results in this area see [I]). Some may be confident that a true expert would never need anything more than intuitive knowledge for rule-based design, and therefore never design a faulty fuzzy controller. However, a true expert will certainly use all available information to ensure the reliable operation of a control system, including approximate mathematical models, simulation, nonlinear analysis, and experimentation. We emphasize that mathematical analysis cannot alone provide the definitive answers about the reliatrility of the fuzzy control system because such analysis proves properties about the model of the process, not the actual physical process. It can be argued that a mathematical model is never a perfect representation of a physical process; hence, while nonlinear analysis (e.g., of stability) may appear to provide definitive statements about control system reliability, it is understood that such statements are only accurate to the extent that the mathematical model is accurate. Nonlinear analysis does not replace the use of common sense and evaluation via simulations and experimentation; it simply assists in providing a rigorous engineering evaluation of a fuzzy control system before it is implemented. It is important to note that the advantages of fuzzy control often become most apparent for very complex problems where we have an intuitive idea about how to achieve high performance control. In such control applications an accurate mathematical model is so complex (i.e., high order, nonlinear, stochastic,
with many inputs and outputs) that it is sometimes not very useful for the analysis and design of conventional control systems (since assumptions needed to utilize conventional control design approaches are often violated). The conventional control engineering approach to this problem is to use an approximate mathematical model that is accurate enough to characterize the essential plant behavior, yet simple enough so that the necessary assumptions to apply the analysis and design techniques are satisfied. However, due to the inaccuracy of the model, upon implementation the developed controllers often need to be tuned via the "expertise" of the control engineer. The fuzzy control approach, where explicit characterization and utilization of control expertise is used earlier in the design process, largely avoids the problems with model complexity that are related to design. That is, for the most part fuzzy control system design does not depend on a mathematical model unless it is needed to perform simulations to gain insight into how to choose the rule base and membership functions. However, the problems with model complexity that are related to analysis have not been solved (i.e., analysis of fuzzy control systems critically depends on the form of the mathematical model); hence, it is often difficult to apply nonlinear analysis techniques to the applications where the advantages of fuzzy control are most apparent. For instance, as shown in [I], existing results for stability analysis of fuzzy control systems typically require that the plant model be deterministic, satisfy some continuity constraints, and sometimes require the plant to be linear or "linear-analytic." The only results for analysis of steady-statetracking error of fuzzy control systems, and the existing results on the use of describing functions for analysis of limit cycles, essentially require a linear time-invariant plant (or one that has a special form so that the nonlinearities can be bundled into a separate nonlinear component in the loop). The current status of the field, as characterized by these limitations, coupled with the importance of nonlinear analysis of fuzzy control systems, make it an open area for investigation that will help establish the necessary foundations for a bridge between the communities of fuzzy control and nonlinear analysis. Clearly fuzzy control technology is leading the theory; the practitioner will proceed with the design and implementation of many fuzzy control systemswithout the aid of nonlinear analysis. In the mean time, theorists will attempt to develop a mathematical theory for the verification and certification of fuzzy control systems. This theory will have a synergistic effect by driving the development of fuzzy control systems for applications where there is a need for highly reliable implementations.
Summary The focus of this chapter is on providing a practical introduction to fuzzy control (a "users guide"); hence we omit discussions of mathematical analysis of fuzzy control systems and invite the interested reader to investigate this topic further by consulting the bibliographic references. The remainder of this chapter is arranged as follows. We begin by providing a ge&ral mathematical introduction to fuzzy systems in a tutorial fashion. Next we introduce a rotational inverted pendulum "theme problem."
THE CONTROL HANDBOOK Many details on control design using principles of fuzzy logic are presented via this theme problem. We perform comparative analyses for fixed (nonadaptive) fuzzy and linear controllers. Following this we introduce the area of adaptive fuzzy control and show how one adaptive fuzzy technique has proven to be particularly effective for balancing control of the inverted pendulum. In the concluding remarks we explain how the area of fuzzy control is related to other areas in the field of intelligent control and what research needs to be performed as the field of fuzzy control matures.
57.10.2 Introduction to Fuzzy Control The functional architecture of the fuzzy system (controller)I2 is composed ofa rule base (containing a fuzzy logic quantificationof the expert'slinguistic description ofhow to achieve good control), an inference mechanism (which emulates the expert's decision making in interpreting and applying knowledge about how to do good control), afuzzification interface (which converts controller inputs into information that the inference mechanism can easily use to activate and apply rules), and a defuzz$cation interface (which converts the conclusions of the inference mechanism into actual inputs for the process). In this section we describe each of these four components in more detail (see Section 57.10.4 for a block diagram) [2,3,4].
assume that there exist many linguistic values defined over U i , then the linguistic variable iii takes on the elements from the set of linguistic values denoted by Ai = : j = 1,2, . . . , N i } (sometimes for convenience we will let the j indices take on negative integer values). Similarly, let i?,! denote the jth linguistic value of the linguisticvariable ji defined over the universe of discourse yi. The linguistic variable ji takes on elements from the set of linguisticvalues denoted by ji = (if: p = 1,2, . . . , Mi} (sometimes for convenience we will let the p indices take on negative integer values). Linguistic values are generally expressed by descriptive terms such as "positive large:' "zero:' and "negative big" (i.e., adjectives). The mapping of the inputs to the outputs for a fuzzy system is characterized in part by a set of condition -+ action rules, or in modus ponens (If. . . Then) form,
{A;
If (antecedent) Then (consequent)
As usual, the inputs of the fuzzy system are associated with the antecedent, and the outputs are associated with the consequent. These If. . . Then rules can be represented in many forms. Two standard forms, multi-input, multi-output (MIMO) and multiinput, single output (MISO), are considered here. The MIS0 form of a linguistic rule is 1fii1 is A{ and ii2 is 2: and,
Linguistic Rules For our purposes, a fuzzy system is a static nonlinear mapping between its inputs and outputs (i.e., it is not a dynamic system). It is assumed that the fuzzy system hasinputs u, E U, where i = 1,2, . . . ,n and outputs y, E Y, where i = 1,2, . . . , m. The ordinary ("crisp") sets U, and yi are called the "universes of discourse" for ui and y, , respectively (in other words they are their domains). To specify rules for the rule base the expert will use a "linguistic description"; hence, linguistic expressions are needed for the inputs and outputs and the characteristicsof the inputs and outputs. We will use "linguistic variables" (constant symbolic descriptions of what are, in general, time-varying quantities) to describe fuzzy system inputs and outputs. For our fuzzy system, linguistic variables denoted by iii are used to describe the inputs ui. Similarly, linguistic variables denoted by ji are used to describe outputs yi . For instance, an input to the fuzzy system may be described as 11, = "velocity error" and an output from the fuzzy system may be Fi = "voltage in''. Just as ui and yi take on values over each universe of discourse Ui and yi, respectively, linguistic variables iii and 5 take on "linguistic values" that are used to describe characteristicsof the variables. Let A[ denote the jth linguistic value of the linguistic variable iii defined over the universe of discourse U i . If we
'2~ornetirnesa fuzzy controller is called a "fuzzy logic controller"or even a "fuzzy linguistic controller" since, as we will see, it uses fuzzy logic in the quantification of linguistic descriptions.
(57.415)
. . .,
and ii, is i f , Then jq is iP
(57.416)
It is a whole set of linguistic rules of this form that the expert specifies on how to control the system. Note that if iil = "velocity error" and A{ = "positive large", then "iil is A{", a single term in the antecedent of the rule, means "velocity error is positive large." It can be shown easily that the MIMO form for a rule (i.e., one with consequents that have terms associated with each of the fuzzy controller outputs) can be decomposed into a number of MIS0 rules (using simple rules from logic). We assume that there are a total of R rules in the rule base numbered 1,2, . . . , R. For simplicity we will use tuples (j, k, . . . , 1 ; p, q)i to denote the ith MIS0 rule of the form given in Equation 57.416. Any of the terms associated with any of the inputs for any MIS0 rule can be included or omitted. Finally, we naturally assume that the rules in the rule base are distinct (i.e., there are no two rules with exactly the same antecedents and consequents).
Fuzzy Sets, Fuzzy Logic, and the Rule Base Fuzzy sets and fuzzy logic are used to heuristicallyquantify the meaning of linguisticvariables, linguistic values, and linguistic rules that are specified by the expert. The concept of a fuzzy set is introduced by first defining a "membership function." Let Ui denote a universe of discourse and A[ E Ai denote a specific linguisticvalue for the linguistic variable iii. The function p(ui) associated with that rnapsUi to [O,11 is called a "membership function." This membership function describes the "certainty" that an element ofUi ,denoted ui ,with a linguistic description iii , may be classified linguistically as 2;. Membership functions are
57.10. FUZZY CONTROL generally subjectively specified in an ad hoc (heuristic) manner from experience or intuition. For instance, if Ui = [-150,150], rii="velocity error,:' and A! = "positive large:' then p(ui) may be a bell-shaped curve that peaks at one at ui = 75 and is near zero when ui < 50 or uj > 100. Then if uj = 75, p(75) = 1 so that we are absolutely certain that ui is "positive large." If ui = -25 then I(-25) is very near zero, which represents that we are very certain! that tci is not "positive large." Clearly many other choices for the shape of the membership function are possible (e.g., triangular and trapezoidal shapes) and these will each provide a different meaning for the linguistics that they quantify. Below, we will show how to specify membership functions for a fuzzy controller for the rotational inverted pendulum. Given a linguistic variable iii with a linguistic value A{ defined on the universe of discourse Ui, and membership function pA;(ui) (membership function associated with the fuzzy set A ( ) that maps Ui to [0, 11, a "fuzzy set" denoted with AJ is defined as (57.417) -4;= {(u,,CLA, (111 1) : U i E ui)
general representation for the union of two fuzzy sets. In fuzzy logic, union is used to represent the "or" operation. The intersection and union above are both defined for fuzzy sets that lie on the same universe of discourse. The fuzzy Cartesian product is used to quantify operations on many universes of discourse. If A ( , A!, . . . , A; are fuzzy sets defined on the . .. ,Un, respectively, their Carteuniverses of discourse U1, U2, sian product is a fuzzy set (sometimes called a "fuzzy relation"), denoted by A( x A; x . . . x A;, with a membership function defined by
Next, we show how to quantify the linguistic elements in the antecedent and consequent of the linguistic If. .. Then rule with fuzzy sets. For example, suppose we are given the If. . . Then rule in MIS0 form in Equation 57.416. Define the fuzzy sets:
I
Next, we specify some set-theoretic and logical operations on fuzzy sets. Given fuzzy sets A,! and Af associated with the universe of discourse U, (N, = 2), with membership functions denoted pA,( u , ) and pA2(u,), respectively, All is defined to be a I
I
"fuzzysubset"of~f,denotedbyl: c A:, ifpa: (u,) 5 pA:(u,) for all u, E U,. The intersection of fuzzy sets A: and A? which are defined on the universe of discourse U, is a fuzzy set, denoted by Af n A:, with a membership function defined by either: Minimum:
I A : ~=Am; i n ( ~ A( f~ i )pAf , (~i) : ui E Ui}
Algebraic Pr'oduct: pAfnA:
(PA!(ui)/LAf(Ui) I eUi) (57.418) Suppose that we use the: notation x * y = min(x, y} or at other times we will use it to genote the product x * y = xy (* is sometimes called the "triangular norm"). Then pA;(ui) * pA;(ui) is a general representation for the intersection of two fuzzy sets. In fuzzy logic, intersection is used to represent the "and" operation. The union of fu:zzysets Af and A?, which are defined on the universe of discourse Ui, is a fuzzy set denoted A; U A?, with a membership filnction defined by either: =
: ui
These fuzzy sets quantify the terms in the antecedent and the consequent of the given If. . . Then rule, to make a "fuzzy implication"
If A( and A: and,. . . , and A; Then E:
(57.422)
where the fuzzysets A { , A:. . . ., A;, and B: are definedin Equation 57.422. Therefore, the fuzzy set A( is associated with, and quantifies, the meaning of the linguistic statement "iil is and B; quantifies the meaning of "Fq is B:" Each .rule in the rule base (j,k, . . . , 1 ; p, q)i, i = 1,2, . . . , R is represented with such a fuzzy implication (a fuzzy quantification of the linguistic rule). The reader who is interested in more mathematical details on fuzzy sets and fuzzy logic should consult [5].
A("
Fuzzification
Fuzzy sets are used to quantify the information in the rule base, and the inference mechanism operates on fuzzy sets to produce fuzzy sets; hence, we must specify how the fuzzy system will convert its numeric inputs U i E Ui into fuzzy sets (a process pAfuAf(ui) = m a x ( ~ A 1 ( u i ) , ~ ~ ; ( u i ) called "fuzzification") so that they can be used by the fuz% sysMaximum: : ui E Ui) tem. Let 24: denote the set of all possible fuzzy sets that can be Algebraic S U ~ :pAilUAf ( ~ i )= ( ~ i+ ) pA2(~i) defined on Ui . Given ui E Ui, fuzzification transforms ui to 'a fuzzy set denoted by defined" over the universe discourse - PA; (ui)pA;(ui) : ui E
Ui)
(57.419) Suppose that we use the notation x 8, y = m a x { x , .y) or at dther times we will use it to denote x @ y = x +y -x y (@ is sometimes called the "trianguliar co-norm"). Then p I (ui) 8, /.LA: (ui) is a Ai
131nthis section, as we introducevariousfuzzy sets, we alwaysuseahat over any fuzzy set whose membership function,changesdynamically over time as the ui change.
.
THE CONTROL HANDBOOK
Ui. This transformation is produced by the fuzzification operator 3defined by, 3 : Ui -+Uf where 3 ( u i ) := iifuz. Quite often "singleton fuzzification" is used, which produces a fuzzy set
Aifuz E
with a membership function defined by
CL . f U Z ( ~=)
4
* ... * PAf,(Un)
1
x=ui
0
otherwise
yA; (u,) are constants. Define
The inference mechanism has two basic tasks: (1) determining the extent to which each rule is relevant to the current situation as characterized by the inputs ui, i = l , 2, . . . , n (we call this task "matching"), and (2) drawing conclusions using the current inputs u, and the information in the rule base (we call this task an "inference step"). For matching note that A{ x A! x . . . x A: is the fuzzy set representing the antecedent of the ith rule ( j , X , . . . , I ; p, q)i (there may be more than one such rule with this antecedent). Suppose that at some time we get inputs ui, i = 1, 2, . . . , n, and fuzzification produces @', . . . , af;uz, the fuzzy sets representing the inputs. The first step in matching involves finding fuzzy sets . . . , if, with membership functions
A(,a!,
(for all j,k, . .. , I) that combine the fuzzy sets from fuzzification with the fuzzy sets used in each of the terms in the antecedents of the rules. If singleton fuzzification is used then each of these fuzzy sets is a singleton that is scaled by the antecedent membership function (e.g., p i , (til) = yA,(U1) for iil = u1 and 1
pdj (ti 1) = 0 for ti 1 f u 1 ). Second, we form membership val-
ues pi (U1, u2, . . . , un) for each rule that represent the overall certainty that rule i matches the current inputs. In particular we first let
1
~ l ( , ~u 2 1, . . . 7 ~ ~ ) = ~ A I ( u ~ ) * ~ A ~ ( u 2 ) * " ' * ~ A ~ ( u n ) (57.426) whichissimplyafunctionoftheinputsu,. Weuse y, (u1, u2, . . . , u,) to represent the certainty that the antecedent of rule i matches the input information. This concludes the process of matching input information with the antecedents of the rules. Next, the inference step is taken by computing, for the ith rule (j,k, . . . , I; p , q), ,the "implied fuzzy set" B; with membership function CL (yq) ~= ; ~ r ( u 1~,
The Inference Mechanism
I
b , ( i l , i 2 , . . . , i n ) = PA,(El) * hA:(U2)
Notice
(57.425) , . . . , 11,) := 0 for ti, # ui, for i, = u,, and ,!i,(il,i2 i = 1,2, . . . , n . Since the u, are given, yAJ(U,), pA:( u ~ ).,. . ,
(any fuzzy set with this form for its membership function is called a "singleton"). Singleton fuzzification is generally used in implementations since, without the presence of noise, we are absolutely certain that ui takes on its measured value (and no other value) and since it provides certain savings in the computations needed to implement a fuzzy system (relative to, e.g., "Gaussian fuzzification" which would involve forming bell-shaped membership functions about input points). Throughout the remainder ofthis chapter we use singleton fuzzification.
I
a;.
be the membership function for x x ... x that since we are using singleton fuzzification we have
. . . , un) * yB;(yq)
2 ,
(57.427)
The implied fuzzy set B; specifies the certainty level that the output should be a specific crisp output yq within the universe of discourse y q , taking into consideration only rule i . Note that since pi (u1, u2, . .. , u,) wi!l vary with time so will the shape of the membership functions piA(yq) for each rule. Alternatively, the inference mechanism could, in addition, compute the "overall implied fuzzy set" Bq with membership function
that represents the conclusion reached considering all the ruies can, in the rule base at the same time (notice that determining iq in general, require significant computational resources). Using the mathematical terminology of fuzzy sets, the computation of phq(yq) is said to be produced by a "sup-star compositional rule of inference." The "sup" in this terminology corresponds to the @ operation and the "star" corresponds to *. "Zadeh's compositional rule of inference" [6] is the special case of the sup-star compositional rule of inference when maxis used for @ and min is used for *. The overall justification for using the above operations to represent the inference step lies in the fact that we can be no more certain about our conclusions than we are about our premises (antecedents). The operations performed in taking an inference step adhere to this principle. To see this, study Equation 57.427 and note that the scaling from pi (ul, u2, . . . , u,) that is produced by the antecedent matching process ensures that supy9 (pi; (yq)) 5 pi (Ui , ~ 2. ., . un) (a similar statement holds for the overall implied fuzzy set). Up to this point we have used fuzzy logic to quantify the rules in the rule base, hzzification to produce fuzzy sets characterizing the inputs, and the inference mechanism t o produce fuzzy sets representing the conclusions that it reaches considering the current inputs and the information in the rule base. Next, we look at how to convert this fuzzy set quantification of the condusio,ns to a numeric value that can be input to the plant.
.
-
57.10. FUZZY CONTROL
Defuzzification A number of defuzzification strategies exist. Each provides a means to choose a single output (which we denote with y y p ) based on either the implied fuzzy sets or the overall implied fuzzy set (depending on the type ofinference strategychosen). First, we. present typical defuzzification techniques for the overall implied fuzzy set Bq: crisp is chosen as the Max Criteria: A crisp output yq point on the output universe of discourse yq for which the overall implied fuzzy set Bq achieves a maximum, i.e.
Since the supremum can occur at more than one point in 3Jqone also needs to specify a strategy on crisp (e.g., chooshow to pick only one point for yq ing the smallest value). Often this defuzzification strategy is avoided due to tllis ambiguity. crisp is chosen Mean of hlaximum: A crisp output yq to represent the mean value of all elements whose membership in Bq is a maximum. We define as the supremum of the membership function of Bq over the universe of discourse Yq. Moreover, we define a fuzzy set hq* E yq with a membership function defined as
6Fax
pi; (yq) =
gmax /+jq(yq)= q 0 otherwise 1
(57.430)
then a crisp output, using the mean of maximum method, is defined as
Note that the integrals in Equation 57.431 must be computed at each time instant since they depend on h q , which changes with time. This can require excessive computational resources; hence, this defuzzification technique is often avoided in practice. crisp Center of Area (COA): A crisp output yq is chosen as the center of area for the membership function of the overall implied fuzzy set Bq. For a continuous output universe of discourse yq tke center of area output is denoted by
Note that, similar to the mean of the maximu:n method, this defuzzification approach can be con)putationally eicpensive. Also, the fuzzy system myist be defined so that iyq piq(yq)dyq # 0 for all ui.
Next, we specify typical defuzzification techniques for the implied fuzzy sets B;: crisp Center Average: A crisp output yq is chosen using the centers of each of the output membership functions and the maximum certainty of each of the conclusions represented with the implied fuzzy sets and is given by crisp -yq -
rf=,supyq{ p i ; (yq)I c;
rf= SUP, 1
{ p i ; (yq)}
(57.433)
where ck is the center of area of the membership function of B: associated with the implied fuzzy set i?; for the ith rule ( j , k, . . . , I ; p, q)i. Notice that s ~ p ~ ~ { p ~is ~often ( y very ~ ) )easy to compute since if pB;(yq) = 1 for at least one yq (which is the normal way to define membership functions), then for many inference strategies ~ ~ P ~ , ( C L ~= ; ( /-~i(ulr Y ~ ) I~ 2 . ., . ~ n ) which , has already been computed in the matching process. Notice that the fuzzy system must be defined so that ~ f i supyq l {pi; ( ~ q ) # ) 0 for all ui.
.
ypp
Center of Gravity (COG): A crisp output is chosen using the center of area and area of each implied fuzzy set and is given by crisp Yq
-
ELcb iyq pi; ( ~ q ) d ~ q (57.434) rC1 iyq pi; (yq)dyq
where cb is the center of area of the membership function of B: associated with the implied fuzzy set 6; for the ith rule ( j , k, .. . ,1; p, q)i. Notice that COG can be easy to compute, since it is often easy to find closed-form expressions for Jyq pi; (yq)dyq which is the area under a membership function. Notice that the fuzzy system must be defined so that Cf=liyq pi; (yq)dyq # 0 f o r d u i . Overall, we see that using the overall implied fuzzy set in defuzzification is often undesirable for two reasons: (1) the overall implied fuzzy set B~ is itself difficult to compute in general, and (2) the defuzzification techniques based on an inference mechanism that provides Bq are also difficult to compute. It is for this reason that most existing fuzzy controllers (including the ones in this chapter) use defuzzificationtechniques based on the implied fuzzy sets such as Centroid or COG.
57.10.3 Theme Problem: Rotational Inverted Pendulum One of the classic problems in the study of nonlinear systems is that of the inverted pendulum. The primary control problem
THE CONTROL HANDBOOK one considers with such a system is regulating the position of the pendulum (typically a rod with mass at the endpoint) to the vertical (up) position; i.e., "balanced." A secondary problem is that of "swinging up" the pendulum from its rest position (vertical down). Often, actuation is accomplished either via a motor at the base of the pendulum (at the hinge), or via a cart through translational motion. In this example, actuation of the pendulum is accomplished through rotation of a separate, attached link, referred to henceforth as the "base."
For controller synthesis (and model linearization) we require a state variable description of the system. This is easily done by defining state variables xl = 00, xz = eo, x3 = 01, x4 = 01, and control signal u = v,. Linearization of these equations about the vertical position (i.e., 81 = O), and using the system physical parameters [7] results in the following linear, time-invariant state-variable description:
Experimental Apparatus
1
The test bed consists of three primary components: the plant, digital and analog interfaces, and the digital controller. The overall system is shown in Figure 57.37, where the three components can be clearly identified [7]. The plant is composed of a pendulum and a rotating base made of aluminum rods, two optical encoders as the angular position sensors with effective resolutions of 0.2 degrees for the pendulum and 0.1 degrees for the base, and a large, high-torque permanent-magnet DC motor (with rated stall torque of 5.15 N-m). As the base rotates through the angle 00 the pendulum is free to rotate (high precision bearings are utilized) through its angle 01 made with the vertical. Interfaces between the digital controller and the plant consist of two data acquisition cards and some signal conditioning circuitry, structured for the two basic functions of sensor integration and control signal generation. The signal conditioning is accomplished via a combination of several logic gates to filter quadrature signals from the optical encoders, which are then processed through a separate data acquisition card to utilize the four 16-bit counters (accessed externally to count pulses from the circuitry itself). Another card supplies the control signal interface through its 12-bit DIA converter (to generate the actual control signal), while the board's 16-bit timer is used as a sampling clock. The computer used for control is a personal computer with an Intel 80486DX processor operating at 50 MHz. The real-time codes for control are written in C.
Mathematical Model For brevity, and because this system is a popular example for nonlinear control, we omit details of the necessary physics and geometry for modeling. The differential equations that approximately dekribe the dynamics of the plant are given by
where, again, 80 is the angular displacement of the rotating base, eo is the angular speed of the rotating base, O1 is the angular displacement of the pendulum, el is the angular speed of the pendulum, v, is the motor armature voltage, K,, and ap are parameters of the DC motor with torque constant K l , g is the acceleration due to gravity, m 1 is the pendulum mass, el is the pendulum length, J1is the pendulum inertia, and C1 is a constant associated with friction (actual parameter values appear in (71).
Swing Up Control Because we intend to develop control laws that will be valid in regions about the vertical position (0, = O), it is crucial to swing the pendulum up so that it is near-vertical at near-~ero (angular) velocity. Elaborate schemes can be used for this task (such as those employing concepts from differential geometry), but for the purposes of this example we choose to use a simple heuristic procedure based on an "energy pumping strategy" proposed in [8] for a similar under-actuated system. The goal of this simple swing up control strategy is to "pump" energy into the pendulum link in such a way that the energy or magnitude of each swing increases until the pendulum approaches its inverted position. To apply such an approach, we simply consider how one would (intuitively) swing the pendulum from its hanging position (01 = x ) to its upright position. If the rotating base is swung to the left and right continually at an appropriate frequency, the magnitude of the pendulum at each swing will increase. The control scheme we will ultimately employ consists of two main components: the "scheduler" (which we will also call a "supervisor") observes the position of the pendulum relative to its stable equilibrium point (01 = n),then schedules the transitions between two reference positions of the rotating base (Ooref = W);and, the "positioning control" regulates the base to the desired reference point. These two components compose a dosed-loop planning algorithm to command the rotating base to move in a certain direction based on the position of the pendulum. In effect, the human operator acts as the supervisor in tuning the positioning control (through trial and error on the system). For simplicity, a proportional controller will be used as the positioning control. The gain K p is chosen just large enough so that the actuator drives the base fast enough without saturating the control output; after several trials, K p was set to 0.5. The parameter r determines how far the base is allowed to swing; larger swings transfer more energy to swinging up the pendulum. The swing-up motion of the pendulum can be approximated as an exponentially growing cosine function. The parameter r significantly affects the "negative damping" (i.e., exponential growthj of the swing-up motion. By tuning r, one can adjust the motion
57.10. FUZZY CONTROL 1:ONTROLLER
j
MTERFACES L4b Trads
CONTROLLED OBJECT
Bovd
..................
Gucm). I C W P C..
.X X I M B hud dark
Figure 57.37
Hardware setup.
of the pendulum in such a way that the velocity of the pendulum and the control output are minimum when the pendulum reaches its inverted position (i.e., the pendulum has the largest potential energy and the lowest kinetic energy). Notice that if the dynamics of the pendulum are changed (e.g., adding extra weight to the endpoint of the pendulum), then the parameter r must be tuned. In [7] it is shown how a rule-based system can be used to effectively automate the swing up control by implementing fuzzy strategies in the supervisor portion of the overall scheme.
Balancing Control Synthesis of the fuzzy controllers to follow is aided by (1) a good understanding of the pendulum dynamics (the analytical model and intuition related to the physical process), and (2) experience with performance of linear control strategies. Although numerous linear control design techniques have been applied to this particular system, here we consider the performance of only one linear strategy (the one tested) as applied to the experimental system: the linear quadratic regulator (LQR). Our purpose is twofold. First, we wish to form a baseline for comparison to fuzzy control designs to follow, and second, we wish to provide a starting point for synthesis of the fuzzy controller. It is important to note that extensive simulation results (on the nonlinear model) were carried out prior to application to the laboratory apparatus; designs were carried out on the linearized model of the system. Specifics of the (designprocess for the LQR and other applicable linear design techniques may be found in other chapters of this volume. Because the linearized system is completely controllable and observable, state feedback strategies, including the optimal strategies of the LQR, are applicable. Generally speaking, the system performance is prescribed via the optimal performance index
where Q and R are the weighting matrices corresponding to the state x and input u, respectively. Given fixed Q and R, the feedback gains that optimize the function J can be uniquely determined by solving an algebraic Riccati equation. Because we
aremore concerned with balancing the pendulum than regulating the base, we put the highest priority in controlling 81 by 'choosing the weighting matrices Q = d i a g ( 1 , 0 , 5,O) and R = [I]. For a 10 msec sampling time, the discrete optimal feedback gains corresponding to the weightingmatrices Q and Rare kl = -0.9, k2 = -1.1, k3 = -9.2, and k4 = -0.9. Althoughobserversmay be designed to estimate the states o1 and eo, we choose to use an equally effective and simple first-order approximation for each derivative. Note that this controller is designed in simulation for the system as modeled (and subsequently linearized). When the resulting controller gains (kl through k4) are implemented on the actual system, some "trial and error" tuning is required (due primarily to modeling uncertainties), which amounted to adjusting the designed gains by about 10% to obtain performance matching the predicted results from simulation. Moreover, it is critical to note that the design process (as well as the empirical tuning) has been done for the "nominal" system (i.e., the pendulum system with no additional mass on the endpoint). Using a swing up control strategy tuned for the nominal system, the results ofthe LQRcontrol design are given in Figure 57.38 for the base angle (top plot), pendulum angle (center plot), and control output (bottom plot).
57.10.4 Fuzzy Control for the Rotational Inverted Pendulum Aside from serving to illustrate procedures for synthesizing a fuzzy controller, several reasons arise for considering the use of a nonlinear control scheme for the pendulum system. Because all linear controllers are designed based on a linearized model of the system, they are inherently valid only for a region about a specific point (in this case, the vertical, = 0 position). For this reason, such linear controllers tend to be very sensitive to parametric variations, uncertainties, and disturbances. This is indeed the case for the experimental system under study; when an extra weight or sloshing liquid (using a water-tight bottle) is attached at the endp~intof the pendulum, the performance of al! linear controllers degrades considerably,often resulting in unstable behavior. Thus, to enhance the performance of the balancing
THE CONTROL HANDBOOK
Figure57.38
Experimental results: LQR on the nominal system.
control, one naturally turns to some nonlinear control scheme that is expected to exhibit improved performance in the presence of disturbances and uncertainties in modeling. Two such nonlinear controllers will be investigated here: in this section, a direct fuzzy controller is constructed and later an adaptive version of this same controller is discussed. Controller Synthesis For simplicity, the controller synthesis example explained next will utilize singleton fuzzification and symmetric, "triangular" membership functions on the controller inputs and output (they are, in fact, very simple to implement in real-time code). We choose to use seven membership functions for each input, uniformly distributed across their universes of discourse (over crisp values of each input e l ) , as shown in Figure 57.39. The linguistic values for the ith input are denoted by I?: where r E {-3, -2, - 1,0, 1,2,3). Linguistically, we would therefore define I?,3 as "negative large," I?,-* as "negative medium:' I?! as "zero," and so on. Note also that a "saturation nonlinearity" is built in for each input in the membership functions corresponding to the outermost regions of the universes of discourse. We use min to represent the antecedent and implication (i.e., "*" from Section 57.10.2 is min) and COG defuzzification. To synthesize a fuzzy controller for our example system, vqe pursue the idea of seeking to "expand" the region of operation of the fixed (nonadaptive) controller. In doing so, we will utilize the results of the LQR design presented in Section 57.10.3 to lead us in the design. A block diagram of the fuzzy controller is shown in Figure 57.40. Similar to the linear quadratic regulator, the fuzzy controller for the inverted pendulum system will have four inputs and one output. The four (crisp) inputs to the fuzzy controller are the position error of the base el, its derivative e2, the position error of the pendulum e3, and its derivative e4. The normalizing gains g, essentially serve to expand and compress the universes of discourse to some predetermined, uniform
region, primarily to standardize the choice of the various parameters in synthesizing the fuzzy controller, A crude approach to choosing these gains is strictly based on intuition and does not require a mathematical model of the plant. In that case the input normalizing gains are chosen in such a way that all the desired operating regions are mapped into [- 1, 11. Such a simple approach in design works often for a number of systems, as witnessed by the large number of applications documented in the open literature. For complicated systems, however, such a procedure can be very difficult to implement because there are many ways to define the linguistic values and linguistic rules; indeed, it can be extremely difficult to find a viable set of linguistic values and rules just to maintain stability. Such was the case for this system. What we propose here is an approach based on experience in designing the LQR controller for the linearized model of the plant, leading to a mechanized procedure for determining the normalizing gains, output membership functions, and rule base. Recall from our discussion in Section 57.10.2 that a fuzzy system is a static nonlinear map between its inputs and output. Certainly, therefore, a linear map such as the LQR can be easily approximated by a fuzzy system (for small values of the inputs to the fuzzy system). Two components of the LQR are the optimal gains and the summer; the optimal gains can be replaced with the normalizing gains of a fuzzy system, and the summer can essentially be incorporated into the rule base of a fuzzy system. By doing this, we can effectivelyutilize a fuzzy system implementation to expand the region of operation of the controller beyond the "linear region" affbrded by the linearizationldesign process. Intuitively, this is done by making the "gain" of the fuzzy controller match that of the LQR when the fuzzy controller inputs are small, while shaping the nonlinear mapping reprdsenting the fuzzy controller for larger inputs (in regions further from zero). As pointed out in Section 57.10.2, the rule base contains information about the relationships between the inputs and output of
+
57.10. FUZZY CONTROL
Four sets (ofinput membership functions: (a) "base position error" ( E l ) , (b) "base derivative error" (&), (c) "pendulum position error" (i3), and (d) "pendulum derivative error" ( 2 4 ) .
Figure 57.39
..........................................
i
Knowledge-Base
i
80
lnvened Pendulum Syslcm
8,
i Normalized Fuzzy Convollcr ,i ..........................................
Figure 57.40
]Blockdiagram of direct fuzzy controller.
a fuzzy system. Within the controller structure chosen, recall that we wish to construct the rule base to perfdrm a weighted sum of the inputs. The summation operation is straightforward; prior to normalization, it is simply a matter or arranging each If. .. Then rule such that the antecedent indices sum to the consequent indices. Specification of the normalizing gains is explained next. The basic idea [ 9 ] in specifying the go - g4 1s so that for "small" controller inputs (e,) the local slope (about zero) of the input-output mapping representing the controller will be the same as the LQR gains (i.e., the k,). As alluded to above, the normalizing gains gl - gr transform the (symmetric) universes of discourse for each input (see Figure 57.39) to [-- 1 , 11. For example, if [ - P i , P i ] is the interval of interest for input i, the choice gi = l / p i would achieve this normalization, whereas the choice go = PO would map the output of the normalized fuzzy system to the real output to achieve a corresponding interval of [-Po, Do]. Then, assuming the fuzzy system provides the summation operation, the "net gain" for the ith input-output pair is gigo. Finally, therefore, this implies that g, go = ki is required to match local slopes of the LQR controller and the fuzzy controller (in the sense oiinput-output mappings). We are now in position to summarize the gain selection pro-
cedure. Recalling (Section 57.10.3) that the optimal feedback gains based on the LQR approach are kl = -0.9, k2 = - 1.1, k3 = -9.2, and k4 = -0.9, transformation of the optimal LQR gains into the normalizing gains of the fuzzy system is achieved according to the following simple scheme: Choose the controller input that most greatly influences plant behavior and overall control objectives; in our case, we choose the pendulum position el. Subsequently, we specify the operating range ofthis input (e.g., the interval [ - 0 . 5 , f0.51 radians, for which the corresponding normalizing input gain g3 = 2 ) . Given g3, the output gain of the fuzzy controller is ~3 = -4.6 . calculated according to go = bGiven the output gain go, the remaining input gains can be calculated according to g, = where j E ( 1 , 2 , 3 , 4 ) , j # i (note that i = 3 ) . For go = -4.6, the input gains g l , g2, g3, and gq are 0.1957, 0.2391,2, and 0.1957, respectively.
2,
Determination of the controller autput universe of discourse and corresponding normalizing gain is dependent on the struc-
THE CONTROL HANDBOOK ture of the rule base. A nonlinear mapping can be used to rearrange the output membership f~lnctions(in terms of their centers) for several purposes, such as to add higher gain near the center, to create a dead zone near the center, to eliminate discontinuities at the saturation points, and so on. This represents yet another area where intuition (i.e., knowledge about how to best control the process) may be incorporated into the design process. In order to preserve behavior in the "linear" region (i.e., the region near the origin) of the LQR-extended controller, but at the same time provide a smooth transition from the linear region to its extensions (e.g., regions of saturation), we choose an arctangent-type mapping to achieve this rearrangement. Because of the "flatness" of such a mapping near the origin, we expect the fuzzy controller to behave like the LQR when the states are near the process equilibrium. It is important to note that, unlike the input membership function of Figure 57.39, the output membership functions at the outermost regions of the universe of discourse do not include the saturating effect; rather, they return to zero value which is required so that the f ~ ~ z controller zy mapping is well defined. In general, for a fuzzy controller with 11 inputs and one output, the center of the controller output fuzzy set Y S would be located at ( j k ... 1) x &, where s = j k . . . 1 is the index of the output fuzzy set Y" (and the output linguistic value), (j,k, . . .1) are the indices of the input fuzzy sets (and linguistic values), N is the number of nlembership functions on each input, and n is the number of inputs. Note that we must nullify the effect of divisor n by multiplying the output gain go by the same factor.,
+ + +
+ + +
Performance Evaluation Simulation Some performance evaluation via simulation is prudent to investigate the effectiveness of the strategies employed in the controller synthesis. Using the complete nonlinear model, the simulated responses of the direct fuzzy controller with seven membership functions on each input indicate that the fuzzy controller successf~~lly balances the pendulum, but with a slightly degraded performance as compared to that of the LQR (e.g., the fuzzy controller produces undesirable high-frequency "chattering" effects over a bandwidth that may not be realistic in implementation). One way to increase the "resolution" of the fuzzy controller is to increase the number of membership functions. As we increase the number of membership functions on each input to 25, responses using the fuzzy controller become smoother and closer to that of the LQR. Additionally, the control surface of the fuzzy controller also becomes smoother and has a much smaller gain near the center. As a result, the control output of the fuzzy controller is significantly smoother. On the other hand, the direct fuzzy controller, with 25 membership functions on each input, comes with increased complexity in design and implementation (e.g., a four-input, one-output fuzzy system with 25 membership functions on each input has ~5~ = 390, 625 linguistic rules). Application toNominal System Given theexperience of the simulation studies, the final step is to implement the fuzzy
controller (with seven membership functions on each input) on the actual apparatus. For comparative purposes, we again consider application to the nominal system, that is, the pendulum alone with no added weight or disturbances. With the pendulum initialized at its hanging position (8, = rr), the swing up control was tuned to give the best swing up response, as in the case of the LQR results of Section 57.10.3. The sampling time was set to 10 msec (smaller sampling times produced no significant difference in responses for any ofthe controllers tested on this apparatus). The only tuning required for the fuzzy control scheme (from simulation to implementation in experimentation) was in adjusting the value for g3 upward to improve performance; recall that the gain g3 is critical in that it essentially determines the other normalizing gains. Figure 57.41 shows the results for the fuzzy controller on the laboratory apparatus; the top plot shows the base position (angle), the center plot shows the pendulum position (angle), and the bottom plot shows the controller output (motor voltage input). The response is comparable to that of the LQR controller (compare to Figure 57.38), in terms of the pendulum angle (ability to balance in the vertical position). However, some oscillation is noticed (particularly in the controller output, as predicted in simulation studies), but any difference in the ability to balance the pendulum is only slightly discernible in viewing the operation of the system. Application to Perturbed System When the system experiences disturbances and changes in dynamics (by attaching additional weight to the pendulum endpoint, or by attaching a bottle half-filled with liquid), degraded responses are observed for these controllers. Such experiments are also informative for considerations of robustness analysis, although here we regard such perturbations on the nominal system as probing the limits of linear controllers (i.e., operating outside the linear region). As a final evaluation of the performance of the fuzzy controller as developed above, we show results when a container half-filled with water was attached to the pendulum endpoint. This essentially gives a "sloshing liquid" effect, because the additional dynamics associated with the sloshing liquid are easily excited. In addition, the added weight shifted the pendulum center of mass away from the pivot point; as a result, the natural frequency of the pendulum decreased. Furthermore, the effect of friction becomes less dominant because the inertia of the pendulum increases. These effects obviously come to bear on the balancing controller performance, but also significantly affect the swing up controller as well. For the present article, we note that the swing up control scheme rcquires tuning once additional weight is added to the endpoint, and refer the interested reader to [7] for details of a supervisory fuzzy controller scheme where tuning of the swing up controller is carried out autonomously. With the sloshing liquid added to the pendulum endpoint, the LQR controller (and, in fact, other linear control schemes we implemented on this system) produced an unstable response (was unable to balance the pendulum). Of course, the linear control schemes can be tuned to improve performance for the perturbed system, at the expense of degraded performance for the nomi-
57.10. FUZZY CONTROL
Figure 57.41
Experimental results: Direct fuzzy control on the nominal system.
Figure 57.42
I!xperimental results: Direct fuzzy control on the pendulum with sloshing liquid at its endpoint.
nal system. Moreover, it is important to note that tuning of the LQR-type controller is difficult: and ad hoc without additional modeling to account for the added dynamics. Such an attempt on this system produced a controller with stable but poor performance. The fuzzy controller, on the other hand, because of its expanded region of operation (in the sense that it acts like the LQR for small inputs and induces a nonlinearity for larger signals), was able to ma.intain stability in the presence of the additional dynamics and disturbances caused by the sloshing liquid, without tuning. These results are shown in Figure 57.42, where some degradation of controller performance is apparent. Such experiments may also motivate the need for a controller that can adapt to changing dynamics during operation; this issue is discussed later when we address adaptation in fuzzy controllers.
57.10.5 Adaptive Fuzzy Control Overview While fuzzy control has, for some applications, emerged as a practical alternative to classical control schemes, there exist rather obvious drawbacks. We do not address all of these drawbacks here (such as stability, which is a current research direction); rather, we focus on an important emerging topical area within the realm of fuzzy control, that of adaptive fuzzy control, as it relates to some of these drawbacks. The point that is probably most often raised in discussion of controller synthesis using fuzzy logic is that such procedures are usuallyperformed in an ad hoc manner, where mechanized synthesis procedures, for the most part, are nonexistent (e.g., it is often not clear exactly how to justify the choices for miny controller parameters, such as membership functions, defuzzifica-
THE CONTROL HANDBOOK tion strategy, and inference strategy). On the other hand, some mechanized synthesis procedures do exist for particular applications, such as the one discussed above for the balancing control part of the inverted pendulum problem where a conventional LQR scheme was utilized in the fuzzy control design. Typically such procedures arise primarily out of necessitybecause ofsystem complexity (such as when many inputs and multiple objectives must be achieved). Controller adaptation, in which a form of automatic controller synthesis is achieved, is one way of attacking this problem, when no other "direct" synthesis procedure is known. Another, perhaps equally obvious, requirement in the design and operation of any controller (fuzzy or otherwise) involves questions ofsystem robustness. For instance, we illustratedin our theme example that performance of the direct fuzzy controller constructed for the nominal plant may degrade if significant and unpredictable plant parameter variations, structural changes, or environmental disturbances occur. Clearly, controller adaptation is one way of overcoming these difficulties to achieve reliable controller performance in the presence of unmodeled parameter variations and disturbances. Some would argue that the solution to such problems is always to incorporate more expertise into the rule base to enhance performance; however, there are several limitations to such a philosophy, including: (1) the difficulties in developing (and characterizing in a rule base) an accurate intuition about how to best compensate for the unpredictable and significant process variations that can occur for all possible process operating conditions; and (2) the complexities of constructing a fuzzy controller that potentially has a large number of membership functions and rules. Experience has shown that it is often possible to tune fuzzy controllers to perform very well if the disturbances are known, Hence, the problem does not result from a lack of basic expertise in the rule base, but from the fact that there is no facility for automatically redesigning (i.e., retuning) the fuzzy controller so that it can appropriately react to unforeseen situations as they occur. There have been many techniques introduced for adaptive fuzzy control. For instance, one adaptive fuzzy control strategy that borrows certain ideas from conventional "model reference adaptive control" (MRAC) is called "fuzzy model reference learning control" (FMRLC) [lo]. The FMRLC can automatically synthesize a fuzzy controller for the plant and later tune it if there are significant disturbances or process variations. The FMRLC has been successfullyapplied to an inverted pendulum, a ship-steering problem [lo], antiskid brakes [ 111, reconfigurable control for aircraft [9],and in implementation for a flexible-link robot [12]. Modifications to the basic FMRLC approach have been studied in [13]. The work on the FMRLC and subsequent modifications to it tend to follow the main focus in fuzzy control where one seeks to employ heuristics in control. There are other techniques that take an approach that is more like conventional adaptive control in the sense that a mathematical model of the plant and a Lyapunov-type approach is used to construct the adaptation mechanism. Such work is described in [2]. There are many other "direct" and "indirect" adaptive fuzzy control approaches that have been used in a wide variety of applications
(e.g., for scheduling manufacturing systems [14]). The reader should consult the references in the papers cited above for more details. Another type ofsystem adaptation, where a significantamount and variety of knowledge can be loaded into the rule base of a fuzzy system to achieve high-performance operation, is the supervisoryfuzzy controller, a two-level hierarchical controller that uses a higher-level fuzzy system to supervise (coordinate or tune) a lower-level conventional or fuzzy controller. For instance, an expert may know how to control the system very well in one set of operating conditions, but if the system switches to another set of operating conditions, the controller may be required to behave differentlyto again achieve high-performance operation. A good example is the PID controller, which is often designed (tuned) for one set of plant operating conditions, but if the operating conditions change the controller will not be properly tuned. This is such an important problem that there is a significant amount of expertise on how to manually and automatically tune PID controllers. Such expertise may be utilized in the development of a supervisory fuzzy controller which can observe the performance of a low-level control system and automatically tune the parameters of the PID controller. Many other examples exist of applications where the control engineer may have a significant amount of knowledge about how to tune a controller. One such example is in aircraft control when controller gains are scheduled based on the operating conditions. Fuzzy supervisory controllers have been used as schedulers in such applications. In other applications we may know that conventional or fuzzy controllers need to be switched on based on the operating conditions (see the work in [15] for work on fuzzy supervision of conventional controllers for a flexible-link robot) or a supervisory fuzzy controller may be used to tune an adaptive controller (see the work in [9] where a fuzzy supervisor is used to tune an adaptive fuzzy controller that is used as a reconfigurable controller for an aircraft). It is this concept of monitoring and supervisinglower-level controllers (possibly fuzzy, possibly conventional) that defines the supervisory control scheme. Indeed, in this sense supervisory and adaptive systems can be described as special cases ofone another.
Autotuning for Pendulum Balancing Control Many techniques exist for automatically tuning a fuzzy controller in order to meet the objectives mentioned above. One simple technique we present next, studied in [7,13,14], expands on the idea of increasing the "resolution'' of the fuzzy controller in terms of the characteristics of the input membership functions. Recall from Section 57.10.4 for our theme problem that when we increased the number of membership functions on each input to 25, improved performance (and smoother control action) resulted. Likewise, we suspect that increasing the resolution would result in improved performance for the perturbed pendulum case. To increase the resolution of the direct fuzzy controller with a limited number of membership functions (as before, we will impose a limit of seven), we propose an "autotuned fuzzy control."
57.10. FUZZY COW'TROL To gain insight on lnow the autotuned fuzzy control works, consider the idea of a "fine controller:' with smooth interpolation, achieved using a fuzzy system where the input and output universes of discourse are narrow (i.e., the input normalizing gains are large, and the output gain is small). In this case there are many membership functions on a small portion of the universe of discourse (i e., "high resolution"). Intuitively, we reason that as the input gains are increased and the output gain is decreased, the fuzzy controller will have better resolution. However, we also conjecture that to get the most effective control action the input universes of discourse must also be large enough to avoid saturation. This obviously raises a question of trying to satisfy two opposing objectives. The answer is to adjust the gains based on the current operating states of the system. For example, if the states move closer to the center, then the input universe of discourse should be compressed to obtain better resolution, yet still cover all the active states. If the states move away from the center, then the input universe of discourse must expand to cover these states, at the expense of lowering the resolution. The input-output gains of the fuzzy controller can be tuned periodically (e.g., every 50 samples) using the autotuning mechanism shown in Figure 57.43. Ideally, the autotuning algorithm should not alter the nominal control algorithm near the center; we therefore do not adjdst each input gain independently. We can, however, tune the most significant input gain, and then adjust the rest olf the gains based on this gain. For the inverted pendulum system, the most significdnt controller input is the position error of the pendulum, e3 = 01. The input-output gains are updated every n, samples in the following manner: Find the maximum e3 over the most recent n, samples and denote it by eyax rn Set the input g i n g3 =: & I5 1 rn Recalculate the remaining gains using the technique
discussed in Section 57.10.4 so as to preserve the nominal control action near the center We note that the larger n, is, the slower the updating rate is, and that too fast an updating rate may cause instability. Of course, if a large enough buffer were available to store the most recent n, samples of the input, the gains could be updated at every sample (utilizing an average), here we minimized the usage of memory and opted for the procedure mentioned above (finding the maximum value of eg). Simulation1 tests (wxth a 50-sample observation window and normalizing gain gs = 2) reveal that when the fuzzy controller is,activated (after swing up), the input gains gradually increase while the output gain decreases, as the pendulum moves closer to its inverted p~osition.As a result, the input and output universes of discourses contract, and the resolution of the fuzzy system increases. As g3 reaches its maximum value14, the control action
-
-
141n practice, it is important to constrain the maximum value for gs
near 81 = 0 is smoother than that of direct fuzzy control with 25 membership functions (as investigated previously via simulation), and very good balancing performance is achieved. When turning to actual implementation on the laboratory apparatus, some adjustments were done in order to optimize the performance of the autotuning controller. As with the direct fuzzy controller, the value of g3 was adjusted upward, and the tuning (window) length was increased to 75 samples. The first test was to apply the scheme to the nominal system; the autotuning mechanism improved the response of the direct fuzzy controller (Figure 57.41) by varying the controller resolution online. That is, as the resolution of the fuzzy controller increased over time, the high-frequency effects diminished. The true test of the adaptive (autotuning) mechanism is to evaluate its ability to adapt its controller parameters as the process dynamics change. Once again we investigate the performance when the "sloshing liquid" dynamics (and additional weight) are appended to the endpoint of the pendulum. As expected from simulation exercises, the tuning mechanism, which "stretches" and "compresses" the universes of discourse on the input and output, not only varied the resolution of the controller but also effectively contained and suppressed the disturbances caused by the sloshing liquid, as clearly shown in Figure 57.44.
57.10.6 Concluding Rerna~~ks We have introduced the general fuzzy controller, shown how to design a fuzzy c:ontroller for our rotational inverted pendulum theme problem, and compared its performance to a nominal LQR controller. We overviewed supervisory and adaptive fuzzy control techniques and presented a particular adaptive scheme that worked very effectively for the balancing control of our rotational inverted pendulum theme problem. Throughout the chapter we have emphasizedthe importance of comparative analysis of fuzzy controllers with conventional controllers, with the hope that more attention will be given to detailed engineering cost-benefit analyses rather than the hype that has surrounded fuzzy control in the past. There are close relationshipsbetween fuzzy control and several other intelligent control methods. For instance, expert systems are generalized Fuzzy systems since they have aknowledge base (a generalized version of a rule base where information in the form of general rules or other representations may be used), an inference mechanism that utilizes more general inference strategies, and are constructed using the same general approach as fuzzy systems. There are close relationships between some types of neural netw0rk.s (particularly radial basis function neural networks) and fuzzy systems and, while we did not have the space to cover it here, fuzzy systems can be trained with numerical data in the same way that neural networks can (see [2] for more details). Genetic algorithms provide fbr a stochastic optimization
(for our system, to avalue of 10) because disturbances and inaccuracies in measurements could have adverse effects.
THE CONTROL HANDBOOK Auto-Tuning Mcclunirm
................. ....................
k, u the m i n d convol lain w h m i
I.
...
00
",
Inverted Pendulum
System
Figure 57.43
Autotuned fuzzy control.
Figure 57.44
Exper!mental results: Autotuned fuzzy control on the pendulum with sloshing liquid at its endpoint.
01
-
-
technique and can be useful for computer-aided design of fuzzy controllers, training fuzzy systems for system identification, or tuning fuzzy controllers in an adaptive control setting.
57.10.7 Acknowledgments
While there exist such relationships between fuzzy control and other techniques in intelligent control, the exact relationships between all the techniques have not been established. Research along these lines is progressing but will take many years to complete. Another research area that is gaining increasing qttention is the nonlinear analysis of fuzzy control systems and the use of comparative analysis of fuzzy control systems and conventional controk systems to determine the advantagesand disadvantages of fuzzy control. Such work, coupled with the application of fuzzy contro! to increasingly challenging problems, will help establish the technique as a viable control engineering approach.
A large portion of the ideas, concepts, and results reported on in this chapter have evolved over several years from the work of many students under the direction of the authors. We therefore gratefully acknowledge the contributions of: A. Angsana, E. Garcia-Benitez, S. Gyftakis, D. Jenkins,W. Kwong, E. Laukonen, J. Layne, W.Lennon, A. Magan, S.Morris, V. Moudgd, J. Spooner, and M. Widjaja. We would like, in particular, to acknowledgethe work of J. Layne and E. Laukonen for their assistance in writing earlier versions of Section 57.10.2 of this chapter and M. i i d jaja, who produced the experimental results for the rotational inverted pendulum.
57.11. N E U R A L CONTROL
57.10.8 Defining Terms Rule base: A part of the fuzzy system that contains a set of If.. . Then rules that quantify a human expert's knowledge about howto best control a plant. It is a special form of a knowledge base that only contains If. . .Then rules that are quantified with fuzzy logic. Inference Mechanism: A part of the fuzzy system that reasons over the informatiorl in the rule base and decides what actions to take. For the fuzzy system the inference mechanism is irhplemented with fuzzy logic. Fuzzification: Converts standard numerical fuzzy system inputs into a fuzzy set that the inference mechanism can operate on. Defuzzification:: Converts the conclusions reached by the inference mechanism (i.e., fuzzy sets) into numeric inputs suitable for the plant. Supervisory Fuzzy Controller: A two-levelhierarchicalcontroller which uses a higher-level fuzzy system to supervise (coordinate or tune) a lower-level conventional or fuzzy controller. Adaptive Fuzzy Controller: An adaptive controller that uses fuzzy systems either in the adaptation mechanism or as the controller that is tuned.
References [ l ] Jenkins; D. and Passino, K.M., Introduction to nonlinear analysis of fuzzy control systems, Submitted for journal publication, 1995. [2J Wang, L.-X., Adaptive Fuzzy Systems and Control: Design and Stability Anulysis; Prentice Hall, Englewood Cliffs, NJ, 1994. [3] Driankov, D., Hellendoorn, H., andReinfrank, M., An Introduction to Fuzzy Control, Springer-Verlag, New York, 1993. [4] Pedrycz, W., Fuzzy Control and Fuzzy Systems, 2nd ed., Research Studies Press, New York, 1993. [5] Klir, G.J. and Folger, T.A., Fuzzy Sets, Uncertainty and Information, Prentice Hall, Englewood Cliffs, NJ, 1988. [6] Zadeh, L.A., Outline of a new approach to the analysis of complex systems and decision processes, IEEE Trans. Syst. Man Cybern., 3(1), 2844, 1973. [7] Widjaja, M. and Yurkovich, S., Intelligent control for swing up and balancing of an inverted pendulum system, Proc. of IEEE Conf Control Applications, Albany, NY,Sept. 1995. [8] Spong, M.W., Swing up control of the acrobot, Proc. IEEE Int. Conf: Robotics and Automation, San Diego, May 1994, pp. 2356-2361. [9] Kwong, V?.A., Passino, K.M., Laukonen, E.G., and Yurkovich, S., Expert supervision of fuzzy learning systems for fault tolerant aircraft control, Proc. IEEE,
83(3), 1995. [lo] Layne, J.R. and Passino, K.M., Fuzzy model reference learning control for cargo ship steering, IEEE Control Syst. Mag., 13(6),23-34, 1993. [ 111 Layne, J.R., Passino, K.M., and Yurkovich, S., Fuzzy learning control for antiskid braking systems, IEEE Trans. Control Syst. Technol., 1(2), 122-129, 1993. 1121 Moudgal, V.G., Kwong, W.A., Passino, K.M., and Yurkovich, S., Fuzzy learning control for a flexiblelink robot, IEEE Trans. Fuzzy Syst., 3 (2), 1995. [ 131 Kwong, W.A. and Passino, K.M., Dynamically focused fuzzy learning control, IEEE Trans. Syst., Man Cybern., in press. [14] Angsana, A. .and Passino, K.M., Distributed fuzzy control offlexible manufacturing systems, IEEE Trans. Control Syst. Technol., 2(4), 423435, 1994. [15] Garcia-Benitez, E., Yurkovich, S., and Passino, K., Rule-based supervisory control of a two-link flexible manipulator, J. Intelligent RoboticSyst., 7(2), 195-2 13, 1993.
Further Reading Other introductions to fuzzy control are given in [2,3,4].To study other applications of fuzzy control the reader should consult the references in the papers and books cited above. There are several conferences which have sessions or papers that study various aspects of fuzzy control. For instance, see the Proceedings of FuzzIEEE (in the World Congress on Computational Intelligence),IEEE Int. Symp. on Intelligent Control, Proceedings of the IEEE Conf. on Decision and Control, the American Control Conf., and many others. The best journals to consult are the IEEE Transactions on Fuzzy Systems, Fuzzy Sets and Systems, and the IEEE Transactions on Systems, Man, and Cybernetics. Also, there are many short courses that are given on the topic of fuzzy control.
57.11 Neural Control Jay A. Farrell, College o f Engineering, U n i versity of California, Riverside 57.11.1 Introduction In control, filtering, and prediction applications the available a priori model information is sometimes not sufficiently accurate for a prior fixed design to satisfy all of the design specifications. In these circumstances,the designer may 1. reduce the specified level of performance, 2. expend additional efforts to reduce the level of model uncertainty, or 3. design a system to adjust itself on-line to increase its performance level as the accumulated input and
THE CONTROL HANDBOOK
output data is used to reduce the amount of model uncertainty. Controllers (filters and predictors can be implemented by similar techniques) which implement the last approach are the topic of this chapter. Such techniques are necessary when additional effort in modeling and validation is unacceptable because of cost and feasibility, and the desired level of performance must be maintained. Linear adaptive control mechanisms [ l l ] have been developed for systems that by suitable limitation of their operating domain, can be adequately modeled as linear. Such control mechanisms are well developed as described in Chapter 52. The on-line system identification and control techniques developed for nonlinear systems can be divided into two categories, parametric and nonparametric. Parametric techniques assume that the functional form of the unknown function is known, based on physical modeling principles, but that the model parameters are unknown. Nonparametric techniques are necessary when the functional form of the unknown function is unknown. In this case'a general family of function approximators is selected based on the known properties of the approximation family and the characteristics of the application. In spite of the name, the objective of the nonparametric approach is still selecting a set of parameters to effect an optimal approximation to the unknown function. Learning and neural control (and system identification) 131, [4], [5] are nonparametric techniques developed to enhance the performance of those poorly modeled nonlinear systems for which a suitable model structure is not known. This performance improvement will be achieved by exploiting experiencesobtainedthrough on-lineinteractions with the actual plant. Approximation-based adaptive control is an alternative name used in the literature for this class of techniques. Neural control is the name applied in the literature when a neural network is selected as the function approximator. To be both concise and general, learning control will be the name used throughout this chapter. This chapter will focus on the motivation for, and implementation of, learning control systems. Topics of related interest include 4
motivational applications with nonlinear dynamics and unknewn model structure, the relationship of learning and neural control to alternative approaches, and the specificationof the various components of learning/neural control systems.
Archimede's principle, etc., a dynamic model of an underwater vehicle can be developed. However, due to both theoretical and practical limitations, the application of this model to particular vehicles may result in dynamic models too uncertain for $atisfactory fixed control system design, and too complex for on-line adjustment. In such circumstances, it is interesting to consider whether a (possibly) nonphysically based model, suitable for high performance control system design, can be constructed on-line based on performance feedback information.
57.1 1.3 Problem Statement Assume that the dynamics of the system can be described as
where x(k) is the state of the system, n(k) indicates noise on the measurement, and w(k) is process noise. The control system design problem is to specify a control law v(k) = K(x(k), r(k); h[x(k), r(k)]) so that the closed-loop system achieves the desired level of performance. The function ha[x(k),r(k), k] is any auxiliary mapping defined over a compact domain D useful in the control system design. Examples include an estimate of the system dynamics f^[x(k),v(k)], and mappings from the current operating condition to either linear model or control system parameters. When the function h [x(k), r(k)] is unknown or inaccurately known, it is of interest to consider whether on-line data can be used to synthesize an approximate function fi (x(k), r (k), 8) so that
achieves a minimum. Solutions to problems such as that stated above and analysis of existing solutions are still active areas of research. In the discussion following, we will see that the choice of the function approximation structure and the experimental conditions may be critical to the level of performance achieved.
57.11.4 Active and Passive Learning Due to the fact that h(s) is unknown, Equation 57.440 cannot be optimized directly. Instead, a typical approach is to optimize a cost function based on samples of the function (i.e., yi = h(si)).15 This results in a cost function defined as a summation of sample errors
57.1 1.2 Motivating Example Although the techniques presented have wide applicability, we will cite examples from underwater vehicle control applications when useful for exemplary purposes. At this point it is of interest to note that, based on an understanding of the physical behavior of fluids, Newton's laws,
1 5 ~ h more e general case in which the output of the approximated function cannot be directly measured is discussed in Section 57.1 1.7. Comments similar to those in this section still hold.
57.11. NEURALC'ONTROL However, the solution to Equation 57.441 is usually distinct from the solution to Equation 57.440. 'This is true because for large N, Equation 57.441 converges to
where p(s) is the experimental sample distribution. The solutions to Equations 57.440 and 57.442 will coincide only when the experimental sample distribution is uniform. Unfortunately, a uniform sample distributioh is not typical for control applications. This difficulty distinguishes between active and passive learning scenarios. Active learzing describes those situations in which the designer has the freedom to control the training sample distribution. Passive learning describes those situations in which the.trainilng sample density is defined by some external involve passive learnmechanism. On-line applications ~~sually ing because the plant is performing some useful function.
57.1 1.5 Models Types: Adaptation and Learning In the selection of a suitable model structure, it is necessary to note the domain over which the model is expected to apply. In particular, the following two definitions are of interest. DEFINITION 57.4 Local Approximation Structure: A parametric model j ( x , 6 ) is a local approximation to f (x) at xo if, for any E , e" and S exist so that 11 f (x) - f(x, 6) 11 5 E for all x E B(xO,8) where B(xo, 8) = {XI 11 x -xo /I< 8). DEFINITION 57.5 Global Approximation Structure: A parametric model f(x, 6) is a global approximation to f (x) over domain D if for any E there exists e" so that 11 f (x) - f ( x , 6) 11 5 E for allx E D.
The following items are of interest relative to the definition of local and globall models: Physical models derived from first principles (see Chapter 7) are expected to be global models (i.e., valid over the domain of operation D). Whether a given approxi.rnationstructure is local or global depends on the system that isbeing modeled and the (domainD. The set of global models is a strict subset of the set of local models. If a set of parameters 6 exists satisfying Definition 57.5 for a particular E , then this 6 also satisfies Definition 57.4 for the same E at each xo E D. As the desired level of accuracy increases (i.e., E decreases), the dimension of the parameter vector or the complexity of the model structure wiU usually have to increase. To maintain accuracy over domain D, a local approximation
structure can either adjust its parameter vector through time, as the operating point xo changes or store its parameter vector as a function of the operating point. The former approach is typical of linear adaptive control methpdologies in nonlinear control applications. The latter approach has been set forth [5] as an instance of learning control. The latter approach effectively constructs a global approximation structure by connecting several local approximating structures. The conceptual differences between linear adaptive and learning control are of interest. In a nonlinear application, linear adaptive methodologies can be developed to maintain a locally accurate model or control law at the current operating point xo. With the assumption that linearization will change over time, the adaptive mechanism will adjust the parameter estimates to maintain the accuracy of the local fit. Such an approach can provide satisfactoryperformance if the local linearization changes slowly, but will have to estimate the same model parameters repetitively if the system moves repetitively over a set of standard operating points. A more ambitious proposal is to develop a global model of the dynamics. One mechanism for this approach is to store the local linear models or controllers as a function of the operating point. f ( x , 1 0 , V , vo) = A(xo, vo)(x - xo)
+ B(xo, vo)(v - vo),
(57.443) where the unknown portions of the matrix functions '4(xo, vo) and B(xo, vo) would be approximated over the domain D. The name Learning Control stems from the idea that past performance informationis used to estimate andstoreinformation (relevant to an operating point) for use when the system operates in thevicinity of the operating point in the future. Learning Control will require more extensive use of computer memory and computational effort. The required amount of memory and computation will be determined predominantly by the selection of function approximation and parameter estimation (training) techniques.
EXAMPLE 57.16: Global underwater vehicle model
Underwater vehicle dynamics can be represented as
and
where vc is the vehicle velocity, w is the angular velocity, H is the angular momentum, m is the vehicle mass matrix, and the right hand side represents the tot2 hydrodynamic, control, propulsion, and gravitational forces and moments acting on the vehicle. Although this equation is intended to describe the vehicle dynamics over the operating envelope, limitations of physical measurements and approximations introduced in the specification of the right-hand side for a particular vehicle usually results in a large number (> 100) of model parameters being scheduled as a
THE CONTROL HANDBOOK
function of the operating condition. The accuracy of the global fit is improved by storing the model parameters as a function of the operating condition. Such parameter scheduling techniques are typical in aircraft and underwater vehicle applications.
EXAMPLE 57.17: Local underwater vehicle model In the vicinity of an operating point (xo, v o ) the vehicle dynamic model, Equations 57.444-57.445, can be approximated by a linear system,
afkv)
- Ix=xo, v =vo
-
where Axo,,o
and Bxo,,,
-
~x=xo,v=vo. Such models, that allow linear control system design and analysis, are only expected to hold locally. Modeling and linearization techniques are discussed in detail in various Control System textbooks (e.g., [ l o ] ) and in Chapter 5.
EXAMPLE 57.18: Connected local models of an underwater vehicle A common practice for control system design, referred to as scheduling (see Chapter 20), is to generate a global model or controller for a nonlinear system by interpolating between various local linear models. For example, if the dominant nonlinearities over a desired operating envelope D are due to changes in the forward velocity u , and linear models are available at integer values of the forward velocity between 5 and 15 feet per second, then
where xi = ( 0 , . . . , 0 , u i , 0 , . . . , 0 , ) is the ith operating point, by assumption ui = i, vi is the steady-statevalue of the actuation vector (i.e., propulsion force) necessary to maintain the operating condition, and
ri(u)=
I
0 (
i
( u - i ) 0
)
i+l i 1 5
5 5 5 5
u
<
15
u u u
<
i+l i i-1
<
<
is a linearly interpolated fit to the nonlinear dynamics. Such scheduled models and controllers are commonly assumedto convert a nonlinear problem to a set of linear design problems. Such approximation structures, where each parameter has a local effect on the approximation, can have appealing properties for passive learning control applications.
57.1 1.6 Function Approximators Based on the discussion of Sections 1-57.1 1.5, it is sometimes of interest to perform on-line function approximation to improve
control pe~formance.This is accomplished by uslng on-line experience (i.e., input-output data) to estimate the optimal parameters of a function approximation structure. Examples ofseveral classes of function approximators are given in the following. The topic of the second subsection is a discussion of several useful approximator properties. An understanding of these properties is necessary to select an appropriate class of approximation structures for a particular application.
Examples of Function Approximators Numerous classes of approximators exist. Below are a few examples. Sigmoidal Neural Networks A strict mathematical definition of what is or is not a neural network does not exist; however, the class of sigmoidal neural networks has become widely accepted as a function approximator. Sirigle hidden-layer neural networks have the form
where i = (h1, . . . , h,) is a vector of functions of the independentvariablexT = ( x l , . . . , x,), W 1 , B2 and W 2 are parameter matrices of the appropriate dimension, and 4 , called the nodal function, is a scalar function of a single variable. The parameters in B2, W 1 and W 2 are considered unknown and must be estimated to produce an optimal approximating function. Multiple layer networks are constructed by cascading single layer networks together, with the output from one network serving as the input to the next. Orthogonal Polynomial Series A set of polynomial functions & ( s ) , i = 1, 2, . . . is orthogonal with respect to a nonnegative weight function w ( t ) over the interval [a, b], if
for some nonzero constants ri. When a function h ( s ) satisfies certain integrability conditions, it can be expanded as
where
The mth order finite approximation of h ( s ) with respect to the polynomial series 4i ( s ) is given by m-l
i ( ~=)
hn@n(s).
The integral of the error between h ( s ) and h ( s ) over [a, b ] converges as m approaches oo. When h ( s ) is unknown, it may be reasonable to approximatethe hn in Equation 57.448 with on-:ine data.
I
57.11. NEURAL CONTROL Localized Blasis Influence Functions Due to the usefulness of the interconnection of local models to generate global models, the class of Basis Influence Functions [ l ] is presented. The definition is followed by examples of several approximation architectures that satisfy the definition to demonstrate the concept and to illustrate that several approximators often discussed independently can be studied as a class within the setting of the definition. This c!ass of models is introduced to reduce redundancy in discussing the function approximation properties in the following subsections.
and computational requirements. For example, a multidimensional output could be the state feedback gain as a function of operating condition. It is interesting to note the similarity between spline approximations, as described above, and some fuzzy approximators. In the language of fuzzy systems, a basis function is called a rule and an influence function is called a membership function. Radial Basis Functions An approximation structure defined as i ( s , 6 ) = C o i r ( s ;s i ) , i
DEFINITION 57.6 Localized-basis influencefunctions: A function approximator is ofthe BI Class if, and only if, it can be written as . f ( x , i )= C . f i ( x , 6 ; x i ) r i ( x ; x i )
(57.449)
i
where each fi(r, 6 ; x i ) is a local approximation to f ( x ) for all x E B ( x i , S ) , and T'(x;x i ) has local support Si = { x : r ( x ; x i ) # O } which is a subset of B(xi , 8) so that D 5 Ui Si.
BOXES One of the simplest approximation structures that satisfies Definition 57.6 is the piecewise constant function, h ( s ;8 ) =
EB~~~, d
where r ( s ;s i ) is a function such as exp [ -(s-sl )2 1, which decays with the distance of s from si, is called a Radial Basis Function Approximator (RBF). In the case where r ( s ; s i ) has only local support, the RBF also satisfies Definition 57.6, with the basis elements being constant functions. Of course, the basis functions can again be generalized to more complicated, but more inclusive, functions. Although the exponential and inverse square nodal functions do not themselves have local support, in most applications the designer only uses nodes within a fixed distance of the current evaluation point to calculate the value of the approximating function; hence, the effective nodal function does have local support. Alternatively, if the definition of Si were relaxed to St = { x : r ( x ; x i ) E ) for some small E then standard RBFs are also included in this class.
where
ri
=
1, 0,
ifs~Di, where Ui Di = D and Di nDj = 0 . otherwise,
Often, the Di are selected in a rectangular grid covering D . In this case, very efficient code can be written to calculate the approximation and update its parameters. Generalizations of this concept include using more complex basis elements (e.g., parameterized linear functions) in place of the constants O i , and interpolating between several neighboring Di to produce a more continuous approximation. Splinelike Functions Interpolating between regions in the BOXES approach can result in spline functions, such as (for a one dimemsio~~aidomain)
where jl. 1
Si+l - S -S i f l - si
(57.450)
for si+l > s 2 si and Oi is the value of h(s; 8 ) at xi. When the knots (i.e., x i ) are equally spaced, Equation 57.450 can be rewritten to satisfy Definition 57.6 with constant basis functions and influence functions of the form
ri( s ) =
si+,-,i
ifsi-1 5 s Isi+l,
otherwise.
(57.451)
Note that such schemes can be extended to both multidimensional inputs and outputs, at the expense of increased memory
Useful Approximator Properties The properties of families of approximators are discussed in this section to aid the designer in making an educated choice among the many available approximators. Universal Approximation Several universal approximation results have been derived in the literature. A typical result is presented below for discussion. For less restrictive results for neural networks, see [7].
THEOREM 57.13 [6] Let rj ( v ) : R' +- R be a nonconstant, bounded, and monotone increasing continuous function. Let D be a compactsubset of Rm and f ( x l , . . . ,x m ) be a continuousfirnction fiom D to Rn. Thenfor any arbitrary E > 0, an integer N and real constants B2, W1 and W2 exist so that the approximator defined in Equation 57.447 satisfiesmaxxEDIIf ( x ) - f(x)lloo < E . This theorem indicates that if 4 is a nodal function satisfying certain technical assumptions and E is a specified approximation accuracy, then any given continuous function can be approximated to the desired degree of accuracy if the number of nodes N is sufficiently large. Such universal approximation results are powerful but must be interpreted with caution. First, note that universal approximation considers increasing the number of nodes, not the number of network layers. If the nodal functions are defined as pth order polynomials, increasing the number of nodes per layer does not change the structure of the approximation from that of pth order polynomials. It is wellknown that the set of pth order polynomials cannot approximate
THE CONTROL HANDBOOK arbitrary continuous functions to a given accuracy s. Therefore, a network of pth order polynomials cannot be a universal approximator. Note that polynomials do not meet the conditions required for the above theorem. However, it is well-known that, for any given continuous function, an integer P exists so that a polynomial of order p > P can be found that approximates the function to accuracy E over the domain D. This increase in polynomial order would correspond to an increase in the number of layers of the network. Therefore, the fact that a family of parametric approximators does not have the universal approximation property does not by itself indicate that the family of approximators should be rejected. It should be kept in mind that universal approximation is a special case of a family of approximators being dense of the set of continuous functions. Second, universal approximation requires that the number of nodes per layer be expandable. In most applications, the number of nodes per layer Niis fixed a priori. Once each N, is fixed, the network can no longer approximate an arbitrary continuous function to a specified accuracy s; hence, approximation structures that allow criteria for the network structure specification are beneficial.
Generalization For the applications discussed, the parameters ofthe approximation structure will be estimated from a finite set of training data,
LinearityinParameters Function approximatorsthat are linear in the parameters can be represented as
where 0 is the unknown parameter vector, and Q ( x , v) the regressor vector is a known, possibly nonlinear, vector function of x and v. The powerful parameter estimation and performance analysis techniques that exist for approximators linear in their parameters (see [a]), make this a beneficial property. The second major advantage of approximators that are linear in their parameters is that for square error types of cost functions, such as Equation 57.455, the cost function is a quadratic function of the parameters. Therefore, there is a unique global minimum.16 Similar conclusions hold for more general objective functions as described in Equation 57.440. As discussed in Section 57.11.4. note that when a sample error cost function such as Equation 57.455 is used in place of an integral cost function, such as Equation 57.440, the optimal parameter estimate that results will depend on the distribution of the training samples. The fact that different parameter estimates result from different sets oftraining samples is notthe result of multiple local minima; it is, instead, the result of different weightings by the sample density in the cost function. Coverage Tnis property can be stated formally as follows: For each f,, and for any x E D, at least one Bj exists so that the function I -I is nonzero in the vicinity of x. If this property does not apply, then some point x E L) exist. for which the approximation cannot be changed. This is obviously an undesirable situation. Localization This property is stated formally as follows: For all f, and O j , if the function 1 ') is nonzero in the J
where yi andsi are defined in Equations 57.439-57.440; however, the approximating function may be evaluated at any point s E D. Therefore, the resulting approximation must be accurate throughout D, not only at the points specified by the training set T. The ability ofparametric approximators to provide answersgood or bad-at points outside the training set is referred to in the neural network literature as generalization. The fact that an approximation structure is capable ofgeneralizingfiorn the trainingdata is not necessarily a beneficial feature. Note that the approximator will output estimates h ( s ) at any desired point s. The user must take appropriate steps to verify or insure the accuracy of the resulting approximation at that point. Families of approximators which allow this assessment to occur on-line are desirable. Generalization of training data may also be described as interpolation or extrapolation of the training data. Interpolation refers to the process of filling in the holes between nearby training samples. Interpolation is a desirable form of generalization. Most approximators interpolate well. Some approximators will allow on-line analysis of the interpolation accuracy. Extrapolation refers to the process of providing answers in regions of the learning domain D that the training set T did not adequately represent. Extrapolation from the training data is risky when the functional form is unknown by assumption. Clearly, it is desirable to know whether the approximator is interpolating between the training data or extrapolating from the training data.
wl
vicinity of x, then it must be zero outside some ball B ( x , 6'). Under this condition the effects of changing any single parameter are limited to the local region. Thus, experience and consequent learning in one part of the input domain does not affect the knowledge accrued in other parts of the mapping domain. For example, in the scheduling example of Section 57.1 1.5, the parameters of the ith linear model only affect the function output for i - 1 < u < i 1; hence they should only be adjusted when u is in this range. If the gains were instead fitted with a Taylor Series, for example, then adjusting any parameter of the approximation would affect the accuracy of the approximating function over the entire domain. Localization is a desirable property because it prevents extended training in the vicinity of any given operating point from adversely affecting the approximation accuracy in distant regions of D. Therefore, localization is beneficial in passive learning applications. The fact that a limited subset of the model parameters is relevant for a given point in thelearning domain also makesit possible
+
'6~ossiblyan equivalent cvnnected set of parameters attiining the global minima if the approximator is overparametrized.
57.11. NEURAL CONTROL to reduce significantly the required amount of computation per sample interval. The trade-off is {.hatan approximator with the localization property may require a higher number of p arameters (more memory) than an approximator without the property. For example, if the domaiin D his dimension d, and rn parameters are necessary for a given local approximator per input dimensions, then on the order of rnd parameters will generally be required to approximate the a' dimensional function. This exponential increase in the memory requirements with input dimension is referred to in the literature as the curse of dirnen~ionalit~.'~ Discussion Function approximation structures that satisfy Definition 57.6 are of interest in learning control applications due to the fact that, .when properly defined, they are dense on the set of continuous functions, and satisfy the Coverage and Localization properties. In addition, if the functions fi(x, 6; xi) are each linear in their parameters, then the overall approximation is also linear in its parameters. Additional useful properties include the ability to store apriori knowledge in the approximation structure, the suitability of the approximation structure to available control design methodologies, and the availability of rules for specifying the structure of the function approximator.
57.1 1.7 Parameter Estimation Section 57.1 1.6 has presented various classes of function approximators and discussed their relevant properties. The present section discusses estimating the parameters of the approximation to optimize the objective function particular to a specific application.
Estimation Criteria Various parameter estimation algorithms exist. Due to our interest in on-line control applications, we are interested in those techniques that minimize the amount of required calculation. Below, we will briefly consider a few parameter estimation criteria. In each case, the goal is to find the optimal parameter vector 0* so that O* = argmirtle J(O), where J(O) is a positive function of the parameter error. Parameter estimation is discussed in greater detail in Chapters 7 and 58 or [8]. Approximation structures that are not linear in their parameters and approximation structures with memory ~(i.e.,their own state) will also be briefly considered. For approximation structures that are linear in the parameters, the resulting parameter estimation algorithms will generallyhave the form,
where at instant k, 6(k) is the estimate of 0*, W(k) is a (possibly data dependent) weighting matrix, #(k) is the regressor vector,
17The curse of dimensionality was originally used by Bellman to refer to the increasing complexityof the solution to dynamic programming problems with increasing input dimension.
and e(k) = y (k) - j(k) is the prediction error. Least Squares The least-squares parameter estimate is based on minimization of
which, for an approximator that is linear in its parameters,
results in the estimate,
I-'
where P (N) = [$ELl 4 (i)c#J(i)T is the inverse regressor sample autocorrelationmatrixand R(N) = [$ EL14 ( i ) y (i)T ] is the sample cross-correlation matrix for the regressor and the desired function output. By the matrix inversion lemma, the least-squaresparameter estimate can be calculated recursivelyby Equation 57.454, where W(k)
=
-
1
+
P (k - 1) @(klTP(k- l)4(k)
and
Note that the recursive least-squares solution to Equation 57.455 via Equation 57.454, with correct initialization of P , gives the same answer as the solution via Equation 57.457. 'The advantage of the recursive approach is that it involves inverting the p x p matrix (I 4(klT ~ ( -kl)$(k)) where p is the dimension of h, often one. The solution by Equation 57.457 involvesinvertingthe M x M matrix ($ +(i)4(i)T), where M is the dimension of 19which is usually very large. The main theoretical advantage, ofthe least-squaresparameter estimate are that the estimate is unbiased and minimum variance under the assumptions that 1) the unknown function is actually in the assumed function class described by Equation 57.456, 2) the noise in the measurement of the output y ( t ) is zero mean, and 3) there is no error in the measurement of the function input s. The convergence of the LS-based algorithms is also typically much faster than other estimation techniques. The main disadvantages of LS-based algorithms include 1) the algorithm is computationally expensive, even in the recursive version, 2) the first and third assumptions listed in the previous paragraph do not typically hold, and 3) the algorithm is difficult to implement in the nonlinear case. Nonlinear Least Squares The previous section discussed a special case of the problem stated in Section 57.1 1.3, where the unknown function was assumed linear and samples of the unknown function were measured directly. When the approximation structure is not linear or the samples ofthe unknown function cannot be measured directly, a nonlinear optimization
+
~r=~
THE CONTROL HANDBOOK problem results. In the first case, the parameter estimate is based on minimization of Equation 57.455 where
those discussed in the next section, are often used to reduce the computational complexity. In the second case,
andg[s(i), 81, representing the structure ofthe function approximator, is not linear in 8. In the second case,
and
where g(s(i), ~ ( i ) is) a8ain a nonlinear function. In this case, g(., .) represents the effect of the approximator output on the plant output. There are situations in which the form of this function is not completelyknown. In the general case of minimizing the two norm of the sample error ($ ;{y(i) - g[h(i), 8]j2) as a function of 8, the iterative parameter estimation algorithm will have the form,
EL,
where di(k) satisfies
and where
Specialization of this algorithm to the two cases of interest illustrate some important points. In the first special case, A = -A- a ( S e ) , ,
a*I
Similarly, B would involve second derivative information for g(., .). In this case, g represents the manner in which the approximating function outputs affect the plant outputs. In those situations where the functional form of g is not known, the derivative information will not be available and approximations to the algorithm will be necessary. Standard drawbacks of the above Gauss-Newton solution to the nonlinear least-squares problem include the difficulty in calculating or estimating the second order derivative information and the expense of solving Equation 57.462. Numerous algorithms have been developed to simplify the computations and improve the estimation performance (see, for example [ 121, [!3]). Gradient Descent A common simplification of the above parameter estimation algorithms relies on the fact that a function's gradient with respect to a vector variable indicates the direction of the function's maximum increase with respect to the vector variable. The direction opposite the gradient indicates the direction of the function's maximum decrease with respect to the vector variable. By taking a step in the direction opposite the objective function's gradient with respect to the parameters, the objective function should be decreased. Unfortunately, the gradient does not indicate the size of the step that should be taken. The size of the step depends on higher order derivative information, which has been neglected in the simplification. Therefore, the step size is usually designed to be small. It must be small because we are using local gradient information which will change throughout the region of interest D. There is of course a tradeoff in selecting the step size. Too small a step size will lead to slow convergence, while too large a step size will cause the parameter estimate to overstep and possibly oscillate around the optimal value. Several mechanisms have been proposed for adjusting the stepsize on-line. For the objective function and approximator structure defined is in Equations 57.455-57.456, the gradient
s=s(r),e=e(k)
and g(., .) is a knbwn function because it represents the function approximator. In the special case where g(., .) is linear in the parameters, A = d @ [ s ( i ) ] = Jz Jz [ ~ ( s I ).,. . ,q5(s,v)lT, B = 0, and the solution of Equations 57.461-57.462 reduces to the least-squares solution previously presented. In the general case, because the functional form of g(., .) is known, the derivative information required to calculate A and B could be determined. In real-time applications, simplified algorithms, such as
where 8 (k) is assumed constant over the previous N samples (i.e., 8(i) =8(k-N+l),fori = k - N + 1 , ..., k. Thisisreferredto as batch training. The resulting parameter estimate update rule is
57.11. NEURAL CONTR0.L where W ( k ) = diug [ai ( k ) ]is a diagonal matrix containing the step size for each element of the parameter vector. For convergence, ai ( k ) > 0 and because the gradient is only a local direction of steepest descent, usually ai ( k ) < < 1 . This algorithm is also in the standard form of Equation 57.454. Incremental training refers to the situation where N = 1, and the parameter vector is updated at each sample instant. In this case, the updated algorithm is
where C is assumed known and the design objective is to construct a system
so that the state estimation error 11 x ( k ) - x,(k) 11 is minimized. To estimate the parameter vector e incrementally by a gradient
O(k+ 1 ) = O ( k ) - W ( k ){ $ ( k - i ) [ y ( k- i ) - j(k - i ) ] }
Batch training is often used in off-line training applications, but incremental training is preferred for on-line training. Incremental training avoids the delay in updating the parameters associated with accumulating a batch of data. This is appropriate, because in passive learning applications there is no guarantee that a batch of training data would be uniformly representative of the domain D. For the second case considered in the previous section (with N=l), the gradient algorithm is 0(k
+ 1 ) =0(k)
-
where J(Q,k ) =I1 y(k) - y,(k)
11.
Then,
W ( k )( $ ( k - i )
where Again, g(., .) describes the manner in which the approximated function output affects the plant output, and it may include the plant dynamics. In these cases, will not be known, and approximate methods will again be required. The main advantage of gradient-descent algorithms is their simplicity, which allows rapid on-line calculation. The tradeoff is that gradient-following techniques normally converge at a much slower rate than the other techniques discussed above. Backpropagation As discussed in Section 57.11.6, some popular app~:oximationarchitectures are not linear in the parameters. These approximators are rkferred to in Case 1 of the Nonlinear Least-Squares Section. When g ( . , ) represents the approximator structure as in Case 1 discussed above, the gradientdescent algorithm becomes
In this case, note that the solution ofthe gradient-based algorithm is complicated by the solution of Equation 57.466, a nonlinear matrix difference equation. In most applications, an approximate solution to Equation 57.466 would be used.
57.1 1.8 Example This example will illustrate the ideas discussed in this chapter, by means of a simplified underwater vehicle heading control example. The assumed dynamic model for vehicle heading is
Care should ble taken in the calculation of the partial deriva,which may, depending on the approxtive
1
s=s(i),O=O(k)
imation structure, be computationdy expensive. A particularly efficient algorithm for calculating the gradient in layered approxirnators (e.g., sigmoidal neural networks) is the backpropagation algorithm (see, for example, [ 9 ] ) . Approximation of the Gradient Due to Internal State Additional complexity will arise in situations where the approximating structure includes its own state. Consider the example of state estimation. Let an unknown plant be defined by
where h ( t ) and r ( t ) are the heading and yaw rate, u ( t ) is the forward velocity, d r ( t ) is the rudder deflection, a ( u ( t ) ) is the velocity-dependent friction, and b [ u ( t ) ]is the velocitydependent rudder effectiveness. In reality, the model would be more complex. For example, it would include models of crossflow friction and the effects of nonzero roll angles. The model is, however, sufficient to demonstrate the desired concepts. The model above shows that the heading dynamics are velocity dependent. Therefore, the controllerwill have to react differently at different speeds. For this example, the functions a [ u ( t ) ]and b [ u ( t ) ]will be assumed unknown.
THE CONTROL HANDBOOK The equivalent discrete-time model is
+
changes continuously according to the dynamics u(k 1) = u(k) ln~f(0.1, [ u , (k) - u(k)l), where u, is the commanded velocity and l m r (.) is a function that limits the magnitude of the . segment of Figure 57.45 acceleration to 0.1 feet per s e ~ . A~ short is shown in Figure 57.46 to display the acceleration-limited dynamics.
+
~ f). whered = e - ~ ~ , c i=( 1 -d), f = bc,ande= $ ( b For parameter estimation, the discrete-timemodel will be represented as h(k (
+ 1) r
)
h(k) - Iz (k - 1) = d ( r(k) - r(k - 1)
where eT = (d, e, g, f),g = f c
- de, 2nd Time, t, see.
Figure 57.45
Velocity trajectory.
Because a and b are both functions of u, each parameter in the vector 0 will also be a function of u. The objective of the example is to use on-line data to approximate the function 0 ( ~= ) ~(d(u), e(u), g(u), f (u)). Because this is a simulation, the actual function is known and will be used for performance analysis. This would not be the case in most applications. For this simulation example, the continuous-time model parameters, as a function of forward velocity, were assumed to be a(u) = 0.0461 * u and b(u) = 0.0006 - 0.0050~- 0.0009u2 for positive values of u. The simulation includes 18 000 input-output samples, corresponding to one hour of real-time data with the control system sample period equal to 0.2 sec. (i.e., 18 000 samples =
5 ample^ 60). The commanded heading during this set. mm. hr.
period was
Time, t, secs.
Figure 57.46
Velocity trajectory for t
E
[500, 10001 sec.
+
where rrand is a uniform random variable in I-.1, .l] that is changed every two sec. This input signal is sufficiently and persistently exciting for the linear model with four parameters. The forward velocity is plotted versus time in Figure 57.45. The same forward velocity trajectory will be used for both the adaptive and learning-based controllers. The commanded forward velocity is held constant for one minute intervals (i.e., 300 samples). At the end of each interval, a new commanded forward velocity is generated randomly in the interval from 2.0 to 16.0 feet per second. The actual forward velocity then
Using state feedback control dr(k) = [khh (k) k,r (k)], with the objective of placing the discrete-time closed loop poles at hl (u) and h2(u), the equations for the state feedback gains become
57.1 1 . NEURAL CONTROL
Because the parameter vector is forward velocity dependent, the state feedback gains will be f ~ ~ n c t i oof n sthe forward velocity even if the desired eigenvalues are constant. The closed-loop, discretetime poles were selected as the constants 0.75, and 0.95. These were the fastest real poles achievable without excessive saturation (at 0.2 rads) of the rudder. T w i controllers will be considered. Each controller would calculate the state feedback gains based on the estimated linear model parameters. The first controller is adaptive. The objective of the parameter estimation is to identify the linear model parameters for the current operating condition (i.e., forward velocity). The linear model parameters will change whenever the forward velocity changes. The parameter estimation algorithm is recursive least squares with covariance resetting. The covariance is reset whenever the forward velocity changes by more than 1.0 ft. per sec. The necessity of covariance resetting or forgetting is discussed in [8]. When th~ecovariance is reset, the last best parameter estimate, before the reset, is stored and used until the postreset parameter estimate is less than 3.00 in magnitude. Without this latching of the last best estimate, large transients may occur in the parameter estimlater. Figure 57.47 displays the error (i.e., the two norm of the error) between the actual closed-loop eigenvalues and the desired eigenvalues. The actual closed-loop eigenval-
Time, I, secs
Figure 57.48
Norm of the eigenvalueerror between 500 and 1000 sec.
1
1
Time, 1, secs
Figure 57.49
Adaptive estimate of d.
parameter estimates ofd and 250e. Although the performance of this adaptive controller could be improved with additional tuning, it is important to remember that possibly large transients will occur each time the forward velocity changes. The fact that the system may have operated at a particular forward velocity in the past will not decrease the transients at that velocity in the future, because thelinear model is not capable of storing information as a function of the operating point.
Figure 57.47
Norm of the eigenvalueerror over entire training period.
ues are calculated at each instant using the current linear model parameter estimates. The large errors in the closed-loop eigenvalues correspond to the times at which the forward velocity is changed. When the velocity is constant, the eigenvalue error becomes small. This is shown more clearly in Figure 57.48, which expands a portion of the time axis. For this same portion of the time axis, Figures 59.49 and 57.50 display the trajectories of the
The second controller will incorporate a mechanism to synthesize a function which will store the linearized model (i.e., the 0 vector) as a function of the forward velocity. Therefore, the objective function is as stated in Equation 57.440, with h ( . ) the linearized plant dynamic model and s the forward velocity. Three approximators will be considered, but only the piecewise constant approximator will be considered in detail. The piecewise constant approximator has a single input u and four outputs, the elements of O ( u ) . The domain of the input variable is D = [2, 161. The region D was divided into 20 regions. Each of the four outputs will be approximated by a constant value on each of the 20 regions. Because for this approximator, the regres-
THE CONTROL HANDBOOK
Time. 1, secs.
Figure 57.50
Adaptive estimate of e.
Figure 57.51
Actual and piecewise constant approximationto d(u). I
0,
sor vector contains 19 elements that are zero and a single element that is one, extremely efficient algorithms can be written to solve the least-squares parameter fit. In fact, the computational load will be the same as that of the adaptive control solution described in the previous paragraphs.18 Although there are a total of eighty parameters (4 parameters per region times 20 regions), only four parameters are relevant to any training sample. The estimates of the four parameters for a given region will not be reliable until on the order of 16 training samples are received in that region (i.e., the regressor sample autocorrelation matrix becomes sufficiently well-conditioned). Once this condition is satisfied, the approximation to the linear dynamics should be accurate over the given region despite what happens to the system when the forward velocity is outside the region. The piecewise approximation to d(u), 250e(u), 50f (u), and 250g(u) that resulted from training over the same set of data as described for the adaptive control approach are shown in Figures 57.51-57.54. A histogram showing the number of training samples in each region of the domain D is shown in Figure 57.55. The norm of the error between the actual and desired eigenvalues as a function of forward velocity is shown in Figure 57.56. The integral of the norm of the eigenvalue error (i.e., RMS error) over the region D is 0.016. Note that this corresponds to on-line training for one hour. At this point, if the actual system were assumed to be time invariant then the parameter update could be stopped without lossing control system performance. Repeating this example with a 20-element piecewise linear or a 5-element Legendre polynomial approximator resulted in RMS errors over the.region D of 3.47e - 4 and 1.04e - 7, respectively.
Of the three approximators that were implemented, the piecewise constant approximator and the polynomial approximator had the lowest and highest computational requirements, respec-
1 8 ~alternative n approach to this piecewiseconstantsolutionis toview it as 20 adaptive control problems, where only one adaptive control problem is valid at each sampling instant.
Figure 57.52
Actual and piecewise constant approximation to e(u).
tively. Although the number of bins in the piecewise constant approach would h a v ~ to increase significantly to achieve the same level of accuracy as the polynomial approximator, the computa1 tional complexity would not have to increase. Such an increase would not necessarily be beneficial, as the approximation could become 'spiky' dde to the difficulty of getting a sufficient number of good samples in each bin. It would probably be more beneficial to move to a higher order approximator, such as the piecewise linear function. The number of parameters that must be adjusted for the piecewise linear approximator is twice that of the piecewise constant approximator, but it decreased the RMS eigenvalue error by almost two orders of magnitude. The polynomial series approximator outperformed the piecewise linear approximator by another three orders of magnitude; however, improved performance disappearswhen noise andvariations in time must be taken into account. Time variations are important because they require, for example, covarianceresetting or forgetting in the parameter estimation routine. These techniques then require meeting a persistence of excitation condition to insure parameter convergence. The conditions for persistence of excitation are much easier to insure for local approximators.
57.11. NEURA,L CONTROL
Forwardvelocity, u, h.1~.
Figure 57.53
Forward speed, u, ft.k
Actual and piecewise constant approximation to f ( u ) .
0,
-1.5
-
2
Figure 57.54
Figure 57.55
Histogram of u (k).
I
4
6
8 10 Forwardvelocity, u, fI.1~.
12
14
1
16
Actual and piecewise constant approximation to g ( u ) .
57.1 1.9 Conclusion This chapter has discussed techniques for using on-line function approximation to unprove control or state estimation performance. Successfd iilnplementations require proper selection of an objective function, i n approximation structure, a training algorithm, andanalysis of these three choices under the appropriate experimental conditions. Several characteristics of function approximators have been discussed. In the literature, much attention is devoted to selecting certain function approximators because as a class they have the universal approximation property. By itself, this condition is not sufficient. First, the set of function approximators, that are dense on the set of continuous functions, is larger than the set of universal function approximators. Second, universal function approximation requires that the approximator has an arbitrarily large size (i.e., width). Once the size of the approximator is limited, a limitatioin is also placed on the set of functions that can be approximated to within a given r . In a particular application, the designer should instead consider which approximation structure can most efficientlyrepresent the class of expected functions. Efficiency should be measured both in terms of the number of
Forward velocity. u. fi.18
Figure 57.56
Norm of eigenvalue error versus forward velocity.
parameters (i.e., computer memory) and the amount of computation (i.e., required computer speed). Due to the inexpensive cost of computer memory and the hard constraints on the amount ofpossible computation due to sampling frequency considerations, computational efficiency is often more important in applications. .Although the goa! of many applications presented in the literature is to approximate a given function, the approximation accuracy is rarely evaluated directly-even in simulation examples. Instead, performance is often evaluated by analyzing graphs of the sample error, for example, a plot of the prediction error between the actual and predicted heading in the previous example. This is a dangerous approach because such graphs indicate only the accuracy of the approximation at the pertinent operating points. It is quite possible, as in the above adaptive controller example, to have a small sample error over a given time span without ever accurately approximating the desired function over the desired region. If the goal is to approximate a function over a given region, then some analysis other than monitoring the sample error durmg training must be performed to determine the success level. In simulation, the approximated and actual
1030 functions can be compared directly. Successful simulation performance is necessary to provide the confidence for implementation. In implementations, where the actual function is not known, a simple demonstration that the function has been approximated correctly is to show that the performance does not degrade when the parameter update law is turned off. Several of the training algorithms discussed herein were derivative based. The main disadvantage of derivative-based parameter estimation algorithms is that they will find only the local minimum of the objective function. This is not an issue when the approximator is linear in the parameters and the objective function is convex in the approximator output because then the objective function is (quasi) convex in the parameters. It can be a major difficulty in other situations. Genetic and stochastic search techniques are not directly applicable to controller design, because a population of identical plants is not available for performance comparison. But techniques have been developed for constructing credit assignment functions which allow reinforcementlearningtechniques to search for globally optimal parameter estimates [ l , 141.
References [ l ] Millington, P., Associative Reinforcement Learning for Optimal Control, M.S. Thesis, Department of Aeronautics and Astronautics, MIT, 1991. [2] Slotine, J. and Li, W., Applied Nonlinear Control, Prentice Hall, Englewood Cliffs, NJ, 1991. [3] Fu, K., Learning Control Systems Q Review and Outlook, IEEE Trans. Automat. Control, AC-15(2), 1970. [4] Tsypkin, Y., Foundations of the Theory oflearningsystems, Academic Press, 1971 [5] Farrell, J. and Baker, W., Learning Control Systems, In Intelligent-Autonomous Control Systems, Antsaklis, P. and Passino, K., Eds., Kluwer Academic, 1993. [6] Funahashi, K., On the Approximate Realization of Continuous Mappings by Neural Networks, Neural Networks, 2,183-192,1989. [7] Hornik, K., Stinchcornbe, M., and White, H., Multilayer Feedforward Networks Are Universal Approximators, Neural Networks, 2,359-366, 1989. [8] Goodwin, G. C. and Sin, K.S., Adaptive Filtering Prediction and Control, Prentice Hall, Englewood Cliffs, NJ, 1984. [9] Rumelhart, D.E., McClelland, J.L., and PDP Research Group, Parallel Distributed Processing- Explorations in the Microstructure of Cognition, Volume I: Foundations, MIT Press, Cambridge, MA, 1986. [lo] Ogata, K., Modern Control Engineering, Prentice Hall, Englewood Cliffs, NJ, 1990. [ l l ] ihtrom, K. and Wittenmark, B., Adaptive Control, Addison-Wesley, 1989. 1121 Nonlinear Optimization: Theory and Algorithms, Dixon, L., Spedicato, E., and Szego, G., Eds., Birkhauser, Boston, 1980. [I?] Dennis, J. and Schnabel, R., Numerical Methods for
THE CONTROL HANDBOOK Unconstrained Optimization andNonlinearEquatior~s, Prentice Hall, Englewood Cliffs, NJ, 1983. [14] Barto, A., Sutton, R., and Anderson, C., Neuronlike Adaptive Elements that can Solve Difficult Learning Control Problems, IEEE Trans. Syst., Man, nnd Cybernetics, SMC- 13(5), 1983.
SECTION XI1 System Identification
System Identification 58.1
The Basic Ideas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1033 .
10 Basic Questions About System Identification Background and Literature * An Archetypical Problem - ARX Models and the Linear Least Squares Method ' The Main Ingredients 58.2 General Parameter Estimation Techniques.. . . . . . . . . . . . . . . . . . . . . . . . . 1036 Fitting Models to Data Model Quality Measures of Model Fit Model Structure Selection Algorithmic Aspects 58.3 Linear Black-Box Systexns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1041 Linear System Descriptions in General Linear, Ready-Made Models
58.4 Special Estimation Techniques for Linear Black-Box Models ...... 1043 Transient and Frequency Analysis 'Estimating Impulse Responses by Correlation Analysis Estimating the Frequency Response by Spectral Analysis Subspace Estimation Techniques for State-Space Models
58.5 Physically Parameterized Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1047 . 58.6 Nonlinear Black-Box Models ........................................1048 Nonlinear Black-Box Structures Nonlinear Mappings: Possibilities ' Estimating Nonlinear Black-Box Models
58.7 User's Issues.. ......................................................... 1050 Experiment Design Model Validation and Model Selection * Software for System Identification The Practical Side of System Identification
Lennart L1un.g Deportment of Electrical Engineering, Linkoping University, Sweden
References.. ..................................................................1053
58.1 The Basic Ideas 58.1.1
10 Basic Questions About System Identification
1. What is System Identification? System Identification allows you to build mathematical models of a dynamic system based on measured data. 2. How is that done? By adjusting parameters within a given model until its output coincides as well as possible with the measured output.
3. How d o you know if the model is any good? A good test is to take a close look at the model's output compared .to the measurements on a data set that wasn't used for the fit ("Validation Data"). 4. Can the quality of the inodel be tested in other ways? It is also valuable to look at what the model couldn't reproduce in the data ("the residuals"). There should be no correlation with other available information, such as the system's input. 0-8493-8570-9/96/$0.00+$.50 @
1996 by CRC Press, Inc.
5. What models are most common? The techniques apply to very general models. Most common models are difference equations descriptions, such as ARX and ARMAX models, as well as all types of linear state-space models. Lately, blackbox nonlinear structures, such as Artificial Neural Networks, Fuzzy models, and so on, have been much used. 6. Do youhaveto assumeamodelofa particular type? For parametric models, you have to specifythe structure. However, if you just assume that the system is linear, you can directly estimate its impulse or step response using Correlation Analysis or its frequency response using Spectral Analysis. This allows useful comparisons with other estimated models. 7. How do you know what model structure to try? Well, you don't. For real life systems there is never any "true model", anyway. You have to be generous at this point and try out several different structures. 8. Can nonlinear models be used in an easy fashion? Yes. Most common model nonlinearities are such that the measured data should be nonlinearly transformed (like squaring a voltage input if you think
THE C O N T R O L HANDBOOK
that it's the power that is the stimulus). Use physical insight about the system you are modeling and try out such transformations on models that are lipear in the new variables, and you will cover a lot. 9. What does this article cover? After reviewing an archetypical situation in this section, we describe the basic techniques for parameter estimation in arbitrary model structures. Section 58.3 deals withlinear models ofblack-boxstructure, and Section 58.4 deals with particular estimation methods that can be used (in addition to the general ones) for such models. Physically parameterized model structures are described in Section 58.5, and nonlinear black-box models (including neural networks) are discussed in Section 58.6. Section 58.7 deals with the choices and decisions the user is faced with. 10. Is this really all there is to System Identification? Actually, there is a huge amount written on the subject. Experience with real data is the driving force to further understanding. It is important to remember that any estimated model, no matter how good it look on your screen, is a simple reflection of reality. Surprisingly often, however, this is sufficient for rational decision making.
58.1.3 An Archetypical Problem - ARX Models and the Linear Least Squares Method T h e Model We shall generally denote the system's input and output at time t by u(t) and y (t), respectively. The most basic relationship between the input and output is the linear difference equation
We have chosen to represent the system in discrete time, primarily because observed data are always collected by sampling. It is thus more straightforward to relate observed data to discretetime models. Nothing prevents us, however, from working with continuous-time models: we shall return to that in Section 58.5. In Equation 58.1 we assume that the sampling interval is one time unit. This is not essential, but makes notation easier. A pragmatic and useful way to see Equation 58.1 is to view it as a way of determining the next output value given previous observations:
For more compact notation we introduce the vectors 8 p(t)
58.1.2 Background and Literature System Identification has its roots in standard statistical techniques and many of the basic routines have direct interpretations as well known statisticalmethods such as Least Squares andMaximum Likelihood. The control community took an active part in developing and applying these basic techniques to dynamic systems right after the birth of "modern control theory" in the early 1960s. Maximum likelihood estimation was applied to difference equations (ARMAX models) by [3],and thereafter a wide range of estimation techniques and model parameterizations flourished. By now, the area is mature with established and well-understood techniques. Industrial use and application of the techniques has become standard. See [13] for a common software package. The literature on System Identificationis extensive. For a practical user oriented introduction we may mention [16]. Texts that go deeper into the theory and algorithms include [14], and [26]. A classical treatment is [4]. These books all deal with the "mainstream" approach to system identification, as described in this article. In addition, there is a substantial literature on other approaches, such as "set membership" (compute all those models that reproduce the observed data within a certain given error bound), estimation of models from given frequency response measurement [24],on-line model estimation [17],nonparametric frequency domain methods [6], etc. To follow the development in the field, the IFAC series of Symposia on System Identification (Budapest, 1991, Copenhagen, 1994) is a good source.
=
=
[ a l , . . . , a n b l , . ... , b,lT, [-y(t - 1) ... - y(t - n ) ~ (-t 1). . . u ( t -m)] T .
and
(58.3) (58.4)
With these, Equation 58.2 can be rewritten as
To emphasize that the calculation of y (t) from past data in Equation 58.2 depends on the parameters in 8, we shall call this calculated value j(t 18) and write
T h e Least Squares Method Now suppose for a given system that we do not know the values of the parameters in 8, but that we have recorded inputs and outputs over a time interval 1 5 t 5 N : \
An obvious approach is then to select 8 in Equation 58.1 through Equation 58.5 so as to .fit the calculated vdues ?(tie) as well as possible to the measured outputs by the least squares method:
min VN(8, z N ) e where N
(58.7)
58.1. THE BASIC IDEAS
We shall denote the value of 8 that minimizes Equation 58.7 by ON : 6~ = argminVN(O, Z N ) (58.9) 0
("arg min" means the minimizing argument, i.e., that value of 6 which minimizes VN.) Since VN is quadratic in 0, we can find the minimum value easily by setting the deri-vative to zero:
v(t): the voltage applied to the heater, r(t): the temperature of the liquid, and y (t): the temperature of the heater coil surface. Suppose we need a model showing how y (t) depends on r(t) and v(t). Some simple considerations based on common sense and high school physics ("semi-physical modeling") reveal the following: The change in temperature of the heater coil over one sample is proportional to the electrical power in it (the inflow power) minus the heat loss to the liquid. The electric power is proportional to v2(t). The heat loss is proportional to y(t) - r(t).
which gives
This suggests the model
which fits into the form Once the vectors q ( t )'ire defined, the solution can easily be f o u ~ l d by modern numerical software, such as hltlTLAl3.
This is a two input (v2 and r ) and one output model, and corresponds to choosing
EXAMPLE 58.1: First order difference equation Consider the sirnple model
in Equation 58.5.
Some Statistical Remarks This gives us the estimate according to Equations 58.3, 58.4 and 58.11
All sums are from t = L to t = N . A typical convention is to set values outside the measured range at zero. In this case we would thus set y (0) == 0. The simple model Equation 58.1 and the well known least squares method Equation 58.11 form the archetype of System Identification. They also give the most commonly used parametric id~entificationmethod and are much more versatile than perceived at first sight. In particular, one should realize that Equation 58.1 can directly be extended to several different inputs (this just calls for a redefinition of y(t) in Equation 58.4) and that the inputs and outputs do not have to be the raw measurements. On the contrary, it is often most important to think over the physics of the application and come up with suitable inputs and outputs for Equation 58.1 from the actual measurements.'
Model structures, such as Equation 58.5 that are linear in B are known in statistics as linear regression and the vector y(t)
is called the regression vector (its components are the regressors). "Regress" here alludes to the fact that we try to calculate (or describe) y(t) by "going back" to p(t). Models such as Equation 58.1 where the regression vector, p(t), contains old values of the variable to be explained, y(t), are then partly autoregressions. For that reason the model structure Equation 58.1 has the standard name ARX-model (Autoregression with extra inputs). There is 9 rich statistical literature on the properties of the estimate 6~ under varying assumptions (see, e.g., [8]). So far we have just viewed Equations 58.7 and 58.8 as "curve-fitting". In Section 58.2.2 we shall deal with a more comprehensive statistical discussion,which includes the ARX model as a special case. Some direct calculations will be done in the following subsection.
Model Quality and Experiment Design Let us consider the simplest special case, that of a Finite Impulse Response (FIR) model, obtained from Equation 58.1 by setting n = 0: ~ ( t= ) blu(t
EXAMPLE 58.2: h immersion heater Consider a process consisting of an immersion heater in a cooling liquid. We measure ,
- 1) + .. .b,u(t
- m).
(58.12)
Suppose that the observed data really have been generated by a similar mechanism
1036
THE CONTROL fiANDI30OK
where e ( t ) is a white noise sequence with variance A, but otherwise unknown. (That is, e ( t ) can be described as a sequence of independent random variables with zero mean values and variances A). Analogous to Equation 58.5,
a
We can now replace y ( t ) in Equation 58.11 by the above expression, and obtain
The covarianceis proportional to the noise-to-signal ratio, that is, it is proportional to the noise variance and inversely proportional to the input power. The covariance does not depend on the input's or noise's signal shapes, only on their variancelcovariance properties. Experiment design, i.e., the selection of the input 11, aims at making the matrix R-I "as small as possible." Note that the same I? can be obtained for many differert signals u .
58.1.4 The Main Ingredients The main ingredients for the System Identification problem are as follows a
Suppose that the input u is independent of the noise e . Then cp and e are independent in this expression, so it is easy to see that ~8~ = 0,because e has zero mean. The estimate is consequently unbiased. Here E denotes mathematical expectation. We can also form the expectation of gNg;, i.e., the covariance matrix of the parameter error. Denote the matrixwithin brackets by RN. Take expectation with respect to the white noise e . Then RN is a deterministic matrix and
because the double sum collapses to ARN. We have thus computed the covariance matrix of the estimate i N , determined entirely by the input properties and the noise level. Moreover define
This will be the covariance matrix of the input, i.e., the i - j element of R is R,, (i - j ) , as defined by Equation 58.89 later on. If the matrix R is nonsingular, the covariance matrix of the parameter estimate is approximately (and the approximation improves as N + oo) h -
pN = -R-' (58.18) N A number of things follow from this, all typical of the general properties to be described in Section 58.2.2: The covariance decays like 1/N, so that the parameters approach the limiting value at the rate 1 / n .
The data set zN, A class of candidate model descriptions; n Model Structure, A criterion of fit between data and models, and Routines to validate and accept resulting models.
We have seen in Section 58.1.3 a particular model structure, the ARX-model. In fact the major problem in system identification is to select a good model structure, and a substantial part of this article deals with various model structures. See Sections 58.3, 58.5, and 58.6. all concerning this problem. Generally speaking, a model structure is a parameterized mapping from past inputs and outputs z'-' (cf. Equation 58.6) to the space of the model outputs: j(tl0) = g(#, z'-') (58.19) Here 6' is the finite dimensional rector used to parameterize the mapping. Actually, the problem of fitting a given model structure to measured data is much simpler, and can be dealt with independently of the model structure gsed. We shall do so in the following section. The problem of assuring a data set with adequate information is the problem of experiment design, and it will be described in Section 58.7.1. Model validation is a process to discriminate between various model structures and the final quality control station, before a model is delivered to the user. This problem is discussed in Section 58.7.2.
58.2 General Parameter Estimation Techniques In this section we skill deal with issues that are independent of model structure. Principles and algorithms for fitting models to data, as well as the g-al properties of the estimated models, are all model-structure independent and equally well applicable to ARMAX models and Neural Network models. The section is organized as follows. In the first section, the general principles for parameter estimation are outlined. Thereafter
1037
58.2. GENERAL PARAMETER ESTIMATION TECHNIQUES we deal with the asymptotic (in the number of observed data) properties of the models, while algorithms, both for on-line and off-line use are described in the last subsection.
58.2.1 Fitting Models to Data
L(q) = 1 and t(&)= - log fe(&).
In Section 58.1.3 we showed one way to parameterize descriptions of dynamical systems. There are many other possibilities and we shall spend a fair amount of this contribution discussing the different choices and approaches. This is actually the key problem in system identification. No matter how the problem is approached, a model parameterization leads to a predictor
that depends on the unknown parameter vector and past data zt-I (see Equation 58.6). This predictor can be linear in y and u. This in turn contains several special cases both in terms of black-box models and physically parameterized ones, as will be discussed in Sections 58.3 and 58.5, respectively. The predictor could also be of general, nonlinear nature, as will be discussed in Section 58.6. In any case we now need a method to determine a good value of 8, based on the information in an observed, sampled data set Equation 58.6. It suggests itself that the basic least-squares approach Equations 58.7 through 58.9, still is a natural approach, even when the predictor j(t 10) is a more general function of 8. A procedure with some more degrees of freedom is the following: 1. From observed data and the predictor j(tle), form the sequence of prediction errors, t = I , 2,. . .N. (58.21) 2. Filter the prediction errors 'through a linear filter L(q), ~ ( t8), = y(r) - j(tle),
+
(hereq denotes the shift operator, qu(t) = u(t 1)) to enhance or depress interesting or unimportant frequency bands in the signals. 3. Choose a scalar valued, positive function t(.) to measure the "size" or "norm" of the prediction error:
4. Minimize the sum of the& norms:
GN
= argmine ~ ~ (z N 0 ),
where vN(e, z N ) =
below) is supposed to be a sequence of independent random variables {e(t))each having a probability density function fe(x), then Equation 58.24 becomes the Maximum Likelihood estimate (MLE) if we choose
(58.24)
c:, e ( ~ ~ 8)) ( t , (58.25)
This procedure rs natural and pragmatic; we can still think of it as "curve fitting" between y(t) and j(t18). It also has several statistical and informational theoretic interpretations. Most importantly, if the noise source in the system (as in Equation 58.62
(58.26)
The MLE has several nice statistical features and thus gives a strong "moral support" for using the outlined method. Another pleasing aspect is that the method is independent ofthe particular model parameterization used (although this will affect the actual minimization procedure). For example, the method of "back propagation:' often used in connection with neural network parameterization~,amounts to computing GN in Equation 58.24 by a recursive gradient method. We shall deal with these aspects later on in this section.
58.2.2 Model Quality An essential question is, of course, what properties the estimate resulting from Equation 58.24 will have. These will naturally depend on the properties of the data record zNdefined by Equation 58.6. It is, in general, a difficult problem to characterizethe quality of GN exactly. One normally has to be content with the asymptotic properties of GN as the number of data, N, tends to infinity. It is an important aspect of the general identification method (Equation 58.24) that the asymptotic properties of the resulting estimate can be expressed in general terms for arbitrary model pararneterizations. The first basic result is the following
GN
-+
B* as N -+ co where
8* = argmine E l [ & F @e)], ,
(58.27) (58.28)
that is, as more and more databecome available,the estimateconverges to that value e*, that would minimize the expected value of the "norm" ofthe filtered prediction errors. This is the bestpossible approximation of the true system that is available within the model structure. The expectation E in Equation 58.28 is taken with respect to all random disturbances that affect the data and it also includes averaging over the input properties. This means, in particular, that 8* will make i ( t l8*) a good approximation of y ( t ) with respect to those aspects of the system enhanced by the input signal. t , is The second basic result is the following one: If { ~ ( Q*)] approximately white noise, then the covariance matrix of GN is approximately given by
-
E ( ~ N- 8 * ) ( 6 ~- Q * ) ~ i [ ~ @ ( t ) @ ~ ( t ) ](58.29) -' where h = E E ~ (8*) ~, (58.30) and @(t) = $ i ( t ~ e ) ~ e = e *
(58.31)
Think of @ as the sensitivity derivative of the predictor with respect to the parameters. Then Equation $8.29 says that the
THE CONTROL HANDBOOK covariance matrix for QN is proportional to the inverse of the covariance matrix of this sensitivity derivative. This is a quite natural result. Note: For all these results, the expectation operator E can, under most general conditions, be replaced by the limit of the sample mean, that is
The results Equations 58.27 through 58.31 are general and hold for all model structures, both linear and nonlinear, subject only to some regularity and smoothness conditions. They are also fairly natural and will give the guidelines for all user choices involved in the process of identification. See [14] for more details around this.
58.2.3 Measures of Model Fit Some quite general expressions for the expected model fit, that are independent of the model structure, can also be developed. Let us measure the (average) fit between any model Equation 58.20 and the true system as
t ( & )= ll&112,and that the model structure is successful in the sense that E F ( ~is) approximately white noise. Despite the reservations about the formal validity of Equation 58.35, it carries a most important conceptual message: If a model is evaluated on a data set with the same properties as the estimation data, then thefit will not depend on the data properties, and it will depend on the model structure only in terms of the number of parameters used and of the best fit offered within the structure. The expression can be rewritten as follows. Let jo(tlt - 1) denote the "true" one step ahead prediction of y (t), let w(Q) = Eljo(tlt - 1) - Y(tlQ)12,
(58.36)
h = Ely(t) - jo(tlt - 1)12.
(58.37)
and let
Then h is the innovationsvariance, i.e., that part of y(t) that cannot be predicted from the past. Moreover W(0*) is the bias error, i.e., the discrepancy between the true predictor and the best one available in the model structure. Under the same assumptions as above, Equation 58.35 can be rewritten as dime FN % A + W(Q*)+k- N The three terms constituting the model error then have the following interpretations:
Here expectation E is over the data properties (i.e., expectation over "Zm" with the notation Equation 58.6). Recall that expectation can also be intelgreted as sample means, as in Equation 58.32. Before we continue, note that the fit V will depend, not only on the model and the true system, but also on data properties, like input spectra, possible feedback, etc. We shall say that the fit depends on the experimental conditions. The estimated model parameter iN is a random variable, because it is constructed from observed data, that can be described as random variables. To evaluate the model fit, we then take the expectation of V(iN) with respect to the estimation data. That gives our measure FN= EV(iN). (58.34) In general, the measure FNdepends on a number of things: The model structure used, The number of data points N, The data properties for which the fit V is defined, and The properties of the data used to estimate iN. The rather remarkable fact is that, if the two last data properties coincide, then, asymptotically in N, (see [14], Chapter 16)
Here 0* is the value that minimizes the expected criterion Equation 58.28. The notation dim0 means the number of estimated parameters. The result also assumes that the criterion function
h is the unavoidable error, stemming from the fact that the output cannot be exactly predicted, even with perfect system knowledge. W(O*) is the bias error. It depends on the model structure, and on the experimental conditions. It will typically decrease as dime increases. The last term is the variance error. It is proportional to the number of estimated parameters and inversely proportional to the number of data points. It does not depend on the particular model structure or on the experimental conditions.
58.2.4 Model Structure Selection The most difficult choice for the user is to find a suitable model structure to fit the data to. This is of course a very applicationdependent problem, and it is difficult to give general guidelines. (Still, some general practical advice will be given in Section 58.7.) The heart of the model structure selection process is handling the trade-off between bias and variance, as formalized by Equation 58.38. The "best" model structure is the one that minimizes FN, the fit between the model and the data for a fresh data set, one that was not used for estimating the model. Most procedures for choosing the model structures are $so aiming at finding this best choice. Cross-Validation Avery natural and pragmatic approach is Cross-Validation. This means that the available data set is split into two parts:
58.2. GENERAL PARAMETER ESTIMATION TECHNlQUES estimation data,
N z, ,:that . is used to estimate the models, N
f j N , = arg min V N ,(0, Z,,:
),
(58.39)
1039 minimization of Equation 58.24. For simplicity we here assume a quadratic criterion and set the prefilter L to unity:
N
and validation data, Zvaf,for which the criterion is evaluated,
Here VN is the criterion Equation 58.25. Then kN will be an unbiased estimate of the measure F N ,defined by Equation 58.34, discussed at length in the previous section. The procedure would be to try out a number of model structures, and choose the one that minimizes F N ,. Cross-validation to find a good model structure has an i~nmediate intuitive appeal. We simply check if the candidate model is capable of"reproducing" data it hasn't yet seen. Ifthat works well, we have some confidence in the model, regardless of any probabilistic framework that might be imposed. Such techniques are also the most commonly used. A few comments can be added. 111 the first place, one can use different splits of the original data into estimation and validation data. For example, in statistics, there is a common-cross validation technique called "leave one out." This means that the validation data set consists of one data point "at a time'', but successively applied to the whole original set. In the second place, the test of the model on the validation data does not have to be in terms of the particular criterion Equation 58.40. In system identification it is common practice to simulate (or predict several steps ahead) the model using the validation data, and then visually inspect the agreement between measured and simulated (predicted) output.
Estimating the Variance Contribution - Penalizing the Model Complexity It is clear that the criterion Equation 58.40 has to be evaluated on the validation data to be of any use. It would be strictly decreasing as a function of model flexibility if evaluated on the estimation data. In other words, the adverse effect of the dimension of 8 shown in Equation 58.38 would be missed. There are a number of criteria, often derived from entirely different viewpoints, that try to capture the influence of this variance error term. The two best known are Akaike's Information Theoretic Criterion, AIC, vvhich has the form (for Gaussian disturbances)
No analytic solution to this problem is possible unless the model j(t 10) is linear in 0, so that the minimization has to be done by some numerical search procedure. A classical treatment of the problem of minimizing the sum of squares is given in [7). Most efficient search routines are based on iterativelocal search in a "downhill" direction from the current point. We then have an iterative scheme of the following kind:
Here & ( I ) is the parameter estimate after iteration number i. The search scheme is thus made up of the three entities:
e
p , step size. kt an estimate of the gradient v ; @ ( ' ) ) , and R, a matrix that modifies the search direction.
It is useful to distinguish between two different minimization situations 1 . Off-line or batch: The update p , R - l i , is based on the whole available data record Z Id . 2. On-line or recursive: The update is based only on data up to sample i ( Z ' . ) ,(done so that the gradient estimate k, is based only on data just before sample i ).
We shall discuss these two modes separately below. First some general aspects will be treated.
Search Directions The basis for the local search is the gradient
-- x ( ~ ( t ) j(tlQ))@(f, 0) (58.44)
=
-4(fl0).
where @(t,Q)
l N N t=l
=
a
a0
(58.45)
and Rissanen's ASni~numDescription Length Criterion, MDL in which dim0 in the expression above is replaced by log Ndim0. See [1] and [23]. The criterion VN is then to be minimized both with respect to 0 and to a family of model structures. The relationship to Equation 58.35 for f i is obvious. ,
The gradient @ is, in the general case, a matrix with dim 0 rows and dim y columns. It is well known that gradient search for the minimum is inefficient, especially close to the minimum. There it is optimal to use the Newton search direction
58.2.5 Algorithmic Aspects
where
In this section we shall discuss how to achieve the best fit between observed data and the model, i.e., how to carry out the
THE CONTROL HANDBOOK
~ ( t0),
=
y(t) - ?(tlO), and
(58.52)
Rt
=
Rt-1
(58.53)
+ ~t The true Newton direction will thus require that the second derivative,
be computed. Also, far from the minimum, R(0) need not be positive semidefinite. Therefore alternative search directions are more common in practice: Gradient direction. Simply take
0
Gauss-Newton direction. Use 1 Ri = Hi = 7j
+(t. B(i))+T(t, B(i)). (58.49) t=l
Levenberg-Marquard direction. Use
where Hi is defined by Equation 58.49. Conjugate gradient direction. Construct the Newton direction from a sequence of gradient estimates. Think of V$ as constructed by difference approximation ofd gradients. The direction Equation 58.46 is, however, constructed directly without explicitly forming and inverting V". It is generally considered [7] that the Gauss-Newton search direction is to be preferred. For ill-conditioned problems the Levenberg-Marquard modification is recommended.
On-Line Algorithms Equations 58.44 and 58.47 for the Gauss-Newton search clearly assume that the whole data set zN is available during the iterations. If the application is off-line, i.e., the model iN is not required during the data acquisition; this is also the most natural approach. However, many adaptive situations require on-line (or recursive) algorithms, where the data are processed as they are measured. (In Neural Network contexts such algorithms are often used off-line.) Then the measured data record is concatenated with itself several times to create a long record that is fed into the on-line algorithm. We may refer to [I71 as a general reference for recursive parameter estimation algorithms. In [27] the use of such algorithms in the off-line case is discussed. It is natural to consider the following algorithm as the basic one:
[t(t. i(t
-
t
T
t6
-
1)) - ~
~ - 1 1 .
The reason is that if j(tl6') is linear in 8, then Equations 58.51 to 58.53, with pf = l l t , provides the analytical solution to the minimization problem Equation 58.42. This also means that this is a natural algorithm close to the minimum, where a second order expansion of the criterion is a good approximation. In fact, it is shown in [17],that Equations 58.51 - 58.53 in general give an estimate6(t) with thesame ("optimal") statistical, asymptotic properties as the true minimum to Equation 58.42. The quantitiesjl(t l6(t - 1)) and @(t,6(t - 1))wouldnormally (except in the linear regression case) require that the whole data record be computed. This would violate the recursiveness of the algorithm. In practical implementations these quantities are therefore replaced by recursivelycomputed approximations. The idea behind these approximations is to use the defining equation for jl(tlO) and @(t,0) (which typically are recursive equations), and replace any appearance of 0 with its latest available estimate. See [I71 for more details. Some averaged variants of Equations 58.51 to 58.53) have also been discussed:
and 6(t) =
BI(~-I)+~~[S^(~)-~(~-I)].
(58.55)
The basic algorithm Equations 58.51 - 58.53 then corresponds to pt = 1. Using pt < 1 gives a so called "accelerated convergence" algorithm. It was introduced by [21] and has been extensively discussed by [lo] and others. The remarkable thing about this averaging is that we achieve the same asymptotic statisticalproperties of6(t) by Equations 58.54 and 58.55 with Rt = I (gradient search) as with Equations 58.51 - 58.53) if
It is thus an alternative to Equations 58.51 to 58.53, in particular, if dim 0 is large, Rt is a big matrix.
Local Minima A fundamental problem with minimization tasks like Equation 58.42 is'that VN(0) may have several or many local (nonglobal) minima, where local search algorithms may get caught. There is no easy solution to this problem. It is usually well worth the effort to find a good initial value 0(') to start the iterations. Other than that, only various global search strategies are left, such as random search, random restarts, simulated annealing, and the genetic algorithm.
58.3. L I N E A R B L A C K - B O X SYSTEMS
58.3 Linear Black-Box Systems 58.3.1 Line.ar !system Descriptions in General
how shall we predict the next output value y(t)? In the general case of Equation 58.62, the prediction can be deduced in the followingway: Dividing Equation 58.62 by H ( q , 9 ) ,
A Linear System A t h Additive Disturbances A linear system with additive disturbances u ( t ) can be described by
+
y ( t ) ==G ( q ) u ( t ) u ( t ) .
(58.56)
Here u ( t ) is the input signal, and G ( q ) is the transfer function from input to output y ( t ) . The symbol q is the shift operator, so Equation 58.56 should be interpreted as
In view of the normalization Equation 58.60,
The expression [ I --H - I ( q , 9 ) ]y ( t )thus only contains oldvalues of y ( s ) , s 5 t - 1. The right side of Equation 58.65 is thus known at time t - 1, with the exception of e(r). The prediction of y ( t ) is obtained from Equation 58.65 by deleting e ( t ) : The disturbance v ( t ) can, in general terms, be characterized by its spectrum, which is a description of its frequency content. It is often more convenient to describe u ( t ) , obtained by filtering a white noise source e ( t ) through a linear filter H ( q ) , as
From a linear id~entiiicationperspective, this is equivalent to describing v ( t ) as a signal with spectrum
where A is the variance of the noise source e ( t ) . We shall assume that H ( q ) is normalized to be monic, i.e.,
Putting all of this together, we arrive at the standard linear system description
+ H(q)e(t).
y(t, = G(q)u(f)
(58.61)
j ( t l 9 ) = [1 - ~ - ' ( q g, ) l y ( t )
+ ~ - ' ( q W, ( q ,0 4 ) . (58.65)
This generally expresses how linear models predict the next value of the output, given old values of y and u.
A Characterization of the Limiting Model in a General Class of Linear Models Let us apply the generallimit result Equations 58.27 - 58.28 to the linear model structure Equation 58.62 (or Equation 58.65). If we choose a quadratic criterion t(&) = in the scalar output case), then this tells us, in the time domain, that the limiting parameter estimate is the one minimizing the filtered prediction error variance (for the input used during the experiment.) Suppose that the data actually have been generated by
Let @, ( w )be the input spectrum and @, ( w ) be the spectrum for the additive disturbance v . Then the filtered prediction error is
Parameterized Linear Models Now, if the transfer functions G and H in Equation 58.61 are not known, we introduce parameters 9 in their description reflecting our lack of knowledge. The exact way of doing this is the topic of the present section as well as of Section 58.5. In any case the resulting, parameterized model will be described by
The parameters 0 can then be estimated from data using the general procedures described in Section 58.2.
Predictors for Linear Models Given a system description Equation 58.62 and inputoutput data up to time t - 1,
By Parseval's relationship, the prediction error variance can also be an integral over the spectrum of the prediction error. This spectrum, in turn, is directly obtained from Equation 58.67, so that the limit estimate 9* in Equation 58.28 can also be defined as O*
= arg mins
r
1Go(eh)
THE CONTROL HANDBOOK
If the noise model H (q, 8 ) = H,(q) does not depend on 6, (as in the output error model Equation 58.75), Equation 58.68 shows that theresultingmodel G(efm,O*) will give that frequency function in the model set, that is closest to the true one, in a quadratic frequency norm with weighting function
This shows that the fit can be affected by the choice of prefilter L, the input spectrum @, and the noise model H,.
58.3.2 Linear, Ready-Made Models Sometimes systems or subsystems cannot be modeled based on physical insights, because the function of the system or its construction is unknown or it would be too complicated to sort out the physical re1a:ionships. Then standard models can be used, which handle a wide range of different system dynamics. Linear systems constitute the most common class of such standard models. From a modeling point of view, these models serve as ready-made models: tell us the size (model order), and it should be possible to find something that fits (to data).
A Family of Transfer Function Models A very natural approach is to describe G and H in Equation 58.62 as rational transfer functions in the shift (delay) operator with unknown numerator and denominator polynomials:
Then, rl(t)
=
~ ( qe)u(t) ,
named the Box-Jenkins (BJ) model, after statisticians G. E. P. Box and G. M. Jenkins. An important special case occurs when the properties of the disturbance signals are not modeled, and the noise model H(q) is chosen to be H (q) = 1, that is, rtc = nd = 0. This special case is known as an outputerror (OE) model since the noise source e(t) will then be the difference (error) between the actual output and the noise-free output:
A colnmon variant is using the same denominator for G and H: F(q) = D(q) = A(q) = 1
+ alq-l + . . . + n,,,,a-"'. (58.76)
Multiplying both sides of Equation 58.74 by A(q), gives
This ready-made model is known as the ARMAX model. The name derives from the fact that A(q)y(t) represents an AutoRegression, C(q)e(t) is a Moving Average of white-noise, and B(q)u(t) represents an extra input (or with econometric terminology, an exogenous variable). The physical differencebetween ARMAX and BJ models is that the noise and input are subjected to the same dynamics (same poles) in the ARMAX case. This is reasonable if the dominating disturbances enter early in the process (together with the input). Consider for example an airplane where the disturbances from wind gusts give rise to the same type of forces on the airplane as the deflections of the control surfaces.. Finally, we have the special case of Equation 58.77 that, when C(q) F 1, that is, nc = 0,
is a shorthand notation for the relationship o(t)
+
+ -..+fnfrl(t - n f ) (58.72) + ... + bnb(t - (nb + n k - 1)).
firlo - 1) = blu(t - nk)
There is also a time delay of nk samples. We assume, for simplicity, that the sampling interval T is one time unit. In the same way, the disturbance transfer function is
The parameter vector 8 contains the coefficients bi , ci , di , and f i of the transfer functions. This ready-made model is described by five structural parameters: nb, nc, nd, nf, and nk. When these have been chosen, it remains to adjust the parameters bi , ci, d i , and fi to data with the methods of Section 58.2. The ready-made model Equations 58.71 - 58.74 gives
which, with the same terminology, is called an ARX model and which we discussed at length in Section 58.1.3. Figure 58.1 shows the most common model structures. To use these ready-made models, decide on the orders nu, nb, nc, nd, nf , and nk, and let the computer pick the best model in the class defined. The model obtained is then scrutinized, and it might be found that other orders must also be tested. A relevant question is how to use the freedom that the different model structures give. Each of the BJ, OE, ARMAX, and ARX structures offer their own advantages, and we will discuss them in Section 58.7.2.
Prediction ,
P
Starting wit model Equation 58.74, it is possible to predict what the ou put y(t) will be, based on measurements of u(s), y (s) s ( t - 1, using the general formula Equation 58.65. It is easiest to calculate the prediction for the OE-case, H(q, 8 ) r 1, when we obtain the model
58.4. SPECIAL ESTIMslTION TECHNIQUES FOR LINEAR BLACK-BOX MODELS
M
le
p
1043
Here 8 is a column vector that contains the unknown parameters, while q ( t ) is a column vector formed by old inputs and outputs. Such a model structure is called a linear regression. We discussed such models in Section 58.1.3 and noted that the ARX model Equation 58.78 is one common model of the linear regression type. Linear regression models can also be obtained in several other ways (see Example 58.2).
ARX
58.4 Special Estimation Techniques for Linear Black-Box Models
Figure 58.1
An important feature of a linear, time-invariant system is that it is entirely characterized by its impulse response. If we know the system's response to an impulse, we will also know its response to any input. Equivalently, we could study the frequency response, which is the Fourier transform of the impulse response. In this section we shall consider estimation methods for linear systems that do not use particular model parameterizations. First, in Section 58.4.1, we shall consider direct methods to determine the impulse response and the frequency response by applying the definitions of these concepts. In Section 58.4.2 methods for estimating the impulse response by correlation analysis will be described, and in Section 58.4.3, spectral analysis for frequency function estimation will be discussed. Finally, in Section 58.4.4 a recent method of estimating general linear systems (of given order, but unspecified structure) will be described.
Model structures.
with the natural prediction ( 1 - H-' = 0)
58.4.1 Transient and Frequency Analysis From the AKY case Equation 58.78, y(t)
==
.- a l y ( t - 1 ) -
+blu(t
Transient Analysis ana Y (t - n a )
- n k ) + ...
+ bnbu(t - nk - n b + 1) + e ( t ) ,
(58.80)
and the prediction (delete e ( t ) ! )
j(tlQ) = - a l y ( t - 1) - . . . - anay(t - n a ) +blu(t - n k ) + . . . + bnbu(t - nk - n b
+ 1). (58.81)
Note the difference between Equations 58.79 and 58.81. In the OE model the prediction is based entirely on the input { u ( t ) } , whereas the ARX model also uses old values of the output.
Linear Regression
The first step in modeling is to decide which quantities and variables are important to describe what happens in the system. A simple and common experiment that shows how and in what time span variousvariables affect each other is called step-response analysis or transient analysis. In these experiments the inputs are varied (typically one at a time) as a step: u ( t ) = uo, t < to; u ( t ) = u 1 , t 1 to. The other measurable variables in the system are recorded during this time. We thus study the step response of the system. An alternative would be to study the impulse response of the system by letting the input be a pulse of short duration. From such measurements, the following information can be found: 1. The variables affected by the input in question. This
Both tailor-made and ready-made inodels describe how the predicted value of y ( t ) depends on old values of y and u and on the parameters 8. We denote this prediction by Wle) (see Equation 58.65). In general, this can be a rather complicated function of 8. The estimation work is considerably easier if the prediction is a linear function of@, jqtle) = eTcp(t).
(58.82)
makes it easier to draw block diagrams for the system and to decide which influences can be neglected. 2. The time constants of the system. This also allows us to decide which relationships in the model can be described as static (that is, they have significantly faster time constants than the time scale we are working with). 3. The characteristic (oscillatory, poorly damped, monotone, and the like) of the step responses, as well as the levels of static gains. Such information is
THE CONTROL HANDBOOK useful when studying the behavior of the final model in simulation. Good agreement with the measured step responses should give confidence in the model.
Frequency Analysis If a linear system has the transfer function G ( q ) and the input is
As soon as we use the term covariance function, there is always an implied assumption that the involved signals are such that either Equation 58.87 or 58.89 is well defined. The cross-correlation signal between a signal u and itself, i.e., RUM (r) = Ru (;c), is called the (auto) covariance function of the signal. We shall say that two signals are uncorrelated if their crosscovariance function is identically equal to zero. Let us consider the general linear model Equation 58.56, and assume that the input u and the noise v are uncorrelated:
then the output, after possible transients have faded away, will be y ( t ) = yo cos(ot
+ Q), for t = T, 2T, 3T, . . .
(58.84)
The cross-covariance function between u and y is then
where yo = l ~ ( e ' @ ~. uo )l rp = arg G ( e i o T )
(58.85) (58.86)
If the system is driven by the input Equation 58.83 for a certain uo and wl and we measure yo and Q from the output signal, it is possible to determine the wmplexnumber G (eiol T , using Equations 58.85 - 58.86. By repeating this procedure for a number of different w, we can get a good estimate of the frequency function ~ ( e ' ~This ~ )method . is calledfrequency analysis. Sometimesit is possible to see or measure uo, yo, and p directly from graphs of the input and output signals. Most of the time, however, noise and irregularities will make it difficult to determine Q directly. A suitable procedure is then to correlate the output with cos wt and sin wt .
00
R y u ( r ) = Ey(t)u(t - t ) =
g k ~ u (t k)u(t - t ) k=O
+
00
E ~ ( t ) u ( t - ~ ) = x ~ ~ ~ ~ ( r - k ) . k=O
(58.91)
If the input is white noise,
The cross-covariance function R y u ( t ) will thus be proportional to the impulse response. Of course, this function is not known, but it can be estimated from observed inputs and outputs as the corresponding sample mean:
58.4.2 Estimating Impulse Responses by Correlation Analysis It is not necessary to use an impulse as input to estimate the impulse response of a system directly. That can also be done by correlation techniques. To explain how these work, let us first define correlation functions. The cross-covariancefunction between two signals y and u is defined as the covariance between the random variables y ( t ) and u(t - t ) , as a function of the time difference t :
In this way we also obtain an estimate of the impulse response:
1
It is implicitly assumed here that the indicated expectation does not depend on absolute time t , that is, the signals are (weakly) stationary. As in the case of Equation 58.32, expectation can be replaced by sample means, my
=
I
then the filtered signals will be related by the same impulse response as in Equation 58.90:
"
lim - x y ( t ) , a n d N+m
Ryu(t) =
Ifwecannot choose theinput ourselves, and it is nonwhite, we can estimate its covariance function as k : ( t ) , analogous to Equation 58.93, and then solve for gk from Equation 58.91 where R, and Ruy have been replaced by the corresponding estimates. However, a better and more common way is the following: First note that, if both input end output are filtered through the same filter
N
t=l
1
lim - x ( Y ( t ) - m y ) ( u ( t - r ) - m u ) .
N+m
N
t=*
(58.89)
Now, for any given input u ( t ) in the process, we can choose the filter L so that the signal ( uF ( t ) }will be as white as possible. Such
-.
58.4. SPECIAL ESTIMATION TECHNIQUES FOR LINEAR BLACK-BOX MODELS a filter is called a whiteningfilter. It is computed by describing u(t) as an AR process (This is an ARX model without an input, cf Section 58.1.3): A(q)u(t) = e(t). The polynomial A(q) = L(q) can then be estimated using the least squares method. (see Section 58.1.3). We can now apply the estimate Equation 58.94 to the filtered signals.
58.4.3 Estimating the Frequency Response by Spectral Analysis Definitions The cross spectrum between two (stationary) signals u(t) and y(t) is defined as the Fourier transform of their crosscovariance function, provided this exists,
where Ryu(t) is defined by Equation 58.87 or Equation 58.89. ,, (w), The (auto) spectrum @, (o) of a signal u is defined as @ i.e., its cross spectrum with itself. The spectrum describes the frequency content of the signal. The connection to more explicit Fourier techniques is evident from the relationship
1045
estimates are mixed with good ones in the Fourier transform, creating an overall bad estimate. It is better to introduce a weighting, so that correlation estimates for large lags T carry a smaller weights:
This spectral estimation methodis known as the Blackman-Tukey approach. Here wy (t) is a window function that decreases with 1sI. This function controls the trade-offbetweenfiequencyresolution and variance of the estimate. A function that gives significant weights to the correlation at large lags will be able to provide finer frequency details (a longer time span is covered). At the same time it must use "bad" estimates, so that the statistical quality (the variance) is poorer. We shall return to this trade-off in a moment. How should we choose the shape of the window function wy (t)? There is no optimal solution to this problem, but the most common window used in spectral analysis is the Hamming window: wy(k) wy(k)
1
= T(l = 0,
+cosy)
Ikl < y , lkl 1 Y.
(58.104)
From the spectral estimates @,, Dy, and Qy, obtained in this way, we can now use Equation 58.101 to obtain a natural estimate of the frequency function G(eiw):
("weak limit") where UN is the discrete time Fourier transform N
UN(o) =
u (t)eio'
.
(58.99)
t=l
Furthermore, the disturbance spectrum can be estimated from Equation 58.102 as
Equation 58.98 is; shown, in [16]. Consider now the general linear model Equation 58.56, To compute these estimates, perform the following steps: It is straightforward to show that: the relationships between the spectra and cross spectra of y and u (provided u and v are uncorrelated) are
The transfer function G(eim) and the n-oise spectrum 4, (o) can be estimated with these expressions, if only we have a method to estimate cross spectra.
Estimation of Spectra The spectrum is defined as the Fourier transform of the correlation function. A natural idea would then be to take the transform of the estimate k g ( t ) in Equation 58.93 but that will not work in most cases, because the estimate k & ( t ) based on only a few observations is not reliable for large s. These "bad"
Algorithm SPA
(58.107)
1. Collect data y (k), u(k) k = 1, . . . , N . 2. Subtract the corresponding sample means form the data. This will avoid bad estimates at very low frequencies. 3. Choose the width of the lag window w y(k). 4. Compute I?iy(k), k/(k), and kK(k) for Ikl 5 y according to Equation 58.93.
5. Form the spectral estimates &;(a), &f(w), and &;,(w) according to Equation 58.103 and analogous expressions. 6. Form Equation 58.105 and possibly also Equation 58.106. The user only has to choose y. A good value for systems without sharp resonances is y = 20 to 30. Larger values of y may be required for systems with narrow resonances.
THE CONTROL HANDBOOK
Quality of the Estimates The estimates G N and 6: are formed entirely from estimates of spectra and cross spectra. Their properties will, therefore, be inherited from the properties of the spectral estimates. For the Hamming window with width y, the frequency resolution will be about JI -
radiansltime unit.
the model (58.111) becomes a linear regression: the unknown parameters, all of the matrix entries in all the matrices, mix with measured signals in linear combinations. To see this clearly, let
(58.108)
yz/Z
This means that details in the true frequency function, that are finer than this expression, will be smeared out in the estimate. The estimate's variances satisfy . ~ Var & N ( i ~ ) ~ 0 . 7 . ~ (58.109) N @"(4 and Var & f ( o ) x 0.7.
6. @i(o).
Then, Equation 58.11 1 becomes
(58.110)
["Variance" here refers to taking expectation over the noise sequence u(t).] The relative variance in Equation 58.109 typically increases dramatically as o tends to the Nyquist frequency because IG(iw)l typically decays rapidly, and the noise-to-signal ratio @,(o)/@, (o) has a tendency to increase as w increases. In a Bode diagram the estimateswill show considerable fluctuations at high frequencies. Moreover, the constant frequency resolution Equation 58.108 will look thinner and thinner at higher frequencies in a Bode diagram due to the logarithmic frequency scale. See [16] for a more detailed discussion.
Choice of Window Size The choice of y is a pure trade-off between frequency resolution and variance (variability). For a spectrum with narrow resonance peaks, it is necessary to choose a large value of y and accept a higher variance. For a flatter spectrum, smaller values of y will do well. In practice a number of different values of y are tried out. We start with a small value of y and increase it successively until an estimate is found that balances the trade-off between frequency resolution (true details) and variance (random fluctuations). A typical value for spectra without narrow resonances is y = 20-30.
From this equation all of the matrix elements in O can be estimated by the simple least-squares method, described in Section 58.1.3. The covariance matrix for E (t) can also be estimated as the sample sum of the model residuals. That will give the covariance matrices for w and e, as well as the cross-covariance matrix between w and e . These matrices will allow us to compute the Kalman filter for Equation 58.1 11. All of the above holds without changes for multivariable systems, i.e., when the output and input signals are vectors. The only remaining problem is where to get the state vector sequence x. It has long been known ([22] and [2]) that all state vectors x(t), that can be reconstructed from input-output: data, are linear combinations of the components of then k-step ahead output predictors
where n is the model order (the dimension of x). See also Appendix 4.A in [14]. We could then form these predictors, and select a basis among their components:
58.4.4 Subspace Estimation Techniques for State-Space Models A linear system-can always be represented in state space form: x(t
+ 1)
=
Ax(t)
+ Bu(t) + w(t) and
~ ( t ) = Cx(t)+Du(t)+e(t).
. (58.111)
We assume that we have no insight into the particular structure, and we estimate any matrices A, B , C, and D that describe the input-output behavior of the system. This is not without problems, because there are an infinite number of matrices that describe the same system (the similarity transforms). The coordinate basis of the state-space realization thus needs to be fixed. Let us for a moment assumethat not only are u and y measured, but also the sequence of state vectors x. This would fur the statespace realization coordinate basis. Now, with known u, y and x,
The choice of L will determine the basis for the state-space realization, and is done so that it is well conditioned. The predictor j ( t klt) is a linear function of u(s), y(s), 1 5 s t and can efficiently be determined by linear projections directly on the input-output data. (One complication is that u(t I), . . . , u(t k) should not be predicted, even ifthey affect y(t k).) What we have described now is the subspace projection approach to estimating the matrices of the state-space model Equation 58.1 l l , including the basis for the representation and the noise covariance matrices. For a number of variants of this approach, see [19] and [12]. The approach gives very useful algorithms for model estimation and is particularly well suited for niultivariable systems.
+
+ +
+
1
58.5. PHYSICALLY PARAMETERIZED MODELS The algorithms also allow, numerically, very reliable implementations. At present, the asymptotic properties of the methods are not fully investigated, and the general results quoted in Section 58.2.2 are not directly applicable. Experience has shown, however, that cbnfidence intervals, computed according to the general asymptotic theory, are good approximations. One may also use the estimates obtained by a subspace method as initial conditions for minimizing the prediction error criterion Equaticm 58.24.
58.5 Physically Parameterized Models So far we have treated the parameters 8 only as vehicles to give reasonable flexibilityto the transfer functions in the general linear model Equation 58.62. This model can also be arrived at from other considerations. Consider a continuous time state space model xl(t) = A(B)x(t)
+ B(O)u(t)
(58.1 15a)
Here x (t) is the state vector consisting of physical variables (such as positions and velocities). The state-space matrices A, B and C are parameterized by the parameter vector 8, reflecting physical insight into the process. The parameters could be physical constants (resistance, heat transfer coefficients, aerodynamical derivatives) whose values are not known. They could also reflect other types of insights into the system's properties.
EXAMPLE 58.3: An electric motor Consider an electric motor with the input u the applied voltage and the output y the angular position of the motor shaft. A first, but reasonable, approximation of the motor's dynamics is as a first-order :systemfromvoltage to angular velocity, followed by an integrator,
b
G(s) = --s(s + a ) '
The parameterization reflects our insight that the system contains an integration, but is in this case not directly derived from detailed physical modeling. Basic physical laws would have shown how 8 depends on physical constants, such as resistance of the wiring, amount of inertia, friction coefficients and magnetic field constants. Now, how do we fit a continuous-time model Equation 58.1 15 to sampled data? If the input u(t) has been piecewise constant over the sampling interval
then the states, inputs, and outputs at the sampling instants will be represented by the discrete-time model
where ;i(e) = eA(@)T,k ( e ) =
J,T e A ( @ ) ' ~ ( e ) d s(58.119) .
This follows from solving Equations 58.1 15 and 58.1 15 over one sampling period. We could also further model the added noise term v(kT) and represent the system in the innovations form
where (e(kT)) is white noise. The step from Equations 58.118 to 58.120 is a standard Kalman filter step: f will be the one-step ahead predicted Kalman states. A pragmatic view is as follows: In Equation 58.1 18 the term v(kT) may not be white noise. If it is colored, we may separate out that part of v(kT) that cannot be predicted from past values. Denote this part by e(kT): it will be the innovation. The other part of v(kT), that can be predicted, can then be described as a combination of earlier innovations, e(lT) l < k. Its effect on y (kT) can be described via the states, by changing them from x to f, where .f contains additional states associated with getting v(kT) from e(lT), k 5 e. Now Equation 58.120 can be written in input - output form (let T = 1) as
If we select the state variables with G(q9 8) ~ ( qe ,) the state space farm is
=
and C
= = (1 0).
,
= I
+ c(e)(qz - & e ) ) - l ~ ( o ) . (58.122)
We thus return to the basic linear model Equation 58.62. The parameterization of G and H in terms of 8 is however more complicated than those discussed in Section 58.3.2. The general estimation techniques, model properties (including the frequency domain characterization Equation 58.68), algorithms, etc., apply exactly as described in Section 58.2. From these examples, it is also quite clear that nonlinear models with unknown parameters can be approached in the same way. We then arrive at a structure,
where v denotes disturbances and noise. In this case,
A(8)
= C(@)(qZ- & e ) ) - ' w )
(58.117)
THE CONTROL HANDBOOK In this model, all noise effects are collected as additive output disturbances v(t) which is a restriction, but also a simplification. If we define j(tl0) as the simulated output response to Equation 58.123, for a given input, ignoring the noise v(t), everything that was said in Section 58.2 about parameter estimation, model properties, etc. still applies.
58.6 Nonlinear Black-Box Models In this section we shall describe the basic ideas behind model structures that can cover any nonlinear mapping from past data to the predicted value of y (t). Recall that we defined a general model structure as a parameterized mapping in Equation 58.19:
We shall consequentlyallow quite general nonlinear mappings g. This section will deal with some general principles for constructing such mappings, and will cover Artificial Neural Networks as a special case. See (251and (91for recent and more comprehensive surveys.
58.6.1 Nonlinear Black-Box Structures The model structure family Equation 58.124 is really too general, and it is useful to write g as a concatenation of two mappings: one that takes the increasing number of past observations 2'-' and maps them into a finite dimensional vector p(t) of fixed dimension and one that takes this vector to the space of the outputs, j(tle) = g(o, zt-l) = g((p(t), e),
(58.125)
(p(t) = (p(Zf-l).
(58.126)
58.6.2 Nonlinear Mappings: Possibilities Function Expansions and Basis Functions The nonlinear mapping
goes from Rd to RP for any 8. At this point it does not matter how the regression vector (p is constructed. It is just a vector that lives in Rd. It is natural to think of the parameterized function family as function expansions,
where gk are the basis functions and the coefficients B(k) are the "coordinates" of g in the chosen basis. Now, the only remaining question is: How to choose the basis functions gk? Depending on the support of gk (i.e., the area in Rd for which gk((p)is (practically)nonzero) we shall distinguish between three types of basis functions: Global basis functions, Semiglobal or ridge-type basis functions, Local basis functions A typical and classical global basis function expansion is the Taylor series, or polynomial expansion, where gk contains multinomials in the components of (p of total degree k. Fourier series are also examples. We shall not, however, discuss global basis functions here any further. Experience has indicated that they are inferior to semiglobal and local functions in practical applications.
where Let the dimension of (p be d. As before, we shall call this vector the regression vector and its components will be referred to as the regressors. We also allow the more general case that the formation of the regressors is itself parameterized:
for short, q(t, r ] ) . For simpgcity, the extra argument q will, however, be used explicitly only when essential. The choice of the nonlinear mapping in Equation 58.124 has thus, been reduced to two partial problems for dynamical systems: 1. How to choose the nonlinear mapping g((p) from
the regressor space to the output space (i.e., from Rd to RP). 2. How to choose the regressors q(t) from past inputs and outputs. The second problem is the same for all dynarnical systems. Most regression vectors are chosen to let them contain past inputs and outputs, and also past predictedlsimulated outputs. The regression vector will thus be of the character Equation 58.4. We now turn to the first problem.
Local Basis Functions Local basis functions have their support only in some neighborhood of a given point. Think (in the case of p = l ) of the indicator function for the unit cube, ~ ( ( p= ) 1 if ((pk 1 5 1 Vk , and 0 otherwise.
(58.130)
By scaling the cube and placing it at different locations we obtain the functions
By allowing a! to be a vector of the same dimension as (p and interpreting the multiplication * as componentwise multiplication (like ".*" in MATLAB) we may also reshape the .cube to be .any parallelepiped.. The parameters a! are scaling or dilation parameters and B determines location or translation. For notational convenience, gk((~)= ~ ( a k* (9- Bk)) = K ( P '~~
p )
(58.132)
where Pk = [ a k * a k * B k l In the last equality, with some abuse of notation, we expanded the regression vector 9 to contain some "1"'s. This stresses the
58.6. N O N L I N E A R B L A C K - B O X MODELS
1049
point that the argument of the basic function K is bilinear in the scale and location parameters pk and in the regression vector cp indicated by the notation pk . cp. This choice of gk in Equation 58.129 gives f~~nctions thdt are piecewise constant over areas in Rd that can be chosen arbitrarily small by proper choice of the scaling parameters. It should be fairly obvious that such functions gk can approximate any reasonable function arbitrarily well. Now it is also reasonable that the same will be true for 'my other localized function, such as the Gdussi'~nbell funct~on,
gives a much used neural network structure, viz., the one hidden [ayerfeed-forward sigmoidal net. Hinging hyperplanes Instead of using the sigmoid a function, if we choose "V-shaped" functions (in the form of a higher-dimensional "open book"), Breiman's hinging hyperplane structure is obtained [5]. Hinging hyperplanes model structures have the form g(x)
=
max { p i x
g(x)
=
min { p ' x
+ y + , /?-x + Y - } + yi , p-x +y - )
or .
It can be written differently as
Ridge-Type Basis Functions A useful alternative is to let the basis functions be local in one direction of the cp-space and global in the others. This is achieved quite analogously to Equation 58.13 1 as follows: Let a ( x ) be a local function from R to R. Then forrn
+
gk(cp) = a(crl(cp - B k ) ) = ~ ( a l c p ~ k =) Q ( P ~ . cp)
(58.134)
where the scalar yk = - c r l ~ k ,and
Note the difference with Equation 58.131! The scalar product is constarlt in the subspace of Rd that is perpendicular to the scaling vector a'k. Hence the function gk(cp) varies like o in a direction parallel to crk and is constant across this direction. This leads to the term semiglobal or ridge-type for this choice of functions. As in Equation 58.131 we expanded, in the last equality in Equation 58.134, the vector cp with the value "1': again just to emphasize that the argument of the fundamental basis function o is bilinear in p and cp.
Connectioil to "Named Structures" Here we briefly review some popular structures. Other structures related to interpolation techniques are discussed in [9] and [ X I . Wavelets The local approach corresponding to Equations 58.129 and 58.131 has direct connections to wavelet networks and wavelet transforms. The exact relationships are discussed in [25]. Via the dilation parameters jn pk, we can work with different scalessimultaneouslyto pick up both local and notso-local variations. With appropriate translations and dilations of a single suitably chosen function K (the "mother wavelet"), we can make the expansion (58.129) orthonormal. This is discussed extensivelyin 191. Wavelet and radial basis networks The choice of Equation 58.133 without any orthogonalization is found in wavelet networks [28] and radial basis neural networks [20]. Neural networks The ridge choice Equation 58.134 with 1 a ( x ) = --1 e-X
+
A hinge is the superposition of a linear map and a semi global function. Therefore, we consider hinge functions as semiglobal or ridge-type, though not in strict accordance with our definition. Nearest neighbors or interpolation By selecting K as in Equation 58.130 and the location and scale vector pk in the structure Equation 58.131, so that exactly one observation falls into each "cube': the nearest neighbor model is obtained : Just load the input-output record into a table, and, for a given cp, pick the pair ( 7 , for closest to the given cp; 7 is the desired output estimate. If one replaces Equation 58.130 by a smoother function and allow some overlapping of the basis functions, we get interpolation type techniques such as kernel estimators. Fuzzy Models So-called fuzzy models based on fuzzy set membership belong to the model structures of the class Equation 58.129. The basis functions gk then are constructed from the fuzzy set membership functions and the inference rules. The exact relationship is described in [25].
a
58.6.3 Estimating Nonlinear Black-Box Models The model structure is determined by the following choices: The regression vector (typically built up from past inputs and outputs), The basic function K (local) or a (ridge), and The number of elements (nodes) in the expansion Equation 58.129. Once these choices have been made, j(t 18) = g((p(r), 8) is a well defined function of past data and the parameters 8. The parameters are made up of coordinates in the expansion Equation 58.129 and from location and scale parameters in the different basis functions. AU of the algorithms and analyticalresults of Section 58.2 can thus be applied. For Neural Network applications these are also the typical estimation algorithms used, often complementedwith regularization, which means that.a term is added to the criterion Equation 58.24, that penalizes the norm of 8. This will reduce
THE CONTROL HANDBOOK
the variance of the model, in that "spurious" parameters are not allowed to take on large, and mostly random, values (see [25]). For wavelet applications it is common to distinguish between those parameters that enter linearlyin j(t 10) (i.e., the coordinates in the function expansion) and those that enter nonlinearly (i,e., the location and scale parameters). Often the latter are seeded to fixed values, and the coordinates are estimated by the linear leastsquares method. Basis functions, that give a small contribution to the fit (corresponding to nonuseful values of the scale and location parameters), can then be trimmed away ("pruning" or "shrinking").
58.7 User's Issues 58.7.1 Experiment Design It is desirable to affect the conditions under which the data are collected. The objective with such experiment design is to make the collected data set zN as informative as possible for the models being built from the data. A considerable amount of theory around this topic can be developed, and here we shall just review some basic points. The first and most important point is the following:
The inputsignal u mustexpose all of the relevantproperties of the system. It must not be too "simple': For example, a pure sinusoid u ( t ) = A cos o t will only give information about the system's frequency response at frequency o . This can also be seen from Equation 58.68. The rule is that the input must contain at least as many different frequenciesas the order of the linear model being built. Another case where the input is too simple is when it is generated by fecdback such as
If we would like to build a first order ARX model
we find that, for any given a,all models with
will give identical input-output data. We cdn thus not distinguish between these models using an experiment with Equation 58.135, that is, we can not distinguish between any combinations of "a" and b>, if' they satisfy the above condition for a given "a".The rule is as follows: If closed-loop experiments have to be performed, the feedbacklawmuht not be too simple. It is preferred that a set point in the regulator is changed in a random fashion. '6
~
The second main point in experimental design is as follows: Allocate the input power to those frequency bands ' p ortant. where a good model isparticularl) im This is also seen from Equation 58.68. If we let the input be filtered white noise, this gives information for choosing the filter. In the time domain, these suggestions are useful: Use binary (two-level)inputs if linear models are being built. This gives maximal variance for amplitude-constrained inputs. Check the changes between the levels so that the input occasionally stays on one level long enough for a step response from the system to settle, more or less. There is no need to let the input signal switch quickly back and forth that no response in the output is clearly visible. Note that the second point is a reformulation in the time domain of the basic frequency domain advice: let the input energy be concentrated in the important frequency bands. Athird basic piece of advice about experiment design concerns the choice of sampling interval. A typicalgoodsamplingfrequency is 10 times the bandwidth of thesystem. That roughly corresponds to 5-7 samples along the rise time of a step response.
58.7.2 Model Validation and Model Selection The system identification process has, as we have seen, these basic ingredients: the set of models the data the selection criterion Once these have been decided upon, we have, at least implicitly, defined a model: The one in the,set that best describes the data according to the criterion. It is thus the best available model in the chosen set. But is it good enough? It is the objective of model validation to answer that question. Often the answer turns out to be "no': and we then have to review the choice of model set or modify the data set (see Figure 58.2). How do we check the quality of a model? The prime method is to investigate how well it reproduces the behavior of a new set of data (the validation data) that was not used to fit the model, that is, we simulate the obtained model with a new input and compare this simulated output. One may then use one's eyes or numerical measurements of fit to decide if the fit is good enough. Suppose we have obtained several different models in different model structures (say a 4th order ARX model, a 2nd order BJ model, a physically parameterized one and so on) and would like to know which one is best. The simplest and most pragmatic approach is then to simulate each one of them with validation
I
58.7. USER'S ISS IJES C Paranretric estimatiorl methods
Calculation of parametric estimates in different model structures. D Presentation of models Simulation of models, estimation and plotting of poles and zeros, computation of frequency functions, and plotting Bode diagrams, etc. E Model validation Computation and analysis of residuals ( ~ ( r 6, , ~ ) ) . Comparison between different models' properties, etc.
Polish and present data
Data not OK
lvlodel structure not OK
--*:_I?
Figure 58.2
model Accept model?
The ldentlfiiat~onloop.
data, evaluate their perhormance, and pick the one with the best fit to measured data. (This could be a subjective criterion!) The second basic method for model validation is to examine the residuals ("the leftovers") from the identification process. These are the prediction errors,
i.e., what the model could not "explain'! Ideally these should be independent of information that was at hand at time t - 1. For ) u(t - r ) are correlated, then elements in y (t) example, if ~ ( tand that originate %om u(t - t) have not been properly accounted for by $(tliN). The model has then not extracted all relevant information about the system from the data. It is always good practice to check the residuals for such (and other) dependencies. This is known as residual analysis.
58.7.3 Software for System Identification In practice, System Identification is characterized by lengthy numerical calculations to determine the best model in each given class of models. This is due to multiple user choices, trying different model structures, filtering data and so on. In practical applications, we need good software support. There are now many different comnlercial packages available,such as Mathwork's System Identification Toolbox [13],M a t r g s System Identification Module [ l a ]and PIM [ l l ] .In common they offer the following routines: A Handling of data, plotting, etc. Filtering of data, removal of drift, choice of data segments, etc. B Nonparametric identification methods Estimation of covariances, Fourier transforms, correlation and spectral analysis, etc.
The existing program packages differ mainly in vafious user interfaces and in options regarding the choice of model structure according to C above. For example, MATLAB's Identification Toolbox [I31 covers all linear model structures discussed here, including arbitrarily parameterized linear models in continuous time. Regarding the user interface, there is now a trend toward graphical orientation. This avoids syntax problems, relies more on "click and move", and avoids tedious menu-labyrinths. More aspects of CAD tools for system identification are treated in (151.
58.7.4 The Practical Side of System Identification It follows from our discussion that, once the data have been recorded, the most essential steps in the process of identification are to try out various model structures, compute the best model in the structures, using Equation 58.24, and then validate this model. Typically this has to be repeated with quite a few different structures before a satisfactory mode1 can be found. The difficulties of this process should not be underestimated, and it will, require substantial experience to master it. The following procedure may prove useful: Step 1: Looking at the Data Plot the data. Look at them carefully. Try to grasp the dynamics. Can you see the effects in the outputs of the input changes? Can you see nonlinear effects, like different responses at differentlevels, or different responses to a step up and a step down? Are there portions of the data that are "messy" or carry no information. Use this insight to select portions of the data for estimation and validation purposes. Do physical levels play a role in your model? If not, detrend the data by removing their mean values. The models will then describe how changes in the input give changes in output, but will not explain the actual levels of the signals. Thisisthe normal situation. The default situation, with good data, is that you detrend by removing means, and then select the first two-thirds or so of the data record for estimation purposes, using the remaining data for validation. (All of this corresponds to the "Data Quickstart" in the MATLAB Identification Toolbox.)
THE CONTROL HANDBOOK Step 2: Gettinga Feel for the Difficulties Compute and display the spectral analysisfrequency response estimate, the correlation analysis impulse response estimate, and a fourth order ARX model with a delay estimated from the correlation analysis and a default order state-space model computed by a subspace method. (Allof this corresponds to the "Estimate Quickstart" in the MATLAB Identification Toolbox.) This gives three plots. Look at the agreement between the spectral analysis estimate-and the ARX and state-space models' frequency functions, the correlation analysis estimate and the ARX and state-space models' transient responses the measured validation data output and the ARX and state-space models' simulated outputs. We call this the Model Output Plot. If these agreements are reasonable, the problem is not so difficult, and a relatively simple linear model will serve. Some fine tuning of model orders and noise models have to be made, and you can proceed to Step 4. Otherwise go to Step 3. Step 3: J h m h h g the Difficulties There may be severalreasonswhy the comparisons in Step 2 did not look good. This section discusses the most common ones, and how they can be handled: Model Unstable: The ARX or state-space model may be unstable, but could still be useful for control purposes. Then change to a 5or 10-step ahead prediction instead of simulation in the Model Output Plot. Feedback in Data: If there is feedback from the output to the input, due to some regulator, then the spectral and correlation analysis estimates are 'not reliable. Discrepancies between these estimates and the ARX and statespace models can, therefore, be disregarded. In residual analysis of the parametric models, feedback in ddta can also be visible as correlatipn between residuals and input for negative lags. Noise Model: If the state-space model is dearly better than the ARXmodel at reproducing the measured output, this is an indication that the disturbances have a substantial influence, and it will be necessary to model them carefully. Model Order: If a fourth order model does not give a good Model Output Plot, try eighth order. Ifthe fit improves, it follows that higher order models will be required, but that linear models could be sufficient.
Additional Inputs: If the Model Output fit has not significantly improved by the tests so far, review the physics of the application. Are there more signals that have been, or could be, measured that might influence the output? If so, include these among the inputs and again try a fourth order ARX model from all the inputs (note that the inputs need not at all be control signals; anything measurable, including disturbances, should be treated as inputs). NonlinearEffects: Ifthe fit between measured and model output is still bad, consider the physics of the application. Are there nonlinear effects in the system? In that case, form the nonlinearities from the measured data. This could be as simple as forming the product of voltage and current measurements, if you realize that it is the electrical power that is the driving stimulus in, say, a heating process, and temperature is the output. This is, of course, application dependent. It does not cost very much work, however, to form a number of additional inputs by reasonable nonlinear transformations of the measured ones, and test if including them improves the fit (see Example 58.2).
Still Problems? If none of these tests leads to a model that reproduces the Validation Data reasonably well, the conclusion might be that for many reasons a sufficiently good model cannot be produced from the data. The most important is that the data simply do not contain sufficient information, e.g., due to bad signal-to-noise ratios, large and nonstationary disturbances, varying system properties, etc. The system may also have some complicated nonlinearities, which cannot be realized on physical grounds. In such cases, nonlinear, black box models could be a solution. Among the most used models of this character are the Artificial Neural Networks (ANN) (see Section 58.6). Otherwise, use the insights on which inputs to use and which model orders to expect and proceed to Step 4. Step 4: F i e Thing Orders and Noise Structures For real data there is no such thing as a "correct model structure." However, different structures can give quite different model quality. The only way to determine this is to try a number of different structures and compare the properties of the models obtained. There are a few things to look for in these comparisons:
58.7. USER'S lSSUES
Fit between simulated and measured output Look at the fit between the model's simulated output and the measured one for the Validation Data. *Formally,you could pick that model, for which this number is the lowest. In practice, it is better to be more pragmatic, to take into account the model complexity, and to judge whether the important features of the output response are captured. Rtsidual analysistest You should require of a good model that the cross-correlation function between residuals and input does not go significantly outside the confidence region. A clear peak at lag k shows that the effect from input u ( t - k ) on y ( t )is not properly described. A rule of thumb is that a slowly varying cross-correlation function outside the confidence region is an indication of too few poles, while sharper peaks indicate too few zeros or incorrect delays. Pole-Zero cancellations If the pole-zero plot (including confidence intervals) indicatespole-zero cancellations in the dynamics, this suggests that lower order models can be used. In particular, if the order of ARX models has to be increased to get a good fit, but that pole-zero cancellations are indicated, then the exlra poles are just introduced to describe the noise. Then try ARMAX, OE, or BJ model structures with an A or F polynomial of an order equal to that of the number cif noncancelled poles. What Model Structures Should be Tested? You can spend any amount of time to check out a very large number of structures. It often takes only a few seconds to compute and evaluate a model in a certain structure, so that you should have a generous attitude to the testing. However, experience shows that when the basic properties of the system's behavior have been picked up, it is not useful to fine tune orders ad infiniturn to improve the fit by fractions of a percent. For ARX models and state-space models estimated by silbspace methods, there are also efficient algorithms for handling many model structures in parallel. Multiviariable Systems Systems with many input signals andlor many output signals are called multivariable. Such systems are often more challenging to model. In particular, system!; with several outputs can be difficult because the couiplings between several inputs and outputs lead to more complex models. The structures involved are richer, and more parameters are required to obtain a good fit.
Generally speaking, it is preferable to work with state-space models in the multivariable case, since the model structure complexityis easier to deal with. It is just a matter of choosing the model order. Working with Subsetsof the Input-Output Channels: In the process of identifying good system models, it is useful to select subsets of the input and output channels. Partial models ofthe system's behavior wiIl then be constructed. For example it might not, for example, be clear if all measured inputs have a significant influence on the outputs. That is most easily tested by removing an input channel from the data, building a model for the output(s) dependence on the remaining input channels, and checking if there is a significant deterioration in the fit of the model's output to the measured output. See also the discussion under Step 3 above. Generally speaking, the fit gets better when more inputs are included and worse when more outputs are included. To understand the latter fact, you should realize that a model required to explain the behavior of several outputs has a tougher job than one accounting for only a single output. If you have difficulties in obtaining good models for a multi-output system, model one output at a time, to find out which are the difficult ones to handle. Models to be used for simulations could very well be built up from single-output models, one output at a time. However, models for prediction and control can produce better results if constructed for all outputs simultaneously. This follows from the fact that knowing the set of all previous output channels is a better basis for prediction, than knowing only the past outputs in one channel. Step 5: Accepting the model The final step is to accept, at least for the time being, the model for its intended application. Recall the answer to question 10 in the introduction: No matter how good an estimated model looks on your screen, it is only a simple reflection of reality. Surprisingly often, however, this is sufficientfor rational decision making.
References [ l ] Akaike, H., A new look at the statistical model identification. IEEE Trans. Auto. Control, AC-19, 716-723, 1974. [2] A k a e , H., Stochastic theory of minimal realization. IEEE Trans. Auto. Control, AC-19,667-674, 1974. [3] k t r o m , EL. J. and Bohlin, T., Numerical identifica-
tion of linear dynamic systemsfrom normal operating records. In IFACSymposium on Self-Adaptive Systems, Teddington, England, 1965. [4] Box, G. E. P. and Jenkins, D. R., Time Series Analysis, Fwcasting and Control. Holden-Day, San Francisco, 1970.
THE C O N T R O L H A N D B O O K [5] Breiman, L., Hinging hyperplanes for regression, classification and function approximation. IEEE Trans. Info. Theory, 39,999-1013, 1993. [6] Brillinger, D., Time Series: Data Analysis and Theory. Holden-Day, San Francisco, 1981. [7] Dennis, !. E. and Schnabel, R. B., Numerical methods for unconstrained optimization and nonlinear equations. Prentice Hall, Englewood Cliffs, NJ, 1983. [8] Draper, N. and Smith, H., Applied Regression Analysis, 2nd ed., John Wiley & Sons, New York, 1981. [9] Juditsky,A., Hjalmarsson, H., Benveniste, A., Delyon, B., Ljung, L., Sjoberg, J., and Zhang, Q., Nonlinear black-box modeling in system identification: Mathematical foundations. Automatica, 31, 1995. [lo] Kushner, H. J. and Yang, J., Stochastic approximation with averaging of the iterates: Optimal asymptotic rate of convergence for general processes. SIAM J. Control Optim., 3 1(4), 1045-1062, 1993. (111 Landau, I. D., System Identification and Control Design UsingRI.M. +Software. Prentice Hall, Englewood Cliffs, NJ, 1990. [12] Larimore, W., System identification, reduced order filtering and modeling via canonical variate analysis. In Proc. American Control Conference, San Francisco, 1983. [13] Ljung, L., The System Identijcation Toolbox: The Manual. The Mathworks Inc. 1st ed., (4th ed. 1994), Natick, MA, 1986. 1141 Ljung, L., System Identification - Theory for the User. Prentice Hall, Englewood Cliffs, NJ, 1987. 1151 Ljung, L., Identification of linear systems. In CADfor Control Systems, Linkens, D. A., Ed., Marcel Dekker, New York, 1993. 1161 Ljung, L. and Glad, T., Modeling of Dynamic Systems. Prentice Hall, Englewood Cliffs, NJ, 1994. [17] Ljung, L. and Soderstrom, T., Theory and Practice of Recursive Identification. MIT Press, Cambridge, MA, 1983. [18] MATRIX, users guide. Integrated Systems Inc., Santa Clara, CA, 1991. [19] Overschee, P. V. and DeMoor, B., N4SID: Subspace algorithms for the identification of combined deterministic-stochastic systems. Automatica, 30,7593, 1994. [20] Poggio, T. and Girosi, F., Networks for approximation and learning. Proc. of the IEEE, 78, 1481-1497, 1990. 1211 Polyak, B. T. and Juditsky, A. B., Acceleration of stochastic approximation by averaging. SIAM J. Control Optim., 30,838-855, 1992. [22] Rissanen, J., Basis of invariants and canonical forms for linear dynamic systems. Automatica, 10,175-182, 1974. [23] Rissanen, J., Modeling by shortest data description. Automatica, 14,465-471, 1978. [24] Schoukens, J. and Pintelon, R., Identijication ofLinear Systems: A Practical Guideline to Accurate Modeling.
Pergamon, London, 1991. [25] Sjoberg, J., Zhang, Q., Ljung, L., Benveniste, A,, Delyon, B., Glorennec, P., Hjalmarsson, H., and Juditsky, A., Nonlinear black-box modeling in system identification: A unified overview. Automatica, 31, 1995. [26] Soderstrom, T. and Stoica, P., System Identzj5cation. Prentice Hall Int., London, 1989. [27] Solbrand, G., Ahlen, A., and Ljung, L., Recursive methods for off-line identification. Int. J. Control, 41(1), 177-191, 1985. [28] Zhang, Q. and Benveniste, A., Wavelet networks. IEEE Trans. Neural Networks, 3, 889-898, 1992.
SECTION XI11 Stochastic Control
Discrete Time Markov Processes
Adam Shwartz Electrical Engir~ccring,T c c l l ~ ~ i o ~ ~ - - l s iilsntute rrie1 Haifa, Israel
59.1 59.2 59.3 59.4 59.5 59.6 59.7 59.8
Caveat . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1057 . Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1057 1)efinitions and C o n s t r u c t i o ~ .~. .... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1058 Properties and Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1059 Algebraic View and Stationarity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1060 Random Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1061 Limit Theorems: Transitions.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1062 Ergodic Theorems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1063
Commellts. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1064 References.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1064
T ~ ~ ! , 59.9 ~ ~ Extensions ~ ! ~ ~ ~and ,
59.1 Caveat What follows is a quick survey of the main ingredients In the theory of discrete-time Markov processes. It is a bird's view, rather than the definitive "state ofthe art". To maximize accessibility, the nomenclature of mathematical probability is avoided, although rigor is not sacrificed. To compensate, examples (and counterexamples) abound and the bibliography is annotated. Relevance to control is discussed in Section 59.9.
59.2 Introduction Discrete time Markov processes, or Markov chains, are a powerful tool for modeling and analysis of discrete time systems, whose behavior is influenced by randomness. A Markov chain is probably the simplest object which incorporates both dynamics (that is, notions of "state" and time) and randomness. Let us illustrate the idea through a gambling example.
EXAMPLE 59.1: A gambler bets one dollar on "red" at every turn of a (fair) game of roulette. Then, at every turn he gains either one dollar (win, with probability 112) or (- 1) (lose). In an ideal game, the gains form a sequence of independent, identically distributed (i.i.d.) random variables. We cannot predict the outcome of each bet, although we d o know the odds. Denote by X r the total fortune the gambler has at time t . If we know X,, then we can calculate the distribution of Xs+l (that is, the probability that = y, for all possiblevalues of y ) , and even of Xs+k for any k > 0. The variable Xt serves as a "state" in the following sense: given X,, 0-8493-8570-9/961$0.00+$.50 @ 1996 by CRC Press, Inc.
knowledge of X I for t < s is irrelevant to the calculation of the distribution of future values of the state. This notion of a state is similar to classical "state space" descriptions. Consider the standard linear model of a dynamical system (59.1) xi+l = AX^ UI,
+
or the more general nonlinear model
It is intuitively clear that, given the present state, the past does not provide additional information about the future, as long as vr is not predictable (this is why deterministic state-space models require that vt be allowed to change arbitrarily at each t). This may remain true even when vl is random: for example, when they are i.i.d., for then the past does not provide additional information about future v, 's. As we shall see in Theorem 59.1, in this case (59.1) and (59.2) define Markov chains. In the next section we give the basic definitions and describe the dynamics of Markov chains. We assume that the state space S is countable. We restrict our attention to time-homogeneous dynamics (see comment following Theorem 59.1), and discuss the limiting properties of the Markov chain. Finally, we shall discuss extensions to continuous time and to more general state spaces. We conclude this section with a brief review of standard notation. All of our random variables and events are defined on a probability space with a collection F of events (subsets F,P) is of 52) and probability P.The "probability triple" (a, fixed, and we denote expectation by IE. For events A and B with P ( B ) 0, the basic definition of'a conditional probability (the multiplication rule) is
THE CONTROL HANDBOOK
1058 The abbreviation i.i.d. stands for independent, identically distributed, and random variables are denoted by capital letters. The identity matrix is denoted by I and IA is the indicator function ( w ) = 1 if w E A and = 0 otherwise. of A, that is, lA
59.3 Definitions and Construction Let Xo, X1, . . . be a sequence of random variables, with values in a state space S . We assume that S is finite or countable, and for convenience we usually set S = 11, 2, . . .).
More generally, using the Markov property:
So, the probability distribution of the whole process can be calculated from the two quantities: the initial probability distribution, and the transition probabilities. In particular, we can calculate the probability distribution at any time t
DEFINITION 59.1 A sequence Xo, Xi,. . . on S is a Markov chain if it possesses the Markov property, that is, if for all t > 0, where the sum is over all states in S, that is, each index is summed over the values 1,2, . . .. In vector notation, A Markov chain is called homogeneous if lP (Xt = j lXt- = i ) does not depend on t . In this case we denote the transition probability from i to j by
A
The matrix P={pil, i, j = 1, . . .) is called the transition matrix. Henceforth, we restrict our attention to homogeneous chains. Using the Markov property, a little algebra with the definition of conditional probability gives the more general Markov < t e then property: if tl 5 tz 5 . . . tk 5 . . . -
P (xte= j e , XI,-, = j e - 1 , . . . , Xtk = j k I Xtk-, = j k - 1 , . . . , Xt, = j l ) . . = P ( X t e = j j e , Xtl-l = ~ 1 - 1 , .. . , XI, = j k I Xtk-l
= jk-I).
(59.4)
This is a precise statement of the intuitive idea given in the introduction: given the present state Xtk-, = j k - 1 , the past does not provide additional information. A chain is called finite if S is a finite set. An alternative name for "homogeneous Markov chain" is "Markov chain with stationary transition probabilities". There are those who call a homogeneous Markov chain a "stationary Markov chain." However, since "stationary process" means something entirely different,we shall avoid such usage (see Example 59.6). Suppose that a process is a Markov chain according to Definition 59.1, and that its initial distribution is given by the row vector p(O), that is
Then we can calculate the joint probability distribution at times 0 and 1 from the definition of conditional probability,
(If S is countable then, of course, the matrix is infinite, but it is manipulated in exactly the same way as a finite matrix.) Thus, the probability distribution of a Markov chain evolves as a linear dynamical system, even when its evolution equation (59.2) is nonlinear. The one-dimensional probability distribution and the transition probabilities clearly satisfy
That is, the rows of the transition matrix P sum to one. Thus, P is a stochastic matrix: its elements are nonnegative and its rows sum to one. If we denote by p/:)% (x,+,, = j lXm = i ) the n step transition probability from i to j , then we obtain from the definition of conditional probability and the Markov property (59.4) the Chapman-Kolmogorovequations
Therefore, the are the elements of the matrix P". This matrix notation yields a compact expression for expectations of functions of the state. To compute IEg(Xt) for some function g on S = {1,2, ...}, we represent g by a column vector A
g={g(l), g(2), . . .)* (where T denotes transpose). Then -
Note that this expression does not depend on the particular S: since the state space is countable we can, by definition, relable the states so that the state space becomes {1; 2, . . .}. Let us now summarize the connection between the representations (59.1)(59.2) and Markov chains.
59.4. PROPERTlES AND CLASSIFICATION
THEOREM 59.1 Let Vo, VI,. . . be a sequence 0fi.i.d. random variables, independent of Xo. Then for any (measurable) function f , the sequence Xo, X I , . . . defined through (59.2) is a Markov chain. Conversely, bet go,3 1 , . . . bit a Markov chain with values in S. Then there is a (probability triple and a measurable) function f and a sequence Vo, V l , . . . of i.1.d. random variables so that the process Xo, X I , . . . defined by (59.2) with Xo = ko has the k l , . . ., that is, for all t and same probability distribution as io, JQ, 117 .. ., jt,
Note that, whether the system (59.2) islinear or not, the evolution of the probability distribution (59.5)-(59.6 ) is always linear. We have seen that a Markov chain defines a set of transition probabilities. The converse is also true: given a set of transition probabilities and an initial probability distribution, it is possible to construct a stochastic process which is a Markov chain with the specified transitions and probability distribution.
THEOREM 59.2 If Xo, X I , . . . is a homogeneous Markov chain then its probability distribution and transition probabilities satisfy (59.7) and (59.8). Conversely, given g ( 0 ) and a matrix P that satisfy (59.7), there exists a (probability triple and a) Markov chain with initial distribution p ( 0 ) and transition matrix P. The restriction to homogeneous chains is not too bad: if we A
define a new state k j r , x ) , then it is not hard to see that we can incorporate explicit time dependence, and the new state space is still countable.
1059 DEFINITION 59.2
We say a state i leads to j if
B (xt = j for some t 1x0 = i) > 0 . This holds if and only if > 0 for some r . We say states r and j communicate, denoted by i t, j,if i leads to j and j leads to i.
Communication is a property of pairs: it is obviously symmetric ( i tt j if and only if j t,i ) and is transitive ( i t, j and j tt k implies i t, k ) by the Chapman-Kolmogorov equations (59.8). By convention, i t, i. By these three properties, t,defines an equivalence relation. We can therefore partition S into non-empty communicating classes So, S 1 , . . . with the properties 1. every state i belongs to exactly one class, 2. if i and j belong to the same class then i
tt
j , and
3. if i and j belong to different classes then i and j do
not communicate. We denote the class containing state i by S(r ). Note that if i leads to j but j does not lead to i and j do not communicate. DEFINITION 59.3 p,, = 0 whenever i
E
A sef of states C is closed, or absorbing, if C 6nd j y! C. Equivalently,
If a set C is not closed, then it is called open. A Markov chain is irreducible if all states communicate. In this case its partition contains exactly one class. A Markov chain is indecomposable if its partition contains at most one closed set. Define the incidence matrix Z as follows:
EXAMPLE 59.2: Let Vo, Vl , . . . be i.i.d. and independent of Xo.Assume both Vt and Xo have integer values. Then
defines a Markov chain called a c h i n with stationary independent increments, with state space . . . , - 1,0, 1, .... The transition probability pi, depends only on the difference j - i . It turns out [ I ] that the converse is also true: if the transition probabilities of a Markov chain depend only on the difference j - i then the process can be obtained via (59.9) with i.i.d. Vt. A random walk is a process defined through Equation 59.9, but where the Vt are not necessarily integers-they are real valued.
59.4
Properties and Classification
Given two states i and j , it may or may not be possible to reach * j from i . This leads to the notion of classes.
1,. -
-0
1 ifpij > O
otherwise.
We can also define a directed graph whose nodes are the states, with a directed arc between any two states for which Zij = 1. The communication properties can obviously be extracted from Z or from the directed graph. The chain is irreducible if and only if the directed graph is connected in the sense that, going in the direction of the arcs, we can reach any node from any other node. Closed classes can also be defined in terms of th; incidence matrix or the graph. The classification leads to the following maximal decow.position. THEOREM 59.3 By renumbering the states, ifnecessary, we can put the matrix P into the block form
THE C O N T R O L HANDBOOK where the blocks Pi correspond to closed irreducible classes. It is maximal in the sense that smaller classes will not be closed, and no subset of states corresponding to Q is a closed irreducible class. If S is couhtable, then the number of classes may be infinite. Note that all the definitions in this section apply when we replace S with any closed class. Much like other dynamical systems, Markov chains can have cyclic behavior, and can be unstable. The relevant definitions are DEFINITION 59.4 A state i has period d if d is the greatest common divisor of the set { t : pjf) > 0). If d is finite and d > 1 then the state is called periodic; otherwise it is aperiodic. DEFINITION 59.5 A state i is called recurrent if the probability of starting at i and returniug to i in finite time is 1. Formally, if P(Xt = i for some t > 11x1 = i) = 1. . Otherwise it is called transient.
THEOREM 59.5 Let S be finite. Then P has a nonzero left eigenvector ~rwhose entries are nonnegative, and n . P = n,that is, thecorrespondingeigenvalue is 1. Moreover, /A(5 l forall other eigenvalues A. The multiplicity of the eigenvalue 1 is equal to the number of irreducible closed subsea of the chain. In particular, if the entries of P" are all positive for some n, then the eigenvalue 1 has multiplicity 1. In this case, the entries of n are all positive and IAl c 1for all other eigenvalues A. If the entries of Pn are all positive for some n then the chain is irreducible and aperiodic, hence the second part of the theorem. If the chain is irreducible and periodic with period d , then the dth roots of unity are left eigenvalues of P, each is of multiplicity 1 and all other eigenvalues have strictly smaller modulus. The results for a general finite chain can be obtained by writing the chain in the block form (59.10). DEFINITION 59.6 Let S be finite or countable. A probability distribution p satisfying p P = p is called invarlant (under P) or stationary.
-
Theorem 59.5 thus implies that every finite Markov chain possesses at least one invariant probability distribution. For countable chains, Example 59.4 shows that this is not true.
EXAMPLE 59.3: In the chain on S = {I, 2) with pij = 1 if and only if i # j , both states are periodic with period d = 2 and both states are recurrent. The states communicate, and so S contains exactly one class, which is therefore closed. Consequently the chain is irreducible and indecomposable. However, if p12 = p22 = 1 then state 1 does not lead to itself, the states are not periodic, state 2 is recurrent and state 1 is transient. In this case, the partition contains two sets: the closed set {2}, and the open set {I). Consequently, the chain is not irreducible, but it is indecomposable. When S is finite, then either it is irreducible or it contains a closed proper subset. EXAMPLE 59.4:
+
Let S = {1,2, . . .). Suppose pjj = 1 if and only if j = i 1. Then all states are transient, and S is indecomposable but not irreducible. Every set of the form { i : i 2 k) is closed, but in the partition of the state space each state is the only member in its class. Suppose now pl 1 = 1 and for i > 1, pij = 1 if and only if j = i - 1. Then state 1 is the only recurrent state, and again each state is alone in its class. THEOREM 59.4 Let Sk be a class. Then either all states in Sk are recurrent, or all are transient. Moreover, all states in Sk have thv same period d.
59.5 Algebraic View and Stationarity The matrix P is positive, in the sense that its entries are positive. When S is finite, the Perron-Frobenius theorem implies (81
EXAMPLE 59.5: Returning to Example 59.3, in the first case (112, 112) is the only invariant probability distribution, while in the second case (0, 1) is the only invariant probability distribution. In Example 59.4, in the first case there is no invariant probability distribution, while in the second case (1,0,O, . . .) is the only invariant probability distribution. Finally, if P = I , the 2 x 2 identity A
matrix, then n=(p, 1 - p) is invariant for any 0 5 p 5 1.
EXAMPLE 59.6: Recall that a process Xo, X I ,. . . is called stationary if, for'gll positive t and s, the distribution of {Xo, X I , . . . , X t } is the same as the distribution of ( X , , XI+^, . . . , Xt+,). From the definitions it follows that a (homogeneous) Markov chain (finite or not) is stationary if and only if p(0) is invariant. A very useful tool in the calculation of invariant probability distributions is the "balance equations":
where the first equality is just a restatement of the definition of invariant probability, and the second follows since by (59.7), the last sum equals I. The intuition behind these equations is very useful: in steady state, the rate at which "probability mass enters" must be equal to the rate it "leaves': This is particularly useful for continuous-time chains. More generally, given any set S, the rate
59.6. RANDOM VA.RIABLES at which "probability mass enters" the set (under the stationary distribution) equals the rate it "1ea.ves":
Let S be a set of states. Then
THEOREM 59.6
EXAMPLE 59.7: Random walk with a reflecting barrier. This example models a discrete-time queue where, at each instance, either arrival or departure occurs. The state space S is the set of rion-negative integers (including O), and poO= 1 - p t
Pi(i+l) =I.'.
pi(i-1) = 1 - p
fori
> 1.
Note that by convention, if XI never visits S then rs = co. The initial time, here t = 0, does not qualify in testing whether the process did or did not visit S . 13y definition, hitting times have the following property. In order to decide whether or not r.y = t , it suffices to know the values of Xo, . . . , X,. This gives rise to the notion of Markov time or stopping time. DEFINITION 59.8 A random variable r with positive Integer values is called a stopping time, or Markov time (with respect to the process Xo, X I . . . .) if one of the following equivalent conditions hold. For each t 2 0 1. it suffices to know the values of Xo, X I , . . . , XI in order to determine whether the event ( r = r ) occurred or not, 2. there exists a function
fc so that
t
IT=,(w)= f ; (XO(W),. . . , XI(O)).
Then all states communicate so that the chain is irreducible, the chain is aperiodic and recurrent. From (59.11) we obtain
An equivalent, and nlore standard definition is obtalned by replacing r = t by r 5 t . With respect to such times, the Markov property holds in a stronger sense. When p < 112, this and (59.7) imply that ni =
--& for i 2 0. (1-P)
EXAMPLE 59.8:
THEOREM 59.7 Strong Markov property. I f r is a stopping time for Xo. X I , . . ., then
.
Xr+2 = j 2 , . . . Xs+m = j m = i t , t < T , X7 = i*)
IF'( X ~ + = I
Birth-death process. A Markov chain on S = (0, 1, . . .) is a birth-death process if pij = O whenever li - jl 2 2. If XI is the number of individuals alive at time t then, at any point in time, this number can increase by one (birth), decrease by one (death) or remain constant (simultaneous birth and death). Unlike Example 59.7, here the probability of a change in size may depend on the state.
59.6 Random Variables In this section vve shift our emphasis back from algebra to the stochastic process. We define some useful random variables associated with the Markov chain. It will be convenient to use P j for the probability conditioned o n the process starting at state j. That is, for an event A,
with a similar convention for expectation E,i. The Markov property says that the past of a Markov chain is immaterial given the present. But suppose we observe a process until a random time, say the time a certain event occurs. Is this property preserved? The answer is positive, but only for non-anticipative times: DEFINITION 59.7 Let S be a collection of states, that is, a subset of S. The hitting time TS of S is the first time the Markov chain visits a state in S. Formally,
rs = inf{t s.O : X, E
S).
1 X,
XI = j 1 , X 2 = j 2 , . . . , X m = j m l X O = i * ) . We can now rephrase and complement the definition of recurrence. We write r, when we really mean r( .
,,
DEFINITION 59.9 The state j is recurrent if P,(ri < cm) = 1. It is called positive recurrent if iE, r, < oo, and null-recurrent if IEj?j = co. Ifstate j is recurrent, then the hitting time of j is finite. By the strong Markov property, when the processes hits j for the first time, it "restarts": therefore, it will hit j again! and again! So, let Ni be the number of times the process hits state j :
THEOREM 59.8 1. If a state is positive recurrent, then all states in its
class are positive recurrent. The same holds for null recurrence. 2. Suppose j is recurrent. Then IP;.(Nj= m) = 1, and consequently E, Nj = W. Moreover, for e.very state i,
Pi
( ~ =j cm)
=
Pi (tj< W) . P, ( N , = W)
= Pi
<
03)
,
THE CONTROL HANDBOOK
and if IF',
(7, 4
m) > 0 the?, El N I = m.
3. Suppose j is transrent. Then JP, ( N , < m) = I, and ji~rall i,
Ei N . -
-
Pi (7; < 00) 1-
q, (r,
2. An irreducible chain is positive recurrent ifand only if it has a11 invariant probability distribution n , and i n this case limt,, p!') = n ( j ) for all i , j . If it is null 1i recurrent then for all i, j , limt,, pj:) = 0. rfrtate
j is transient then
4 00)
?i) see why the last relation should hold, note that by the strong Markov property,
xtp!:)
c m.
Since a finite Markov chain always contains a finite closed set of states, there always exists an invariant distribution. Moreover, if a set is recurrent, then it is positive recurrent.
PI (5, < cc and a 5econd visit occurs) = PI (7, <
03)
'
(7,
<
00) ,
EXAMPLE 59.9: Example 59.3 Continued.
,uld sirnilarly for later visits. This mean5 that the distribution of the number of visits is geometric: with every visit we get another c h . l ~ ~ c with e , equal probability, to revisit. Therefore,
El N /
=
P, (sl < 00) + PI (7,
.I
<
For the periodic chain, pli) clearly does not converge. However, P:/ = 112 for all i, J , and the rows define an invariant measure.
03
dnd second visit occurs)
+ ...
PI (T/ < 00) (1 + P (a second vis~toccurslr, < CC)) = PI (7, < w) (1 +P, (s, < m) + ...)
EXAMPLE 59.10: Example 59.8 Continued.
=
+ . . ..
which 1s what we obtain ifwe expand the denominator. A similar interpretation gives rise to
Assume that for the birth death process p,(,+l) > 0 and p(, + I ) , > 0 for all i 3 0 and p,, > 0 for some i. Then the chain is obviously irreducible, and aperiodic (if p,, = 0 for all i then d = 2). Using (59.1 1) we o b ~ a i nthat an invariant probability distribution, if it exists, must satisfy ni =
We have a simple criterion for recurrence in terms of transition probabilities, since
59.7 Limit Theorems: Transitions Classical limit theorems concern the behavior oft -step transition probabilities, for large t . Limits for the random variables are discussed in Section 59.8.
THEOREM 59.9
For every Markov chain, the limit
exists and satisjes
pol . . .p(i-1)r PI0 ' ' . pi(i-I)
. no.
(59.13)
Therefore, any invariant probability must satisfy ni > 0 for all i, and in particular no > 0. So, we can invoke (59.7) to obtain the following dichotomy. Either
in which case (59.13)-(59.14) determine the unique invariant probability, and we conclude that the Markov chain is positive recurrent. Or Z = m, in which case there is no invariant probability and the chain is not positive recurrent. In terms of the transition matrix P. if a chain is nonperiodic, indecomposable, and recurrent then the matrix converges (uniformly over rows) to a matrix having identical rows, which are either all zeroes (null-recurrent case), or equal to the invariant probability distribution. Here are the missing cases from Theorem 59.9. Denote the mean hitting time of state j starting at i by m,, = ElrJ . Clearly mJ, is infinite if j is not positive recurrent, and w2 shall use the convention that a / m = 0 whenever a is finite.
P*.p=p.p*=P*.p*=P*. THEOREM59.10
If S is finite then P* is a stochastic matrix. I . Suppose the Markov chain is indecomposable, recurrent, and nonperiodic. Then, far all states i, j, k,
Ifczstate j is transient thmli~n~,, pj:) = 0.
=.
i f j is recurrent with period d then liml,, p"; = j is m j ~ nonperiodic, this remains true with d = 1, so that (by Theorem 59.9) ~ ( j. m) J j = 1. The last statement should be intuitive: the steady state proba bility of visiting state j is a measure of how often this state is "vis
59.8. E R G O D I C T H E O R E M S ited", and this is inversely proportional to the mean time between visits. The rate at which convergence takes place depends on the second largest eigerivalue of P. Therefore, if the Markov chain is finite, indecomposable, and aperiodic with invariant probability distribution n,then
I'J
k = 1,2, . . .) are independent, and (except possibly for k = 1) are identicallydistributed for each j. By the law oflarge numbers this implies the following.
I
p!t? - nj 5 R ~ ' for all i, j
pias.)
Ei X:=l 1 x s = j 2 ~ , ( t ) EiC:=l I X , ~ = ~ r e '
Nt(j)
lim -I+OO
with p < 1. This of course implies that the one dimensional distributions conv~crgegeometrically fast. On the other hand, the Markov structure implies t k t if indeed the one dimensional distributions converge, then the distribution of the whole process converges:
Let i be recurrent. Then starting at i
T H E O R E M 59.13
and n is a n invariantprobability distribution, concentrated on the closed class containing i , that is,
so that nk = 0 if i f +k . Moreover, i f w e start i n some state j then Suppose that for all i and j we have T H E O R E M 59.11 lim,,, pf:) := ~ i for, some probability distribution n. Then n is an invariantprobability distribution, and for any i ,
lim Pi (Xt+1 = j l , Xt+2 = j 2 . . .) t-+OO
= JP, ( X 1 = j l , X 2 = j 2 . . .)
where P , is obtained by starting the process with the distribution n (in fact, the distribution of the process coliverges).
59.8 Ergodic Theorems We do not expect the Markov chain to converge: since transition probabilities are homogeneous, the probability of leaving a given state does not change in time. However, in analogy with i.i.d. random variables, there are limit theorems under the right scaling. The connection to the i.i.d. case comes from the following construction. Fix an arbitrary state j and define
T H E O R E M 59.12 I,f j is recurrent and the Markov chain starts at j (with probability one), then T i , T2 . . . is a sequence of i.i.d. random variables. Moreover, the random vectors
Nt(i) t
lirn --- r+OO
l,i
with Pj-probability I .
Ill i i
The last relation implies (59.12): taking E, expectations, we obtain that if i is recurrent then P* = PI(rl < c c ) / m l l . From here follows a limit theorem for functions of a Markov chain. T H E O R E M 5 9 1 4 Ergodic Theorem. Let S be a single recurrent class, and assume n is a n invariant probability distribution. Let f a n d g be functionssuch thatIE, If ( X o ) ( < cc andIE, Ig(Xo)l < m. Then for a n arbitrary startingstate i, with probability one,
C:.=of ( x s ) - IEi C:=l f
! X s ) -.- E n f ( X O ) t - f o o C ~ = o g ( ~ s ) EiC:=lg(X,y) Eng(X0)
lim
provided not both numerator and denominator ofthe last terms are zero.
Setting g = 1we obtain a law of large numbers for a function of the Markov chain. Here is a statement of a Central Limit theorem and a Law of Iterated Logarithm. T H E O R E M 59.15 Let S be a single positive recurrent class with a n invariant probability distribution n. Fix a function f and suppose that for some i , IEi [r:] < co and IE, f ( X o ) = 0 and
Define are independent and identically distributed (in the space of sequences of variable length!).
This is anlother manifestation of the fact that, once we know the Markov chain hits some state, future behavior is (probabilistically) determined. Fix a recurrent state i and a time t , and define A T' = m a x { ~ : T~ k 5 t). k
Denote by N , ( j ) the number of times in 1 , 2 , . . . ,s that X u = j . By Theorem 59.12, the random variables { N ~ ( ~j ) + NRk ~ (j),
Then y 2 < 03, and if y 2 >.O then CLT
Ci=o
f
(Xs)
&-7
converges i n distribution (as t -+ m ) to a standard Gaussian random variable. Moreover, the sum satisfies the law of iterated logarithm, that is, the lirn sup of,
T H E CON'rKOL, HANDBOOK time, d~wete-spacemodels are controlled Markov chams. Here the tr;uns~tionprobabilities are parameterized by the control: see the sectlon on Dynam~cProgramming. In addition, many filtering and identification algorithms give rise to Markov chains (usually with values in Illd). Limit theorems for Markov chains can then be used to analyze the limiting propert~esof these algorithms. See, for example [ 4 ] ,(91.
LIL as t -+ co,is 1, and the l i r r ~inf is (-- i ).
59.9 Extensions and Comrne~its Markov processes are very general objects, of which our treatment covered just a fraction. Many of the basic ideas extend, but definitely not all, and usually some effort is required. In addition, the mathematical difficulties rise exponentially fast. There are two obvious directions to extend: more general state spaces, and continuous time. When the state space is not countable, the probability that an arbitrary point in the state space is "visited" by the process is usually zero. Therefore, the notions of communication, hitting, recurrence, and periodicity have to be modified. The most extensive reference here is [4]. In. the discrete-space continuous time setting, the Markov property implies that if xl = i then the values of , K , ~ , s < t are not relevant. In particular, the length of time from the last change in state until t should be irrelevant. This implies that the distribution of the time between jumps (= change in state) should be exponential. Ifthe onlypossible transition is then from i to i 1 and if all transition times have the same distribution (that is, they are all exponential with the same parameter), then we obtain the Poisson process. More generally we can describe most discrete-state, continuous time Markov chains as follows. The process stays at state i an exponential amount of time with parameter h ( i ) . It then jumps to the next state according to a transition probability pi,,, and the procedure is repeated (this is correct if, for example, h ( i ) > h > 0 for all i ) . This subject is covered, for example in [I]. If we obs,ervesuch a process at jump times, then we recover a Markov chain. This is one of the major tools in the analysis of continuous time chains. Semi-Markov processes are a fq~rthergeneralization, where the time between events is drawn from a general distribution, which depends on the state, and possibly on the next state. This is no longer a Markov process; however, if we observe the process only at jump times, then we recover a Markov chain. Finally, in applications, the information structure, and consequently the set of events, is richer: we can measure more than the values of the Markov chain. This is often manifested in a recursion of the type (59.2),but where the Vl are not independent. Do we still get a Markov chain? and in what sense? The rough answer is that, if the Markov property (Equation 59.4) holds., but where we condition on all the available information, then we are back on track: all of our results continue to hold. For this to happen we need the "noise sequence" V o , V 1 ,. . . to be nonanticipative in a probabilistic sense. Criteria for Stability. As in the case of dynamical systems, there are criteria for stability and for recurrence, based on Lyapunov functions. This is one of the main tools in [ 4 ] . These techniques are often the easiest and the most powerful. Relevance to Control. Many models of control systems subject to noise can be modeled as Markov processes, and the discrtte-
E X A M P L E 59.1 1: Extending Example 59.2, consider the recursion
'
where Ut is a control variable. This is a simple instance of a controlled recursion of the ARMA type. Suppose that Ut can only take the values *1. Of course, we require that the control depends only on past information. If the contro! values Ut depend of the past states, then X o , X I , . . . may not be a Markov chain. For example, if we choose Ut = sign ( X o ) , then the sequence X o , X I , . . . violates the Markov property (Definition 59.1). However, we do have a controlled Markov chain. This means that Definition 59.1 is replace with the relation
+
'
This in fact is the general definition of a controlled Markovchain. If we choose a feedback control, that is, Ut = f ( X I ) for some function f,then X o , X i , . . . is again a Markov chain; but the transitions and the limit behavior now depend on the choice of f.For more information on controlled Markov chains, see the section on Dynamic Programming.
References Markov chains are covered by most introductory texts on stochastic processes. Here are some more specific references. [ l ] Chung, K.L., Markov Chains with Stationary Transition Probabilities, Springer-Verlag. New York, 1967.
A classic on discrete space Markov chains, both discrete and continuous time. Very thorough and detailed, but mathematically not elementary. [ 2 ] Cinlar, E., Introduction to Stochastic Processes, Prentice Hall, 1975.
A precise, elegant and accessible book, covering the basics. [3] Kemeny, J.G., Snell, J., and Knapp, A.W., Denumerable Markov Chains, Van Nostrand, Princeton, NJ, 1966.
The most elementary in this list, but fairly thorough, not only in coverage of Markov chains, but also as introduction to Markov processes.
59.9. EXTENS1O.FlS AND COMMENTS
1 [4] Meyn, S.P., and Tweedie, R.L., Markov Chains and Stochastic Stability, Springer-Verlag, London 1993. Deals with general discrete-time Markov chains, and covers the state of the a t . It is therefore demanding mathematically, although not much measure theory is required. The most comprehensive book if you can handle it. [5] Nurnmelin, E., GeneralIrreducibleMarkov'Chainsand Non-NegativeOperators, Cambridge University Press, 1984.
More general than 181, treats general state spaces. Introduced many new techniques; perhaps less encyclopedic than [4], but fairly mathematical. [6] Orey, S., Limit Theoremsfair Markov Chain Transition Probabilities, Van Nostrand Reinhold, London, 1971.
A thin gem on limit theorems, fairly accessible. [7] Revuz, D., Markov Chains, North Holland, Amsterdam, 1984. A mathematical.,thorough treatment. [8] Seneta,E., Non-NegativeMahicesandMarkovChains, Springer Verlag, New York, 198 1.
Gives the algebraic point of view on Markov chains. [9] Tijms, H.C., Stochastic Modelling and Analysis: a Computational Approach, John Wiley, New York, 1986.
Contains a basic introduction to'the subject of Markov chains, both discrete and continuous time, with a wealth of examples for applications and computations.
Stochastic Differential Equations 60.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1067 . Ordinary Differential Equations (ODEs) Stochastic Differential Equations (SDEs)
John A. Gubner University o f Wisconsin-Madison
60.2 White Noise and the Wiener Process ................................1068 60.3 The It6 Integral .......................................................1070 Definition of ItB's Stochastic Integral Properties of the It6 Integral ' A Simple Form of ItB's Rule 60.4 Stochastic Differential Equations and ItB's Rule . . . . . . . . . . . . . . . . . . . .1072 50.5 Applications of ItB's Rule to Linear SDEs . . . . . . . . . . . . . . . . . . . . . . . . . . .1073 Homogeneous Equations Linear SDEs in the Narrow Sense The Langevin Equation 60.6 Transition Probabilities for General SDEs . . . . . . . . . . . . . . . . . . . . . . . . . . . 1075 60.7 Defining Terms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1077 Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1077 References.. ..................... .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1077 Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1077
60.1 Introduction This chapter deals with nonlinear differential equations of the form
deeper theoretical results are needed, the reader is referred to an appropriate text for details. Suggestions for further reading are given at the end of the chapter.
60.1.1 Ordinary Differential Equations (ODEs) where Zt is a Gaussian white noise driving term that is independent of the random initial state X. Since the solutions of these equations are random processes, we are also concerned with the probability distribution of the solution process { X t ) . The classical example is the Langevin equation,
Consider a deterministic nonlinear system whose state at time t is In many engineering problems, it is reasonable to assume that x satisfies an ordinary differential equation (ODE) of the form
x(t).
where z ( t ) is a separately specified input signal. Note that if we integrate both sides from to to t , we obtain where p and j3 are positive constants. In this linear differential equation, Xt models the velocity of a free particle subject to frictional forces and to impulsive forces due to collisions. Here ,u is the coefficient of friction, and j3 = 2/2-,' where rn is the mass of the particle, k is Boltzmann's constant, and T is the absolute temperature [7].As shown at the end of Section 5, with a suitable Gaussian initial condition, the solution of the Langevin equation is a Gaussian random process known as the Ornstein-Uhlenbeck process. The subject of stochastic differential equations is highly technical. However, to make this chapter as accessible as possible, the presentation is mostly on a heuristic level. On occasion, when 0-8493-8570-9/96/$0.00+$.50 @ 1996 by CRC Press, Inc.
Since x (to) = {, x ( t ) satisfies the integral equation
Under certain technical conditions, e.g., [3], it can be shown that there exists a unique solution to Equation 60.1; this is usually accomplished by solving the corresponding integral equation. Now suppose x ( t ) satisfies Equation 60.1. If x ( t ) is passed through a nonlinearity, say y ( t ) := g ( x ( t ) ) , then y ( t o ) = g([), '
' and by the chain rule, y (t) satisfies the differential equation,
+
= gl(x(t))a(t, ~ 0 ) ) gf(x(t))b(t, x(t))z(t), (60.2) assuming g is differentiable.
THE CONTROL HANDBOOK
It6 integral [I],we restrict attention in our discussion to the It6 integral. Now suppose Xt is a solution to the SDE of Equation 60.4, and suppose we pass Xt through a nonlinearity, say Yt := g(Xr). Of course, Yto = g(E), but astonishingly, by the stochastic chain rule, the analog of Equation 60.2 is [ I ]
60.1.2 Stochastic Differential Equations (SDEs) If x models a mechanical system subject to significant vibration, or if x models an electronic system subject to significant thermal noise, it makes sense to regard z(t) as a stochastic, or random process, which we denote by Zt . (Our convention is to denote deterministic functions by lowercase letters with arguments in parentheses and to denote random functions by uppercase letters with subscript arguments.) Now, with a random input signal Zt, the ODE of Equation 60.1 becomes the stochastic differential equation (SDE),
where the initial condition %3 is also random. As our notation indicates, the solution of an SDE is a random process. Typically, we take Zt to be a white noise process; i.e,
.
E[Zt] = 0 and E[ZtZs] = S(t -s), where E denotes expectation and S is the Dirac delta function. In this discussion we further restrict attention to Gaussian white noise. The surprising thing about white noise is that it cannot exist as an ordinary random process (though it does exist as a generalized process [I]). Fortunately, there is a well-defined ordiniiy random process, known as the Wiener process (also known as Brownian motion), denoted by Wt ,that makes a good model for integrated white noise, i.e., Wt behaves as if
or symbolically, d Wt = Ztdt. Thus, ifwe multiply Equation 60.3 by dt, and write dWt for Ztdt, we obtain
To give meaning to Equation 60.4 and to solve Equation 60.4, we will always understand itas shorthand for the corresponding integral equation,
assuming g is twice continuously differentiable. Equation 60.6 is known as ItB's r d e , and the last term in Equation 60.6 is called the It6 correction term. In addition to explaining its presence, the remainder of our discussion is as follows. Section 2 introduces the Wiener process as a model for integrated white noise. In Section 3, integration with respect to the Wiener process is defined, and a simple form of Itb's rule is derived. Section 4 focuses on SDEs. It6's rule is derived for time-invariant nonlinearities, and its extension to time-varying nonlinearities is also given. In Section 5, It6's rule is used to solve special forms of linear SDEs. ODESare derived for the mean and variance of the solution in this case. When the initial condition is also Gaussian, the solution to the linear SDE is a Gaussian process, and its distribution is completely determined by its mean and variance. Nonlinear SDEs are considered in Section 6. The solutions of nonlinear SDEs are non-Gaussian Markov processes. In this' case, we characterize their transition distribution in terms of the Kolrnogorovforward (Eokker-Plan&) and backward partial differential equations.
60.2 White Noise and the Wiener Process A random process {Zt) is said td be a white noise process if E[Zt] = 0 and E[ZtZs] = S(t - s),
where S(t) is the Dirac delta, which is characterized by the two S(t) d t = 1, properties S(t) = 0 for t # 0 and jFW Consider the integrated white noise
We show that integrated white noise satisfies the following five properties. For 0 5 B 5 t. 5 s 5 t, Wo E[WtI
In order to make sense of Equation 60.5, we have to assign a meaning to integrals with respect to a Wiener process. There are two different ways to do this. One is due to It6, and the other is due to Stratonovich. Since the It6 integral is more popular, and since the Stratonovich integral can be expressed in terms of the
(60.7)
= 0, = 0, = t -st
E[(wt - ws12] - E[(wt - Ws)(Wr -we>] =
(60.9) (60.10) (60.1 1)
0,
E[Wtws] '= min{t,s). In other words: Wo is a constant random variable with value zero.
(60.12) (60.13)
60.2. WHITE NOISE AND THE WIENER PROCESS
w-1 wo=O.
Wt has zero mean. Wt - Ws has variance t - s . If (0, r 1 and (s , t 1 are nonoverlappingtime intervals, then the increments Wr - We and Wt - W, are uncorrelated. The correlation between Wt and Ws is E[ Wt Ws] = minit, s } . (A process that satisfies Equation 60.12 is said to have orthogonal increments. A very accessible introduction to orthogonal increments processes can be found in [4].) The property defined in Equation 60.9 is immediate from Equation 60.8. To establish Equation 60.10, write E[W,] = jofE[Zu]du = 0. To derive Equation 60.11, write
W-2
For 0 5 s < t , the increment Wt - W, is a Gaussian random variable with zero mean and variance t - s , i.e.,
W-3
( Wt ,t 0) has independent increments; i.e., if 0 5 tl 5 . . . 5 t,, then the increments
br-4
>
are statistically independent random variables. {Wt, t 2 0) has continuous sample paths with probability I.
A proof of the existence of the Wiener process is given, for e~ample,in [2] and in [4]. A sample path of a standard Wiener process is shown in Figure 60.1.
To obtain Equation 60.12, write
=
l'(ltJ(u )
- u ) d u du.
Because the ranges of integration, which are understood as (8, r ] and ( s , t], do not intersect, 6 ( u - v) = 0, and the inner integral is zero. Finally, tk~eproperties defined in Equations 60.9,60.11, and 60.12 yield Equation 60.13 by writing, when t > s , Figure 60.1
Sample path of a standard Wiener process.
If t < s , a symmetric argument yields E [ WtW s ] = t. Hence, we can in general write E[ Wt W s ] = minit, s ) . It is well known that no process ( Z t} satisfying Equation 60.7 can exist in the usual sense [I]. Hence, defining Wt by Equation 60.8 does not make sense. Fortunately, it is possible to define a random process ( Wt, t 2 0} satisfying Equations 60.9 to 60.13 (as well as additional properties).
We now show that properties W- 1, W-2, and W-3 are sufficient to prove that the Wiener process satisfies Equations 60.9 to 60.13. Clearly, property W-1 and Equation 60.9 are the same. To ?stablish Equation 60.10, put s = 0 in property W-2 and use property W-1. It is clear that property W-2 implies Equation 60.1 1. Also, Equation 60.12 is an immediate consequence of properties W-3 and W-2. Finally, since the Wiener process satisfies Equations 60.9, 60.1 l , and 60.12, the derivation in Equation 60.14 holds for the Wiener process, and thus Equation 60.13 also holds for the Wiener process.
DEFINITION 60.1 The standard Wiener process, or Brownian motion, denoted by { W t , t 2 O}, is characterized by the following four properties:
REMARK60.1 From Figure 60.1, we see that the Wiener process has very jagged sample paths. In fact, if we zoom in on any subinterval, say [2,4] as shown in Figure 60.2, the sample path
THE CONTROL HANDBOOK
1070 looks just as jagged. In other words, the Wiener process is continuous, but seems to have corners everywhere. In fact, it can be shown mathematically [2] that the sample paths of the Wiener process are nowhere differentiable. In other words, Wt cannot be the integral of any reasonable function, which is consistent with our carlier claim that continuous-time white noise cannot exist as an ordinary random process.
and E[Z(Fs] = E[E[Z(3,]1Fs],
t
r s 2 0.
(60.18)
DEFINITION 60.2 A random process I O) is (3;Iadapted (or simply adapted if { 3 t 1is understood), iffor t, HI is an 3 1 -measurable random variable. Obviously, {Wt, t 2 0) is IFt]-adapted. We now show that Wt is a martingale, i.e.,
To see this, first note that since Wo = 0,
It follows, on account of the independent increments of the Wiener process, that Wt - W, is independent of the history 3, for t 2 s; i.e., for any function f , E[f (wt - ws)IFsl = E[f (wt - ws)l. Then, since Wt - W, has zero mean, E[Wt - WslFs] = E[Wt - W,] = 0, Figure 60.2
Closeup of sample path in Figure 60.1.
Let { Wt , t 2 O} be a standard Wiener process. The history of the process up to (and including) time t is denoted by Ft := o(We, 0 5 0 5 t). For our purposes, we say that a random variable X is Ft-measurable if it is a function of the history up to time t; i.e., if there is a deterministic function h such that X = h(We,O 5 0 5 t). For example, if X = Wt - Wtp, then X is Ft-measurable. Another example would be X = j,' We do; in this case, X depends on all of the variables We, 0 p 8 5 t and is Ft-measurable. We also borrow the following notation from !,robability theory. If Z is any random variable, we write EIZIWe, 0 p 8 jt].
1. W, - W, isindependent of?-, with E[(Wt - ws12] = t -s. 2. Wt is a martingale (cf. Equation 60.19). 3. Properties defined in Equations 60.15 and 60.16. For > S > write E[W~IF,] = E[(Wt- w , ) ~ + ~ W , W , - ~ f 1 3 , ] = E[(Wt - W ~ ) ~ ~ ~ ~ ] + ~ W ~ E [ W ~ I F ~ I - W ~
Rearranging, we have
If Z is arbitrary but X is .Ff-measurable, then [2] E[XZ(Ft] = X E[Z(Ft].
E[W: - tlF,] = (60.15)
A special case of Equation 60.15 is obtained if Z = 1; then E[ZIFt] = E [ l IFt] = 1, and hence, if X is 3;-measurable, E[XlFt] = X.
(60.20)
which is equivalent to Equation 60.19. Having shown that Wt is a martingale, we now show that W; - t is also a martingale. To do this, we need the following three facts:
60.3 The It6 Integral
E[ZIFt] insteadof
t 2 s,
(60'16)
We also require the following results from probability theory [2], properties of conditional which we refer to as the expectation:
~f - S ,
t > S,
i.e., W: - t is a martingale.
60.3.1 Definition of Itti's Stochastic Integral We now define the It6 integral. As in the development of the Riemann integral, we begin by defining the integral for functions that are piecewise constant in time; i.e., we consider integrands {H, t 2 0] that are {Ft}-adapted processes satisfying
.
1071
60.3. THE ITO INTEGRAL
for some breakpoints ti < t,+l. Thus, while Ht = Hi, on [ t, , t, + 1 ), the value th, is an Fil -measurable random variable. Without loss of generality, given any 0 5 s < t , we may assume .\ = to c . . . < tn = f . Then the l[tBintegral is defined to be
By the first smoothing property of conditional expectation, the fact that Ht, is Ft,-measurable, and the properties defined in Equations 60.15 and 60.20, EIHt, (Wi,+, -- Wi,)l
E[E[Ht,(w~,+, - l ~ t l ) I FI t] ,
=
= E [ f f i , E i ~ i , +- ~~ t , l 3 i , l ] = 0. To establish Equation 60.25, use Equation 60.22 to write 111handle the general case, suppose Ht is aprocess for which there e x o t s a sequence of processes H,k of the form of Equation 60.21 (where now n and the breakpoints { t , ) depend on k ) such that
l,!
Then we take He $We to be the mean-square limit (which exists) of:\! H i dWe; i.e., there exists a randomvariable, denoted by J: He dWo, such that lim
k-+w
E[I[
H; d
-
[
HQd We
(21
= 0.
(60.23)
First consider a typical term ill the double sum for which j = i. Then by the first smoothing property and Equation 60.15,
Since Wil+,- Wi, is independent of 6, , EI(Wi,+, - wi,)'l,T,I = El(Wt,+l - wt,)'l = ti+,
See [ 101 for details. It should also be noted that when Hi = h ( t )is a deterministic filnction, the right-hand side of Equation 60.22 is a Gaussian random variable, and since mean-square limits of such quantities are also Gaussian 141, {Sf h ( @ dWe, ) t 2 s) is a Gaussian random process. When the integrand of an It6 integralis deterministic, the integralis sometimes calledawiener integral.
H
~
~
,
- tr).
t , ) = ~ ]E [ ~ ; l ( t r + i
E[Ht,HiJ(Wt,+,- Wt,)(Wt1+,- Wil)I E [ E [ H ~ , H ~ , ( W ~-Wt,)(Wt,+, ,+, -~i~)lfi,l]
= EIHt,fftl(Wt,+l - Wt,)E[Wt,+, - wtilFt,l] = 0, by Equation 60.20.
Thus,
li He d We is
~[(l
He ~ W O )=~ J ]t
Third, if Xt :==
~
li
=
E l ~ , dlo .
(60.25)
/of
He d W e , then Xi is a martingale, i.e., E[XiI& I = X, , or in terms of It6 integrals,
E I X f - X y l ~ F , v ]=
E
[li I
HedWeF,
n-1
E [ ~ ' H B ~ W ~ ~=F ~, ] s ~ e d t~z se . ,
I
)IT\]
E[CH ~(wtl+, , - w,, 1
=o
?,-I
= Equaions of
E[H;]dQ.
To show that Xt := 1,' He dWo is a martingale, it suffices to prove that E [ X t - X, J F S= ] 0. Write
=
We verify these properties when Ht is of the form tion 60.21. To prove Equation 60.24, take the expec Equation 60.22 to obtain
- tl.
For the terms with j # i, we can, without loss of generality, take j < i . In this case,
60.3.2 Properties of the It6 Integral The It6 integrai satisfies the following three properties. First, the It6 integral is a zero-mean random variable, i.e.,
~
EIH;(w~,+,- ~
=
Second, the variance of
= E [ E I H ; ( w ~ , +~ W ~ , ) ~ I .I T] I , = E[H;E[(wl,+, -wr,)213t,l]
E[H;(W~,+, -
~ [ f f t( , ~ t , + -~Wl,)l&
I.
r=O
Then by the second smoothing property of conditional expectation and the properties defined in Equations 60.15 and 60.20,
THE CONTROL HANDBOOK
60.3.3 A Simple Form of Itb's Rule
dYt = gt(Wt)d Wt. As we now show, the correct answer is dYt = gt(Wt)dWt $ g t t ( ~ t d) t . Consider the Taylor expansion,
+
The following result is essential to derive ItG's rule.[ 11.
LEMMA 60.1
For any partition of [ s ,t ] ,say s = to < tn = t, put A := maxogsn-l Iti+l - til. If
... < Suppose Yt = g(Wr). For s = to < . . . < tn = t, write
n-l
v then
:=
C(wti+,- wti12,
E[lV - (t - s ) l 2 ] 5 2A(t - s ) .
The importance of the lemma is that it implies that as the A of the partition becomes small, the sum of squared increments 1.' converges in mean square to the length of the interval, t - s.
PROOF60.1 The first step is to note that t - s = ti). Next, let D denote the difference
xyzd (ti+l -
Note that gt(Wti) and g"(Wli)are Fli-measurable. Hence, as the partition becomes finer, we obtain
Writing this in differential form, we have a special case of ItB's rule: If Yt = g(Wt),then
+
dYt = gt(Wt)d ~ t ig1'(l.lr,)d t .
Thus, D is a sum of independent, zero-mean random variables. It follows that the expectation of the cross terms in D2 vanishes, thus leaving
Then Zi is a zero-mean Put Zi := (Wti+,- W t , ) /J-. Gaussian random variable with variance 1 (which implies E [ z ~ ] = 3). We can now write
__
As a simple application of this result, we show that
Take g ( w ) = w 2 . Then g t ( w ) = 2w, and g t t ( w ) = 2. The special case of ItG's rule gives
Converting this to integral form and noting that Yo = W: = 0, Yt
=
Yo
+
l
I
2WodWo
+st
Id@
Since Yt = g(Wt) = w:, the result follows. As noted earlier, integrals with respect to the Wiener process are martingales. Since 1; We dWo = (w: - t ) / 2 ,we now have an alternative proof to the one following Equation 60.20 that w;?- t is a martingale.
Let { H I ,t 0 ) be a continuous {3t}-adapted process. It can be shown [ I ] that as the partition becomes finer,
60.4 Stochastic Differential Equations and It8's Rule Suppose XI satisfies the SDE dXt = a ( t , X t ) d t + b ( t , X t ) d W t ,
We now derive a special case of ItG's rule. Let g ( w ) be a twice continuously differentiabIe function of w . If Yt = g(Wt),our intuition about the chain rule might lead us to write
or equivalently, the integral equation,
(60.27)
60.5. APPLICATIONS OF 1 ~ 6 'RULE s TO LINEAR SDES If Yr. = n- ( X t ) ,where R- is twice continuously differentiable, we show that Yt satisfies the SDE dYt = g'CXt) dXt
+ ig1'(xt)b(t,
1 dt. )
~
~
(60.29)
Using theTaylor expansion of g as we did in the precedingsection, write n-I
Y
- Y.
(
tt
+ -t
+t
-t )
t
r=O
1.
From Equation 60.28 we have the approximation Xt,+l - Xt,
X
tz(tt, Xr,)(tr+l - ti)
1073 which is indeed the integral form of Itd$ rule in Equation 60.29. It6's rule can be extended to handle a time-varying nonlinearity g (t ,x ) whose partial derivatives ag gt := at'
gx :=
ag ax '
and gxx :=
a2 i3 ax2
are continuous [ I 1. If Yt = g(t, X I ) , where XI satisfies Equation 60.27. then dYr
+ b(t,, Xt,)(Wt,+l - Wr,).
=: g r ( t , X t ) d t + g ~ ( t , X t ) d X t i g x x ( t , Xt)b(t, xt12dt
+
== gt(t, X t ) d t + g x ( t , Xt)a(t, X t ) d t
Hence, as the vartition becomes finer,
i::O
converges to
EXAMPLE 60.2: Consider the Langevin equation It remains to consider sums of the form (cf. Equation 60.26)
d X I = -3Xt dt
+ 5 dWt .
n-1
1; gU(Xt, =O
Suppose Yt = sin(tXt). Then with g ( t , x ) = sin(tx), gt ( t ,x ) = x cos(tx),gx(t, x ) = t cos(tx),and g,,(t, x ) = -t2 sin(tx). By the extended It& rule,
) ( ~ t , + ,- X I , 12.
I
The ith term in the sum is approximately
+ Nti. ~ t j ) ' ( ~ t , -+ ~WI,)'].
60.5 Applications of It6's Rule to Linear SDEs
NOWwith A = ma~o5is.m-1Iti+l - ti I, n--1
I?;
g"(X6 )a(ti, X , )'(ti+]
- ti12/
Using It6's rule, we can verify explicit solutions to linear SDEs. By a linear SDE we mean an equation of the form
n-l
F ~ ~ x I g " ( ~ t , ) l a~( tt,i),~ ( t i -+t il) , i =al
which converges to zero as A
+=
0. Also,
n-1
E g u ( ~ t , ) a ( t Xt,)(ti+l i, - ti)(W,+,
- Wt,)
i=O
converges to 0. Finally, note that n-11
C gl'(xti)b(ti ~ t ~ ) ~ (-~ ti)^ r i + ~ i =CI
converges to j,' g t ' ( ~ e ) b ( Bxe12 , dB. Putting this all together, as the partition becomes finer, we have Yt
- Ys =
6'
g1(xe)[a(8,X e ) d t
+ b(Q,~ e ) d ~ e ]
+
Xto = S . (60.31) If E is independent of {Wt - Wb, t 2 to),and if c, B, and y are bounded on a finite interval [to,t f 1, then a unique continuous solution exists on [to,tf 1; ifar, c, /3, and y are bounded on [to,t f ] for every finite tf > to, then a unique solution exists on [to,oo) 111.
dXt = [a@)+ c ( t ) X I ]d t f [ B ( t ) y ( t ) X t ]d W t ,
60.5.1 Homogeneous Equations A linear SDE of the form d x t = c ( t ) x t dt
+
xt, = 6 ,
( t j x td w t ,
(60.32)
is said to behomogeneous. In this case, we claim thatthe solution is Xt = S exp(Yt),where
1 I
yt :=
+
[ c ( e )- Y ( B ) ~ /do ~I
t
Y(9)dwe. to
THE CONTROL HANDBOOK
or in differential form,
gy (t, Y) = @(t,to), and gyy(t,y) = 0. By the extended It65 rule of Equation 60.30,
To verify our claim, we follow [ I ] and simply apply ItB's rule of Equation 60.29:
= c(t)Xt d t
+ y(t)Xt dWt.
Since Xto = S exp(Yto)= E exp(0) = S , we have indeed solved Equation 60.32.
EXAMPLE 60.3: Consider the homogeneous SDE
For this problem, Yt = sin(t)
xt = eYt
= esin(t)-2(t-Wt).
- 2t + 2Wt, and the solution is
which is exactly Equation 60.33. Recalling the text following Equation 60.23, and noting the form of the solution in Equation 60.35, we see that {Xt) is a Gaussian process if and only if S is a Gaussian random variable. In any case, we can always use Equation 60.35 to derive differential equations for the mean and variance of Xt. For example, put m(t) := E[Xt 1. Since It6 integrals have zero mean (recall Equation 60.24), we obtain from Equation 60.35,
and thus
60.5.2 Linear SDEs in the Narrow Sense A linear SDE of the form
dXt = [a(t)
+ c(t)Xt] d t + p(t) d Wr,
Xro = E, (60.33)
is said to be linear in the narrow sense because it is obtained by setting y(t) = 0 in the general linear SDE in Equation 60.31. The solution of this equation is obtained as follows. First put
and m(to) = E [ S ] . We now turn to the covariance function of XI, r(t, S) := E[(Xt - m(t))(X, -- m(s))]. We assume that the initial condition & is independent of {Wt - Wto,t 2 to). Write
Observe that
a@(t1 - to)
- c(t)@(t,to), (60.34) at @(to,to) = 1, and @ (t , to)@(to,0) = @(t,0). Next, let
For s < t, write
or in differential form, dYt = @(to,t)a(t)dt +@(to, t)B(t)dWr. Now put
Then
= @(t,to)E[(S - E [ E ] ) ~ ] @ ( to) s,
r(t,s)
x,
:= @(t,to)Yt
,
Lettingvar(E) := E[(E - E [ E ] ) ~ ]for , arbitrary s and t, we can write r(t, s) In other words, Xt = g(t, Yt), where g(t, y) = @(t, tdy. Using Equation 60.34, gt(t, y) = c(t)@(t,to)y. We also have
=
@(t,to)var($)@(s, to) min(s,t) Q(t, 0)/3(0)~@(s,0)dO.
+1
60.6. TRANSITION PROBABILITIES FOR GENERAL SDES In particular, if we put v(t) := r(t, t) = E[(Xt - m(t))2], then a simple calculation shows
To motivate the general results below, we first consider the narrow-sense linear SDE in Equation 60.33, whose solution is given in Equation 60.35. Since @ ( t ,8) = @(t, s)@(s,e), Equation 60.35 can be rewritten as
60.5.3 The Langevin Equation If a ( t ) = 0 and c(t) and B(t) do not depend on t, then the narrow-sense linear SDE in Equation 60.33 becomes
which is the Langevin equation when c < 0 and j-3 since
0. Now,
@(t. 10) = exp(lot cde) = e c [ t - b l . the solution in Equation 60.35 simplifies to Now consider Pr(Xt 5 y lXs = x). Since we are conditioning on Xs = x, we can replace X, in the preceding equation by x. For notational convenience, let
Then the mean is
and the covariance is
Now assume c < 0, and suppose that E[E] = 0 and var(E) = -p2/(2c). I h e n and
Then Pr(Xt 5 ylX, = x) = Pr(Z 5 ylX, = x). Now, in the definition of 2,the only randomness comes from the It6 integral with deterministic integrand over [s, t ] . The randomness in this integral comes only from the increments of the Wiener process on [s, t]. From Equation 60.35 with t replaced by s , we see that the only randomness in Xs comes from E and from the increments of the Wiener process on [to,s]. Hence, Z and X, are independent, and we can write Pr(Z 5 ylX, = x) = Pr(Z 5 y). Next, from the development of the It6 integral in Section 3, we see that Z is a Gaussian random variable with mean
Since EiXt ] does not depend on t, and since r(t, s) depends only on (t - s ( , {Xt,t 2 to) is said to be wide-sense stationary. If E is also Gaussian, then {Xt, t 2 to) is a Gaussian process known as the Ornstein-Whlenbeck process.
and variance
60.6 Transition Probabilities for General SDEs
Hence, the transition function is
The transition function for a process {XI,t 2 to) is defined to be the conditional cumulative distribution
and the transition density is
P(t, yls, x) := Pr(Xt 5 y(X, = x). When Xt is the solution of an SDE, we can write down a partial differential equation that uniquely determines the transition function if certain,hypotheses are satisfied. Under these hypotheses, the conditional cumulative distribution P(t, yls, x) has a density p(t, yls, x) that is uniquely determined by another partial differential equation.
EXAMPLE 60.4
THE CONTROL HANDBOOK
1076
We now return to the general SDE,
y are fixed, and x varies ass evolves backward in time starting at t.
dXt = a ( t , X t ) d t + b ( t , X t ) d W t ,
Xto=E,
(60.37)
where E is independent of ( Wt - Wto,t 2 to). To guarantee a unique continuous solution on a finite interval, say [to,tf 1, we assume [ I ] that there exists a finite constant K > 0 such that for all t E [to,tf ] and all x and y, a and b satisfy the Lipschitz conditions la(t, x ) - a ( t , y)l
5
Klx
- yl,
- b ( t , y)l
5
Klx
- yl,
Ib(t, x )
REMARK60.2 If the necessary partial derivatives are continuous, then we can differentiatethe backward equation with respect to y and obtain the following equation for the transition density p:
and the growth restrictions EXAMPLE 60.5: If such a K exists for every finite tf > to, then a unique solution exists on [to, m) [I]. Under the above conditions for the general SDE in Equation 60.37, a unique solution exists, although one cannot usually give an explicit formula for it, and so one cannot find the transition function and density as we did in the narrowsense linear case. However, if for some bo > 0, b ( t , x ) bo for all t and all x , then [lo] the transition function P is the unique solution of Kolmogorov's backward equation
- ap(t1yls7x), r o c s < t < t f ,
as
(60.38)
In general, it is very hard to obtain explicit solutions to the forward or backward equations. However, when a ( t , x ) = a ( x ) and b ( t , x ) = b ( x ) do not depend on t , it is sometimes possible to obtain a limiting density p ( y ) = limt,, p ( t , yls, x ) that is independent of s and x . The existence of this limit suggests that for large t , p ( t , yls, x ) settles down to a constant as a function oft. Hence, the partial derivative with respect to t on the righthand side of the forward equation in Equadon 60.39 should be zero. This results in the ODE
For example, let p and h be positive constants, and suppose that
satisfying
dXl = - p x t d t lim P ( t , yls, x ) = s?t
(The case p = 1 and h = 2 was considered by [lo].) Then Equation 60.40 becomes
Furthermore, P has a density, ,
.
,
a P ( t , yls, x )
Integrating both sides, we obtain If a a / a x , a b l a x , and a 2 b / a x 2 also satisfy the Lipschitz and growthconditionsabove, andif a b l a x 2 bo 0anda2b/ a x 2 L bo > 0, then the-transition density p is the unique fundamental solution of Kolrnogorov's forward equation
a p ( t , yls, x ) , at
for some constant K . Now, the left-hand side of this equation is ( A p ) y p ( y ) A(l y 2 ) p ' ( y ) / 2 . If we assume that this goes to zero as 1 y 1 -+ m, then K = 0. In this case,
+
+
+
t,, < s < t < t f ,
satisfying p ( s , yls, x ) = J ( Y - X I .
The forward partial differential equation is also known as the Fokker-Planck equation. Equation 60.39 is called the forward equation because it evolves forward in time starting at s; note also that x is fixed and y varies. In the backward equation, t and
Integrating from 0 to y yields
60.7. DEFINING TERMS Of course, p(O) is determined by the requirement that 100-, p(y)dy = 1. Forexample,ifp/iL = 1/2,thenp(O) = 112, which can be found directly after noting that the antiderivative y2. As a second example, supof 1/(1 y2l3I2is y/d'l pose p / h = 1. Then - p ( y ) has the form p(0) f (y)2, where f (y) = 1/(1 + y2). If we let F(w) denote the Fourier transform off (y), then Parseval's equation yields jFW1 f (Y)12dy = Jrrn I F ( w )d~w ~/ 2 n . Since F ( w ) = ne-IWI, this last integral can be computed in closed form, and we find that p(0) = 2 / n .
+
+
60.7 Defining Terms Adapted: A random process ( H t ] is IFr]-adapted, where FI = a(We,O 5 0 5 t), iffor each t, HI i s 3 t measurable. See measurable. Brownian motion: Synonym for Wiener process. Fokker-Planck equation: Another name for Kolmogorov's forward equation. History: The history of a process {We,0 2 0) up to and including time t is denoted by 31= a (We, 0 5 0 5 t). See also measurable. Homogeneous: Linear stochastic differential equations (SDEs) of the form in Equation 60.32 are homogeneous. Integrated white noise: A random process Wt that behaves as if it had the representation Wt = Ze do, where Ze is a white noise process. The Wiener process is an example cf integrated white noise. Itd correction t e r m The last term in It&'s rule in Equations 60.6 and 60.30. This term accounts for the fact that the Wiener process is not differentiable. It6's rule: A stochastic version of the chain rule. The general form1 is given in Equation 60.30. Kolmogorov's backward equation: The partial differential Equation 60.38 satisfied by the transition function of tk~esolution of an SDE. Kolrnogorov's forward equation: The partial differential Equation 60.39 Satisfied by the transition density of the solution of an SDE. Martingale: ( W'}is an (31)-martingaleif { Wt} is {Ft)adapted and if for all t 2: s 2 0, E[ Wt IFs]= Ws, or equivalently, E[Wt - Ws lFsl = 0. Measurable: See also history. Let Ft = u(We, 0 5 8 5 t ) . A random variable X is Ft-measurable if it is a deterministic function of the random variables {We,0 e .c t ) . Narrow sense: An SDE is linea~rin the narrow sense if it has the form of Equation 60.33. Ornstein-Uhlenbeck process: A Gaussian process with zero mean and covariance function in Equation 60.36. Smoothing properties of conditional expectation: See Equations 60.17 and 60.18.
Transition function: For a process {XI), the transition function is P(t, y Is, x ) := Pr(Xt 5 y IX, = x). White noise process: A random process with zero mean and covariance E[Z,Z,] = Li(t - s). Wiener integral: An It&integral with deterministic integrand. Always yields a Gaussian process. Wiener process: A random process satisfying properties W- 1 through W-4 in Section 2. It serves as a model for integrated white noise.
Acknowledgments The author is grateful to Bob Barmish, Wei-Bin Chang, Majeed Hayat, Bill Levine, Radl Sequeira, and Rajesh Sharma for reading the first draft of this chapter and for their suggestions for improving it.
References [ l ] Arnold, L., Stochastic Differential Equations: Theory and Applications, Wiley, New York, 1974, 48-56, 91, 98, 105, 113, 128, 137, 168. [2] Billingsley, P., ProbabilityandMeasure, 2nd ed., Wiley, NewYork, 1986,sect. 37, pp 469-470 (Equations 34.2, 34.5, 34.6). [3] Coddington, E. A. and Levinson, N., Theory of Ordinary Differential Equations, McGraw-Hill, New York, 1955, chap. 1. [4] Davis, M. H A., LinearEstirnation and Stochastic Control, Chapman and Hall, Londcn, 1977, chap. 3, 5253. [5] Elliott, R. J., Stochastic Calculus, Springer-Verlag,New York, 1982. [6] Ethier, S. N. and Kurtz, T. G., MarkovProcesses: Characterization and Convergence, Wiley, New York, 1986. [7] Karatzas, I. and Shreve, S. E., Brownian Motion and Stochastic Calculus, 2nd ed., Springer-VerPag, New York, 1991,397. [8] Protter, P., Stochastic Integration and Differential Equations, Springer-Verlag,Berlin, 1990. [9] Segall, A., Stochastic processes in estimation theory, IEEE Trans. Inf: Theory, 22(3), 275-286, 1976. [lo] Wong, E. and Hajek, B., Stochastic Processes in Engineering Systems, Springer-Verlag, New York, 1985, chap. 4, pp 173-174.
Further Reading For background on probability theory, especially conditional expectation, we recommend (21. For linear SDEs driven by orthogonal-increments processes, we recommend the very readable text by Davis [4].
THE CONTROL HANDBOOK For SDEs driven by continuous martingales, there is the more advanced book by Karatzas and Shreve 171. For SDEs driven by right-continuous martingales, the theory becomes considerablymore complicated. However, the tutorial paper by Segall [ 9 ] ,which compares discrete-time and continuous-time results, is very readable. Also, Chapter 6 of [lo] is accessible. Highly technicalbooks on SDEs driven by right-continuous martingales include [5] and [8]. For the reader interested in Markov processes there is the advanced text of Ethier and Kurtz [ 6 ] .
Linear Stochastic Input-Output
Models
Introduction ......................................................... 1079 ARMA and ARMAX Models .......................................:. 1079 Linear Filtering ....................................................... 1081 Spectral Factorization ................................................ 1081 Yule-Walker Equations and Other Algorithms for Covariance Calculations.. ........................................... 1082 61.6 Continuous-Time Processes ......................................... 1084 61.7 Optimal Prediction ................................................... 1085 61.8 Wiener Filters 1086 ~ ~ ~ ......................................................... l ~ , References .................................................................... 1088
61.1 61.2 61.3 61.4 61.5
Torsten Soderstrom
Systems and Control Gloup, ~~~~~l~ unlverslty, ~ Sweden
~
61.1 Introduction Stationary stochastic processes are good ways of modeling random disturbances. The treatment here is basically for linear discrete-time input-output motkls. Most modern systems for control and signal processing work with sampled data; hence, discrete-time models are of primary interest. Properties of models, ways to calculate variances, and other second-order moments are treated. The chapter is organized as follows. Autoregressive moving average (ARMA) and ARMA with exogenous input (ARMAX) models are introduced in Section 6 1.2, while Section 6 1.3dealswith the effect of linear filtering. Spectral factorizatio~n,which has a key role when finding appropriate model representations for optimal estimation and control, is described in Section 61.4. Some ways to analyze stochasticsystems by covariance calculations are presented in Section 61.5, while Section 61.6 gives a summary of results for continuoustime processes. In stochastic coni:rol, it is a fundamental ingredient to predict future values of the process. A more general situation is the problem of estimating unmeasurable variables, Mean square optimal prediction is dealt with in Section 61.7, with minimal output variance control as a special application. In Section 6 1.8, a more general estimation problem is treated (covering optimal prediction, filtering and smoothing), usingwiener filters, which are described irn some detail. The chapter is based on [6], where proofs and derivations can be found, as well as several extensions to the multivariable case and to complex-valued signal processing problems. 0-8493-8570-9/%/$0.00+$.50 0 1996 by CRC Press, Inc.
61.2 ARMA and ARMAX Models Wide-sense stationary random processes are often characterized by their first- and second-order moments, that is, by the mean value m = Ex(?) (61.1) and the covariance function A
r ( r ) = E[x(t
+ r ) - ml[x(t) - ml,
(61.2)
where t, r take integer values 0,f1 f 2, .... For a wide-sense stationary process, the expected values in Equations 61.1 and 61.2 are independent oft. As an alternative to the covariance function, one can use its discrete Fourier transform, that is, the spectrum,
Evaluated on the unit circle it is called the spectral density,
(where the last integration is counterclockwise around the unit circle), the spectral density describes how the energy of the signal is distributed over different frequency bands (set r = 0 in Equation 61.5). Similarly, the cross-covariance function between two widesense stationary processes y ( t ) and x ( t ) is defined as
THE CONTROL HANDBOOK 10,
and its associated spectrum is
Process I
Process I1
I
A sequence of independent identically distributed (i.i.d.) random variables is called white noise. A white noise has r ( t ) = 0 for r # 0.
(61.8)
Equivalently, its spectrum is constant for all z. Hence, its energy is distributed evenly over all frequencies. In order to simplify the development here, it is generally assumed that signals have zero mean. This is equivalent to considering only deviations of the signals from an operating point (given by the mean values). Next, an important class of random processes, obtained by linear filtering of white noise, is introduced. Consider y(t) given as the solution to the difference equation y(t)
+
a l y ( t - I ) + . ..+a,y(t-n)= e(t)
+ cle(t - 1) + . . . + c,e(t
- m), (61.9)
where e(t) is white noise. Such a process is called an autoregressive moving average (ARMA) process. If m = 0, it is called an autoregressive (AR) process; and if n = 0, a moving average (MA) process. . Introduce the polynomials
+
and the shift operator q, qx(t) = x(t I). The A of Equation 61.9 can then be written compactly as
M model
As the white noise can be "relabelled" without changing the statistical properties of y(t), Equation 61.9 is much more frequently written in the form
Illustration of some ARMA processes: (a) Process I: A(q) = q2 - 1.59 0.8, C(q) = 1. Pole locations: 0.75 fi0.49; (b) Process 11: A(q) = q2 - 1.09 0.3, C(q) = 1. Pole locations: 0.50 fi0.22; (c) Process 111: A(q) = q2 - 0.59 0.8, C(q) = 1. Polelocations: 0.25f i0.86; (d)Process IV: A(q) = q2 - 1.09+0.8, C(q) = q2 - 0.59 + 0.9. Pole locations: 0.50 f i0.74; (e) Covariance functions; (f) Spectral densities [ProcessesI (solidlines),I1 (dashed lines), 111(dottedlines), IV (dash-dotted lines)]. Figure 61.1
+
+
+
As an alternativeto q, one can use the backward shift operator,
q-' . An ARMA model would then be written as
where the polynomials are
Some illustrations of ARMA processes are given next.
EXAMPLE 61.1: Consider an ARMA process
The coefficients of the A(q) and C(q) polynomials determine the properties of the process. In particular, the roots of A(z), which are called the poles of the process, determine the frequency contents of the process. The closer the poles are located towards the unit circle, the slower or more oscillating the process is. Figure 61.1 illustrates the connections between the A and C coefficients, realizations of processes and their second-order moments as expressed by covariance function and spectral density.
.
The advantage of using the q-formalism is that stability corresponds to the "natural" condition lzl c 1. An advantage of the alternative q-l formalism is that causality considerations (see below) become easier to handle. The q-formalism is used here. In some situations, such as modeling a drifting disturbance, it is appropriate to allow A(z) to have zeros on or even outside the unit circle. Then the process is not wide-sense stationary, but drift away. It should hence only be considered for a finite period of time. The special simple case A(z) = z - 1, C(z) = z is known as a random walk. If an input signal term is added to Eq~ation61.10, we obtain
I
61.4. SPECTRAL, FACTORI.ZATION an ARMA model wlth exogenous input (ARMAX): A(q)y(t) = B(q)lr(t)
+ C(q)e(t).
(61.13)
The system must be ca~rscll,which means that y ( t ) is not allowed to depend on firturc values of the input u ( r + r ) , r > 0. Hence, it is required that deg A > deg B. Otherwise, y ( t ) would depend on future input values. Sometimes higher-order mo~nents(that is, moments of order Such spectra are useful tools for the higher than two) are usef~~l. following: Extracting information due to deviations from a Gaussian distribution Estimatnng the phase of a non-Gaussian process Detecting and characterizingnonlinear mechanisms in time series
where G ( q ) = Egogkq-k is an asymptotically stable filter (that is, it has all poles strictly inside the unit circle). Then y ( t ) is a stationary process with mean
and spectrum 4 , ( z ) and cross-spectrum @ y , tively,
( 2 ) given by, respec-
The interpretation of Equation 61.20 is that the mean value m u is multiplied by the static gain of the filter to get the output mean value m y . The following corollaryis a form of Parseval's relation. Assume that u ( t ) is a white noise sequence with variance h2. Then
To exemplify, we consider the bispectrum, which is the simplest form of higher-order spectrum. Bispectra are useful only for signals that do not have a probability density function that is symmetric around its mean value. For signals with such a symmetry, spectra of order at least four are needed. Let x ( t ) be a scalar stationary process of zero mean. Its third moment sequence, R(m, n ) is defined as
Let u ( t ) further have bispectrum B U ( z l ,22). 'Then By ( can be found after straightforward calculation,
and satisfies a inumber of symmetry relations. The bispectrum is
This is a generalization of Equation 61.2 1. Note that the spectral density (the power spectrum) does not carry information about the phase properties of a filter. In contrast to this, phase properties can be recovered from the bispectrum when it exists. This point is indirectly illustrated in Example 6 1.2.
~ 1 ~ 2 2 )
61.4 Spectral Factorization Let us consider two special cases: Let @ ( z )be a scalar spectrum that is rational in z , that is, it can be written as
Let x(t')be zero mean Gaussian. Then
Let x ( t ) be non-Gaussian white noise, so that x ( t ) and x ( s ) are independent for t # s , E x ( t ) = 0, EX^(^) = cr2, E'x3(t) -= ,3. Then
R(:m,n ) =
0
ifm=n=O, elsewhere.
(61.17)
The bispectrum becomes a constant,
-
61.3 Linear Filtering Let u ( t ) be a stationary process with mean mu and spectrum @, (z), and consider 00
~ ( t=) G ( q ) u ( t )= x g k u ( t - k ) , k=O
(61.19)
Then there are two polynomials
and a positive real number )c2 so that ( 1 ) A ( z ) has all zeros inside the unit circle, ( 2 ) C ( z ) has all zeros inside or on the unit circle, and (3)
In the case where @ ( e i o ) > 0 for all o,C ( z ) will have no zeros on the circle. Note that any continuous spectral density can be approximated arbitrarily well by a rational function in z = eio as in Equations 61.25 to 61.27, provided that m and n are appropriatelychosen. Hence, the assumptions imposed are not restrictive. Instead, the results are applicable, at least with a small approximation error, to a very wide class of stochastic processes.
THE CONTROL HANDBOOK
31082
It is an important implication that (as far as second-order moments are concerned) the underlying stochastic process can be regarded as generated by filtering white noise, that is, as an ARMA process A(q)y(t) Ee(t)e*(t)
= C(q)e(t), = A2.
(61.28)
Hence, for describing stochastic processes (as long as they have rational spectral densities), it is no restriction to assume the input signals are white noise. In Equation 61.28, the sequence {e(t)}is called the output innovations. Spectral factorization can also be viewed as a form of aggregation of noisesources. Assume, for example, that an ARMA process (61.294 is observed, but the observations include measurement noise y(t) = x(t)
+ e(t),
(6
One would expect, for a given process, the same type of filter representation to appear for the power spectrum and for the bispectrum. This is not so in general, as illustrated by the following example.
EXAMPLE 61.2: Consider a process consisting of the sum of two independent AR processes 1 r(t) = -4t) A(q)
(61.29~)
The polynomial D(q) and the noise variance A$ are derived as follows. The spectrum is, according to Equations 61.29a, 61.29b and 6 1 . 2 9 ~
Equating these two expressions gives
The two representations of Equations 61.29a and 61.29b, and Equation 61.29~of the process y(t), are displayed schematically in Figure 61.2.
(61.30)
e(t) being Gaussian white noise and v(t) non..~aussianwhite noise. Both sequences are assumed to have unit variance, and ~ ~ 3 ( ~ ) = 1 , The Gaussian process will not contribute to the bispectrum. Further, R, (zl , z2) = 1, and according to Equation 61.24 the bispectrum will be 1 1 1 -. By'z17z2)= C ( ~ ; I ~ ; I ) c(z1) c(z217
and that v(t) and e(t) are uncorrelated white noise sequences with variances A; and A:, respectively. As far as the second-order properties (such as the spectrum or the covariance function) are concerned, y (t) can be viewed as generated from one single noise source: A ( q ) ~ ( t )= D(q)&(t).
1 + -v(t), c(q)
H(z) = 1/C(z)
(61.31)
(61.32)
is the relevant filter representation as far as the bispectrum is concerned. However, the power spectrum becomes 1 Q(z) = a(z)a(z-I)
+
1 c(z)c(z-~).
(61.33)
and in this case it will have a spectral factor of the form
where
due to the spectral factorization. Clearly, the two filter representations of Equations 6 1.32 and 6 1.34 differ.
61.5 Yule-Walker Equations and Other Algorithms for Covariance Calculations Figure 61.2 servations.
Two representations of an ARMA process with noisy ob-
Spectral factorization can also be performed using a statespace formalism. Then an algebraic Riccati equation (ARE) has to be solved. Its different solutions correspond to hfferent polynomials C(z) satisfying Equation 61.27. The positive definite solution of the ARE corresponds to the C(z) polynomial with all zeros inside the unit circle.
When analyzing stochastic systems it is often important to compute variances and covariances between inputs, outputs and other variables. This' can mostiy be reduced to the problem of computing the covariance function of an ARMA process. Some ways to do this are presented in this section. Consider first the case of an autoregressive (AR) process
61.5. YULE-WALKER EQL'ATIONS AND OTHER ALGORITHMS FOR COVARIANCE CALCULATIONS Note that y (t) can be viewed as a linear combination of all the old values of the noise, that is, {e(s)}:.=-,. By multiplying y(t) by a delayed value of the process, say y(t - r ) , r 2 0, and applying the expectation operator, one obtains
EXAMPLE 61.3: Consider the ARMA process
In this case 12 = 1, m = 1. Equation 61.42 gives
Using Equation 6 1.40 for t = 0 and 1 gives
which is called a Yule-Walker equation. By using Equation 61.37 fort = 0, . . . , n, one can construct the followingsystem of equations for determining the covariance elements r(O), r ( l ) , . . . , r (n):
Consider Equation 61.41 for r = 0 and 1, which gives
By straightforward calculations, it is found that ~ey(O)
rcy(-l) (61.38) Once r(O), . . . , r(n) are known, Equation 61.37 can be iterated (for t = n 1, n 2, . . .) to find further covariance elements. Consider next a full ARMA process
+
k2,
=
k2(c - a ) ,
+
~(t).taly(t-l)+...+a~y(t-n)
= e(t)
,b
---(c - a ) ( l
1-a2
-
ac),
(61.39)
Now computing the cross-covariance function between y (r) and e(t) must involve an intermediate step. Multiplying Equation 61.39 by y(t -- r ) , r > 0, and applying the expectabon operator gives
+ airy ( r - 1) + . . . + anry( r - n) = + c1rey(5 - 1) +. . . + cmrey(T - m).
rFy(l) = and finally,
+ cle(t - 1) + . . . + cme(t - m), ~ e ~ (=t h2. )
ry(r)
=
As an example of alternative approaches for covariance calculations, consider the following situation. Assume that two ARMA processes are given: A(q)y(t) A(q) B(q)
fey(5
(6 1.40)
= B(q)e(t), = qn+alqn-'+ = bog" blqn-I
+
...+a,,
(61.43)
+ . . . + b,,
and
In order to obtain the output covariance function ry(r), the cross-covariance function rel,(t) must first be found. This is done by multiplying Equation 61.39 by e(t - r ) and applying the expectation operator, which leads to Assume that e(t) is the same in Equations 61.43 anJ61.44 and that it is white noise of zero mean and unit variance. The problem is to find the cross-covariance elements where at,, is the Kronecker delta (St,, = 1 if t = s, and 0 elsewhere). As y(t) is a linear combination of {e(s)}:=-,, it is found that rey(T) = 0 for r > 0. Hence, Equation 61.40 gives
The use of Equations 61.40 to 61.42 to derive the autocovariance function is illustrated next by applying them to a first-order ARMA process.
for a number of arguments k. The cross-covariance function r (k) is related to the cross-spectrum 4y (2) as (see Equations 6 1.7 and 6 1.22)
1084
THE CONTROL HANDBOOK
Equating the powers of z leads to
Introduce the two polynomials F(z) = f o z " + f i z n - I + ...+f,,, ~ ( z - l ) = g , ~ - g~l ~ - ( m - l ) . . . f gm-12-I. (61.47) through
+
+
B(z) D(z-') - F(z) A(z) C(z-I) - A(z)
+ Z- ~~ (( 22 --'l1)'
(61.48)
Since zA(z) and ~ ( 2 - I )are coprime (that is, these two polynomials have no common factor), Equation 61.49 has a unique solution. Note that as a linear system of equations, Equation 61.49 has n + rn + 1 equations and the same number of unknowns. The coprimeness condition ensures that a unique solution exists. Equations 61.46 and 61.48 now give
The two terms in the right-hand side of Equation 61.50 can be identified with two parts of the sum. In fact,
Equating the powers of z gives
k
ryw(k) = fk - Gj=1ajr(k - j ) , ryw(k) = - x y , l a j r ( k - j ) ,
(2 5 k 5 n), (kzn). (61.52) Note that the last part of Equation 61.52 is nothing but a YuleWalker type of equation. Similarly,
I
rylu(-1) = go7 rYw(-k) = gk-1 cjr(-k (2 5 k 5 m), ryw(-k) = cjr(-k j), (k > m).
~iz{
Xcl
+ j), (61.53)
+
EXAMPLE 61.4: Consider again a first-order ARMA process y(t)
+ ay(t - 1) = e(t) + ce(t - l),
1 - a2
+
+ C)(Z-' + c) = (fez + fi)(z-'
+a)
+ Z(Z+ ~ ) ( ~ , z - ' ) .
(C
,
fi
= c, go =
- ~ ) (1 NC) 1 - (12
Hence, Equation 61.52 implies that r(0) = =
f,= fl
1 + c2 - 2ac 1-(12
- ar(0) =
(C
-
- a ) ( l - ac) 1 -a2
while Equation 61.53 gives r(-1) r(-k)
=
g , = (C - a ) ( l - ac) 1 -a2
=
(-a)k-lr(-l),
k 2 1.
Needless to say, these expressions for the covariance function are the same as those derived in the previous example.
61.6 Continuous-Time Processes This section illustrates how some of the properties of discretetime stochasticsystems appear in an analog form for continuoustime models. However, white noise in continuous time leads to considerable mathematical difficulties, which must be solved in a rigorous way. See Chapter 60 for more details on this aspect. The covariance function of a process y (t) is still defined as (compare Equation 6 1.2)
assuming for simplicity that y (t) has zero mean. The spectrum will now be W #(s) = /--m r ( r ) ~ - " ~ d r , (61.55) and the spectral density is
The inverse relation to Equation 6 1.55 is
where integration is along the whole imaginary axis. Consider a stationary stochasticprocess described by a spectral density #(iw) that is a rational function of iw. By pure analogy with the discrete-time case it is found that
~ e ~ (=t 1.)
In this case, the autocovariancefunction is sought. - Hence, choose ~ ( t= ) y(t)andthusA(q) = q+a, B(q) = q+c, C(q) = q+a, D(q) = q c. Equation 61.49 becomes (Z
+ c2 - 2ac
r(1)
or, equivalently,
I
1
h=
where the polynomials
61.7. OPTIMAL PREDICTION I
have all their roots in the left half-plane (i.e., in the stability area). Here p is an arbitrary polynomial argument, but can be interpreted as the differentiation operator [ p y ( t ) = y ( t ) l . The effect of filtering a stationary process, say u ( t ) , with an asymptotically stable filter, say H ( p ) , can be easily phrased using the spectra. Let the filtering be described by At) = ff(~)u(t).
( 6 1.60)
@Y(s= ) H(s)~~(-s)$u(s),
(61.61)
for integers (for given integersn andm, there exist unique integers q and r ) . Now
Then again paralleling the discrete-time case. As a consequence, one can interpret any process with a rational spectral density Equation 61.58 as lhaving been generated by filtering as in Equation 61.60 by using B(P) H ( p ) == A(P)
This is the decomposition mentioned previously. The term F ( q ) e ( t 1 ) is a weighted sum of future noise values, while q L ( q ) / C ( q ) y ( t )is a weighted sum of available measurements Y t . Note that it is crucial for stability that C ( q ) has all zeros strictly inside the unit circle (but that this is not restrictive due to spectral factorization). As the future values of the noise are unpredictable, the mean square optimal predictor is given by
+
'
The signal u ( t ) would then have a constant spectral density, $,(iw) E 1. As for the discrete-time case, such a process is called white noise. It will have a covariance function r ( T ) = 6 (s) and hence, in particular, an infinite variance. This indicates difficulties to treat it with mathematical rigor.
61.7 Optimal Prediction while the associated prediction error is
Consider an AlRMA process A(q)y(t)= C(q)e(t), Ee2(t)= h2,
( 6 1.63)
where A ( q ) anti C ( q )are ofdegreen and have all their roots inside the unit circle. We seek a k-step predictor, that is, a function of and has variance available data y ( t ) , y(t - I ) , . . . , that will be close to the future value y ( t +k). In particular, we seek the predictor that is optimal in a mean square sense. The clue to finding this predictor is to rewrite y (t + k) into two terms. The first term is a weighted sum As a more general case, consider prediction of y ( r ) in the ARof future noise values, { e ( t j ) ~ ; = ~As. this term is uncorrelated MAX model to all available data, it cannot be reconstructed in any way. The second term is a weighted sum of past noise values { e ( t - S)),OO,~. (61.71) A ( q ) y ( t ) = B ( q ) u ( r )+ C ( q ) e ( t ) . By inverting the proce'ss model, the second term can be written In this case, proceeding as in Equation 61.67, as a weighted sum of output values, ( y ( t - s ) } E o . Hence, it can be computed exactly from data. In order to proceed, introduce the predictor identity
+
where F(z)
+ f l ~ k - 2+ .. . + fk-1, e,zn-I + e l ~ n - + 2 . . . + en-l.
= zk-'
L(Z) =
(61.65) (61.66)
Equation 61.64 is a special case of a Diophantine equation for polynomials. 14 solution is always possible. This is analogous to the Diophanti~neequation
We find that the prediction error is still given by Equation 61.69, while the optimal predictor is
THE CONTROL HANDBOOK This result can also be used to derive a minimum output variance regulator. That is, let us seek a feedback control for the process Equation 61.71 that minimizes Ey2(t). Let k = deg A - deg B denote the delay in the system. As j ( t k) and j ( t klt) are independent,
+
2
E Y (t
+
It is possible to give an interpretation and alternative view of Equation 61.80. For the optimal filter, the estimation error, S(t), should be uncorrelatedwith all past measurements, {y(t- j ) ) g o . Otherwise, there would be another linear combination of the past measurements giving smaller estimation error variance. Hence,
+ kit) = E j 2 ( t + klt) + E j 2 ( t + k) 2 E I ~ (+~k ) ,
E?(t)y(t - j ) = 0,
all j 3 0,
(61.81)
(6 1.74) with equality if and only if j ( t + kit) = 0, the regulator is ES(t)[G~(q)y(t)l= 0 for any stable and causal G I ( q ) . Optimal prediction can also be carried out using a state-space formalism. It will then involve computing the Kalman filter, and a Riccati equation has to be solved (which corresponds to the spectral factorization). See Chapter 39 for a treatment of linear quadratic stochastic control using state-space techniques.
(61.82)
This can be rewritten as
61.8 Wiener Filters The steady-state linear least mean square estimate is considered in this section. It can be computed using a state-space formalism (like Kalman filters and smoothers), but here a polynomial formalism for an input-output approach is utilized. In case timevarying or transient situations have to be handled, a state-space approach must be used. See Chapter 25 for a parallel treatment of Kalman filters. Let y(t) and s(t) be two correlated and stationary stochastic processes. Assume that y(t) is measured, and find a causal, asymptotically stable filter G(q) such that G(q)y(t) is the optimal linear mean square estimator ofs(t), that is, it minimizes the criterion (6 1.76) v = E[s(t) - G(q)y(t)12.
which is precisely Equation 61.80. From Equation 61.80, one easily finds the unrealizable Wiener filter. Setting the integrand to zero gives G(z)&(z) = @.yy(z), and G(2) = @.sv(z)@yl(2). (61.84)
I
I
The filker is not realizable since it relies (except in very degenerate cases) on all future data points of y ( t ) . Note, though, that when "deriving" Equation 61.84 from Equation 61.80, it was effectively required that Equation 61.80 holds for any Gl(z). However, it is required only that Equation 61.80 holds for any causal and stable G l (2). Such an observation will eventually lead to the optimal realizable filter. To proceed, let the process y(t) have the innovations representation (remember that this is always possible by Section 61.4)
This problem is best treated in the frequency domain. This implies in particular that data are assumed to be available since the infinite past r = -m. Introduce the estimation error
The criterion V (Equation 61.76) can be rewritten as
H (q), H-l (q) asymptotically stable. Then 4, (z) = H (z) H (zdl)l'. Further, introduce the causal part of an analytical function. Let
Next note that
where it is required that the series converges in astrip that includes the unit circle. The causal part of G(z) is defined as Now let G (q) be the optimal filter and G 1( q ) any causal filter. Replace G(q) in Equation 61.76 by G(q)+&Gl(q). AS afunction of s, V can then be written as V = Vo &V1 t2v2.For G(q) to be the optimal filter it is required that V p Vo for all E , which leads to V1 = 0, giving
+
+
and the anticausal part is the complementary part of the sum:
61.8. WIENER FILTERS
1087
It is important to note that the term goz-O in Equation 61.86 appears in the causal part, [G(z)]+. Note that the anticausal ] of a transfer function G ( z ) has no poles inside or part [ G ( z ) on the unit circle, and that a filter G ( z ) is causal if and only if G ( z ) = [ G ( z ) ] + . Using the conventions of Equations 61.87 and 61.88, the optimality condition of Equation 61.80 can be formulated as
Note that it is noncausal, but it is otherwise a perfect estimate since it is without error! Next, the realizable filter is calculated:
To proceed, let A ( z ) and C ( z )have degree n, and introduce the polynomial F ( z ) of degree k - 1 and the polynomial L ( z ) of degree n - 1 by the predictor identity (Equation 61.64). This gives
The stability requirements imply that the function H ( Z - l ) G 1(z-') does not have any poles inside the unit circle. The same istrue for [@s",yz){~(~-l~)]-lh-Z]-, byconstruction. Thelatter function has a zero at z = 0. Hence, by the residue theorem,
The optimal predictor therefore has the form
in agreement with Equation 61.68.
1 dz 21s i f[$sy~(~){iv(~-l)~-l~-~(~-l)~l(~-l)T = 0. (61.90)
The optimal condition of Equation 61.90 is therefore satisfied if
This is the realizable Wienerjilter. It is clear from its construction that it is a causal and asymptotically stable filter. The Wiener filter is illustrated by two examples.
EXAMPLE 61.6:
Consider the measurement of a random signals (t) in additive noise w ( t ) . This noise source need not be white, but it is assumed to be uncorrelated with the signal s(t). Model s ( t ) and w ( t ) as ARMA processes, (see Figure 61.3). Thus,
EXAMPLE 61.5:
Consider the .ARh4A process
Treat the prediction problem
s(t)
-
y(t
+k ) ,
k > 0.
Figure 61.3
Setup for a polynomial estimation problem.
In this case
~ ( t ) e ( s=) k~&,,. Ee(t)u(s)= 0. The unrealizable filter of Equation 61.84 becomes, as before,
meaning that
;(t) = y(t + k ) .
The polpomials in the model are = q n + c l q " J + ...+c,, A(qE = qn alqn-' . . . +a,,, M(q)l = qr+mlqr-' +...+ m,, . . . + n, . N ( q ) = qr nlqr-'
+ +
+ +
THE CONTROL H A N D B O O K Assume that all four polynomials have all their roots inside the unit circle. The problem to be treated is to estimate s ( j - m ) from ( y( t - j ) ) G o ,where rn is an integer. (By definition, rn = 0 gives filtering, m > 0 smoothing, and rn i0 prediction.) To solve the estimation problem, first perform a spectral factorization of the output spectrum
References
requiring that B ( z ) is a monic polynomial (that is, it has a leading coeffici'ent equal to one) of degree n r, and that it has all zeros inside the unit circle. The polynomial B ( z ) is therefore uniquely given by the identity
+
According to Equation 61.9 1, the optimal filter becomes
The causal part [ ]+ can be found by solving the Diophantine equation
where the unknown polynomials have degrees deg R deg L
= n - min(0, -rn), = n r - 1 + max(0, -m).
+
Note that the "-1" that appears in deg L has no direct correspondence in deg R. The reason is that the direct term goz-O in Equation 6 1.86 is associated with the causal part of G (z). The optimal filter is readily found:
[ I ] Astrom, K.J., Introduction to Stochastic Control, Academic Press, New York, 1970. [2] Astrom, K.J. and Wittenmark, B., Computer Controlled Systems, Prentice Hall, Englewood Cliffs, NJ, 1990. [3] Grimble, M.J. and Johnson, M.A., Optimal Control and Stochastic Estimation, John Wiley & Sons, Chichester, UK, 1988. [4] Hunt, K.J. Ed., Polynomial Methods in Optimal Control and Filtering, Peter PeregrinusLtd, Stevenage,UK, 1993 (in particular, Chapter 6: A. Ahltn and M. Sternad: Optimal Filtering Problems). [5] Kutera, V., Discrete Linear Control, John Wiley & Sons, Chichester, UK, 1979. [6] Soderstrom, T., Discrete-Time Stochastic Systems: Estimation and Control, Prentice Hall International, Hemel Hempstead, UK, 1994.
Minimum Variance Control
M. R. Katebi and .A. W. Ordys IndustnaI Control Centre, Strothclyde University, Glasgow, Scotland
62.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1089 62.2 Basic Requirements.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1090 The Process Model The Uisturhance Model The Performance Index The Control Design Specification 62.3 Optimal Prediction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1091 The d-Step Ahead Predictor ' Predictor Solution 62.4 Minimum-Variance Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1092 Minimum-VarianceControl Law Closed-Loop Stability The Traclang Problem 'The Weighted Minimum Variance 'The Generalized Minimum Variance Control 62.5 State-Space Minimum-'Variance Controller.. . . . . . . . . . . . . . . . . . . . . . . 1093 62.6 Example.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1094 . 62.7 Defining Terms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1094 Expectation Operator Linear Quadratic Gaussian (LQG) Minimum Phase Steady-StateOffset White Noise .. References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1095 Further Reading .............................................................1096
62.1 Introduction Historically, Minimum-Variance Control (MVC) has been a practical control strategy for appkying of linear stochastic control theory. The control algorithm was first formulated in [I]. Astrom used the MVC technique to minimize the variance of the output signal for control oFpaper thickness in a paper machine. The control objective was to achieve the lowest possible variation of paper thickness with stochastic disturbances acting on the process. Sincethen, the MVC technique has attracted many practical applications (e.g. [8], [12]) and has gained significant theoretical development leading to a wide range of control algorithms known as Model-Based Predictive Control techniques. The reasons for MVC's popularity lie in its simplicity, ease of interpretation and implementation, and its relatively low requirements for model accuracy and complexity. In fact, the models can be obtained from straightforward identification schemes. The MVC technique was also used for self-tuning algorithms. When MVC was developed, its low computational overhead was important. MVC is still used as a simple and efficient algorithm in certain types of control problerns [lo]. It is particularly well suited for situations where [I]
the control task: is to keep variables dose to the operating points, 0-8493-8570-9/96/$0.00+$.50 @ 1996 by CRC Press, lnc.
the process can be modeled as linear, time invariant but with significant time delay, and disturbances can be described by their stochastic characteristics.
MVC, as it was originally presented, provides a stable control action for only a limited range of processes. One of its main assumptions is so-called "minimum-phase" behavior of the system. Later, extensions to MVC were introduced in [ 3 ] ,[5], [ 7 ] to relax the most restrictive assumptions. The penalty for this extension was increased complexity of the algorithms, eventually approaching or sometimes even exceeding the complexity of the standard Linear Quadratic Gaussian (LQG) solution. However, the formulation of MVC was a cornerstone in stimulating development of the more general class of Model-Based Predictive Control, of which MVC is an example. This article is organized as follows. The model and control design requirements are described in Section 62.2.1. The formulation of an MV predictor is given in Section 62.3. The basic MVC and its variations are developed in Section 62.4. The statespace formulation is given in Section 62.5. An example of the application of MVC is given in Section 62.6. Finally, conclusions are drawn in Section 62.7.
1090
THE CONTROL HANDBOOK
62.2 Basic Reauirements
coefficient in polynomial A (2-l) is a0 then both sides of Equation 62.1 may be divided by a0 and the remaining coefficients adjusted accordingly. Similarly, if the coefficient in polynomial C (z-I) is co then its value may be included in the disturbance description,
62.2.1 The Process Model
~ ( t=) co . ~ ( t ) .
In its basic form, the minimum-variance control assumes a single-input, single-output (SISO), linear, time-invariant stationary stochastic process described by
-
(62.6)
The new disturbance is still a zero mean, Gaussian white noise with variance a' = c i . a .
62.2.2 The Disturbance Model where y ( t ) represents variation of the output,signal around a given steady-state operating value, u ( t ) is the control signal, and w ( t )denotes a disturbance assumed to be a zero mean, Gaussian white noise of variance a. A (z-') , B ( z - l , C ) ( z - l ) are nth order polynomials,
Consider a discrete, time-variant system of transfer function G (z-') : y ( t ) = G (z-') w ( t ) . Let the input signal be white noise of zero mean and unity variance. The spectral density of the signal may be written as
4yy(0) = ~ ( e - j ~ ) ~ ( e j ~ ) r( w$ ), = , I
G ( ~ - J ~ ) I ' ~ ~ ~ ( ~ ) . (62.7)
For the special class of a stationary stochastic process with rational spectral density, the process can be represented by the rational transfer function, G (z-') = C (z-') / A ( z - l ) driven by white noise so that G has all its poles outside the unit circle and zeros outside or on the unit circle. This process is known as spectralfactorization and may be used to model the disturbance part of the model given in Equation 62.1. This implies that the output spectrum should first be calculated using the output measurement. The transfer function G ( z - l ) is then fitted to the spectrum using a least-squares algorithm.
and 2-l is the backward shift operator, i.e.,
62.2.3 The Performance Index
I
w(t) Disturbance
The performance index to be minimized is the variance of the output at t + d , given all the information up to time t , expressed by J(l) =E
Figure 62.1
The process and the feedback system structure.
Thus z-d in Equation 62.1 represents a d-step delay in the control signal. This means that the control signal starts to act on the system afier d time increments. The coefficients of polynomials A ( z - l ) ,B (z-l) , and C ( z - l ) are selected so that the linear dynamics of the system are accurately represented about an operating point. The coefficientvalues may result from considering physichl laws governing the process or, more likely, from system identification schemes. Some of these coefficientsmay be set to zero. In particular, if certain coefficientswith index n (e.g., c,) are set to zero then the polynomialswill have different orders. Also, note that the first coefficientsassociated with zero power of z-I in A ( 2 - I ) and C ( z - l ) are upity. This does not impose any restrictions on the system description. For example, if the
[ Y O +d ) 2 ] ,
(62.8)
where E { . ) denotes the expectation operator. It is assumed that the control selected must be a function of information available at time t , i.e., all past control signals, all past outputs, and the present output is
Therefore, the feasibility of the solution requires rewriting3he performance index Equation 62.8 in the form ,
where E{.l.) is the conditional expectation. Then, the optirnization task is choosing control signd u ( t ) so that J ( t ) is minimized with respect to u ( t ) ,
62.3. OPTIMAL PREDICTION
62.2.4 The Control Design Specification From a practical point of view, after applying the MVC it is important that the closed-loop system meets the following requirements : The closed-loop system is stable. The stead,y-state offset is zero. The system is robust, i.e., the behavior of the system does not change with respect to parameter changes.
+
The random variables [ w ( t d - 1 ) , w ( t + d - 2 ) . . . .] are assumed independent of the process output [ y ( t ) , y ( t - 1). . . .], and all future control inputs are assumed to be zero. We can therefore separate the disturbance signal into casual and noncasual parts [ 3 ] .
where E and F are polynomials defined as d
It will be shown in later sections how the minimum-variance design fulfills these requirements.
;=I n-l
62.3 O~timalPrediction The control objective of the feedback system shown in Figure 62.1 is to compensate for the effect of the disturbance w ( t )at any time t. Because of the process time delay, z - ~ the , control input, u ( t ) , at time t , influences the output, y ( t ) , at time t d . This implies that the value of the output at time t d is needed to calculate the control input u ( t ) at time t . Because the output is not known at the future time t d , we predict its value using the process model given in Equation 62.1. Thus, we first develop a predictor in the next section.
+
+
and the following polynomial relationship should be satisfied: A(Z-')E(z-I)
+ z-"(~-')
-e7
=C(Z-I).
(62.17)
Substituting for w ( t ) in Equation 62.13 using Equation 62.14 gives
+
which can be simplified to
62.3.1 The d-Step Ahead Predictor Our objective is to plredict the value of the output at time t + d , defined by j ( t $. d l t ) , given all the information, Y ( t ) = [u(t d - l ) , u ( t - d - - 2 ) . . . y ( t ) ,y ( t - I ) , . . . ] uptotimet. Wealso want to predict the output so that the variance of the prediction error, e p ( t d ) = [ y ( t d ) - j ( t + d J t ) ] is , minimized, i.e., find j ( t d l t ) so that
+
+
+
+ Ew(t + d ) .
(62.19)
Replacing (zPdF) from Equation 62.17, F BE ~ ( t + d ) = - C~ ( t ) + ~ y ( t ) + E w ( t + d ) . (62.20) Substituting this equation in the minimum prediction cost J ( t d) leads to
+
is minimized subject to the process dynamics given in Equation 62.l.
62.3.2 Predictor Solution We first calculate the output at time t +d using the process model,
where the argument z-' is deleted from the polynomials A , B, and C for convenience. The control input is known up to time t - 1 . The disturbance input can be split into two sets of signals, namely, [ w ( t ) ,w ( t l ) . . . , ] a n d [ w ( t + d - l ) , w ( t + d - 2 ) , . . . , wjt+l)].Thelatter are future disturbances, and so their values are unknown. The former are past disturbances and their values may be calculated from the known past inputs and outputs by
+
The last term on the right-hand side is zero because [ w ( t d ) , w ( t + d - l ) , . . . , w(t l ) ] are independent of [ y ( t ) ,y(t 1 ) , . . .] and E { w ( t ) )= 0. The variance of the error is therefore the sum of two positive terms. The second term represents the future disturbance input at t d and can take any value. To minimize J ( t d ) ,we make the first term zero to find the minimum variance error predictor,
+
+
+
THE CONTROL, HANDBOOK
62.4.2 Closed-Loop Stability
This can be implemented in time domain as
The closed-loop characteristic equation for the system of Figure 62.1 may be written as For the predictor to be stable, the polynomials C and B should have all of their roots outside the unit circle. The implementation of the above optimal predictors requires solving the polynomial Equation 62.16 to find ~ ( 2 - I )and F ( Z - I ) . There are a number of efficient techniques to solve this equation. A simple solution is to equate the coefficients of different power of z-I and solve the resulting set of linear algebraic equations. The minimum variance of the error can also be calculated as
Replacing for G and H from Equation 62.27 and using the polynomial identity Equation 62.17 gives
The closed loop is asymptotically stable if the polynomials C and B have all of their roots outside the unit circle. This implies that the basic version of MVC is applicable to minimum-phase systems with disturbances which have a stable and rational power spectrum.
62.4.3 The Tracking Problem
62.4 Minimum-Variance Control Given the dynamical process model in Equation 62.1, the minimum-variance control problem can be stated as finding a control law which minimizes the variance of the output described in the performance index of Equation 62.1 1.
The basic MVC regulates the variance of the output about the operating point. It is often necessary in process control to change the level of the operating point while the system is under closedloop control. The control will then be required to minimize the variance of the error between the desired process set point and the actual output. Assuming the set point, r ( t d ) , is known, the error at time t d is defined as
+
62.4.1 Minimum-Variance Control Law Assuming that all of the outputs [ y ( t ) ,y ( t - I ) , . . .] and the past control inputs [ u ( t - I ) , u ( t - 2 ) , . . .] are known at time t , the problem is to determine u ( t ) so that the variance of the output is as small as possible. The control signal u ( t ) will affect only y(t d ) but not any earlier outputs. Use Equation 62.17 for the output,
+
The performance index to be mi'nimized may now be written
+
Following the same procedure as in the previous section, the control law may be written Substitute this equation in the minimum variance performance index to obtain The feedback controller is similar to the basic MVC and hence the stability are the same. The set point is introduced directly in the coiltroller and variance of the error rather than the variance of the output signal is minimized. The expected value of the cross-product term is zero because [ w ( t + d ) , w(r + d - I ) , . . . , w ( t I ) ] are assumed to be independent of { y ( t ) ,y(t - l ) , . . .]. The performance index is minimum if the first term is zero:
+
The minimum variance control law is then given by u(t)=
=-hY(t).
(62.28)
Note that the controller is stable only if ~ ( z - ' has ) all of its roots outside the unit circle. This implies that the process should be minimum phase for the closed-loop system to be stable.
62.4.4 The Weighted Minimum Variance The performance index given in Equation 62.10 does not penalize the control input u ( t ) . This can lead to excessive input changes and hence actuator saturation. Moreover, the original form of MVC developed by Astrom [ l ] stabilizes only the minimumphase systems. Clarke and Hastings-James [7] proposed that a weighting, R, of the manipulated variable be included in the performance index,
62.5. STATE-SPACE MINIMUM-VARIANCE CONTROLLER
Minimizing this performance index with respect to the control input and subject to the process dynamic will lead to the weighted MV control law,
+ .----r(k CI
u(t) = --f--Y(t) BE
+ 2~
BE
+ $C
+ d).
1093 The control law is chosen so that the d-step ahead prediction #(t dlt) is zero:
+
(62.35)
The closed-loop characteristic equation for the weighted minimum variance is calculated from Equation 62.24 as
The closed-loop system is now asymptotically stable only if the roots of C are outside the unit circle. The system can now be nonminimum phase if an appropriate value of R is chosen to move the unstable zero outside the unit circle. The disadvantage is that the variance of error is not directly minimized. It should be pointed out that MVC without the control input weighting can also bc designed for nonminimum-phase systems as shown by Astrom and 'Nittenmark [3].
Note that by setting Pd = Qd = 1 and R = P,, = 0, the basic MVC is obtained. Qd may be used to influence the magnitude of control input signal. In particular, choosing Qd = (1-2-') Q~ introduces integral action in the controller. The weighting ~ ( z - ' )adjusts the speed of response of the controller and hence prevents actuator saturation. The closed-loop transfer function, derived by substituting for . u(t) in Equation 62.1 is
Solving the characteristic equation,
62.4.5 The! Generalized Minimum Variance Control
gives the closed loop poles. The weightings Q and R may be selected so that the poles are located in a desired position on the complex plane.
Clarke and Gawthrop [5], [6] developed the Generalized Minimum-Variance Controller (GMVC) for self-tuning control application by introducing the reference signal and auxiliaryvariables (weighting functions) into the performance index. GMVC minimizes the variance of an auxiliary output of the form,
62.5 State-Space Minimum-Variance Controller The state-space description of a linear, time-invariant discretetime system has the form,
where Q(Z-') = Q~(z-')/Q~(z-'),R(z-') = Rn(z- 1)/ ~ ~ ( ~ ~- ( 1~ -) 1~.)r P,(~-')/ pd(z-l) are stable weighting functions. The performance index minimized subject to the process dynamics itr
+
The signals u (t:) and r (t d) are known at time t. The prediction of 9 ( t dlt) ?will reduce i:o the prediction of Qy(t d). Multiplying the process model by Q ,
+
+
where x(t), u(t), y(t) are state, input, and output vectors of size n, m, and r , respectively, and A , B, and C are system matrices of appropriate dimensions. Also [w(t)] and [v(t)] are white noise sequences. The states and outputs of the system may be estimated using a Kalman filter as discussed in other chapters of the Handbook,
Replacing A , M , and C by Qd A , QnB, and Q,C in Equation 62.23 leads to the predictor equation for GMVC,
and the polynomial identity,
where e(t) is the innovation signal with property, E [e(t)Jt]= 0 and K (t) is the Kalman fdter gain. The optimal d-step aheadpredictor of [ i ( t ) ] is the conditional mean of x(t d:). From Equation 62.47,
+
) r(t) to Equation 62.41, Adding the contribution from ~ ( t and the predictor for th'e auxiliary output @(t)is
THE CONTROL HANDBOOK
Taking the conditional expectation and using the property of innovation signal, j(t
+dlt)
+
= CA"'~ (t 1) (62.49) = CA"-' { [ A- K ( t ) C ]f ( t )
+ CBu(t)+ K(t)y(t)).
For the performance index given by Equation 62.1 1 , the optimum control input will be achieved if the d-step prediction of output signal is zero. Therefore, the state-space version of minimum-variance controller takes the form, u ( t ) = [ c A ~ - ' B ] - ' c A d - ' { [ A- K ( t ) C ]i ( t )
time(sec) Figure 62.3
The prediction of a sine wave.
+ ~ ( t ) ~ ( .t ) } (62.50)
It can be shown [ 4 ] ,[13]that this controller is equivalent to the minimum-variance regulator described by Equation 62.28. The multivariable formulation of the MVC algorithm in statespace form is similar to the one described above. The polynomial VVG for multi-input, multioutput is given in [ 9 ] .
62.6 Example As a simple example illustrating some properties of minimumvariance control, consider a plant described by the following input-output relationship:
Note that the delay of control signal is 2. The open-loop response of the system is shown in Figure 62.2 where the variance of the output is 2.8. To illustrate the performance of the predictor, the polynomialidentityissolvedto give ~ ( z - ' )= (1 -0.01~-') and F(Z-') = (0.049-0.0082~-'). Asinewave,u(t) = sin(0.2t), is applied to the input and the prediction of the output is shown in Figure 62.3. The theoretical minimum prediction error variance of 1 should be obtained but the simulated error is calculated as
The basic MV controller Equation 62.28 is then applied to the system. The input and output are shown in Figure 62.4. The variance is now reduced to 1.26. TOexamine the set point tracking ofthe basic MVC, a reference of magnitude 10 is used. The results are shown in Figure 62.5. The tracking error is small, but the magnitude of the control signal is large and this is not realistic in many practical applications. To reduce the size of the control input, WMV may be used. The result of introducing a weighting of R = 0.8 is shown in Figure 62.6. The set point is changed to 20 to illustrate the effect more clearly. The WMVC will reduce the control activity but it will also increase the tracking error. To keep the control input within a realistic range and to obtain zero tracking error, the GMVC may be used. ,+, Selecting Q d = ( 1 - z-') will introduce integral action in the controller. The solution to the identity polynomial may then be obtained as ~ ( z - ' )= (1 - 0.01~-') and ~ ( z - ' )= (1.06 1.842-' + 0.083z-~). Note that the order of the controller has increased by one due to integral action. The response of the system in Figure 62.7 shows the reduction in the tracking error and the control activity. If the plant description changes slightly,
1.49.
The basic MVC produces an unstable closed-loop system due to an unstable zero. The GMVC will however stabilize the system as shown in Figure 62.8.
62.7 Defining Terms 62.7.1 Expectation Operator If g(C) is a function of a random variable 5 described by the probability density (p(<),the expectation operator of g is defined
time(sec) Figure 62.2
The open-loop response.
The conditional expectation will be obtained if the probability density function ~ ( 5is) replaced by conditional probability
62.7. D E F I N I N G T E R M S
Output
o.+pvvf.yf -.,0
-1
2
4
6
8
10
0
2
4
6
Figure 62.4
R
in
time (sec)
time (sec) Basic MVC. (a) (output), (b) (input).
Input 5
-
f'--'
Output
I 0
I 2
4
(p(<
8
10
time (sec)
time (sec) Figure 62.5
6
Basic MVC with set point.
Iq), where q is another random variable, 03
E lg(
zero, if possible). It is well-known that the systems with integral action have zero steady-state offset.
Il(i)l~(
The conditional probability can be obtained from Bayes equation,
62.7.5 White Noise A stochastic process v ( t ) is called white noise if its realizations in time are uncorrelated, i.e., the autocorrellation function, E [ v ( t ) v ( t 6 ) ] has nonzero value only for 6 = 0. The process is called Gaussian white noise if, in addition to the above assumption, the probability density function is Gaussian at each instant.
+
62.7.2 Linear Quadratic Gaussian (LQG) A dynamic optimization problem can be described by a linear model of the system (either continuous-time or discrete-time) with additive disturbances which are stochastic processes with Gaussian probability density functions. The performance index to be minimized is a finite- or infinite-horizon sum involving quadratic terms con the state of the system and/or on the control.
62.7.3 Minimum Phase A system is minimum phase if the open-loop transfer function does not include right half-plane zeros (in the continuous-time case). In the disczrete-time case this corresponds to all zeros lying outside the unit circle. Practically, minimum-phase behavior of the system implies that the step response of the system starts in a "right direction': The right-hand side limit of the time derivative of the system's step response has the same sign as the steady-state gain.
62.7.4 Steady-State Offset The differencebetween the set point (desired) value of the output and the actual steady-state valut: of the output is called steadystate offset. It is desirable to reduce the steady-state offset (to
References [ I ] Astrom, K.J., Computer Control of a Paper Machine -An application of linear stochastic control theory, IBM J. Res. Dev., 11,389-404, 1967. [2] Astrom, K.J., Introduction to stochastic control theory, Academic, 1970. [3] Astrom, K.J. and Wittemnark, Computer Controlled Systems, Prentice Hall, 1984. [4] Blachuta, M. and Ordys, A.W., Optimal and asymptotically optimal linear regulators resulting from a one-stage performance index, Int. J. Syst. Sci., 18(7), 1377-1385, 1987. [5] Clarke, D. W. and Gawthrop, P.J., Self tuning controller, Proc. IEEE, 122(9),929-934, 1975. [6] Clarke, D.W. and Gawthrop, P.J., Self tuning control, Proc. IEEE, 126(6), 1979. [7] Clarke, D. W. and Hastings-James, R., Design of digital controllers for randomly distributed systems, Proc. IEEE, 118, 1503-1506, 1971. [8] Flunkert, H.U. and Unbehauen, H., A nonlinear adaptive Minimum-Variance Controller with application to
THE CONTROL HANDBOOK
4Input -20' 0
-
5
10
15
20
time (sec)
time (sec) Figure 62.6
WMVC w ~ t hFet p a n t .
-20
1
\- R=l.
Output
time (sec)
,-I
Input L- R=0.8
n 0
Figure 62.7
R=0.8
5
10
15
time (sec)
20
GMVC.
Output 0
5
10
15
time (sec) Figure 62.8
[9] 110)
[Ill [12]
(131
20
time (sec)
Nonminimum-phasesystem.
a turbo-generator set, Proc. 32nd Control and Decision Conf., San Antonio, Texas, 1993, 1597-1601. Goodwin, G.C. and Sin, K.S., Adaptive filtering and prediction and control, Prentice Hall Int., 1984. Grimble, M.J., Generalized minimum variance control law revisited, Optimal Control Appl. Methods, 9, 63-77,1988. Isermann, R., Lachmann, K.-H., and Matko, D., Adaptive Control Systems, Prentice Hall, 1992. Kaminskas, V., Janickiene, D., and Vitkute, D., Selftuningconstrained control ofapowerplant, Proc. IFAC Control of Power plants and Power Systems, Munich, Germany, 1992, pp 87-92. Warwick, K., Relationship between h t r o m control and the Kalman linear regulator - Caines revisited, Optimal Control Appl. Methods, 11, 223-232, 1990.
Further Reading The standard minimum-variance controller was introduced in [ I ] where a detailed derivation can be found. The weighted minimum variance and the generalized min-
imum variance are discussed in [7] and [6], respectively. More general discussion of MVC and its relations to the LQG problem can be found in excellent books [2] and 191. Also the book [ l l ] surveys all minimum-variance control techniques.
Dynamic Programming
I? R. Kumar Department of Electrical and Computer Engineer~ngand Coordinated Science Laboratory, Un~versity of Illinois, Urbana, IL
63.1 Introduction .......................................................... 1097 Example: The Shortest Path Problem The Dynamic Programming Method Observations on the Dynamic Programming Method 63.2 Deterministic Systems with a Finite Horizon ....................... 1099 Infinite Control Set U * Continuous-TimeSystems 63.3 Stochastic Systems .................................................... 1100 Countably Infinite State and Control Spaces Stochastic Differential Systems 63.4 Infinite Horizon Stochastic Systems ................................. 1101 The Discounted Cost Problem The Average Cost Problem Connections ofAverage Cost Problem with ~iscountedCostProblems and Recurrence Conditions Total Undiscounted Cost Criterion References .................................................................... 1103 Further Reading ............................................................. 1104
63.1 Introduction -Dynamic programming is a recursive method for obtaining the optimal control as a function of the state in multistage systems. The procedure first determines the optimal controlwhen there is only one stage left in the life of the system. Then it determines the optimal control where there are two stagesleft, etc. The recursion proceeds backward in time. This procedure can be generalized to continuous-time systems, stochastic systems, and infinite horizon control. The cost criterion can be a total cost over several stages, a discounted sum of costs, or the average cost over an infinite horizon. A simple example illustrates the main idea.
63.1.1 Example: The Shortest Path Problem A bicyclist wishes to determine the shortest path from Bombay to the Indian East Coast. The journey can end at any one of the cities N3, C3, or S3. The journey is to be done in 3 stages. Stage zero is the starting stage. The bicyclist is in Bombay, labelled Co in Figure 63.1. Three stages of travel remain. The bicyclist has to decide whether to go north, center, or south. If the bicyclist, goes north, she reaches the city N1, after travelling 600 kms. If the bicyclist goes to the center, she reaches C1, after travelling 450 kms. If she goes south, she reaches S1 after travelling 500 krns. At stage one, she will therefore be in one of the cities N1, C1, or S1. From wherever she is, she has to decide which city from among N2, C2,or $2 she willtravelto next. The distancesbetween 0-8493-8570-9/%/$0.00tS.50 @ 1996 by CRC Press, Inc.
Figure 63.1
A shortest path problem.
cities N1, C1, and S1 and N2, C2, S2 are shown in Figure 63.1. At stage two, she will be in one of the cities N2, C2, or S2, and she will then have to decide which of the cities N3, C3, or S3 to travel to. The journey ends in stage three with the bicyclist in one of the cities N3, C3 or S3.
63.1.2 The Dynamic Programming Method We first determine the optimal decision when there is only one stage remaining. If the bicyclist is in N2, she has three choices-north, center, or south, leading respectively to N3, C3 or S3. The corresponding distances are 450, 500 and 650 krns. The best choice is to go
THE CONTROL HANDBOOK north, and the shortest distance from Nz to the East Coast is 450 kms. Similarly, if the bicyclist is in C2, the best choice is to go center, and the shortest distance from C2 to the East Coast is 550 kms. From S2, the best choice is to go south, and the shortest distance to the East Coast is 400 kms. We summarize the optimal decisions and the optimal costs, when only one stage remains, in Figure 63.2.
Figure 63.2
Figure 63.3
Making a decision in city N1.
Figure 63.4
Optimal solution when two stages remain.
Optimal solution when one stage remains.
Now we determine the optimal decision when two stages remain. Suppose the bicyclist is in N1. If she goes north, she will reach N2 after cycling 600 kms. Moreover from N2, she will have to travel a further 450 kms to reach the East Coast, as seen from Figure 63.2. From N1, if she goes north, she will therefore have to travel a total of 1050 kms to reach the East Coast. If instead she goes to the center from N1, then she will have to travel 650 kms to reach Cz, and from C2 she will have to travel a further 550 kms to reach the East Coast. Thus, she will have to travel a total of 1200 kms to reach the East Coast. The only remaining choice from N1 is to go south. Then she will travel 750 kms to reach Sz, and from there she has a further 400 kms to reach the East Coast. Thus, she will have to travel 1150 kms to reach the East Coast. The consequences of each of these three choices are shown in Figure 63.3. Thus the optimal decision from N1 is to go north and travel to N2. Moreover, the shortest distance from N1 to the East Coast is 1050 kms. Similarly, we determine the optimal paths from C1 and S1, also, as well as the shortest distances from them to the East Coast. The optimal solution, when two stages remain, is shown in Figure 63.4. Now we determine the optimal decision when there are three stages remaining. The bicyclist is in Co. If she goes north, she travels 600 kms to reach N1, and then a further 1050 kms is the minimum needed to reach the East Coast. Thus the total distance she will have to travel is 1650 kms. Similarly, if she goes to the center, she will travel a total distance of 1450 kms. On the other hand, if she goes
south, she will travel a total distance of 1400 kms to reach the East Coast. Thus the best choice from Co is to travel south to S1, and the shortest distance from Co to the East Coast is 1400 kms. This final result is shown in Figure 63.5.
P i e 63.5
Optimal solution when three stages remain.
63.1.3 Observations on the Dynamic Programming Method We observe a number of important properties.
63.2. DETER.MINIST1C SYSTEMS IV!'I'i ! A I:lNI'!'l: I:( )/
Lct L ~ * ( . I ,t - 1 ) IIC a valne ofu which minimizes the right-hand side (RHS) aboie. Then,
1. In order to sijlve the shortest path fro111just one ilty Co to the East Coast, we actually solved the shortest path fromi all of the cities Co, N I ,C , , S I ,N2, C L ,S?
-7'.- 1 2 ,,=,
to the East Coast. 2. Hence, dynamic pregramming can be i.ornp~rtrltlorl-
rdly introctablr if there are many stages ,111d ~nany possible decisions at each \rage. 3. Dynamic programming determines tl-IC optimal decision as a fi.lnction of the state and the number of stages remaining. Thus, it gives an optimal closedloop con,trol policy. 4. Dynamic programming proceeds backward in time. After determining the optimal solution when there are s stages remaining, it determines the optimal sol~ltionwhen there are (s 1 ) stages remaining.
+
5. Fundamental cie is made of the following relation-
ship in obtaining the backward recursion:
--
I . Cf(.r, t ) = minu(.) c [ x ( n ) u, ( n ) ,n I + d [ x ( T ) l , when .r ( t ) 7= .x , that is, it is the optimal cost-to-go from state x at tin~ct to the end.
2. i [ * ( . x , 1) is nn optimal control when one is in state x at lime t . PROOF 63.1 Define V ( x . r ) as the optimal cost-to-go from state x at time t to the end. V ( x , T ) = d ( x ) , because there is no decision to make when one is at the end. Now we obtain the backward recursion of dynamic programming. Suppose we are in state x at time t - 1. Ifwe apply control input u, we will incur an immediate cost c ( x , u, t - l ) , and move to the state f ( x , u , t I ) , from which the optimal cost-to-go is V (f ( x , u , t - I ) , t ) . Because some action u has to be taken at time I - 1,
Optimal dlstance ~fromstate x to the end = m,in 1
Distance travelled on current stage when decision di
-r optimal distance from the state x' to the end, where x' is the state to which decision ditakes you 1s made
from state x --
V(x,t-1)
I
min c ( x , u , t - l ) + V ( f ( x , u , t - l ) , t ) uEU
Moreover, if V (x,t - 1 ) was strictly smaller than the RHS above, then one would obtain a contradiction to the minimality of V ( . ,t ) . Hence equality holds, and (63.3) is satisfied. By evaluating the cost of the control policy given by u * ( x , t ) , one can verify that it gives the minimal cost V ( x o ,to).
Hence, if one passes through state x' from x , an optimal contir~uationis to take the shortest path from state x' to the end. This can be summarized by saying "segments of optimal paths are optimal in themselves." This is called the Principle of Optimality.
We have proven the Principle of Optimality. We have also proven that, for the cost criterion Equation 63.1, there is an optimal control policy that takes actions based oniy on the value of current state and current time and does not depend on past states or actions. Such a control policy is called a Markovian policy.
63.2 Deterministic Systems with a Finite Horizon We can gene::ilize the solution method to deterministic, timevarying control systems. Consi.der a system whose state at time t is denoted by x ( t ) E X . If a control input u ( t ) E U is chosen at time t , then the next state x ( t -t1 ) is determined as, x(t
5
+ 1 ) = f ( x ( t ) ,ll(t),t ) .
The system starts iln some state x(to) = xo at time to. For simplicity, suppose that there are only a finite number of possible control actions, i.e., U is finite The cost incurred by the choice of control input u ( t ) at time t is c ( x ( t ) ,~ ( 1 ) ) .We have afinite time horizon T . The goal is to determine the controls to minimize the cost function,
63.2.1 Infinite Control Set U When the control set U is infinite, the RHS in Equation 63.3 may not have a minimizing u. Thus the recursion requires an "inf" in place of the "min." There may be no optimal control law, only nearly optimal ones.
63.2.2 Continuous-Time Systems Suppose one has a continuous-time system
and the cost criterion is
+
(d(x(T)) THEOREM 6.3.1 Deterministic system, finite horizon, finite control set. Define V ( x ,T )
:= c;l(x), and
(63.2)
V ( x , t - - 1 ) := r n i n { c ( x , u , t - 1 )
Several technical issues arise. The differential Equation 63.4 may not have a solution for a given u ( t ) , or it may have a solution but only until some finite time. Ignoring such problems, we examine what type of equations replace the discrete--time recursions. Clearly, V ( x , T ) = d ( x ) . Considering a control input u which is held constant over an interval [ t ,t A ) , and approximating the resulting state (all possible only under more assumptions),
+
uGU
+ V ( f ( x , u , t - 1), t ) ) .
c ( x ( t ) ,~ ( t )t ),d l .
(63.3)
THE C O N T R O L H A N D B O O K V ( x ,t )
=
inf { c ( x ,u , t ) A
ucu
+V(x f
=
ucu
+ -aax(vx ,
+ A ) +o(A)}, + V ( x ,t )
f ( x ,u , t ) A , t
inf ( c ( x ,u , t ) A
A notianticipative controlpo/icy y is a mapping from pas: information to control inputs, i.e., y = (yt,,, yfo+l, . . . , y ~ - l )where Y, : to), u(to),x(to+ 11, u(to+ 11, . , x ( t ) ) H ~ ( t )Define . the conditionally expected cost-to-go when policy y is ayylied,
..
as t ) f ( x ,U , t ) A
Thus, one looks for the partial differential equation A~~alogously to the deterministic case, set Equation 63.2, and recursively define
to be satisfied by the optimal cost-to-go. This is called the Hamilton-Jacobi-Bellman equation. To prove that the optimal cost-to-go satisfies such an equation requires many technical assumptions. However, if a smooth V exists that satisfies such an equation, then it is a lower bound on the cost, at least for Markovian control laws. To see this, consider any control law u ( x , t ) , and let x ( t ) be the trajectory resulting from an initial condition x(to) = xo at time to. Then,
v ( x . t - 1)
=
min c ( x , u . t - 1)
I(EU
+
p , j ( u ) V ( j .t ) I
Now let uscompare V: with V(.t (:), I ) when policy y 1s used. ) ) Now suppose, by Clearly V; = LJ(.t(T),T ) = d ( . ~ - ( Ta.s. induction, that vfY3 V ( x ( t ) .t ) a.s. Then
v:-,
=
c ( x ( t - i). ic(f - I ) , t
-
1)
+ E {E:=~c [ x ( f z ) ,~ ( 1 1 1tz), 4-d ( x ( T ) l =
Ix(to), u(to). . . . , x ( t - 1 ) ) c(x(t - I ) , u(r - I), t - 1 )
+ E ( E [EL, c[x(tz),v(11).n) + ~ ( . x ( T ) I Hence $ I V ( X ( ~t)),]i-c ( x ( t ) ,u ( x ( t ) ,t ) , t ) 2 0 , and so, V(x(to).to)
5
IT + IT
v ( x ( T ) ,T )
+
c ( x ( t ) .a ( t ) .r)dt
10
= ~(x(T))
c ( x ( ~U ) .( t ) o. d t .
Moreover, if u*(x, t ) is a control policy which attains the "inf" in Equation 63.5, then it attains equality above and is, hence, optimal. In many problems, the optimal cost-to-go V ( x ,t ) may not be sufficiently smooth, and one resorts to the Pontryagin Minimum Principle rather than the HJB Equation 63.5.
63.3 Stochastic Systems Consider the simplest case where time is discrete, and both the state-space X and the control set U are finite. Suppose that the system is time invariant and evolves probabilistically as a controlled Markov chain, i.e.,
+
+
+
fn
Prob [ ~ ( t 1 ) = j l x ( t ) = i , ~ ( t=) U ] = pij(u).
Ix(t0). u(to)*.. . , x ( t ) l Ir(to), u(to),. . . , x ( t - I ) } = c ( x ( t - l ) , u(t - l), t - 1 ) E [ v ~ I x ( ~ to), o ) , . . . , ~ (-t 111 2 c ( x ( t - I), u(t - I), t - 1 ) E [ V ( x ( t )t)Ix(to), , u(to),. . . , x ( t = c(x(t - I ) , u(t - I), t - 1 ) C, p x ( ~ - l )[u(t , - 1)1V(j,t )
(63.7)
The elements of the matrix P ( u ) = [ p i j( u ) ]are the transition probabilities. Given a starting state x(to) = xo at time to, one wishes to minimize the expected cost
(63.9)
- 111
+
Thus V ( s ,t ) is a lower bound on the expected cost of any non anticipative control policy. Suppose, moreover, that u*(x, t ) attains theminimum on the RHS of Equation 63.8 above. Consider the Markovian control policy y * = (y;, y;, . . . , y;C;-, ) with
For y* it is easy to verify that the inequalities in Equation 63.9 are equalities, and s'o it is optimal.
THEOREM 63.2 Stochastic finite state, finite control system, finite horizon. Recursively define V ( x ,t ) from Equations 63.2 and 63.8. Let u * ( x , t - 1 ) attain the minimum on the RHS of (63.81, and consider the Markovian control policy y* defined in Equation 63.10. Then,
63.4. INFINITE hrOR1ZON STOCHASTIC SYSTEMS
(i) V (xo, to) is the optimal cost. (ii) The Markovian sontrolpolicj~y* is optimal over theclass of all nonanticipative control policies.
63.3.1 Countably Infinite State and Control Spaces The result (i) can be extended to countably infinite state and control policies by replacing the 'him" above with an "inf". However, the "inf" need not be attained, and an optimal control policy may not exist. If one considers uncountably infinite state and control spaces, then further highly technical issues arise. A policy needs to be a nieasurable map. Moreover, V(x, t ) will also need to be a measurable function because one must take its expected value. One must impose appropriate conditions to insure that the "inf" over an uncountable set still gives an appropriately measurable function, and further that one can synthesize a minimizing u(x, t ) that is measurable.
where 0 < B < 1 is a discountfactor. The discounting guarantees that the summation is finite. We are assuming that the one-stage cost function c does not change with time. Let W(x, co) denote theoptimalvalueofthiscost, startinginstatex. The "con denotes that there is an infinity of stages remaining. Let W(x, N ) denote the optimal value of the finite horizon cost E prc[x(t), u(t)], when starting in statex at time 0. Note that the index N refers to the number of remaining stages. From the finite horizon case, we see that
c'$,'
Note the presence of the factor #3 multiplying W(j, N - 1) above. Denote by R" the class of all real valued functions of the state, i.e., V E R X is a function of the form V : X 4. R. Define an operator T : R~ -+ R X by its action on V as
for all x E X .
63.3.2 Stochastic Differential Systems Consider a system described by a stochastic differential equation,
where w is a standard Brownian motion, with a cost criterion
If V(x, t) denotes the optimal cost-to-go from a starting state x when there are t time units remaining, one expects from Ito's differentiation rule that it satisfies the stochastic version of the Hamilton- Jacobi-Bellman equation:
(63.13)
Using the operator T , one can rewrite Equation 63.12 as W ( . , N ) = T W ( . , N - 1). Also, W(.,O) = 0,where 0 is the identically zero function, i.e., O(x) = 0 for all x. Note that T is a monotone operator, i.e., if V(x) 5 ~ ( x for ) all x , then TV(x) 5 TV(X) for all x. Let us suppose now that E 2 C(X,U) 2 0 for all x, u. (This can be achieved by simply adding a large enough constant to each original one-step cost.) Due to this assumption, T 0 2 0. Hence by monotonicity, T ( ~ ) o>_ T ( ~ - ' ) o 2 . . . 2 T O >_ O. Thus T ( ~ ) o ( x ) converges, for every x, to a finite number (since ~ ( ~ 51&). 0 Let
Now W(x. m) 2 W(x, N ) for all N, because c(x, u) 2 0. Hence W(x, co) 2 limN,, W(x, N ) = W(x). Moreover W(x, co) 5 W (x, N ) .:pN because one can employ an optimal policy for an N-stage problem. Thus W(x, co) 5 limN,,
+ m,
The existence of a solution to such partial differential equations is studied using the notion of viscosity solutions.
63.4 Infinite Horizon Stochastic Systems
[
+
I
w (x, N ) -= W (x). Hence W (x) = W (x, m ) , and is, 1-B cpN therefore, the optimal infinite horizon cost. Hence, we would like to characterize W(x). Since T is continuous, TW(x) = limAr,, T ~ +O(x) ' = W(x). Hence W is afixedpoint of T , i.e.,
We will consider finite state, finite control, controlled Markov chains, as in Equation 63.7. We study the infinite horizon case.
63.4.1 The Discounted Cost Problem Consider an infinite horizon total discounted cost of the form
Simple calculations show that T is a contraction with respect to theII.~I,norm,i.e.,max, I T V ( X ) - T ? ( X ) I 5 Pmax, (V(x)V(x)l. Thus .T has a unique fixed point. Since T is a contraction, we also know that limN,, T(~)V = W for an,y V. Suppose now that u*(x) attains the minimum in Equation 63.14 above. Consider the policy y * which always chooses control u*(x) whenever the state is x. Such a control policy is called
T H E CONTROL HANDBOOK stationary. If one applies the stationary control policy y*, then the expected cost over N days, starting in state x, is T ( ~ ) O ( X ) . 'Thus the infinite horizon cost of y * is limN,, T N 0 = W.
1 . Then J * is the optimal value of the average cost crite-
rion, starting from any state x. 2. Let y * (x) be a value of u which minimizes the RHS in
Equation 63.15 above. Then y * is a stationarypolicy which is optimal in ike class of all nonnnticipative policies.
THEOREM 63.3 Stochastic finite state, finite control system, discounted cost criterion. Let T : R" + Re' rict~otcthc opercrtor (Equation 63.13).
PROOF 63.2
I . T is a contraction. 2. Let W = limN,, T ( ~ )f vo r a ~V.~ Then W i~ the ~rr~ique solutiorr of Eqiratiot~63.14. 3. W(x) is the optrrnol cost whet] stnrtttrg irl state x.
J*
Noting that E that
stationary control policy which is optimal it1 the class of all nonanticipative policies.
Policy Iteration Procedure by,
1. Let yo be any stationary policy. 2. Solve the linear system of equations TyoWyo = Wyo to determine its cost MJY0. 3. If TWyo
xi p,(,),,(~t(t))V(J)= E{V(x(t + I))), we see
~
~
N-l 0 lirn sup -E c(x(t), u ( t ) ) 2 J * N-oo N t=o
C
+ lim sup V(x(0)) N
+ R'.
px(tl1(~r(t))~(J).
Hence
The procedure for determining W(x) as limN,, ~ (is called value iteration. Another procedure which determines the optimal policy in a finite number of steps is pollcy ~teration.
+
+ V(x(t)) 5 c(x(t), u(t)) + .I
4. Let y*(x) be the val~reof li which ottnirrs the minim u m on the RHS in Eqiratiort 63.14. Then y* is a
For a stationary policy y , define Ty : R" Ty V(A) = C(X,Y(x)) B X I i 7 r l ( ~ ( x ) ) V ( J ) .
Consider any nonanticipative policy. Then
# Wy,, then let yl (x) be the value of n
which attains the minimum in
-
E t V(x(N))) = J* N
Thus J* is a lower bound on the average cost, starting from any state x. Moreover, if y* is the policy under consideration, then equality holds above, and so it is indeed optimal. The question, still a topic of active research, is when does a solution [ J * , V (.)I exist for Equation 63.1 5 ? Let us consider the following simplifying assumption. For every stationary pol-. icy y , the Markov chain Py = [pi,,(y(i))] is irreducible. By this is meant that there exists a unique steady state distribution r y , satisfying ryPy = r y , ry(i) > 0, n y (i) = 1, which is strictly positive, i.e., n y (i) > 0 for all i. Then the average cost..Jy starting from any state is a constant and satisfies, Jy = r y ( i ) c [ i , y(i)] = n y c y (where cy := [cy(l), . . . , cy (lx)]). Hence if e = (1,. . . , l l T , n y ( J y e c y ) = 0. Hence ( J y e - c y ) is orthogonal to the null space , is, therefore, in the range space of (PY - I). of (Py - I ) ~ and Thus a Vy exists so that J y e - cy = (Py - I)VY,which simply means that
xi
I
B E j ~r,(u)Wyo(J) . 4. Then yi is a strict improvement of yo (since Wy, = (N)
l~mN-+ooTyl Wya;Wyo). 5. By repeating this procedure, one obtains a sequence of strictly improving policies, that must terminate in a finite number of steps, because the total number of stationary policies is finite.
xi
63.4.2 The Average Cost Problem Weconsider the averagecostper unit timeover an infinite horizon, N-1
c[x(t), u ( t ) ]
lim sup -E N-+,
N
*=O
Then the dynamic programming equation needs to be modified slightly.
Note that Vy (.) is not unique because Vy (.) + a is also a solution for any a. Let us therefore fix Vy ( i )= 0 for some 2. One can obtain a policy iteration algorithm, as well as prove the existence of J * and V(.), as shown below. Policy Iteration Procedure 1. Let yo be any stationary policy.
THEOREM 63.4 Sup?ose a constant J * exists, and a function V : X -+ R exists so that
2. Solve Equation 63.16, with y replaced by yo, to obtain (.Iyo, Vyo) If (Jyo, VYo)does not satisfy Equation 63.15, then let yl (x) attain the minimum in
63.4. INFINITE H O R I Z O N S T O C H A S T I C SYSTEMS
Then yl is a strict improvement of yo. (This follows because nyl(i) > 0 for all i , and so
Suppose that the mean jrst passage times are uniformly bounded, t.e., E ( s I x(0) = x and y is used) 5 M < +oo,
Because the policy space is finite, this procedure terminates in a finite number of steps. At termination, Equation 63.15 is satisfied.
for all states x and all stationary policies y . Then a solution ( J * , V ( . ) )exists to Equation 63.15. PROOF 63.4 Under a stationary policy yg which is optimal for the discount factor #I,
63.4.3 Connections of Average Cost Problem with Discounted Cost Problems and Retcurrence Conditions The average cost problern can be regarded as alimit of discounted cost problems when #I /' 1 . We illustrate this for systems with col~ntablestate space and finite control set.
Moreover, by Jensen's inequality,
THEOREM 63.5 Connection between discounted and average cost. Let Wg ( X I denote theop timal discounted cost E CF=ySt c(x ( t ) ,u ( t ) )when starting in the srate x. Suppose that ( Wg ( x ) Wp ( X I ) I 5 M all x,xl, and all B E ( 1 - E , I ) , for some E > 0. For an arbitrar,~state i E X,let B, 7 1 be a sub-sequence so that the following limits exist:
Hence, -EM 5 ?(&I) 5 w g ( f ) ( B M - 1) I W g ( x ) 1-8 EM, and the result follows from the preceding TheoWg ( X ) rem.
li,m,-tm(l
-
P n ) Wp, ( x ) =: J * ,
lim,+,(Wg,(x)
and
- Wp,,(T))=: V(xJ.
63.4.4 Total Undiscounted Cost Criterion Consider a total infinite horizon cost criterion of the form
Then,
2. I f a stationary policy y* is optimal for a sequence of
discount factors pn with #In for the average cost problem.
7 l,.then y*
In order for the infinite summation to exist, one often assllmes that either
is optimal
PROOF 63.3 The dynamic programming Equation 63.14 for the discounted1 cost problem can be rewritten as,
C(X, U )
>
0 for all x , u
(the positive cost case), or
C(X, U )
5
0 for all x , u
(the negative cost case).
These two cases are rather different. In both cases, one exploits the monotonicity of the operator T.
References Taking limits dong /3,
71 yields the results.
The existence of ( J * , V (.)) satisfying the average cost dynamic programming Equation 63.15 is guaranteed under certain uniform recurrence conditions on the controlled Markov chain. THEOREM 63.6 Uniformly bounded mean first passage times. Let r deniote the jrst time after time 1 that the system enters some fixed state f , i.e.,
[ I ] Blackwell, D., Discounted dynamic Ann. of Math. Statist. 36,226-335, 1965. [ 2 ] Strauch, FL., Negative dynamic programming. Ann. Math. Statist., 37, 871-890, 1966. 131 Blackwell, D., Positive dynamic programming. Proceedings of the 5th Berkeley Symposium on Mathematical Statistics and Probability, 415-4 18, 1965. [ 4 ] Blackwell, D., On stationary policies. J. Roy. Statist. Soc. Ser. '4, 133,33-38, 1970. [ 5 ] Ornstein, D., On the existence of stationary optimal strategies. Proc. Amer. Math. Soc., 20, 563-569, 1969. [ 6 ] Blackwell, D., Discrete dynamic programming. Ann. ofMath. Statist., 33,719-726, 1962.
THE CONTROL HANDBOOK [7] Hernandez-Lerma, 0. and Lasserre, J. B., Weak conditions for average optimality in Markov control processes. Syst. and Contr. Lett., 22,287-291, 1994. [8] Bertsekas, D. P. and Shreve, S. E., Stochastic Optimal Control: The Discrete Time Case, Academic, New York, 1978. [9] Lions, P. L., Optimal control of diffusion processes and Hamilton-Jacobi equations, Part I-the dynamic programming principle and applications.Comm. Partial Differential Equations, 10, 1 101-1 174, 1983. [lo] Crandall, M., Ishii, H. and Lions, P. L., User's guide to viscosity solutions of second order partial differential equations. Bull. Amer. Math. Soc. 27, 1-67, 1990. [ l l ] Bellman, R., Dynamic Programming. Princeton University Press, Princeton, NJ, 1957. (121 Bertsekas, D. P., Dynamic Programming: Deterministic and Stochastic Models. Prentice Hall, Englewood Cliffs, NJ, 1987. [13] Kumar, P. R. and Varaiya, P. P., Stochastic Systems: Estimation, Identification andAdaptive Control. Prentice Hall, Englewood Cliffs, NJ, 1986. [14] Ross, S. M., Applied Probability Models with Optimization Applications. Holden-Day, San Francisco, 1970.
Further Reading For discounted dynamic programming, the classic reference is Blackwell [ 11. For the positive cost case, we refer the reader to Strauch [2]. For the negative cost case, we refer the reader to Blackwell [3,4] and Ornstein [5]. For the average cost case, we refer the reader to Blackwell [6] for early fundamental work, and to Hernandez-Lerma and Lasserre [7] and the references contained there for recent developments. For a study of the measurability issues which arise when considering uncountable sets, we refer the reader to Bertsekas and Shreve [8]. For continuous time stochastic control of diffusions we refer the reader to Lions [9], and to Crandall, Ishii and Lions [lo] for a guide to viscosity solutions. For the several ways in which dynamic programming can be employed, see Bellman [ 11 1. Some recent textbooks which cover dynamic programming are Bertsekas [12], Kumar and Varaiya [13], and Ross [14].
Stability of Stochastic Systems
Kenneth A. Loparo and Xiangbo Feng
D~~~~~~~~~~ofSystems Engineering, case westernR~~~~~~
University, Cleveland, O H
64.1 Introduction .......................................................... 1105 64.2 Lyapunov Function Method ......................................... 1107 64.3 The Lyapunov Exponent Method and the Stability of Linear Stochastic Systems ....................................................1113 64.4 Conclusions.. .........................................................1124 References.. ..................................................................1124
64.1 Introduction In many applications where dynamical system models are used to capture the behavior of real world systems, stochastic components and random noises are included in the model. The stochastic aspects of the model are used to capture the uncertainty about the environment in which the system is operating and the structure and parameters of the model of the physical process being studied. The analysis and control of such systems then involves evaluating the stability properties of a random dynamical system. Stability is a qualitative property of the system and is often the first characteris~icof a dynainical system1 or model) studied. We know from our study of classical control theory that, before we can consider the design of a regulatory or tracking control system, we need to make sure that the system is stable from input to output. Therefore, the study of the stability properties of stochastic dynarnical systems is important and considerable effort has been devoted to the study of stochastic stability. Significant results have been reported in the literature with applications to physic4 and engineering systems. A comprehensive survey on the topic of stochastic stabilitywas given by [27]. Since that time there have been many significant developments of the theory and its applications in science and engineering. In this chapter, we present some basic results on the study of stability of stochastic systems. Because of limited space, only selected topics are presented and discussed. We begin with a discussion of the basic definitions of stochastic stability and the relationships among them. Kozin's survey provides an excellent introduction to the subject and a good explanation of how the various notions of stochastic stability are related. It is not necessary that readers have an extensive background in stochastic processes or other related mathematical topics. In this chapter, the various results and methods will be stated as simply as possible and no proofs of the theorems are given. When necessary, important mathematical concepts, which are the foundation of some of the results, will be discussed briefly. This will 0-8493-8570-9/96/$0.00+$.50 @ 1996 by CRC Press, Inc.
provide a better appreciation of the results, the application ofthe results, and the key steps required to develop the theory further. Those readers interested in a particular topic or result discussed in this chapter are encouraged to go to the original paper and the references therein for more detailed information. There are at least three times as many definitions for the stability of stochastic systems as there are for deterministic systems. This is because in a stochastic setting there are three basic types of convergence: convergence in probability, convergence in mean (or moment), and convergence in an almost sure (sample path, probability one) sense. Kozin [27] presented several definitions for stochastic stability, and these definitions have been extended in subsequent research works. Readers are cautioned to examine carefully the definition of stochastic stability used when interpreting any stochastic stability results. We begin with some preliminary definitions of stability concepts for a deterministic system. Let x ( t ; xo, to) denote the trajectory of a dynarnic system initial from xo at time to. 3EFINITION 64.1 Lyapunov Stability The equilibrium solution, assumed to be 0 unless stated otherwise, is said to be stable if, given E > 0, S ( E , to) > 0 exists so that, for all llxoII < 6 , sup llx(t; xo, to)\\ < 8 . f?to
DEFINITION 64.2 Asymptotic Lyapunov Stability The equilibrium solution is said to be asymptotically stable if it is stable and if 6' > 0 exists so that llxo 11 < 6', guarantees that t-+m lim
Ilx(t;xo,to)ll = O .
If the convergence holds for all initial times, to, it is referred to as uniform asymptotic stability. DEFINITION 64.3
Exponential Lyapunov Stability
THE CONTROL HANDBOOK The equilibrium solution is said to be exponentially stable if it is asymptoticallystable and if there exists a 6 > 0, an a > 0, and a B > 0 so that llxoll < 6 guarantees that
If the convergence holds for all initial times, to, it is referred to as uniform exponential stability. These deterministic stability definitions can be translated into a stochastic setting by properly interpreting the notion of convergence, i.e., in probability, in moment, or almost surely. For example, in Definition 64.1 for Lyapunov stability, the variable of interest is suptzt,, Ilx(t; xo, to)11, and we have to study the various ways in which this (now random) variable can converge. We denote the fact that the variable is random by including the variable w , i.e., x ( t ; xo, to, w ) , and we will make this more precise later. Then, DEFINITION 64.4 Ip: Lyapunov Stability in Probability The equilibrium solution is said to be stable in probability if, < 6, given E , E' > 0, 8(&, E', to) > 0 exists SO that for d JJxoJJ PIsup Ilx(t; xo, to, w)ll > 8'1 < 6. tzto
Here, P denotes probability. DEFINITION 64.5 Im: Lyapunov Stability in the p th Moment The equilibrium solution is said to be stable in pth moment, p > 0 if, given E > 0, S(E,to) > 0 exists so that llxoll i6 guarantees that
If the convergence holds for all initial times, to, it is referred to as uniform asymptotic stability in probability. DEFINITION 64.8 IIm: Asymptotic Lyapunov Stability in the pth Moment The equilibrium solution is said to be asymptotically pth moment stable if it is stable in the pth moment and if 6' > 0 exists so that Ilxo 11 < 6' guarantees that lirn Elsup Ilx(t; xo, to, w>ll)= 0. tts
8-00
DEFINITION64.9 :Almost SureAsymptotic Lyapunov Stability The equilibrium solution is said to be almost surely asymptotically stable if it is almost sureIy stable and if 6' > 0 exists so that 11x0 11 < 6' guarantees that, for any E > 0, lim {sup Ilx(t; xa, to, w)I) > E ) = 0. 8-tcc
tz6
Weaker versions of the stability definitions are common in the stochasticstability literature. In these versions the stochastic stability properties of the system are given in terms of particular instants, t , rather than the seminfinite time interval [to, m) as given in the majority of the definitions given above. Most noteworthy, are the concepts of t$e pth moment and almost sure exponential stability: DEFINITION 64.10 IIIm: pth Moment Exponential Lyapunov Stability The equilibrium solution is said to be pth moment exponentially stable if there exists a 6 > 0, an a! > 0, and a B > 0 so that llxOll < 6 guarantees that
Here, E denotes expectation. DEFINITION 64.6 Ia,s.: Almost Sure Lyapunov Stability The equilibrium solution is said to be almost surely stable if
PI lim sup Ilx(t;xo,tO,w)ll= O ) = 1. Ilxoll-+ot>to Note, almost sure stability is equivalentto saying that, with probability one, all sample solutions are Lyapunov stable. Sinilar statements can be made for asymptotic stability and for exponential stability. These definitions are introduced next for completeness and because they are often used in applications. DEFINITION 64.7 IIp: Asymptotic Lyapunov Stability in Probability The equilibrium solution is said to be asymptotically stable in probability if it is stable in probability and if 6' > 0 exists so that 11x0 11 < 6' guarantees that lim P{sup ((x(t ;xo, to, w ) (1 > s) = 0. 6-+m
t26
DEFINITiON 64.1 1 IIIa.s,: Almost Sure Exponential Lyapunov Stability The equilibrium solution is said to be almost surely exponentially stable if there exist a 6 > 0 , an a! > 0, and a /J > 0 so that 11x0 11 < 6 guarantees that
P{{llx(t;xo, to, w)ll} < Bllxoll ~ ~ P - ~ ( ' - ' o= ) I1
EXAMPLE 64.1: Consider the scalar Ito equation
d x =axdt
$. a x d w
(64.1)
where w is a standard Wiener process. The infinitesimal generator for the system is given by
64.2. LYAPUNO I/ FUNCTION METHOD The solution process xt for t 5 0 is given by
Hence,
and the asymptotic exponential grcwth rate of the process is 1 t
xt x0
1
h = lim - l o g - = ( a - - a 2 ) + t-w
2
lim
t-+W
t
2 0
The last equality follows from the fact that the Wiener process, with increment dw, is a zero-mean ergodic process. We then conclude that the system is almost surely exponentially stable in the sense that
{
PI, lim xt = 0, at an exponential rate a.s. 1-+OO
1
= 1,
if, and only if, ( I < $ a 2 . Next we compare this with the second-moment stabilityresult; see also example 64.2 in the next section where the same conclusion follows from a! Lyapunov analysis of the system. From our previous calculation, X; for t 2 0 is given by
x; = e 2 ( u - f u 2 * e 2 ~
1;dwx$
Then, ~{x:} = e ( 2 a + 0 2 ) t ~ { x ~ ) , and we concludethat the system is exponentiallysecond-moment stable if, and only if, (a + ;a2) < 0, or a < - $ a 2 . Therefore, unlike deterministic systems, even though moment stability implies almost sure stability, almost sure (sample path) stability need not imply moment stability of the system. For the system Equation 64.1, the pth moment is exponentiallystable if and only if a < ;a2(1 - p) where p = 1,2,3, . . .. In the early stages (1940s to 1960s) of the study of the stability of stochastic systems, investigators were primarily concerned with moment stability and the stability in probability; the mathematical theory for the study of almost sure (samplepath) stability was not yet Mly developed. During the initial development of the theory ancl methods of stochastic stability, some confusion about the stability concepts, their usefulness in applications, and the relationship among the different concepts of stability existed. Kozin's survey clarified some of the confusion and provided a good foundation for further work. During the past twenty years, almost sure (sample path) stability studies have attracted increasing attention of researchers. This is not surprising because the sample paths rather than moments or probabilities associated with trajectories are observed in real systems and the stability properties of the sample paths can be most closely related to their deterministic counterpart, as argued by Kozin [27]. Practically speaking, moment stability criteria, when used to infer
sample path stability, are often too conservative to be useful in applications. One of the most fruitful 2nd important advances in the study of stochastic stability is the development of the Lyapunov exponent theory for stochastic systems. This is the stochastic counterpart of the notion of characteristic exponents introduced in Lyapunov's work on asymptotic (exponential) stability. This approach provides necessary and sufficient conditions for almost sure asymptotic (exponential) stability, but significant computationalproblems rnust be solved. The Lyapunovexponent method uses sophisticated tools from stochasticprocess theory and other related branches of mathematics, and it is not a mature subject area. Because this method has the potential of providing testable conditions for almost sure asymptotic (exponential) stability for stochastic systems, this chapter will focus attention on this approach for analyzing the stability of stochastic systems. In this chapter, we divide the results into two categories, the Lyapunov function method and the Lyapunov exponent method.
64.2 Lyapunov Function Method The Lyapunov function method, i.e., Lyapunov's second (direct) method provides a powerful tool for the study of stability properties of dynamic systemsbecause the technique does not require solving the system equations explicitly. For deterministic systems, this method can be interpreted briefly as described in the following paragraph. Consider a nonnegative continuous function V(x) on Rnwith V(0) = 0 and V(x) > 0 for x # 0. Suppose for some m E R, the set Q, = { x E Rn : V(x) < m} is bounded and V(x) has continuous first partial derivatives in Q,. Let the initial time to = 0 and let xr = x(t, xo) be the unique solution of the initial value problem:
for xo E Q,. B~ecauseV(x) is continuous, the open set Q, for r E (0, m ] defined by Q, = {x E Rn : V(x) < r } , contains the origin and monotonically decreases to the singleton set {0) as r -+ o+. If the total derivative ~ ( xof) V(x) (along the solution trajectory x(t, xo)),which is given by V(X) =
= fT(x) .
av
def
= -k(x),
(64.4)
satisfies -k(x) 5 0 for all x E Q,, where k(x) is continuous, then V(xl) is a nonincreasing function o f t , i.e., V(xo) < rn implies V(xt) < m for all t 2 0. Equivalently, xo E Q, implies that xt E Q,, for all t 2 0. This establishes the stability of the zero solution of Equation 64.3 in the sense of Lyapunov, and V(x) is called a Lyapunov function for Equation 64.3. Let us further assume that k ( x ) > 0 forx E Q,\{O}. Then V(xt), as a function o f t , is strictly monotone decreasing. In this case, V(xt) -+ 0 as t -+ +oo from Equation 64.4. This implies that xt -+ 0 as t -t +oo. This fact can also be seen through an integration of
!
THE CONTROL HANDBOOK
Equation 64.4, i.e.,
above, L V ( x ) 5 - k ( x )
0 . It follows that
f
0 < V(xo) - V ( x t )=
1
k ( x s ) d s < +m for t E [O, + a ) .
It is evident from Equation 64.5 that xt -+ { 0 ) = { x E Q,, : -t + m . This establishes the asymptotic stability for system Equation 64.3. The Lyapunov function V ( x ) may be regarded as a generalized energy function of the system Equation 64.3. The above argument illustrates the physical intuition that if the energy of a physical system is always decreasing near an equilibrium state, then the equilibrium state is stable. Since Lyapunov's original work, this direct method for stability study has been extensively investigated. The main advantage of the method is that one can obtain considerable information about the stability of a given system without explicitly solving the system equation. One major drawback of this method is that for general classes of nonlinear systems a systematic way does not exist to construct or generate a suitable Lyapunov function, and the stability criteria with the method, which usually provides only a sufficient condition for stability, depends critically on the Lyapunov function chosen. The first attempts to generalize the Lyapunovfunction method to stochastic stability studies were made by [ 8 ] and [ 2 4 ] . A systematic treatment of this topic was later obtained by [ 3 0 ] , [ 3 1 ] and [ 2 2 ] (primarily for white noise stochastic systems). The key idea of the Lyapunov function approach for a stochastic system is the following: Consider the stochastic system defined on a probability space (S2,3, P ) , where S2 is the set of elementary events (sample space), 3 is the a field which consists of all subsets of S2 that are measurable, and P is a probability measure: k ( x ) = 0 ) as t
and for t . s > 0
Equation 64.9 means that V ( x t ) is a supermartingale, and, by the martingale convergence theorem, we expect that V ( x t) -+ 0 a.s. (almost surely) as t -+ +m. This means that x, -+ 0 a.s. as t 4 +w. A similar argument can be obtained by using Equation 64.8 an analog of Equation 64.5. It is reasonable to expect from Equation 64.8 that xt + { x E R" : k ( x ) = 0 ) almost surely. These are the key ideas behind the Lyapunov function approach to the stability analysis of stochastic systems. Kushner [ 3 0 ] ,[ 3 1 ] ,[ 3 2 ]used the properties of strong Markov processes and the martingale convergence theorem to study the Lyapunov function method for stochastic systems with solution processes which are strong Markov processes with right continuous sample paths. A number of stability theorems are developed in these works and the references therein. Kushner also prerented various definitions of stochastic stability. The reader should note that his definition of stability "with probability one" is equivalent to the concept of stability in probability introduced earlier here. Also, asymptotic stability "with probability one" means stability "with probability one" and samp!e path stability, i.e., xt -+ 0 a.s. as t -+ +m. The key results of Kushner are based on the following supermartingale inequality which follows directly from Equation 64.8 where xo is given:
~ ( t ) = f ( x ( t ) , w ) ,t LO, x ( 0 ) = xo.
It is not reasonable to require that V ( x t ) 5 0 for all w , where xt = x ( t , xo, w ) is a sample solution of Equation 64.6 initial from xo. What can be expected to insure "stability" is that the timederivativeofthe expectation of V ( x t ) ,denoted by L V ( x t ) , is nonpositive. Here, L is the infinitesimal generator of the process xt. Suppose that the system is Markovian so that the solution process is a strong, time homogeneous Markov process. Then C is defined by L V ( x o ) = lim At--to
E x , ( V ( x ~ t )) V(xo) At
(64.7)
where the domain of L is defined as the space of functions V ( x ) for which Equation 64.7 is well-defined. This is a natural analog of the total derivative of V ( x ) along the solution trajectory xt in the deterministic case. Now suppose that, for a Lyapunov function V ( x ) which satisfies the conditions stated
From this, the following typical results were obtained by Kushner. For simplicity, we assume that xt E Q m almost surely for some rn > 0 and state these results in a simpler way. THEOREM 64.1
(Kushner)
1. stability "with probability one": If L V ( x ) 5 0 , V ( x ) > 0 for x 6 Qm\{O}, then the origin is stable "with probability one': 2. asymptotic stability "with probability one": If L V ( x ) = ,-k(x) 5 0 with k ( x ) > 0 for x E Q m\{0) and k ( 0 ) = 0 , and if for any d > 0 small, cd > O e x i s t s s o t h a t k ( x ) z d f o r x โฌ { e r n : I I x I I L c d } , then the origin is stable "with probability one" with
64.2. LYAPUNOV FUNCTION METHOD
In particular, if the conditions are satisfied for arbitrarily large m, then the origin is asymptotically stable "with probability oneJ: 3. exponential asymptotic stability "with probability one": IfV(x) 2 0 , V(0) = 0 and LV(x) 5 -crV(x) on Q, for some cr > 0 , then the origin is stable '(with probability one': and
=
1
(a - - a 2 )
2
+ t+ca lim
t
2 0
We conclude that the system is almost surely exponentiallystable in the sense that Pxo [>k%,xt = 0, at an exponential rate a.s.
] = 1,
if, and only if, a < ia2..Compare this with the stability result a < - ;a2, given above, using the Lyapunov function method. Note that a < -$a2 is actuallythe stability criterion for secondmoment stability, which is a conservative estimate of the almost sure stability condition a < ;a2.
In particular, if the conditions are satisfied for arbitrarily large rn, then the origin is asymptotical1,ystable "with probability one': and
Many interesting examples are also developed in Kushner's work to demonstrate the application of the stability theorems. These examples also illustrated some construction procedures for Lyapunov functions for typical systems.
EXAMPLE,64.2: (Kushner) Consider the scalar Tto equation
d x = axdt
+ axdw
(64.11)
Has'minskii [22] and the references cited therein, provide a comprehensive study of the stability of diffusion processes interpreted as the solution process of a stochastic system governed by an Ito differential equation of the form,
where (, (t) are independent standard Wiener processes and the coefficientsb(t,x) and ur (t, x) satisfy Lipschitz and growth conditions. In this case, the infinitesimal generator L of the system (associated with the solution processes) is a second-order partial differential operator on functions V(t, x) which are twice continuously differentiable with respect to x and continuously differentiable with respect to t. L is given by
where w is a standard Wiener process. The infinitesimal generator for the system is given by
~
=2 1 2d2 d 2 a x G + a x ~.
(64.12)
If the Lyapunov function V(x) = x2, then
+
1f a2 2a c 0, then with Q , = (x : x2 < m2), from 1) of the previous theorem, the zero solution is stable "with probability one". Let m -+ +m. By 2) of the theorem, lim xt = 0
t+OO
a.s.
(64.14)
where xt is siolution process of Equation 64.11. By 3) of the theorem,
The key idea of Has'minskii's approach is to establish an inequalitylike Equation 64.10 developed in Kushner's work. Below are some typical results obtained by Has'minskii; the reader is referred to [22] for a more detailed development. Let U be a neighborhood of 0 and U1 = (r > 0) x U. The collection of functions V(t, x ) defined in U1, which are twice continuously differentiable in x except at the point x = 0 and continuously differentiable in r , are denoted by c;(u~). A function V(t, x) is said to be positive definite in the Lyapunov sense if V(t, 0) = 0 for all t 1 0 and V(t,x) 1 w(x) > 0 for x # 0 and some continuous function w(x).
for some a! =- 0.
THEOREM 64.2 REMARK 64.1 As calculated in the previous section, the asymptotic exponential growth rate of the process is
(Has'minskii)
1. The trivial solution of Equation 64.16 is stable in probability (same as our definition) if there exists V(t, X) E c;(u~), positive definite in the Lyapunov sense, so that LV(t , x) L: 0 ,for x # 0.
THE CONTROL HANDBOOK 2. I f the system Equation 64.16 is time homogeneous, i.e., b(t, x) = b(x) and (t, x) = ur(x) and if the nondegeneracy condition,
construct a suitable family of Lyapunov functions to obtain the "best" stability results. In the following, we summarize some of these efforts. There are severalworks related to the stability oflinear systems of the form,
fork = (Al . . . hnlT E R n , is satisfied with continuous m(x) > 0 for x # 0, then, a necessary and sufficient condition for the trivial solution to be stable in probability is that a twice continuously differentiablefunction V(x) exists, exceptperhaps a t x = 0, so that
where Lo is the generator of the time homogeneous system. 3. If the system Equation 64.16 is linear, i.e., b(t, x) = b(t)x and a,(t, x) = a,(t)x, then the system isexponentially p-stable (the pth moment is exponentially stable), i.e.,
with A a stable (Hurwitz) matrix and F(t) a stationary and ergodic matrix-valued random process. This problem was first considered by Kozin (261 by using the Gronwall-Bellman inequality rather than a Lyapunov function technique. Kozin's results were found to be too conservative. Caughey and Gray [12] were able to obtain better results through the use of avery special type of quadratic form Lyapunov function. Later, Infante [23] extended these stability theorems by using the extremal property of the so-called regular pencil of a quadratic form. The basic idea behind Infante's work is the following: Consider the quadratic form Lyapunov function V(x) = x f P x . Then the time derivative along the sample paths of the system Equation 64.19 is ~ ( x t= ) xi (F' (t) P
+ P F(t))xt - xi Qxt
(64.20)
where Q is any positive definite matrix and P is the unique solution of the Lyapunov equation
E.xoIllx(t,xo, s)llP} i A . llxllP exp{-cr(t - s)}, p > 0, for some constant cr > 0 if, and only if; a function V(t, x) exists, homogeneous of degree p in x, so that for some constants ki > 0, i = 1,2, 3 , 4 , kl JlxllP i v(t, X) 5 k2 IlxllP,
Ifk(t) = v(x,)/v(~,), then
From the extremal properties of matrix pencils,
LV(t,x) 5 -k311xllP, k(t) 5 hmax [(A and
+ ~ ( t ) )+' Q(A + ~ ( t ) )
Q-l
I.
(64.23)
Here, km,(K) denotes the largest magnitude of the eigenvalues of the matrix K. By the ergodicityproperty ofthe random matrix process F(t), we have the almost sure stability condition Besides the many stability theorems, Has'minskii also studied other asymptotic properties of stochastic systems and presented many interesting examples to illustrate the stabilizingand destabilizing effects of random noise in stochastic systems. Just as in the case of deterministic systems, the Lyapunov function approach has the advantage that one may obtain considerable information about the qualitative (asymptotic) behavior of trajectories of the system, in particular, stabilityproperties of the system which are of interest, without solving the system equation. However, no general systematic procedure exists to construct a candidate Lyapunov function. Even though the theorems, like Has'minskii's, provide necessary and sufficient conditions, one may never find the "appropriate" Lyapunov function in practice. Usually the nature of the stability condition obtained via a Lyapunov function critically depends on the choice of this function. Various techniques have been proposed by investigators to
Man [39] tried to generalize the results of Infante. However, several obvious mistakes can be observed in the derivation of the theorem and the two examples given in this work. Following Infante, Kozin and Wu [29] used the distributional property of the random coefficient matrix F(t) to obtain improved results for two specific second-order systems in the form,
where f (t) and g(t) are stationary and ergodic random processes.
1111
64.2. LYAPUMOV FUNCTION METHOD
Parthasarthy and Evan-Zwanoskii [48] presented an effective computational procedure, using an optimization technique, to apply the Lyapunov type procedure to higher order systems in the form of Equation 64.19 with F ( f ) = k ( t ) G . Here G is a constant matrix and k ( t ) is a scalar (real-valued) stationary and ergodic random process. After proper parameterization of the quadratic form Lyapunov function, the Fletcher-Powell-Davidson optimization algorithm was used to optimize the stability region which depends on the system data and k ( t ) . A fourth-order system was studied using the suggested procedure, and various simulation results were obtained to show ,nat this procedure yielded a stability region that was cot unduly conservative. Because an optimization procedure was bsed and the solution of a Lyapunov equation was required, this procedure required an extensive computational effort. Wiens and Sihna [56] proposed a more direct method for higher order systerns defined as an interconnection of a set of second-order subsystems. Consider the system,
3. E(Amaxl(Ao
+ Ct + K ~ ) P - ' ] )
K
0,
where Ct and Kt are given by
and
When applying their results to the second order system Equation 64.24, Wiens and Sihna [56]obtained the stability criteria, where M, Co, and KO are nonsingular n x n matrices and C ( t ) ,K ( t ) are n x n stationary and ergodic matrix-valued, random processes The technique for constructing aLyapunovfunction suggested by Walker [55]for deterministic systems was used to construct a quadratic form Lyapunov function V ( 2 ) for the deterministic counterpart of Equation 64.25,
that is
;
P1 P; i =A i t p i , " ( 2 , = i ' [ t pp, 2 ] ;
P; = Pi r 0 , for i = 1 , 2 , with a time derivative along sample paths,
where A. is properly defined arid the following choices are taken: P3
== I ' ~ M - ~ c o ,
P1
=:
- E ( f 2 ( t ) }< 4 p 2 c , E ( g 2 ( t ) )< 4 p 2 c / ( c
THEOREM 64.3 (Wiensand Sihna) The system Equation 64.25 is almostsurely asymptotically stable (Dejinition II,.,. In Section 64.1) in the large i f a positive definite matrix P2 exists so that 1. p2 M-' KO is positive definite, 2. thesymmetricpartsof P ~ M - 'coand ( M - ~ c o ) ' P ~ ( MKO) -~ are positive definite, and
(64.27) (64.28)
These results are similar to those obtained by Caughey and Gray 112). Equation 64.27 is the "optimal" result of Infante, but Equation 64.28 is not. This is not surprising, because no optimization procedure was used in deriving the stability criteria. The usefulness of the theorem was demonstrated by Wiens and Sihna by applying their method to higher order systems (n = 4 and 6 ) which yielded stability regions of practical significance. Another research direction for applying of the Lyapunov function method is the study of stochastic feedback systems for which the forward path is a time-invariant linear system and the random noise appears in the feedback path as a multiplicative feedback gain. Lyapunov functions are constructed by analogous methods for deterministic systems, e.g., the Lyapunov equation, the pathintegral technique and the Kalman-Yacubovich-Popov method. However, most of the results obtained can be derived by directly using a quadratic form Lyapunov function together with the associated Lyapunov equation. Kleinman [25] considered a stochastic system in the form,
1 ~ 2 M - KO l -1- - ( M - ' c ~ ) ' P ~ ( M - ' c ~ ) . 2
Then, Infante's approach was used to obtain the following theorem:
and
+2p2)
d x = Axdt
+ Bxdc,
(64.29)
c
where is a scalar Wiener process with E ( [ t ( t ) - { ( t ) j 2 } = a2 ( t - t I . By using the moment equation and a quadratic Lyapunov function, he showed that a necessary and sufficient condition for zero to be "stable with probability one" (this is the same as Kushner's definition) is that
1s a stable matrix where "8" denotes the Kronecker product of matrices. However, this result should be carefully interpreted as discussed by Willems [%I.
THE C O N T R O L H A N D B O O K Willems [58], [59] studied the feedback system in the form,
with f a scalar-valued Wiener process with E{ [t( t ) - ,!j (r)I2}= u 2 I t - r 1, i.e., a system consisting of a linear time-invariant plant in the forward path with rational transfer function H(s) and a minimal realization (Ao, b, c), and a multiplicative feedback gain which is the sum of a deterministic constant k and a stochastic noise component. Suppose the system is written in a companion form with state x = (y, Dy, . . . Dn-' y) where D = d l d t and y = cx is the output. The closed loop transfer function is
with p(s) and q(s) relatively prime because the realization is assumed to be minimal. Then, the input-output relation can be written in the form,
Moreover, the following identity holds:
where g(t) is the impulse response of the closed-loop (stable) deterministic system and G(s) is the Laplace transform of g ( t ) . 2. Ifthe inequality Equation 64.33 holds, the stability of the origin as indicated in condition 1 above is asymptotic. Brockett and Willems [ l 11 studied the linear feedback system given by
where (A, B , C) is a completely symmetric realization, i.e., = A E I W " ~ " , B = C ' ~ R " ~ ~ , a n d K ~( tR) ~ ~ " i s a s t a tionary ergodic matrix process. By using a simple quadratic form as a Lyapunov function, the specific properties of a completely symmetric system, and the well-known Kalman-YacubovichPopov Lemma, they obtained the following stability theorem:
Willems observed that if a positive-definite quadratic form V(x) was to be used as a Lyapunov function, then, following Kushner, CV(x) is the sum ofa "deterministic part" obtained by setting the / ~ X ~ ) 64.5 (Brockett and Wdlems) noise equal to zero and the term ~ U ~ ( ~ ( D ) ~ ) ~(x,( ~= ~ V THEOREM For the system Equation 64.34, ~ n - 1y). Here, C is the infinitesimal generator of the system. To guarantee the negativity of LV(x) to assure stability, using Equation 64.32, Willems used a construction technique for a Lyapunov function that was originally developed by Brockett [lo],that is
I . If K(t) = K1(t) almost surely and
then the origin is almost surely asymptotically stable (in thesense that lim x(t) = 0 as.). Inparticular, t++m
+
+ + ho is the
ifm = 1, which is analogous to a single-input, singleoutput system, then
where the polynomial h(s) = sn hn-1sn-' .. . unique solution (assume p(s) is strictly Hurwitz) of
where Z1 is the largest zero of the open-loop transfer function g(s) = C(sZ B and p(.) is the density function of K(0). 2. If m = 1 and
so that
By applying Kushner's results, Willems obtained the following theorem:
THEOREM 64.4
(Wdlems)
1. The origin is mean-square stable in the large (in the
sense that R(t) = E{x(t)xl(t)} < M c +CG fort 2 0,xo E Rn andsupIIR(t)ll -+ Oas IIR(0)Il -+ 0) t20
and stable "with probability one" (in Kushner's sense) in the large, ifp(s) is strictly Hurwitz and
the origin is almost surely asymptotically stable i f a constant exists so that (a) Eimin(8, K(t>)l > 0, (b) the poles of g(s) lie in R e { s } < -4,-IS, and (c) the locus of G ( i o - qn-lS), -CG < r~ < +CG, does not encircle or intersect the closed disc centered at (- 1/28,O) with radius 1/28 in the complex plane.
64.3. THE LX4PCJNOV EXPONENT METHOD AND THE STABILITY OF LINEAR STOCHASTIC SYSTEMS Mahalanabis and Purkayastha [38] as well as Socha [53], applied techniques similar to Willems and Willems and Brockett to study nonlinear stochastic systems. Readers should consult their papers and the references cited therein for more details on this approach. An extension of Kushner's and Has'minskii's work on the Lyapunov function approach uses the Lyapunov function technique to study large-scale stochasticsystems. The developmentoflargescale system theory in the past two decades is certainly a major impetus for these studies. The Lyapunov function technique of Kushner and Has'rninskai, based upon the use of scalar, positivedefinite functions, is not effective for large-scale systems. As in the deterministic case, the difficulties are often overcome, if a vector positive-definite functioil is used. Michel and Rasmussen [40], [41], [5 11 used vector valued positive-definitefunctions for studying various stability properties of stochasticlarge-scale systems. Their approach was to construct a Lyapunov function for the complete system from those ofthe subsystems. Stabitityproperties were studied by investigating the stability properties of the lower order subsystemsand the interconnection structure. Ladde and Siljak in [33] and Siljak [52] established quadratic mean stability criteria by using a vector positive-definite Lyapunov function. This is an extension of the comparison principle developed by Ladde for deter~inisticsystemsto the stochasticcase. In these works, alinear comparison system was used. Bitsris [9] extended their work by considering a nonlinear comparison system. White noise systems were studied in all of the above works. Socha [54] investigated a real noise system where the noise satisfied the law of large numbers. Has'minskii's result [22] was extended to a large-scale system in this work. Interested readers are referred to the original papers and references cited therein.
64.3 The Lyapunov Exponent Method and tlie Stability of Linear Stochastic Systems One of the major advanced contributions to the study of stochastic stability during the past two decades is the application of the Lyapunov exponent concept to stochastic systems. This method uses sophisticated mathematical techniques to study the sample behavior of stochastic systems and often yields necessary and sufficient conditions for almost sure (sample) stability in the sense that lim Ilx(t, xo, w ) 11 = 0 a.s.
t+oo
1113
it is only recently that the Lyapunov exponent method has been used to study almost sure stability of stochastic systems. The setup is as follows: Consider the linear stochastic system in continuous time defined by
Let x ( t , xo, w ) denote the unique sample solution of ( E , ) initial from xo for almost all w E R . (We always denote the underlying nent probability spaceby ( R ,3,P)).The ~ ~ a ~ u n o v e x ~ oh,(xo) determined by xo is defined by the random variable, -
-
1
A,(xo) = limt-++oo log Ilx ( t , xo, w ) 11,
(64.36)
for ( C , ) . Here, xo can be a random variable. For simplicity, we will concern ourselves primarily with the case where xo is nonrandom and fixed. In the case A ( t , w ) = A ( t ) , a deterministic RdXd-valuedcontinuous bounded function, Lyapunov [37] proved the following fundamental results for the exponent h(.xo)for the system,
1. h(xo) is finite for all xo E Rd\{O). 2. The set of real numbers which are Lyapunov exponents for some xo E Rd\{O) is finite with cardinality P, 1 i p s d ; -m < 11 < 1 2 <
. . . < Ap < +m,
A,: E R Vi.
3. h(cx0) -- ~ ( x o for ) xo E Rd\{O} and c E R\{o). h o BYO) 5 max{h(xo),;i;(yo>}for xo, yo E Rd\{O} and a,B E R with equality if ~ ( x o< ) -(YO) and #? # 0. The sets Li = { x E R d \{O): X(X) = hi), i = 1 , 2 . . . p , are linear subspaces of Itd, and { C i ) e ois a filtration of Itd, i.e.,
+
A
wheredi = dim(&) - dim(li-1) is called the multiplicity of the exponent hi for i = l, 2 . . . p and the collection {(hi,di)}fz1is referred to as the Lyapirnov spectrum of the system ( C ) . We have the relation
(64.35)
This method can potentially be applied in science and engineering for the developlment of new theoretical results, and we expect it to be the focus of much of the future work. In this section we present a summary of this method and selected results which have been obtained to date. After the introduction of the concept of Lyapunov exponents byA.M. Lyapunov (371 at the end of thelast century, this concept has formed the foundation for many investigations into the stability properties of deterministic dynarnical systems. However,
-
1
I limt++m7 log llQ(t)ll (64.37)
where @ ( t )is the transition matrix of (Z). The system is said to be (forward) regular if the two inequalities in Equation 64.37 are equalities. For a forward regular system the and & can be replaced by lim .
THE CONTROL HANDBOOK
1114
For the stochastic system (E,) with w E S2 fixed, the relationship Equation 64.36implies that if xw(xo) < 0, then the sample solution x ( t , xo, w ) will converge to zero at the exponential rate lIU(xo)land, if%m(xo)> 0 , then the sample solutionx(t, no, w ) cannot remain in any bounded region of JRd indefinitely. From this we see that L ( x o ) contains information about the sample stability of the system. As we will see later, in many cases a necessary and sufficient condition for sample stability is obtained. Arnold and Wihstutz [5]have recently given a detailed survey of research work on Lyapunov exponents. The survey is mathematically oriented and presents a summary of general properties and results on the topic. Readers are encouraged to refer to Arnold and Wihstutz [5]for more details. Here, we are interested in the aqplication of the Lyapunov exponents to stability studies of stochastic systems. The fundamental studies of the Lyapunov exponent method when applied to stochastic systems are by Furstenberg, Oseledec, and Has'minskii. We will briefly review here Oseledec's and Furstenberg's work and focus attention on the work of Has'minskii, because there the idea and methodology of the Lyapunov exponent method for the study of stochastic stability are best illustrated. The random variables xw(xo) defined by Equation 64.36 are simple, nonrandom constants under certain conditions, for example, stationarity and ergodicity conditions. This is a major consequence of the multiplicative ergodic theorem of Oseledec [46]. Oseledec's theorem deals with establishing conditions for the regularity property of stochasticsystems. Lyapunov used the regularityof ( E )to determine the stabilityofa perturbed version of the system ( E ) from the stability of ( E ) . Regularity is usually very difficult to verify for a particular system. However, Oseledec proved an almost sure statement about regularity. Because of our interest in the stability of stochastic systems, the theorem can be stated in the following special form for (Z,), see Arnold et al. [ 2 ] .
THEOREM 64.6
2. Domain of attraction of Ei (w): 1 lim - log ll @ ( t , w)xo ll = hi ( w ) , t iffxo E Li(w)\li-l(w),
r++m
where l i (a)= Ej(w). 3. Center of gravity of exponents: r (w)
di(w)hi(w) = i=l
1 lirn - log det @ ( t , w)l t
t++m
1
where ? is the a algebra generated by the invariant sets of A(t, w ) . 4. Invariance property: If A(t, w ) is ergodic as well, then the random variables r ( w ) ,hi( w ) , and di ( w ) , i = 1 , 2 . . . r, are independent of w and are non-random constants. Note that under the current assumptions, all are actually lirn and ( 3 )is equivalent to the almost sure regularity of sample systems of the form (E,). Oseledec's theorem is a very general result and the above is a special version for the system (E,). A detailed statement of the theorem is beyond the scope of this chapter. As far as stability is concerned, the sign of the top exponent h , is of interest. We present a simple example to illustrate the application of the theorem.
EXAMPLE 64.3:
Consider the randomly switched linear system
Multiplicative ergodictheorem: (Oseledec)
Suppose A(t , w ) is stationary with finite mean, i.e., E{A(O, w ) ) < w. Then for (E,), we have
where
1. State space decomposition: For almost all w E S2, an integer r = r ( w ) exists with 1 5 r ( w ) 5 d , real numbershl(o) < h2(w) <
.. . c A, ( w ) , and linear subspaces (Oseledec spaces) E l ( w ) , . . . E, ( w )with dimensiotidi (w) = dim[Ei ( w ) ]
so that
+
and y ( t ) E {- 1 , 1 ) is the random telegraph process with mean time between jumps a-I > 0. Then,
and lirn
t++m
1
-t log Il@(t, W ) X Oll = Li (w), ifxo E Ei ( w )
where @(t, w ) is the transition matrix of ('C,) '@ " denotes the direct sum of subspaces.
and The.phase curves of these two linear systems are
64.3. THE LYAPUNOV EXPONENT METHOD A N D THE STABILITY OF LINEAR STOCHASTlC SYSTEMS
11 15
Because E = span {el = (1, ) is a common eigenspace of eA-I' and e A l t ,by the law of large numbers, it is clear that for any.yo E E\(U). -
h,(xo) = - ; a
+ 3c 1
as. for .yo E E\{O).
(64.46)
By (3) of the theorem, Equations 64.45 and 64.46, we obtain
(with h l < h 2 ) .We can ic!entify the following objects in Oseledec's theclrern:
Figure 64.1
Pha.se curves.
r
=
2,
dl
=
1,
d2
=
1 ,
E1(w)=E . f o r w ~ Q \ N ~ , EL(w)= E' = span {e;! = (0, I ) ~ ) , for w E Q\N1, Cl(w) = E, C2(w) =IR2, form E S2\N1, (64.46)
where NI is an 3-set with P ( N 1 )= 0 and
It is easy to verifiy that, for xo E lR2\{0),
-
a . lim -loglie 1 k+m k
dk). . , e ~ Y dl) Ixo 11 as.
From Equation 64.49 it follows that the system Equation 64.38 is almost surely exponentially stable in the sense that
P { \lx(t , w, xo) 11 -+ 0 as t + Sm, at an exponential rate } = 1 and
-
h,(IR2)
1
;log 11 det m ( t , o)ll
d"
tl&
=
a lim - log (1 det (eAykr(k). . . eAy1 T " ) )
11
k+aJ
k
11 a.s.
where {yk : k 1 1) is the embedded Markov chain of y ( t ) and {t (k) : k 2 1) is a sequence of i.i.d. random variables exponentially distributed with parameter a , which is also independent of {yk : k 2 1). Because y(t) is stationary and ergodic with unique stationary (invariant) distribution P{y(t) = 1) = P{y(t\ = - 1) = Osele~dec'stheorem applies. The theorem says that an F-set No exists with P(No) = 0 and two real numbers hl and h2 (maybe h1 = h2) so that
i,
-
I,(xo) E {A1,h2) for xO E IW2\{0) a n d o E n \ N o (64.44)
By Equations 64..40,64.41and 64.43 and the law oflarge numbers
==
---
- (a
1 + b)a k-tm lim - (t(') + . . . + ~ ( ~ k ) ) k
1 + 2ca k+m lirn - (ql) + . . . + t(k+)), k - (a
+ b)/;! + c
a.s.
(64.45)
Here ik is the number of times that yj takes the value - 1ink steps and t(j); i, j 2 1) is a sequence of i.i.d. random variables exponentially drstributed with parameter a.
iff h2 < 0 (iff c < b ). Note that the system is almost surely exponentially unstable iff h l > 0( iff c > 0 ). We remark here that we can compute A1 and h2 directly using Equation 64.46 and the center of gravity of the exponent relation for this simple system. An interesting question is what happens if b a . In this case, the top exponent is ic - ; a and only the top exponent in Oseledec's theorem is physically "realizable" in the sense that an xo E IR2 exists so that
it follows from Feng and Loparo [14], [15] that A,(xo) = ;C - $ a a.s. for any xo E IR2\{0) in this case. Motivated by Bellrnan [7],Furstenberg and Kesten [16], as a starting point, studied the product of random matrices and generalized the classical law of large numbers of an i.i.d. sequence of random variables. They showed that under some positivity conditions of the entries of the matrices, the random variable n-' log IIX, . . . Xi 11 tended to a constant almost surely for a sequence of i.i.d. random matrices {Xk}r=O=l. Furstenberg I17],[18] went one step further and showed that, if X i E G , for all i, where G is a noncompact semisimple connected Lie group with finite center, there existed a finite-dimensional vector space of functions @s (rather than just log 1) . 1)) so that n-"(~, . . . X I ) tended to a constant a* (known as Furstenberg's constant) almost surely asn -+ +oo. These studies also considered the action of the group G om certain manifolds, for example, JF-' c litd (the projective sphere in Ktd,i.e., p-'is obtained by identifying s and -s on the unit sphere sd-' in Rd). Furstenberg showed
-Actually,
THE CONTROL HANDBOOK that under a certain transitivity or irreducibility condition, there exists a unique invariant probability measure v for the Markov chain Zn = Xn * . . . * X 1 Zo, which evolves on @-I, with respect to p the common probability measure of Xi E G ( here, " " denotes the action of G on P-l) and for any xo E IRd\{O), the Lyapunov exponent
*
where ti( t )are independent standard Wiener processes. Theprocess x ( t )is then a Markov diffusion process with the infinitesimal generator
*
-
-
I,(xo)
1
= ,llmo0 ; log 11 X,
* . .. * X 1 * xo 11
+ 1 C o i j ( x ) &axi axj i,.j=l
where u is a real-valued twice continuously differentiable function, u, = a u / a x , (., .) is the standard inner product on IRd, and
Furstenberg's works developed deep and significant results and Equation 64.50 is a prototype for computing the Lyapunov exponent in most of the later work which followed. The following simple discrete-time example due to Has'minskii best illustrates Furstenberg's idea.
EXAMPLE 64.4:
d
Lu = ( A x , u,)
By introducing the polar coordinates, p = log llxll and q = Ilxll-lx for x # 0 , and applying Ito's differential rule,
(Has'minskii)
Consider a discrete time, xn+l = An(w)xn, system with { A , (w)]?=~,a sequence of i.i.d. random matrices. Let where Q(q)=s'Ar
+ ;1t r C ( q ) - q l W ) q
and HO(p), Hi ((P) are the projections onto sd-l of the linear vector fields Ax and Bix on IRd\{O]. Then from Equation 64.55 we see that q ( t ) is independent of p ( t ) and q ( t ) is a Markov diffusion process on sd-l. Has'minskii assumed a nondegeneracy condition of the form, Because { A n ) z lare independent, { A , , q n - l ] z l formsaMarkov is ergodic, the law oflarge chain in Itd X d x sd-'. If { A n(P,,- 1 number gives
-
-
I&(xo) = ,ll%
1 1 ;pn = lim n+w n
x "
. 1=1
log llAi(~i-1ll
where p is the common probability measure of { A i } g 1and v is the unique invariant probability measure of (p, with respect to b , (i.e., v is such that if (pg is distributed according to v on s d - l , thenrpl = IIA1~O1l-lAlrpOhas thesame distribution v on sd-1). Has'minskii [20]generalized Furstenberg's idea to a linear system of Ito stochastic differentialequations and obtained a necessary and sufficient condition for the almost sure stability of the system:
x(0)
= xo
+ i = l Bix(t)d&( t ) ,
t 20
which guaranteed that the angular process q ( t ) is an ergodic process on sd-l with a unique invariant probability measure v. Then the time average of the right hand side of Equation 64.54 is equal to the ensemble average, i.e.,
independent of xo E IRd\{O].
THEOREM 64.7 (Has'minskii) Under condition ( H ) , J < 0 implies that the system Equation 64.51 is almost surely stable. I f J > 0 , then the system Equation 64.51 is almost strrely unstable and P { t +lim + w Ilx(t,xg,o)ll =+oo}=
I f J = 0 , then
rn
d x ( t ) = Ax(t)dt
( H ) alZ(x)ol 5 rnllall. 11x1, V a E Itrn andx E IRd
(64.51)
1.
I
64.3. THE LYA17UI\JOV EXPONENT METHOD AND THE STABILITY OF LINEAR STOCHASTIC SYSTEMS
P{
lim
t++m
Ilx(t, xo, w)ll < +co) > 0.
To determine the sign of J , the invariant probability measure v must be found. This can be done, in principle, by solving the so-called Fokker-Planck (forward) equation for the density p of v . For simplicity, .we consider the two dimensional case (d = 2). It can be shown from Equation 64.55 that the angular process cp(t) satisfies the Ito equation
with C ( t ) a standard Wiener process, where priate functions of cp. cp(t) has generator
and Y are appro-
1117
where B represents a Gaussian white noise process. After transforming the systems into an Ito representation, they observed that condition ( H ) is not satisfied with singularities Y (f = 0. Kozin and Prodromou then used one-dimensional diffusion theory to study the singularities and sample path behavior of the angular process q ( t ) near the singularities. The angular process traversed the circle in a clockwise directior: and was never trapped at any point, nor did it remain at a singularity +%or -% for any positive time interval. Thus, the angular process cp(t) was ergodic on the entire circle S' . From the law of large numbers, they obtained
9)
Then, the Fokker--Planck equation for p is
with normalization and consistency constraints
Condition ( H ) Iguarantees that Equation 64.59 is nonsingular in the sense that Y(cp) i 0 for any cp E S d - l , and thus admits Equation 64.60. However, even for a unique solution sat~~sfylng simple but nontrivial systems, an analytic solution can never usually be obtained. The problem is even more complicated if ( H ) is not satisfied. This is more often than not the case in practice. In this situation, singularity on s1 is a result of q(cp)= 0 for some cp E s1 and the Markov diffusion process p ( t ) may not be ergodic on the entire sphere s'. One must examine each erto determine a complete description of godic component of the invariant probability measures. The law of large numbers is then used to compute the exponent for each xo belonging to the ergodic components. Has'minskii presented two second-order examples to illustrate the treatment of ergodic components and the usefulness of the theorem. In one of the examples it was shown that an unstable linear deterministic system could be stabilized by introducing white noise into the system. This settled a long standing conjecture about the possibility of stabilizing an unstable linear system by noise. Has'minskii's work was fundamental in the later developments of the Lyapunov exponent method for studying stochastic stability. Following Has'minskii, Kozin and his co-workers studied the case when the nondegeneracy condition (H) fails. This work made extensiveuse of one-dimensional diffusiontheoryto obtain analyticand humerical results for second-order linear white noise systems. Kozin and Prodromou 1281 applied Has'minskii's idea to the random harmonic oscillator system in the form,
,s'
and
independent of xo. R and L are Markov times for cp to travel the right and left circles, respectively. Using the so-called speed measure and scale measure, they were able to show the positivity of J for Equation 64.61 for a2 E ( 0 , +co). This proved the almost sure instability of the random harmonic oscillator with white noise parametric excitation. For the damped oscillator Equation 64.62, they failed to obtain analytic results. The almost sure stability region in terms of the parameters (4, a ) was determined numerically. Mitchell [42] studied two second-order Ito differential equations where the condition (H) is satisfied. By solving the FokkerPlanck equation, he obtained necessary and sufficient conditions for almost sure stabilityin terms of Bessel functions. Mitchell and Kozin [43] analyzed the linear second-order stochastic system, xl=x2, and x2 = -u2x1 - 2C0,r2 - f i x l - f2xS
(64.63)
Here the fiare stationary Gaussian random processes with wide bandwidth power spectraldensity. After introducing a correction term due to Wong and Zakai, Equation 64.63 was written as an Ito equation and Has'minskii's idea was again applied. Because the angular process ~ ( tis) singular, one-dimensional diffusion theory was used to give a careful classification of the singularities on S' and the behavior of the sample path p ( t ) near the singularities. The ergodic components were also examined in detail. The difficulty of the analytic determination of the invariant measures corresponding to the ergodic components was again experienced. An extensive numerical simulation was conducted. The almost sure stability regions were represented graphically for different parameter sets of the system. In this work, the authors provided a good explanation of the ergodic properties of the angular process p ( t ) and also illustrated various concepts, such as stabilization and destabilization of a linear system by noise, by using examples corresponding to differentparameter sets. Interestingly, they provided an example where the sample solutions of the system behaved just like a deterministic one. Examples were also used to show that the regions for second moment stability may be small when compared with the regions for a.s. stability, i.e., moment
1118
THE CONTROL HANDBOOK
stability may be too conservative to be practically useful. This is one of the reasons why as. stability criteria are important. A more detailed study of the case when the nondegeneracy condition is not satisfied was presented by Nishioka [45]for the second-order white noise system given by
with t , ( t ) , i = 1, 2, being independent standard Wiener processes. Results similar to Kozin and his co-workers were obtained. Has'minskii's results were later refined by Pinsky [49]using the so-calleJ Fredholm alternative and the stochastic Lyapunov function f ( p , cp) = p h(cp) in the following manner. From the system Equation 64.51 with the generator Equation 64.52, Pinsky observed that
+
i.e., the action of L on f ( p , 9 ) separates C into the radial part C p and the angular part LV. Furthermore, C p ( p )= v(cp) is a function of cp only, and L, is a second-order partial differential operator which generates Markov diffusion process on . S d - l . If Has'minskii's nondegeneracy condition ( H )is satisfied, then L, is ergodic and satisfies a Fredholm alternative, i.e., the equation
a
(64.66)
L,h(cp) =
Loparo and Blankenship [35],motivated by Pinsky, used an averaging method to study a class of linear systems with general jump process coefficients
with A' = -A and (pl . . . pm)' a jump process with bounded generator Q. Usingthetransformationz(t) = e V A t x ( t ) , p ( t ) = log Ilz(t)ll = log llx(t)ll, and ~ ( t = ) llz(t)ll-'z(t), the system was transformed into a system of equations involving ( p , 9) which are homogeneous in y = ( y l . . . y,)' but inhomogeneous in time. Thus an artificial process t ( t ) was introduced so that [ p ( t ) ,cp(t),r ( t ) ,y ( t ) ] was a time-homogeneous Markov process. Using the averaging method and the Fredholm alternative, a second-order partial differential operator was constructed. L is the generator of a diffusion process on R x sd-' . Applying Pinsky's idea they obtained q=
f
Z p ( p ) m ( d 0 )=
~d-1
u(t)
.f
g ( ~ ) m ( d=~0)
(64.67)
with m being the unique invariant probablity measure of the process cp(t) on sd-l. Define
.f
4 =
(64.68)
v(cp)m(d~).
sd-1
=
choose h ( c p )as a solution of formula, Pinsky obtained ~(t)+h(cp(t))=
- v(cp).F~~~ I
~
~(O)+h(cp(o))
1 t
+
Lf (P(s).~ ( s ) ) d+s Mt
0
'(O) + h(r(o)) + qt
=
where Mr=
j
+
Mt (64'69)
H(p(s))df(s)
0
for an appropriate function H on sd-'. Mt is a zero-mean martingale with r ' p e c t to the a algebra generated by the process ~ ( t=) 0 a.s. Thus, upon dividing {cp(t), t L 0 ) and lim t++m
both sides of Equation 64.69 by t and taking the limit as t +
+w
v(O)m(dO)
(64.72)
where m is the unique invariant probability measure on sd-' for the diffusion cp(t) generated b y z q . Unfortunately, q is not the top Lyapunovexponent for the system and cannot be used ti, determine the a.s. stability properties of the system. This can be easily verified by considering the random harmonic oscillator
has a unique solution, up to an additive constant, provided that
sd-1
f
s~ I
+ k2(1 + b y ( t ) ) u ( t )= 0
(64.73)
with y ( t ) E (-1, + 1 ) a telegraph process with mean time between jumps 1-' > 0. The formula for q is q = - k*hhZ
8(k2+k2)
a's.
(64.74)
independent of xo E iR2\{0}. This is only the first term in an expansion for the Lyapunov exponent h , ( x o ) ; see the discusEquations 64.72 and 64.74 do not provide sufficient ~ sion ' below. ~ conditions for almost sure stability or instability. Following Loparo and Blankenship's idea, Feng and Loparo [13] concentrated on the random harmonic oscillator system Equation 64.73. By integration of the Fokker-Planck equation, the positivity ofh,(xo) = h was established for any k, h E (0, +m) and b E (-1, 1)\{0). Thus, the system Equation 64.73 is almost surely unstable for any k , 1, and b as constrained above. To compute the Lyapunov exponent h explicitly, (it is known that h,(xo) = = constant a s . for any xo E IR2\{0)), the following procedure was used. As usual, introducing polar coordinates p = log Ilxll, p = ~lxll-'x, where x = ( k u , u)', the system becomes 6 0 ) = Y ~ O ( P ) and Q(t>= h ( ~Y ,)
(64.75)
for some smooth functions go and h on S' .The Markov process ( p ( t ) ,cp(t),y ( t ) ) has generator
64.3. THE LYAPUNOV EXPONENT METHOD AND THE STABILITY OF LINEAR STOCHASTIC SYSTEMS
where L1 = Q + A q . a / a p satisfies a Fredholm alternative. When L acts on the {unction, F = p h ( q ) fl OJ, p), where fl ( y , cp) is a correction term introduced in the work of Blankenship and Papanicolaou,
+
+
L F =qo+Lzh(cp) and qo = T L 2 f i ( y , q ) = K[v(do)l.
(64.76)
1119
analytic determination of the invariant probability measure, several researchers have attacked the problem by using an analytic expansion and perturbation technique; some of these effortshave been surveyed by Wihstutz 1571. Auslender and Mil'shtein -161- studied a second-order system perturbed by a small white noise disturbance, as given by k
In Equation 64.76 the correction term is chosen as the solution of Ll f l = -ygo(q), and C 2f l = v(cp) is a function of q only. Here, ?is the uniform measure on^', and h ( 9 ) is the solution of ~ c p ah/av . = [ v ( q )-77v(q)]. It follows that L 2 h ( q ) = Ygl (bo) and a martingale convergence argument gives -
==
h
1 t
lim - p ( t ) ,
r++oo
t
=
qo
== 40
+
1 l&
+X I ,
[(kh), (~)ds,
o
where hl is the Lyaplunov exponent of the system,
(
+ x Ujxc(t)dtj(t)
dx'(t) = BxC(t)
where ( t )are independent, standard Wiener processes and 0 < E < 1 models the small noise. Assuming that Has'minskii's nondegeneracy condition (H) is satisfied, the angular process q y t ) = ~lx'(t)ll-'x'(t) is ergodic on s', and the invariant density pt satisfiesthe Fokker-Planckexpression Equation 64.58. The exponent h ( r )isgivenby Equation 64.56. In thiscase, asmall parameter E is involved. Auslender and Mil'shtein computed A(โฌ) to second order ( r 2 )and estimated the remainder term to obtain an expansion for A(โฌ) to order E~ as 6 J o+. For different eigenstructures of B, they obtained the following results:
[; :]
-- a - - X(0,1l)~ + r 4 p ( r )+ P O ( E ) โฌ2
r=l
where or = ( u i j p x 2. 1p(6)1 g rn < m and Ipo(r)I ( C ~ - ~ I / ' for some constants c and c l >
00
0.
k=O
The absolute convergence of the above series for any k , h E (0, +m), and b E (-1, 1 ) was proved by using Fourier series methods, and a general term qk was obtained. To third-order (b6),Feng and 1,oparo obtained
+ -5k4hb4 8(k2 + h 2 ) 64(k2 .th2)' +-g I;$: k2)2 21 160 --+ --) h2 + rk2 h2 + 9k2
b , two distinct real eigemal-
ues.
h=q0+11 =qo+(ql + x 2 ) = E q k
=
, a
(64.77)
Noting the similarity between the systems given by Equations 64.75 and 64.77, thle above procedure can be repeated for Equation 64.77. Hence
-
(64.79)
j=1
I. B =
~ ( t=) 4 4 ~ =)Y I I ( V I ) and @ ( t )= h ( q , Y ) .
-
6
[
b ] , a , b > 0 , a complex conjugate a pair of eigenvalues.
2. B =
a -b
= a
"6)
c2 +7
r=l
k2hb2 --
[ ( q ? 2 - u:l)
2
+ (o:' + u:')~]+ r 4 R ( s )
(A +
3. B =
3
:]
where J R ( E ) I5 m < +oo.
[;
, one real eigenvalue of geometric
multiplicity 2. 2rr f
n=3
In the formulas for the exponent obtained by most researchers, e.g., Has'minskii with Equation 64.56, an invariant probability measure was always involved in determining and evaluating the exponent. This is a very difficult problem which was overcome in the work of Feng and Loparo by sequentially applying the Fredholm alternative for the simple, but nontrivial, random harmonic oscillator problem. Due to the difficulty involved with
where C ( P )=
k
1
5 X ( ~ r h ( 9u)r ~h ( ~ ) ) - x ( u r h ( ~ ~) T( q ) ) ~ , r=l
r=l
h ( q ) = (cos q, sin q)', and p ( q ) is the density de-
termined by a Fokker-Planck equation and is independent 01'6.
THE CONTROL HANDBOOK 4. B =
[i d] , one red eigenvalue of geometric
multiplicity 1.
that Gyi
+ LTpi-I
=0
, for i = 1 . 2 . . . n,
(64.84)
by setting the coefficients of the term ai in Equation 64.83 equal to zero. Pinsky showed that the expansion satisfying Equation 64.84 can be obtained by taking where r (x) is the gamma function and 0(6)denotes the quantity ofsameorder of6 (i.e., lim,.+o O ( E ) / E= constant). In the above work, an intricate computation is required to obtain h ( r ) as an expansion of powers of E . A much easier way to obtain an expansion is direct use of perturbation analysis of the linear operator associated with the forward equation L;p = 0. Pinsky [50] applied this technique to the random oscillator problem,
where y is a positive constant which determines the natural frequency of the noise-free system, u is a small parameter which signifies the magnitude of the disturbance, and ( ( t ) is a finitestate Markov process with state space M = { 1 , 2 . . . N ) . F ( . ) is a function satisfying E { F ( ( ) ) = 0. After introducing polar coordinates in the form u f i = p cos p and u = p sin p , Pinsky obtained
where ( p , 6) is a time-homogeneous Markov process with generator,
and Q is the generator of the process ( ( t ) . The Fokker-PIanck equation of the density p , ( p , () for the invariant probability measure of ( p , 6 ) is given by
PO =
1
and
J
p, = 0
(64.85)
S1xM
and proved the convergence of the series expansion by using the properties of the finite-state Markov process to show that
By evaluating pi from Equations 64.84 and 64.85, Pinsky obtained
<
;i = where hkl is the mean sojourn time of ( t )in state k and ($i .I, 2 . . . N ) are normalized eigenfunctions of the linear operator Q . For the case when F ( ( ) = ( is a telegraph process, Pinsky obtained a refined expansion of Equation 64.86 consisting of the first two terms in Equation 64.78 when a 4 0+. Contemporaneous with Pinsky, Arnold, Papanicolaou, and Wihstutz [ 4 ] used a similar technique to study the random harmonic oscillator in the form,
where y, a andp are parametersmcdeling thenatural frequency, the magnitude of noise, and the time scaling of the noise. ( ( t ) is only assumed to be an ergodic Markovprocess on a smooth connected Riemannian manifold M, and F (.) is a function satisfying E { F ( ( ) } = 0. A summary of their results appears next.
where L* is the formal adjoint of L. The exponent is computed by where f ( w ) is the power spectral density of F ( ( ( t ) ) . +oo, y = yo + a y l , p = 1) :
2. largenoise ( a
Note that
Assume an approximation of p, in the form
Then, it follows from where Q is the generator of 6 and v is the unique invariant probability measure of 6 on M.
64.3. THE LYAPUNOV EXPONENT METHOD AND THE STABILITY OF LINEAR STOCHASTIC SYSTEMS
Arnold, Kliemann, and Oeljeklaus [ 3 ]considered the system given by
3. fast noise ( P J 0+, a and v fixed): if y > 0, U'T
A(P) == P . -f
4Y
A
(0) -k 0 ( p 2 ), p J 0+,
and
x ( t ) = A [ ( ( t ) ] x ( t ) and x ( 0 ) = xo E lRd
if y < 0, A(p) =:
a275
JT + p -f
4. slow noise ( p
4Y
+CQ,
-(0) + 0 ( p 2 ),
1121
p J 0'.
a and y fixed): if y >
where A : M .-+ lRdxd is an analytic function with domain M , an analytic connected Riemannian manifold, which is the state space of a stationary ergodic diffusion process t ( t ) satisfying the Stratonovich equation,
a max(F),
with ( r ) independent, standard Wiener processes. The following Lie algebraic conditions on the vector fields were posed by Arnold, etc.: A) dim L A ( X 1 , . . . X , ) ( t ) = dim M , VC E M. B ) d i m L A ( h ( t , . ) , C E M)(cp) = d - 1, Vcp i p - ' . Here h ( t , $ )is the smooth function given in Equation 64.89 for theangular process, a n d p - I is the projectivesphere inIRd. Condition A) guarantees that a unique invariant probability density p of 6 on M exists solving the Fokker-Planck equation Q * p = 0, with Q the generator of t ( t ) . Condition B) is equivalent to the accessibilityof the angular process cp governed by Equation 64.88 or the fact that the system group, G=
where +(<) = tan-I ( ( aF ( < ) - y)/J--57). Pardouxand Wihstutz [47] studied the two-dimensionallinear white noise system Equation 64.79 by using similar perturbation techniques. Instead of using the Fokker-Planck equation, perturbation analysis was applied to the backward (adjoint) equation and a Fredholm alternativewas used. Pardoux and Wihstutz were able to obtain a general scheme for computing the coefficients of the series expansion of the exponent for all powers of E . Their results are the sane as l), 2), and 3 ) of Auslender and Mil'shtein, if only the first tvvo terms in the expansion are considered. There is another approach for studying the problem based on a differential geometric concept and nonlinear control theory, which been suggested by Arnold and his co-workers. For a linear system, after introducing polar coordinates, the angular process (p(t)is separated from the radial process and is governed by the nonlinear differential equation
where (( t ) is the noise process and h is a smooth function of 6 and (p. To determine the Lyapunlov exponent h,(xo), the ergodicity ] to be determined. This of the joint process [ ( ( t ) , ~ ( t )needs question is naturally related to the concept of reachable sets of p ( t ) and invariant sets of Equation 64.88 on sd-' for certain "controls" (( t ). From this perspective the geometric nonlinear control theory is applied.
[ nn,=I e'lAc51);r, E R, C,
E
M,
I
finite ,
la
acts transitively on sd-l, i.e., for any x , y E 9 - l . g E G exists so that g * x = y. Under these conditions, they showed that a unique invariant control set C in p-' exists which is an ergodic set and that all trajectories cp(t) entering C remain there forever with probability one. Henceforth, a unique invariant probability exists with support M x C. measure p of (4, cp) on M x The following theorem was obtained. THEOREM 64.8 (Arnold, et al.) For the system defined by Equations 64.89 and 64.90, suppose A) and B) above hold. Then
A =
1
q(F. cp)p(dt.do), q(F, c p )
MxC
=
P'A(~)P
(64.91)
is the try exponent in Oseledec's theorem. 2. For each
3. For the fundamental matrix Q ( t ) of Equation 64.89,
limt++m
+
log llQ(t)(l = A
a.s. (64.93)
THE CONTROL HANDBOOK
Equation 64.91 is in the same form as Has'minskii's. The fact that k,(xo) = h = A,,, in Oseledec's theorem says that only the top exponent Amax is realizable, i.e., observing the sample solutions of thesystem with probability one, one can only observe the top exponent. The difficulty is still how to determine F . Arnold [ 3 ] motivated , by the results on the undamped random oscillator by Molchanov [44] studied the relationship between sample and moment stability properties of systems defined by Equations 64.89 and 64.90 and obtained very interesting results. Besides the conditions (A) and (B) above, another Lie algebraic condition was needed to guarantee that the generator C of (6, c p ) is elliptic. Define the Lyapunov exponent for the pth moment, for p E R , as -
g ( p , xo) = limt-++,-
1 log E(llx(t,X O , w)llP), t
By Jenson's inequality g ( p , xo) is a finite convex function of p with the following properties:
Figure 64.2
Applying the results to the damped harmonic oscillator,
-
2. g ( p , xo) l hp , h,(xo) = a.s. 3. g ( p , x o ) / p is increasing as a function of p with xo E ' Rd\{O) fixed.
with x = (y, j)' E R 2 , Arnold obtained
4 . g'(0-, xo) 5 h 5 gf(O+,xo), here, g' denotes the derivative of g with respect to p.
Then linear operator theory was used to study the strongly conA
tinuous semigroup TI ( p )of the generator L ( p ) = L + p q ( c , 9 ) . Here, L is the generator of (6, cp) and q ( 6 , c p ) = cplA({)cp. Arnold showed that g ( p , xo) is actually independent of xo, i.e., g ( p , xo) = g ( p ) , and g ( p ) is an analytic function of p with gl(0) = h and g(0) = 0 . g ( p ) can be characterized by the three figures shown in Figure 64.2 (excluding the trivial case, g(p) = 0). If g ( p ) r 0, then h = 0 . If g ( p ) f 0 , besides 0 , g ( p ) has at most one zero po # 0, and it follows that 1. A = 0 iff g ( p ) > 0 ,
V p # 0.
2. h > 0 iff g ( p ) < 0 ,
for some p < 0 . for some p > 0
3. h < 0 iff g ( p ) < 0 ,
In the case t r A ( c ) r 0 , more information can be obtained for the second zero po # 0 of g ( p ) . Under conditions that e ( t ) is a reversible process and the reachability of cp(t) on @-' , Arnold showed that po = - d , d is the system dimension. However, if t r A ( 6 ) f 0, then the system Equation 64.89 is equivalent to
where t r A o ( 6 ) = 0. observing that d - ' t r ~ ( e commutes )~ with A o ( { ) , it follows that
~ ) ]go(p) ) is the exponent for where a0 = d - ' t r { ~ ~ [ e ( and the pth moment of j ( t ) = A o ( e ) y ( t ) .Therefore go(-d) = 0 .
where go ( p )is the moment exponent for the undamped oscillator with go(-2) = 0 . Thus, it follows that 1. For the undamped oscillator Equation 64.97, if B = 0, then h > 0 and g ( p ) > 0 for p > 0 . This im-
plies almost sure sample instability and pth moment instability for p > 0. 2. We can stabilize the undamped oscillator by introducing positive damping so that with a > 0 and g ( p ) < 0 for p E (0, p l ) , pl is some positive real number. We remark here that the random oscillator problem has attracted considerable attention from researchers during the past thirty years. Equation 64.97 occurs in many scienceandengineering applications, such as mechanical or electrical circuits, solid state theory, wave propagation in random media, and electric power systems. It is also of particular interest from a theoretical viewpoint because of the simple structure of the model. From the result 3) above, we see that pth moment stability for some p > 0will imply almost sure sample stability. One natural question to ask is the converse question, i.e., does almost sure sample stability have any implication for the pth moment stability? The result 2) above for the random oscillator gives a partial answer to this question. One may ask whether pi in 2) can be arbitrarily large or, equivalently, when does the sample stability (A < 0 ) imply pfh moment stability for all p > O? if lim
p++m
1123
64.3. THE LYAPUNOV EXPONENT METHOD A N D THE STABILITY OF LINEAR STOCHASTIC SYSTEMS
then we may choose > y sufficiently large so that g ( p ) < 0 for all p > 0. Hence, y is the quantity which characterizes the questioned property. An interesting problem in the stability study of stochastic systems is when can an unstable linear system be stabilized by introducing noise into the system. This question has attracted the attention of researchers for a long time. Has'minskii [20]presented an example giving a positive answer to the question. Arnold, Crauel and Wihstuts [ 2 ] presented a necessary and sufficient condition for stabilization of a linear system x ( t ) = A ( t ) x ( t )by stationary ergodic noise F ( t ) E IRdxd in the sense that the new system f ( t ) = [,4(t) F ( t ) ] x( t ) will have a negative top exponent and hence is almost surely stable. This result is presented next.
Using the fact that {Xi ( w ) = Dwj e x p ( A r j ) ) E 1is an i.i.d. sequence, they generalized Furstenberg's results from semisimple Lie groups to general semigroups (note that Di may be singular) in the following sense. Let M = sd-l U { O ) , and let p be the probability distribution of X1 induced by
(Arnold,Crauel and Wihstutz) Given thesystem i ( t ) = A x ( t ) with A E IRdxda constant matrix, theperturbedsystem isgiven by x ( t ) = [ A F ( t ) ] x( t ) where F ( t ) is a stationary, ergodic, and measurable stochastic process i n IRdxd with finite mean. Then
THEOREM 64.10 For all v E Qo,
+
THEOREM 64.9
p ( r ) E= P { D ~ ~ E~ r*: ~r I E B(IRdxd))
where B(IRdxd) is the Bore1 CT algebra on IRdxd. Let v be an invariant probability on M with respect to p , i.e., if xo E M, it is distributed according to v, and then xl = ~ ~ ~ has the same distribution v. If Q o denotes the collection of the so-called extremal invariant probabilities on M with respect to p , then Li and Blankenship obtained the following result:
(Li and Blankenship)
+
1. For any choice of F ( t )with E ( F ) = 0 , the top Lya-
punovexponenth,, tem satisfies
+
[ A F ( t ) ]of theperturbedsys-
and -
A,(xo) = lim - log rr++oo t
2. For any E > Ofuced, an F ( t ) exists with E { F ) = 0 so that
llx(tl
XO)II
= 1 . r,,
llxoII
a.s.
for all xo E E:, where h i 1 is the mean interarrival time of Ni ( t ) , A-' is the mean interarrival time of N(t),'and E: is an ergodic componentcorresponding to v E Qo. There are only afinitenumber of different values of r v , say, rl < r2 < . . . < re, l 5 d . Furthermore, if E: contains a basis for IRd, then the system V E Qo Equation 64.99 is asymptotically almost surely stable ifre < 0 , and Equation 64.99 is almost surely unstable i f r l > 0. In the case where rl < 0 and re > 0 , then the stability of the system depends on the initial state xo E IRd \to).
u
In particular, a linear system x ( t ) = Ax ( t ) can be stabilized by a zero-mean stochastic process if; and only if; t r ( A ) < 0.
Note that 2) above implies that the undamped, random harmonic oscillatorcannot be stabilized by a random linear feedback control of the ergodic type. Li and Blankenship [34]generalized Furstenberg's results on the product of random matrices and studied linear stochastic system with Poisson process coefficients of the form,
where Ni ( t ) are independent Poisson processes (counting processes). The solution of Equation 64.99 can be written as a product of i.i.d. matrices on IRd x d acting. on the initial point
Here, the difficulty is still determining the extremal invariant probabilities. Li and Blankenship also discussed large deviations and the stabilization of the system by linear state feedback. Motivated by Oseledec and Furstenberg's work, Feng and Loparo [14]and [ I S ]studied the linear system given by ~ ( t=) A ( y ( t ) ) x ( t ) ,t 2 0 , X ( O )= xo E I R ~ ,
and
(64.100)
where y ( t ) E N = { 1 , 2 , . . .n ) is a finite-state Markov process with the infinitesimal generator
xo E IRd \{o) :
+
for t E [ t ~ ( C~ N) (, ~ ) +where ~ ) , Dj = I Bi, for i = 1 , 2 . . .m, and N ( t ) = Nl ( t ) .. . N, ( t ) is also a Poisson process with mean interarrival time h-', {ti;j 2 1 ) is the interarrival times
+ +
A
+
of N ( t ) , tj = rl i . . . r, is the occurrence times of N ( t ) , and p j ( t ) is an indicator process which encodes which Ni ( t ) undergoes an increment at tj.
and initial probabilities ( p l ;.. . pn) , and A ( i ) = Ai E IRd x d . By extensively using properties of the finite-state Markov processes and a sojourn time description of the process y ( t ) , they related the Oseledec spaces of the system to invariant subspaces of the constituent systems 1 = Aix and obtained a spectrum theorem for the Lyapunov exponents of the stochastic system Equation 64.100. The exponents in the spectrum theorem given
~
x
THE C O N T R O L HANDBOOK
below are exactly those exponents in Oseledec's theorem which are physically realizable. They also showed that the spectrum obtained was actually independent of the choice of the initial probabilities (pl , . . . p,) of y ( t ) . Thus, stationarity was not required. The theorem is stated next. THEOREM 64.11 (Feng and Loparo) For the system Equation 64.100, suppose pi > 0 and )cij > 0 for all i, j E N = { l , 2 . . . n ) . Then there exists k reals, k 5 d .
and an orthonormal basisfor IRd
~ '= 1
span {ep'
ingredient is a Lyapunov function and, under certain technical conditions, the stability properties of the stochastic system are determined by the "derivative" of the Lyapunov function along sample solutions. As in the deterministic theory of Lyapunov stability, a major difficulty in applications is constructing a proper Lyapunov function for the system under study. It is well known that, although moment stability criteria for a stochastic system may be easier to determine, often these criteria are too conservative to be practically useful. Therefore, it is important to determine the sample path (almost sure) stability properties of stochastic systems. It is the sample paths, not the moments, that are observed in applications. Focusing primarily on linear stochastic systems, we presented the concepts and theory of the Lyapunov exponent method for the sample stability of a stochastic system. Computational difficulties in computing the top Lyapunov exponent, or its algebraic sign, must still be resolved before this method will be of practical value.
. . .e y ' )
References
is an ij-dimensional subspace of IRd and
[ l ] Arnold, L., A formula connecting sample and moment stability of linear stochastic systems, SIAM J. Appl. Math., 44,793-802, 1984. [ 2 ] Arnold, L., Crauel, H., and Wihstutz, V., Stabilization
of linear system by noise. SIAM J. Control Optimiz.,
which is afiltration of IRd, i.e., ,
21(3),451-461, 1983. [ 3 ] Arnold, L., Kliemann, W., and OeljeLJaus, E., Lya-
where "@" denotes the direct sum of subspaces, then
Lj = { X E IRd\(0) : Zu(x) 5 hj a.s. ) and&,,(x) = hi a.s. iifjC~E Lj\Cj-, for j = 1 , 2 . . .k. 2. Lj is an Ai-invariant subspace of IRd for all i = 1.
1,2 ... n a n d j = 1 , 2
...k.
3. All of the results above are independent of the initial
probabilities ( p l , .. . p,) chosen.
4. I f y ( t ) isstationary, hk is the top exponentin Oseledec's
theorem. Stability results can be obtained directly from the theorem, e.g., if hi < 0 i n d hi+l > 0, then
and
P ( t ) y _ 11x0, xo, w)ll = +m) = 1 ,
ifxo E IRd\&.
64.4 Conclusions In this chapter we have introduced the concept of stability for stochastic systems and presented two techniques for stability analysis. The first extend Lyapunov's direct method of stability analysisfor deterministic systemsto stochasticsystems. The main
punov exponents of linear stochastic systems. In Lecture Notes in Math., No. 1186, Springer, BerlinHeidelberg-New York-Tokyo, 1985. [ 4 ] Arnold, L., Papanicolaou, G., and Wihstutz, V., Asymptotic analysis of the Lyapunov exponent and rotation number of the random oscillator and application. SIAM J. App. Math., 46(3),427-449, 1986. [ 5 ] Arnold, L. and Wihstutz, V., Eds., Lyapunov exponents, In Lecture Notes in Math., No. 1186, Springer, Berlin-Heidelberg-New York-Tokyo, 1985. [ 6 ] Auslender, E.I. and Mil'shtein, G.N., Asymptotic expansion of Lyapunov index for linear stochastic system with small noise. Prob. Math. Mech. USSR, 46, 277-283, 1983. [ 7 ] Bellman, R., Limit theorem for non-commutative Operator I. Duke Math. J., 491-500, 1954. [ 8 ] Bertram, J.E. and Sarachik, P.E., Stability of circuits
with randomly time-varying parameters. Trans. IRE, PGIT-5, Special Supplement, p 260, 1959. [ 9 ] Bitsris, G., On the stability in the quadratic mean of stochastic dynamical systems. Int. J. Control, 41 ( 4 ) , 1061-1075,1985. [ l o ] Brockett, R.W., Finitedimensionallinearsystems,John Wiey & Sons, New York, 1970. [ 111 Brockett, R.W. and Willems, J.C., Average value crite-
ria for stochastic stability. In Lecture Notes in Math., No. 294, Curtain, R.F., Ed., Springer, NewYork, 1972. [12] Caughey, T.K. and Gray, A.H., Jr., On the almost sure
64.4. C O N C L U S I O N S
stability of Eneair dynamic systems with stochasticcoefficients. J. Appl.Mech., 32, 365, 1965. [13] Feng, X, and Loparo, K.A., Almost sure instability of the random harmonic oscillator. SIAMJ.Appl. Math., 50 (3), 744--759, 1990. . I141 Feng, X. and Loparo, K.A., A nonrandom spectrum for Lyapunov exponents of linear stochastic systems. Stochastic Analysis and Appl., 9(1), 25-40, 1991. 1151 Feng, X. and Loparo, K.A., A nonrandom spectrum theorem for products of random matrices and linear stochastic systems. J. Math. Syst., Estimation Control, 2(3), 323-338, 1992. 1161 Furstenberg, H. and Kesten, H., Products of random matrices. Ann. Math. Statist., 31,457-469, 1960. 1171 Furstenberg, H., Noncommuting random products. Trans. Amer, Math. Soc., 108,377-428, 1963. 1181 Furstenberg, H., A Poisson formula for semi-simple Lie group. Ann. of Math., 77,335-386, 1963. [19] Has'minskii~,R.Z., A. limit theorem for solutions of
[20]
12 11
1221 [23]
differential equations with random right-hand sides. Theory Probl. Appl., 11,390-406, 1966. Has'minskii, R.Z., Necessary and sufficient condition for the asymptotic stability of linear stochastic systems. Theory Prob. Appl., 12, 144-147, 1967. Has'minskii, R.Z., Stability of systems of differential equations under random perturbation of their parameters. &I., "Wauka", 1969. Has'minskii, R.Z., Stochastic stability of differential equations, Sijthoff and Noordhoff, Maryland, 1980. Infante, E.F., On the stability of some linear nonautonomouls random system.J. Appl. Mech., 35,7-12,
1968. 1241 Kats, 1.1. and Krasovskii, N.N., On the stability of
systems with random parameters. Prkil. Met. Mek., 24, 809, 1960. 1251 Kleinman, D.L., On the stability of linear stochastic systems. ZEEE Trans. on Automat. Control, AC- 14, 429430,1969. [26] Kozin, F., C)n almost sure stability of linear systems with rando~mcoefficients. M.Z.?: Math. Phys., 43, 59, 1963. [27] Kozin, F., A, survey of stability of stochastic systems. Automatics, 5,95-112, 1969. [28] Kozin, F. and Prodrornou, S., Necessary and sufficient
condition for almost sure sample stabilityof linear Ito equations. SZAMJ.Appl. Math., 21 (3), 413424,1971. 1291 Kozin, F. and \Mu, C.M., On the stability of linear stochastic clifferentkal equations. J. Appl. Mech., 40, 87-92,1975. (301 Kushner, H.J., Stochastic stability and control Academic, New York, 1967. [311 Kushner, H.J., hztroduction to stochastic control theory, Holt, Rinehart and Winston, New York, 197 1. 132) Kushner, H.J., Stochastic stability. In Lecture Notes in Math., No. 249, Curtain, R.F., Ed., Springer, New York, 97-124,. 1972.
1331 Ladde, G.S. and Siljak, D.D., Connective stability of large scale systems. Int. J. Syst. Sci., 6 (8), 713-721, 1975. [34] Li, C.W. and Blankenship, G.L., Almost sure stability
of linear stochastic system with Poisson process coefficients. SIAMJ. Appl. Math., 46 (5), 875-91 1, 1986. 1351 Loparo, K.A. and Blankenship, G.L., Almost sure instability of a class of linear stochastic system with jump parameter coefficients. In Lyapunov Exponents, Lecture Notes in Math., No.1186, Springer, BerlinHeidelberg-New York-Tokyo, 1985. 1361 Loparo, K.A. and Feng, X., Lyapunov exponent and rotation number of two-dimensional linear stochastic systems with telegraphic noise. SIAM J. Appl. Math., 53 (I), 283-300, 1992. 1371 Lyapunov, A.M., ProblCme gCnCrale de la stabilitk du muvement. Comm. Soc. Math. Kharkov, 2, 1892,3, 1893. Reprint Ann. of Math. Studies, 17, Princeton Univ. Press, Princeton, 1949. [38] Mahalanabis, A.K. and Parkayastha, S., Frequency
domain criteria for stability of a class of nonlinear stochasticsystems. IEEE Trans.Automat. Control, AC18 (3), 266-270, 1973. 1391 Man, F.T., On the almost sure stability of linear stochastic systems. J. Appl. Mech., 37 (2), 541, 1970. [40] Michel, A.N., Stability analysis ofstochastic large scale systems. Z. Angew. Math. Mech., 55,93-105, 1975. [41] Michel, A.N. and Rasmussen, R.D., Stability of
stochastic composite systems. IEEE Trans. Au:omat. Control, AC-21,89-94, 1976. [42] Mitchell, R.R., Sample stability of second order stochastic differential equation with nonsingular phase diffusion. IEEE Trans. Automat. Control, AC17,706-707,1972. 1431 Mitchell, R.R. and Kozin, F., Sample stability of sec-
ond order linear differentialequation with wide band noise coefficients. SIAM J. Appl. Math., 27, 571-605, 1974. 1441 Molchanov, S.A., The structure of eigenfunctions of
one-dimensional unordered structures. Math. USSR Izvestija, 12,69-101, 1978. 1451 Nishioka, K., On the stability of two-dimensional linear stochastic systems. Kodai Math. Sem. Rep., 27, 221-230, 1976. [46] Oseledec, V.Z., A multiplicativeergodic theorem Lya-
[47]
1481
1491 [50]
punov characteristic number for dynamical systems. Trans. Moscow Math. Soc., 19, 197-231, 1969. Pardoux, E. and Wihstutz, V., Two-dimensionallinear stochasticsystems with small diffusion. SIAM J. Appl. Math., 48,442-457, 1988. Parthasarathy, A. a d Evan-Zwanowskii, R.M., On the almost sure stability of linear stochastic systems. S U M I. Appl. Math., 34 (4), 643-656,1978. Pinsky, M.A., Stochastic stability and the Dirichlet problem. Comm. PureAppl. Math., 27,3 11-350,1974. Pinsky, M.A.,Instability of the harmonic oscillator
THE CONTROL HANDBOOK with small noise. SLAM J. Appl. Math., 46(3), 451463,1986. [51] Rasmussen, R.D. and Michel, A.N., On vector Lya-
punov functions for stochastic dynamic systems.IEEE Trans. Automat. Control, AC-21,250-254, 1976. [52] Siljak, D.D., Large-scale dynamic systems. Amsterdam, North Holland, 1978. [53] Socha, L., Application of Yakubovich criteria for stability of nonlinear stochastic systems. IEEE Trans.Automat. Control,, AC-25(2), 1980. [54] Socha, L., The asymptotic stochastic stability in large of the composite systems. Automatics, 22(5), 605610,1986. 1551 Walker, J.A.,On the application of Lyapunov's direct
method to linear lumped-parameter elastic systems. J. Appl. Mech., 41,278-284, 1974. [56] Wiens, G.J. and Sinha, S.C., On the application of Lyapunov direct method to discrete dynamic systems with stochastic parameters. J. Sound. Vib., 94(1), 1931, 1984. [57] V. Wihstutz, V. Parameter dependence of the Lyapunov exponent for linear stochastic system. A survey.
In Lyapunov Exponents, Lecture Notes in Math., No. 1186, Springer, Berlin-Heidelberg-New York-Tokyo, 1985. (581 Willems, J.L., Lyapunov functions and global fre-
quency domain stability criteria for a class of stochastic feedback systems. In Lecture Notes in Math., No. 294, Curtain, R.F., Ed., Springer, New York, 1972. [59] Willems, J.L., Mean square stability criteria for stochastic feedbacksystems.Int. J. Syst. Sci., 4(4), 545564,1973.
Stochastic Adaptive Control
T.E. Duncan and B. Pasik-Duncan Department of Mathematics, University of Kansas, Lawrence. KS
65.1 Introduction .......................................................... 65.2 Adaptive Control of Markov Chains.. ............................... A Counterexample to Convergence A Counterexample to Convergence 65.3 Continuous Time Linear Systems.. .................................. 65.4 Continuous Time Nonlinear Systems................................ References.. ..................................................................
65.1 Introduction Stochasticadaptive control' has become avery important area of activity in control theory. This activity has developed especially during the past 25 years or so because many physical phenomena for which control is required are best modeled by stochastic systems that contain unknown parameters. Stochastic adaptive control is the control problem of an unknown stochastic system. Typical stochastic systems are described by Markov chains, discrete time linear systems, and continuous time linear and nonlinear systems. In each case a complete solution of a stochastic adaptive control problem means that a family of strongly consistent estimators of the unknown parameter is given and an adaptive control that achieves the optimal ergodiccast for the unknown system is given. The problem of determining the unknown parameter is an identification problem. Two well-known identification schemes are least squares and maximuin likelihood. In some cases these two schemes give the same family of estimates. Adaptive controls are often formed by the certainty equivalence principle, that is, a control is determined by computing the optimal control assuming that the parameter estimate is the true parameter value. Each type of stochastic system typically has special features which require special procedures for the construction of a good family of estimates or a good adaptive control. A brief review of a few notions from stochastic analysis is given to facilitate or elucidate the subsequent discussion. Let ($2, 3,P) be a complete probability space, that is, $2 is the sample space and F is a u-algebra of subsets of $2 that contains all of the subsets of P-measure zero. Let (Ft,t E &) be an increasing . that contain all sets of P-measure family of sub-u-algebras of F zero. Such a family is often called a filtration. A Markov pro-
''~u~portecl in part by NSF Grant DMS 9305936. 0-8493-8570-9/9~j/$O.O0+$.50 @ 1996 by CRC Press, lnc.
1127 1127 1130 1133 1136
cess is a family of random variables ( X t , t E R+) on ( Q , 3,P ) such that if t > s P ( X f E AIX,, u 5 s ) = P ( X t E AIXs) almost surely (a.s.) A standard n-dimensional Brownian motion (or Wiener process), ( B t , t E R+) on ( Q , 3,P) is a family of Rn-valued random variables such that Bo = 0, for t > s Bt - B, is Gaussian with zero mean and covariance ( t - s)I and this process has independent increments. The process ( Y t ;3 t ; t E W+) is an ( 3 , , t E B+) martingale if Yt is integrable for all t E R+ and if t > s E[Yt IFs]= Ys a.s.
65.2 Adaptive Control of Markov Chains In this section the problem of adaptive control of Markov chains is considered. A Markov chain is described by its transition probabilities. It is assumed that the transition probabilities from the state i to the state j depend on a parameter which is unknown. The adaptive control procedure consists of two stages. The first one is the estimation of the unknown parameter. The Markov chain is observed at each time t and on the basis of this observation the unknown parameter is estimated. The second stage is the construction of the control as ifthe estimatewere the true value of the parameter. The control should be optimal in the sense that an appropriate cost function is minimized. It is important to know when the sequence of estimates converges to the true value of the parameter and when the sequence of controls as functions of the estimates converges to the control that is a function of the true value of the parameter. Is it possible to achieve the optimal cost? Some results in adaptive control ofMarkov chains that were obtained by Mandl, Varaiya, Borkar, Kumar, Becker, and many others are surveyed. Let S denote a finite state space of the Markov chain and U the finite set of available controls. For each parameter cu E I, p(i, j,u , a) is the probability of going from state i to state j using the control u. Precisely speaking the transition probability at time t depends upon the control action ut taken at t and upon
THE CONTROL HANDBOOK a parameter a Prob(xt+l = jlxt = i) = p(i, j , ut, a ) .
a)
At each time t , xt is observed and based upon its value, ut is selected from the set U . The parameter a has the constant value a 0 which is not h o w n in advance. However, it is known that a 0 belongs to a fixed set I. For each parameter a E I @(a,.) : S + U is a feedback control law. At time t we apply the action ur = q5 (&?,xt) where xt is the state of the system at time t and Gt is the maximum likelihood estimaw of the unknown parameter. The estimate sit thus satisfies: P{xo, ..., x t l x o , ~ o . . .~ t - 1 , G t )= 3
n n
2. Does thesequence {~%t),bO=~ converge to the truevalue a 0 almost surely? 3. Does thesequence {ut) E l ,whereut = #(st, converge almost surely? 4. Does the sequence {ut ) E l , where ut = 4 (air, tor.verge to uo = +(a!o, -)almost surely? 5. Does the average cost Ct = c(xk, uk) converge almost surely? a)
6. Does the average cost Ct =
c(xk, uk) converge to J(ao) almost surely? Here J(ao) is the optimal cost achievable for the parameter ao, which is assumed not to depend on the initial state xo. 7. k t what rate do these quantities converge, if they do?
t-1
P(x,, x,+], u,, G ~ )
s=o
t-1
P(x,, xs+lus7a) = s=o = P{x0, . . . ,xrlx0, uo . . . , ur-1, a )
(65.1)
for all a E I . If the likelihood function is maximized at more than one value of a, then a unique value is assumed to be chosen according to some fixed priority ordering. Having determined air according to Equation 65.1, the control action is selected to be ut = $(&, xr). The assumptions are the following:
Some examples are given to show that 2,4, and 6 may be negative. The answer to the most important question: "Does the sequence of the parameter estimates {air)converge to the true parameter value, ao?"can be ''no". The following is an example that illustrates such a situation.
65.2.1 A Counterexample to Convergence Consider the two state system with the unknown parameter a! E {0.01,0.02,0.03} with the truevaluea = 0.02 [l]. The feedback law is u = q5(0.01) = 4(0.03) = 2 and 4(0.02) = 1. The transition probabilities are the following:
1. S and U are finite sets. 2. 1 is a finite set [ 11 or a compact set [13], [ 9 ] . 3. There is an E > 0 such that for every i, j either p(i, j, u, a) > E for all u,a or p(i, j, u, a ) = 0 for all u, a . 4. For every i, j there is a sequence io, il, . . ., i, such that for all u and a, p(is-l,i,,u,a!) > 0, s = 1,..., r lwhereio =i,i,+l = j.
+
The assumption 1 is motivated by the fact that often a computer is used which can read only a finite number of (possibly quantized) states and can.guarantee only a finite number of (possibly quantized) actions. The assumption 3 guarantees that the probability measures Prob{x,, . . . ,xnIxo,uo, . .. , ut-1, a ) , cr E I are mutually absolutely continuous. Since the esthnation procedure will, in finite time, eliminate from future consideration those parameter values which do not yield a measure with respect to which the measure induced by a 0 is qbsolutely continuous, this assumption is not restrictive. The assumption 4 guarantees that the Markov chain generated by the transition i), i) has a single ergodic class. In other probabilities p(i, j, q5 (a, words, assumption 4 guarantees that all states communicat~with eich other. Some such condition is clearly needed for identification. Many q-uestions arise in the problems of adaptive control of Markov chains. 1. Does the sequence {&t),OO=l converge almost surely?
The initial state is xo = 1. Suppose uo = 1. Then at t = 1 we have the following probabilities:
so that the estimate is ail = 0.01; or 2. xl = 2,
p(l,2, uo, 0.01) = 0.51 p ( l , 2 , uo, 0.02) = 0.52 p(l,2, uo, 0.03) = 0.53
so that the estimate is dl = 0.03. In either case ul = 2. Since p(i, j, 2, a ) does not depend on a!, it follows that the estimate will remain unchanged. Thus, air it 0.01 if xl = 1 or air = 0.03 if xl = 2 and so pro cannot be a limit point of {air). The same situation occurs when questions 4 and 5 are asked. The exampie shows that the answers to the questions:
65.2. ADAPTIVE CONTROL OF MARKOV CHAINS 4. Does the sequence (@ ( s t , .)} converge to @(ao, almost surely? and
in some situations. Kumar [9] extended the result of Borkar and Varaiya assuming that the set of unknown parameters is compact. All the other assumptions are the same as those made by Borkar and Varaiya (Theorem 65.1).
a)
c:
5. Does Ct =: c(xs, us) converge to J(ao) almost surely? (liecall: J (ao) denotes the optimal cost for aO.) can be negative. Kumar and Becker [ l l ] also give an example that illustrates the situation that: the sequence of parameter estimates does not need to converge to the true value.
65.2.2 A Counterexample to Convergence Consider the two state system S = (1,2} with the unknown parameter a = (1,2,3} with the true value a 0 = 1 [ l l ] . The transition probabilities are the following p ( l , l , l , l ) = p(1,1,1,3)=0.5 p(1, 1, 1,2) = 0.9 p(1, 1,2, 1) = p(1, 1,2,2) = 0.8 p(1,1,2,3) = 0.2 p(2,l,u,a) =
lforalluanda.
The cost functio~nis the following:
It is easy to calculatethat the optimal feedback law is Q (1, i) = 1 for i = 1,2 and Q(2, i ) = @(3,i ) = 2 for i = 1,2. Assume that xo = 1 and uo = 1. With probability 0.5, xo goes to the state xl = 1, but then &1 .= 2. Hence, ul = @(I,2) = 2. It is easy to verify that &t = 2 for all t 2 1, so the probability that limtj, kt # ,XO is at least 0.5. Borkar and Varaiya [ l ] proved that with probabil~ty1 the sequ&ce {Gt} converges to the random variable a* such that p(i, j, @(a*,i), a*) = p(i, j, @(a*,i), cro) for alli, j
E
S.
It means that the transition probabilitiesare the same for the limit of the parameter estimates as for the true value, so that these two parameters a * and a0 are indistinguishable.
(Borkar and Varaiya). Ifp(i, j , u, a ) 2 e > 0 for all i, j E S, u E U,a E I then there is a set N of zero measure, a random variable a * and a finite random time T such thatfor m 4 N , 1 T(w):
THEOREM 65.1
THEOREM 65.2 (l? R. Kumar). There exists a subset N C i2 with probability measure zero such that: w 4 N and {h,(w)) converges toa*(w) a p(i, j , @(a*(@),i), a*(w)) = p(i, j, Q (a*(w), i), (YO) for every i, j E. S. The results of Borkar and Varaiya hold for every w for which (2, ( w ) } converges. Kumar showed that there is an example where with probability 1,the sequence {&t) diverges even though assumptions 1,2,3, and 4 in 65.2 are satisfied. It is easy to see that Theorem 65.2 requires the convergence of the parameter estimates. So, the result of Kumar is that the conclusion of Borkar and Varaiya holds for almost every w for which the parameter estimates converge. Even if the parameter estimates do not converge, then another result of his is that if the regulator converges to a limiting feedback regulator (see definitions in [9]), then the true parameter and everylimit point of the parameter estimatesare indistinguishable, in the sense that under the limiting feedback law their closed-loop transition probabilities coincide. As was mentioned before, Mandl was the first who considered the problem of adaptive control of Markov chains. He assumed that the set of controls U and the set of unknown parameters I is compact, p(i, j, u, a ) > 0 and c(i, j, u) are continuous functions, where c(i, j, u), i, j E S is the reward from a transition from i to j, when the control parameter equals u. The parameter estimates are obtained by minimizing t-l
f (xS,xs+i, us, a ) a
~ t ( a= )
E
I , t = 1.2,. ..
s=a
This is called a minimum contrast estimator of ao.The function f (i, j, u , a ) u E U ,a E I, i , j, E S is the so-called contrast function. For this function to be a contrast function the following conditions have to be satisfied: 1.
Zjzlp(i, j , u, a')(f (i, j , u, a ) > Oforu E U , i E S , a l , a E I,
2. if a,a' that
E
- f (i, J , u, a'))
I,a # a', then there exists an i E S such
1. limt +, t3t (w) = a * (w)
2. p(i, j , @l(a*(w),i), a*(w)) = PC, j . @(a*(w),0, ao) for all i, j , E S.
EXAMPLE 65.1: The result obtained by Borkar and Varaiya (Theorem 65.1) was extended to the case where the set of parameters is compact. Mandl [13] assumed that the set of parameters is compact but he considered this situation under a condition that can be restrictive
Assume: a) fori, j E Seitherp(i, j, u, a) > Oforu E U , a E I or p(i, j, u , a ) = 0 for u E U ,a E I,
THE CONTROL HANDBOOK b) if a , a' E I, cr # a', then there exists an i E S for which It)(i. l , ~a ,) , . . . , p(i, r, u, a ) ] # [p(i, 1 , u,.cr!),
f (i, j, u, a )
-
t-I
lim -
. . . , p(i, r, u, a')]
for u E U (i.e., the transition vectors are different). The contrast function leading to the maximum likelihood estimates is
f (i, j, ~ ra, ) =
2. The adaptive control law converges, in a Cesaro sense, to a control law that is optimal for the true but unknown parameter, lee.,
In p(i, j, u, a ) whenever
p(i, j , u , a ) # 0, = 1 whenever p(i, j, u, a ) = 0.
t+oo f
Ce (u, = 0 ( a * , ~ , =~ )1.) s=O
3. The long-term average cost achieved is, almost surely, exactly equal to the optimal long-term average cost achievable if the true parameter were known. THEOREM 65.4
(Kumar and Becker). Let
1. p ( i , j , u , a ) > O f o r a l l i , j ~ S , u ~ U , a ~ I .
2. c(i, j, u) > 0 fori, j E S, u
The condition b) of this example is the so-called Identifiability Condition. THEOREM65.3 (Mandl). If U and I arecompact, p(i, j, u, a), c(i, j, u), @(a,i) and f (i, u, a ) are continuousfunctions, p(i, j, u, a ) > 0 and the Identifiability Condition is satisfied, then
E
U.
3. I is a finite set. 4. o(t) isa function for which limr+, limt+, T1 ~ ( t=) 0.
o(t) = +oo and
The estimate iland the control law ut are chosen so that arg max
J(CY)-('(')
kt-,
fort = 0,2,4, . . . fort = 1 , 3 , 5 , ...
1. for any control
lim Gt = a 0 almost surely,
p(x,,., ,yS+1 , u, , cr)
I+,
2. if@(Gt,xt) = nt(x0, UO,. . . , xt) then t-I
lim c(xs, xs+ 1, us) = J (ao) almost surely. t-+W t s=o The remarkable result 1 of Mandl [13] suggests that the Identifiability Condition might be too restrictive in some practical situations. To see this, Borkar and Varaiya consider the familiar Markovian system
where xt is a real-valued variable and {tt) is a sequence of i.i.d. random variables. The unknown parameter is a! = (a, b). Then, for the linear control law ut = -gxt and two parameter values a = (a, b) and a' = (a', b') such that a l b = a'lb' = g, p(xt, ~ t + l~t, = -gxt, a ) = p(xt, xt+1, ut = -gxt, a') for all xt ,xt+l and so the IdentifiabilityCondition cannot hold. However, in some problems, for example, in the control of queueing systems, such a condition is satisfied. Others have considered special applications such as renewal processes and queueing systems and more general state spaces. These results and improvements of the previous theorems can be found in the references that are given in the survey of Kumar [lo] and the book by Chen and Guo [3]. Kumar and Becker [ l 11 introduced a new family of optimal adaptive controllers that consists of an estimator and a controller and simultaneously performs the following three tasks: 1. The successiveestimatesof the unknown parameter made by the adaptive controller converge, in a Cesaro sense, to an estimate of the true but unknown parameter.
and ut = @(kt,xt). Then 5. limt+ W c(xs, x,+l, u , ~= ) J(cro)almostsurely.
x e(& = a * )
t-1
6. limt+,
s=o
= 1 for some a * almost
surek 7. p(i, j, @(a*, i ) , a * ) = p(i, j, @(a*,i), ao) almost surely. 8. @fr is optimal for a 0 almost surely. Generalizations of this result are contained in the references given in [3] and [ 1 11.
65.3 Continuous Time Linear Systems Another important and commonly used class of systems is the class of continuous time linear systems. The models are assumed to evolve in continuous time rather than discretetime because this assumption is natural for many models and it is important for the study of discrete time models when the sampling rates are large and for the analysis of numerical round-off errors. The stochastic systems are described by linear stochastic differential equations. It is assumed that there are observations of the complete state. The general approach to adaptive control that is described here exhibits a splitting or separation of the problems of identification of the unknown parameters and adaptive control. Maximum likelihood (or equivalently least squares) estimates are used for the identification of the unknown constant parameters. These estimates are given recursivelyand are shown to be strongly consistent. The adaptive control is usually constructed by the so-
65.3. CONTINUOUS TIME LINEAR SYSTEMS
called certainty equivalence principle, that is, the optimal stationary controls are computed by replacing the unknown true parameter values by the current estimates of these values. Since the optimal stationary controls can be shown to be continuous functions of the urihown parameters, the self-tuning property is verified. It is shown that the family of average costs using the control from the certainty equivalence principle converges to the optimal average cost. This verifies the self-optimizingproperty. A model for the adaptive control of continuous time linear stochastic systems with complete observations of the state can be described by the following stochastic differential equation
where X (t) E Rn, U (t) E Rm.
each t 2 A and (K(t), t E [O, A)) is a deterministic function. For such an adaptive control, it is elementary to verify that there is a unique strong solution of Equation 65.2. The delay A > 0 accounts for some time that is required to compute the adaptive controllaw from the observation ofthe solution ofEquation 65.2. Let (U(t), t 2 0) be an admissible adaptive control and let (X(t), t 2 0) be the associated solution of Equation 65.2. Let A(t) = (aij (t)) and&) = (iij (t)) be L(RP)-valuedprocesses such that
To verify the strong consistency of a family of least squares estimates it is assumed that (3.A4) lim inf ldet A(t)l > 0 as. t+oo
Ai E L(Rn) i = 0,.. . , p, B E L(Rm,Rn), (W(t), t E R+) is a standard Rn-valued Wiener process and Xo = a E Rn. It is assumed that (3.A1) d C RP is compact and a E d. (3.A2) (A(a), B) is reachable for each CY E A. (3.A3) The family (Ai, i = 1, . . . , p) is linearly independent. Let (Ft, t :4 R+) be a filtration such that Xt is measurable with respect to Ft for all t E R + and (W(t), F t , t E R+) is a Brownian martingale. The ergodic, quadratic control problem for Equation 65.2 is to minimize the ergodic cost functional 1
lim sup - J(Xo, U, a , t) t+oo t
(65.4)
where
and t E (0, m], X(O) = Xo, Q E L(E"' and P E L(Rrn) are self-adjoint and P'-' exists, (X(t), t E R+) satisfies Equat E R+). It is tion 65.2 and (U(t), t E R+) is adapted to (Ft, well known [HI that if q is known then there is an optimal linear feedback control such that
where K = --P-.'B* v and V is the unique, symmetric, nonnegative definite solution of the algebraic Riccati equation
For an unknown a the admissible adaptive control policies (U (t ), t E W+) are linear feedback controls
The estimate of the unknown parameter vector at time t, &(t), for t > 0 is the minimizer for the quadratic functional of a , L (t ,a), given by
where U (s) = K (s)X(s) is an admissible adaptive control. The following result: [5] gives the strong consistency of these least squares estimators.
THEOREM 65.5 Let (K(t), t I 0) be an admissible adaptive feedbackcontrollaw. If(3.Al-3.A4) aresatisfied and a 0 E d o , the interiorofd, then thefamily of leastsquaresestimates (&(t), t > 0) where&(t)is the minimizer of Equation 65.9, is strongly consistent, that is, (65.10) where a0 is the true parameter vector. The family of estimates (&(t), t > 0) can be computed recursively because this process satisfies the following equation d&(t) = A - ' ( t ) ( ~ ( t ) x ( t ) , dX(t)- A(&(t))X(t)dt -BU (t)dt) (65.11) where (A(t)x, ,y) = ((Aix, y)) i = 1, .. . , p. Now the performance of some admissible adaptive controls is described.
PROPOSITION 65.1
lim sup 0) is an L(Wn,Rm)-valued process that is where (K(t), t uniformly bounded and there is a fixed A > 0 such that (K (t), t 2 0 ) is measurable with respect to u(X,, u 5 t - A) for
Assume that (3.A1-3.A4) are satisfied
and that
t+oo
4 jl
~ ~ ( t ) l< ~ doos a.s.
(65.13)
where (X(t), t 0) is the solution of Equation 65.2 with the admissible adaptive control ( ~ ( t )t, 2 0) and a = a0 E K '
THE CONTROL HANDBOOK and V is the solution of the algebraic Riccati equation 65.7 with = uo. Then 1 lirn inf -J(X0, U , cq,T ) 1: tr V a.s. (65.14)
a!
T+w
T
If U is an admissible adaptive control U ( t ) = K ( t ) X ( t ) such that a.s. lim K ( t ) = ko (65.15) t+m
where ko = - P - I B* v then 1 lim - J(X0, U , a!o, T ) = tr V T+m T
a.s.
(65.16)
Under the assumptionsofProposition65.1, if Equation 65.15 is satisfied, then Equations 65.13 and 65.13 are satisfied.
where Ql 1:0 and Q 2 > 0 and U is an admissible control. Since both A and B are unknown, it is necessary to ensure sufficient excitation in the control to obtain consistency of a family of estimates. This is accomplished by a diminishing excitation control (dither) that is asymptotically negligible for the ergodic cost functional Equation 65.20' Let (vn,n E M) be a sequence of Rm-valued independent, identically distributed random variables that is independent of the Wiener process ( W ( t ) ,t 2 0). It is assumed that E [vn] = 0, E [vn ]v: = I for all n E N and there is a a > 0 such that llvn112 5 a a.s. for all n E N. Let E E (0, and fixit. Define theRm-valuedprocess ( V ( t ) ,t 2 0 ) as
4)
COROLLARY65.1
The previous results can be combined for a complete solution to the stochastic adaptive control problem (Equations 65.2 and 65.4) [ 5 ] .
THEOREM 65.6 Assume that 3.A1-3.A4 are satisfied. Let (&(t),t > 0 ) be the family of least squares estimates where & ( t )is the minimizer ofEquation 65.9. Let ( K ( t ) ,t 2 0 ) bean admissible adaptive control law such that
A family of least squares estimates ( 8 ( t ) ,t 3 0 ) is used to estimate the unknown 8 = [ A , B J T . The estimate 8 ( t ) is given by
where 8(0) and a > 0 are arbitrary. The diminishingly excited control is U(t)= ud(t) V(t) (65.25)
+
where V ( a )is the solution of Equation 65.7 for a! E A. Then the family of estimates (B(t),t > 0 ) is strongly consistent,
lim K ( t ) = ko a.s.
(65.17)
t-+w
where ko = -P - I B *
v(ao)and
1
-J(Xo, U , ao, T)= tr V T-+w T lim
where Ud is a "desired" control. For A stable, ( A , C ) controllable and some measurability and asymptotic boundedness of the control, the family of estimates (8 ( t ) ,t 2 0 ) is strongly consistent 121.
a.s.
(65.18)
Now another model for the adaptive control of an unknown linear stochasticsystem is given. Let ( X ( t ) ,t 2 0 )be a controlled diffusion that is a solution of the stochastic differential equation
THEOREM65.7 Lets E ( 0 , ; ) begiven in Equation 65.21. For Equation 65.19, ifA is stable, ( A , C ) is controllableand the control ( U ( t ) ,t 2 0 ) isgiven byEquation65.25 where U d ( t ) E F ( t - ~ ) v ~ fort 2 0 and A > 0 isfixed and
as t + m for some S E [0, 1
where X ( t ) E Rn, U ( t ) E Rm and (W ( t ) ,t 2 0 ) is a standard pdimensional Wiener process. The probability space is (a,3 , P) and (Ft, t 2 0 ) is an increasing family of sub-a-algebras of 9 such that Fo contains all P-null sets, ( W ( t ) ,3 t , t 2 0 ) is a continuous martingale and X ( t ) E 3 t for all t 2 0. The linear transformationsA , B, C are assumed to be unknown. Since the adaptive control does not depend on C it sufficesto estimate the pair ( A , B). For notational simplicity let eT = [ A , B]. For the adaptive control problem, it is required to minimize the ergodic cost functional lirn sup J ( t , U ) = t-+w
+ uT( s )~ 2 U ( s ) ) d s
(65.20)
as t + m for each a! E (?,I 8 ( t ) satisfies Equation 65.22.
- 2E) then
- s) where 8 = [ A BIT and
Now a self-optimizing adaptive control is constructed for the unknown linear stochastic system (65.19)with the quadratic ergodic cost functional (65.20). The adaptive control switches between a certainty equivalence control and the zero control. The family of admissible controls U ( A ) is defined as follows
65.4. CONTINUOUS TIME NONLINEAR SYSTEMS
d U (t')f F(~-A)VO and
where P is the minimal solution of the algebraic Riccati equation 65.30 using A and B and
I J ' ( ~E) u ( v ( s ) , ( t - A ) v o 5 s It )
for all t 2 0 llx(t)112 = o ( t )
+
( l l ~ ( s ) l l ~l l ~ ( s ) l l ~ ) d s = O(r)a.s. as*t -+
Define the Rin-valued process ( u O ( t ) t, uO(t) =
(65.28)
00).
A) by the equation
- ~ 2 ' 5 ~( tA ) P ( t - A )
(ed*("x (t - A )
(65.29)
where A ( t ) and B(1) are the least squares estimates of A and B given by Equation 65.22 and P ( t ) is the minimal solution of the algebraic Riccati equation
+ el
+
~ ~ ( t ) ~ ~( (t t) ) ~ ( tP)( ~ ) B ( ~ ) Q ; ' B ~ ( ~ ) P (= ~ o) (65.30) if A ( t ) is stable and otherwise P ( t ) = 0.
To define the switching in the adaptive control, the following two sequencesof stopping times (a,, n = 1 , 2 , . ..) and ( r , ,n = 1 , 2 , . . .) are given as follows:
A(s
- A ) is stable for alh E
[r,, t ) ]
- A ) is stable and ~lx(t h)112(- t'fs/2
(65.3 1 )
A(t
(65.32)
65.4 Continuous Time Nonlinear Systems The stochastic adaptive control of nonlinear systems has become one of the most important areas of activity in control theory and is significantlymore difficult than for linear systems. Some reasons for these difficulties are that the processes are typically not Gaussianso it is difficultto establishthe existence and uniqueness of invariant measures and typically the optimal control (if it exists) is not known explicitly. To overcome these difficulties, two classes of problems are considered. The first is a financial model that describes a portfolio allocation where transaction costs are included and the second is a fairly general controlled diffusion where it is only required to find an almost self-optimizing adaptive control, that is, a control that is less than a given E > 0 h m the optimal ergodic cost for the known system.
MODEL I. A well-known model for optimal portfolio selection allows the investorto choose to invest in two assets: a bond B with a fixed rate of growth r and a stock S whose growth is governed by a Brownian motion with drift p andvarianceu 2 . The investor controls his assets by transferring money between the stock and the bond. While it is possible to buy stockswithout any cost, there is a brokerage or transaction fee that must be paid if stocks are sold such that if an amount of stock with value x is sold, then the bond only increases by an amount Ax where A E (0. 1 ) is fixed. For such a formulation it is usually not reasonable to assume that the average rate of return of the stock, p, is known. Let U ( t ) and Z ( t ) denote the total amount of money transferred from S to B and B to S respectively at time t ( U ( 0 ) = Z ( 0 ) = 0 ) . The processes S and B can be described by the following stochastic differential equations
The adaptive control ( U * ( t ) ,t >_ 0 ) is given by dt where ud(t)=
[
if t E [a,,a,+')
O v0(t) iftE[r,,u,)
for some n >_ 0 forsomen~l. (65.34)
The adaptive control U* is self-optimizing [ 2 ] . THEOREM65.8 IfA is stable and ( A , C )is controllable, then the adaptive control (U'(t), t 2 0 ) given by Equation 65.23 belongs to U ( A )and is self-qptimizingfor Equations 65.19 and 65.20, that is,
inf UEUaA)
+udS(t)dW(t)
+ d Z ( t ) - dU ( t )
(65.37)
where ( W ( t ) ,t 2 0 ) is a real-valued standard Brownian motion. Let Y ( t ) = S(t)
+B(t)
(65.38)
be the totalwealthofthe investor attime t . The control problem is to find a pair of optimal controls ( U * , Z*) such that the expected rate of growth
lim sup J ( t , U ) = lirn J ( t , U * ) = t+oo
tr(CT P C )
t-+m
+ ~ ~ ( B ~ P R ( A ) P B Qa.s. ; ' ) (65.35)
is maximized. It is shown in [14] that an optimal policy exists for the known system and for this policy the limit inferior in
THE CONTROL HANDBOOK
Equation 65.39 can be replaced by the limit and
For this optimal control pair, the limit superior in Equation 65.46 is actually a limit and the optimal cost is
In Y ( t ) a.s.
t
(65.40)
This control problem is solved by considering the process
and two new control processes L(t) =
and
I' r
S-'(s)d~(s)
Since it is assumed that the average rate of return of the stock is unknown, then a in Equation 65.45 is unknown. A family of estimates (li(t), t 2 0 ) based on the observations is defined and an adaptive control is defined based on the certainty equivalence principle by using a family of adaptive double bounds as follows:
(65.42)
t
(65.54)
and which represent the cumulative percentages of stocks and bonds withdrawn until timet. By the It6 formula the process ( X ( t ) ,t > 0 ) satisfies the following stochastic differential equation
where j ( x ) = x
where A (6 ( k ) ) and B(6 ( k ) ) are the solutions of Equations 65.47 and 65.48 where a = and A. and Bo are arbitrary constints such that 0 < A. < Bo. This family of estimatesand the adaptive control solve the stochastic adaptive control problem (41.
+ k 2k,( x ) = 1 + x and
The maximization of Equation 65.40 is equivalent to the minimization of 1 ' I(&,L ) =lim sup - E h(X(s))ds g(~(t))d~(s)] t+oo t - (65.46) ~ and g ( x ) = (1 - h ) x ( x 1). It is where h ( x ) = i ;x 2~ i-i shown in [14]that the optimal policy for this control problem is a double-bound control policy that keeps the process ( X ( t ) , t > 0 ) in the interval [ A , B ] where A and B are determined by the equations
[l
+
1'
+
$
and
B = A-'
3.
(a- l)A+a (2-a)A+l-a
and a = The optimal controls R* and L* are nonzero only on the boundary of the interval (local times) and are given explicitly as R*
=
L*
=
lim R,
(65.49)
lim L,
(65.50)
E+O
E+O
where the limits exist in the sense of weak* convergence of measures and
THEOREM 65.9 The family of estimates (li(t), t > 0 ) given by Equation 65.54 is strongly consistent and the adaptive control that uses the adaptive bounds given in Equations 65.55 and 65.56 is self-optimizing,that is, it achieves the optimal cost (65.53).
MODEL 11. The second unknown nonlinear system that is considered is a controlled diffusion described by a stochastic differential equation. Since the stochastic differential equation is fairly general, it is not known that a suitable optimal control exists. Furthermore, since the family of controls for the ergodic control problem is suitably measurable functions, only weak solutions of the stochastic differentialequation can be defined. For the adaptive control problem only almost optimal adaptive controls are required. A family of estimates is given but no claim of consistency is made. Let ( X ( t ;a, u ) , t 2 0 ) be a controlled diffusion process that satisfies the following stochastic differential equation
where X ( t ;a, u ) E Rn,(W ( t ) ,t > 0 ) is a standard It"-valued Wiener process, u ( t ) IEU C Rm, U is a compact set, a E d C Rq.and A is a compact set. The functions f and a satisfy a global Lipschitz condition a ( x ) a * ( x ) >_ cZ > 0 for all x E Rn and h is a bounded, Borel function on Rn x A x U . The family U of admissible controls is
U = ( u : u :Rn + Uis Borel measurable).
65.4. CONTINUOUS TIME NONLINEAR SYSTEMS The probability space for the controlled diffusion is den~ted (Q, 3,P). The solution of the stochastic differential equation is a weak solution that can be obtained by absolutely continuous transformation of the measure of the solution of
THEOREM 65.10 E
If 4.A1 and 4.A2 are satisfied, then for each >OtkereisaS >Osuchthatifa,B ~ d a n d l a - 8 1< S
where 11
which has one and only one strong solution by the Lipschitz continuity o f f and o. For aBorel set A, let TAbethe first hitting time of A. Let rl and r2be two spheres in Rn with centers at 0 and radii 0 < rl < 1-2, respectively. Let 1: be the random time defined by the equation
where oT is the positive time shift by Tr2 that acts on C ( R + , R'). The random variable r is the first time that the process ( X ( t ) ,t L 0 ) hits rl after hitting r2. The following assumptions are used subsequently (4.A1) sup sup sup E : ' ~ I < ~c ~o. ] a c d UEUX
E ~ ]
(4.A2) There is an Lo such that for all a , B
E
d
(4.A3) For each (x, a , u ) E Rn x d x U,E?" [Trl ] < 00. A family of measures ( m x ( . ;a , u); x E Rn,a E A, u E U) on the Borel cr-algebra of Rn, B(Rn),is defined by the equation
is the variation norm and p isgiven by Equation 65.61.
The control problem for Equation 65.57 is to find an admissible control that minimizes the ergodic cost functional 1 J ( u ;X, a ) =lim sup -E;vU r+oo t
[l 1
t ( ~ ( s )u ,( ~ ( s ) ) ) h ]
where k : Rn x U + R is a fixed, bounded Borel function. An adaptive contnd is constructed that is almost optimal with respect to this cost functional (65.63). An identification procedure is defined by a biased maximum likelihood method [1l ] where the estimates are changed at random times. The parameter set d is covered by a finite, disjoint family of sets. The adaptive control is obtained by choosingfrom a finite family of controls each of which is an almost optimal control for a distinguished point in one of the sets of the finite cover of A. Fix E > 0 and choose S ( E ) > 0 SO that a control that is & / 2 optimal for a E d is E optimal for /? E A where la - j3I < a(&). It can be verified lhat this is possible. By the compactness of d there is a finite cover of A, ( B ( a i ,S ) , i = 1.2, ...,r ) where ai $ B ( q , 6) for i # j and B(a, 6 ) is the open ball with the center a and the radius S > 0. Define (Ai ( E ) , i = 1, . . . ,r ) by the equations
where i = 2 , . .., r and A1(&)= B ( a l , 6 ) r l A. Let e : A + (a1, .. .,a,} be defined by where l o is the indicator function of D E B ( R n ) and r is given by Equation 65.59 assuming 4.AA. The measure m,(.; a , u ) is well defined for each (a,u ) E d x U. If 4.A1 is satisfied, then thereis an invariant measure, p ( . ; a, U ) on B ( R n ) for the process ( X ( t ; a, u ) , t 2 0 ) that is given by the equation
and let A. : A + R be defined by
where J * ( a ) = inf J(u; x , a ) . UEU
(65.67)
It can be assumed that A is lower semicountinuous by modifying (Ai( E ) i, = 1, .. .,r) if necessary on their boundaries. Given E > 0 choose N E W such that where D E B ( R n ) and q ( - ;a , u ) is an invariant measure for the embedded Markov chain (xo E: rl,X (sn;a , u ) , n E W ) where
&
2 sup sup Ilw(.; a, u)ll 5 -mN a c A ucU 4
where wheren > 1 and to = r . It is important to verify a continuity property of the invariant measures as a function on A. Here these measures are shown to be uniformly equicontinuous on A [7].
(65.68)
THE CONTROL HANDBOOK
References
and m > 0 satisfies
m 5 inf inf inf E;."(r). xerl ucU a c d
Define a sequence of stopping times (a,, n
E
N ) as
where n 2 1, a1 = rN and N is given in Equation 65.67. The unknown parameter a0 is estimated at the random times (a,, n E N ) by a biased maximum likelihood method [ 1 11, that is, &(a,)is a maximizer of
Ln ( a )= In M(on; a, ao, rl)
+ z ( u ~ln)
(",El ))
where
is the likelihood function evaluated at an with the control q, z : lR + W+ satisfies -+ 0 and + m for some /3 E (f, 1) a s t + 00. The family of estimates (&(t),t 2 0 ) is defined as follows: choose ii E A and let
9
3
and for n 2 1 let
&(an)= arg rnax L, ( a ) &(t)= &(a,) for an 5 t < un+l.
(65.72)
Using the family of estimates (G(t), t 2 0) and an approximate certainty equivalence principle, the following adaptive control is defined
rl(s;8) = ~ , ( c ( s )(x ) (s))
(65.73)
where uai for i = 1, . . .,r is a fixed ~ / optimal 2 control corresponding to the value ai and e ( . ) is given by Equation 65.65. This adaptive control achieves 8-optimality [7].
THEOREM 65.31 lim sup t+oo
1t
/
0
If (Al-A3)are satisfied then for each E > 0 t
+
k(X(s),q(s:~ ) ) d5s J*(oo) 28 as.
where 11 is given by Equation 65.73, J*(ao) is given by Equation 65.67 and a0 is the true parameter value. The results can be extended to a partially observed discrete time Markov process. Let us consider the following scenario. The Markov process is completely observed in a fixed recurrent domain and partially observedin the complement ofthis domain. Then the adaptive control problem can be formulated and solved [61.
[ 1 ] Borkar, V.and Varaiya, P., Adaptive control of Markov chains. I. Finite parameter set, IEEE Trans. Autom. Control, AC-24,953-958,1979. [2]Chen, H.F., Duncan, T. E., and Pasik-Duncan, B., Stochasticadaptive control for continuous time linear systems with quadratic cost, J. Appl. Math. Optim, to appear. [3] Chen, H. F. and Lei Guo., Identification and Stochastic Adaptive Control, Birkhauser VerIag, 1991. [4] Duncan, T. E., Faul, M., Pasik-Duncan, B., and Zane, O., Computational methods for the stochastic adaptive control of an investment model with transaction fees, Proc. 33rd IEEE Con$ Decision Control, Orlando, 2813-2816,1994. [5] Duncan, T.E. and Pasik-Duncan, B., Adaptive control of continuous-time linear stochastic systems, Math. Control, Sig., Syst., 3,4540,1990. [6]Duncan, T. E., Pasik-Duncan, B., and Stettner, L., Some aspects of the adaptive control of a partially observed discrete time Markov processes, Proc. 32nd Con$ Dec. Control, 3523-3526, 1993. [7] Duncan, T.E., Pasik-Duncan, B., and Stettner, L., Almost self-optimizing strategies for the adaptive control of diffusion processes, J. Optim. Th. Appl., 81, 479-507,1994. [8] Fleming, W.,and Rishel, R., Deterministic and Stochastic Optimal Control, Springer-Verlag, 1975. [9]Kumar, P. R.,Adaptive control with a compact parameter set, SIAM J. Control Optim., 20,9-13, 1982. [lo]Kumar, P. R., A survey of some results in stochastic adaptive control, SIAM J. Control Optim., 23, 329380, 1985. [ll] Kumar, P. R. and Becker, A., A new family of optimal adaptive controllers for Markov chains, IEEE Trans. Autom. Control, 27,137-146,1982. [12] Kumar, P. R.and Varaiya, P., Stochastic Systems: Estimation, Identification and Adaptive Control, Prentice Hall, 1986. [13]Mandl, P., Estimation and control in Markov chains, Adv. Appl. hob, 6,4040,1974. [14] Taksar, M., Klass, M. J., and Assaf, D., A diffusion model for portfolio selection in the presence of brokerage fees, Math. Oper. Res, 13,277-294, 1988.
SECTION XIV Control of Distributed Parameter Systems
Controllability of Thin Elastic Beams and Plates , 66.1 Dynamic Elastic Beam Models. ..................................... 1139 66.2 The Equations of Motion ........................................... 1140 Initially Straight and Untwisted Linear Shearable 3-Dimensional Beams Equations of motion Boundary conditions Initial conditions Initially Straight and Untwisted, Nonshearable Nonlinear 3-Dimensional Beams Equations of motion Boundary conditions Initial conditions Nonlinear Planar, Shearable Straight Beams Equations of motion Boundary conditions Initial conditions Planar, Nonshearable Nonlinear Beam The Rayleigh Beam Model The Euler-Bernoulli Beam Model The Bresse System The Timoshenko System 66.3 Exact Controllability ................................................. 1142 Hyperbolic Systems Quasi-Hyperbolic Systems The Euler-Bernoulli Beam 66.4 Stabilizability ......................................................... 1145 Hyperbolic Systems Quasi-Hyperbolic Systems The Euler-Bernoulli Beam 66.5 Dynamic Elastic Plate Models. ...................................... 1146 Linear Models 'A Nonlinear Model: The von KQrmlnSystem ' Boundary Conditions
j.
E. Lagnese
Department of Mathematla, Georgetown Unlverslty, Washington, DC
C.Leugering Fakultat fir Mathematilc und Physlk, University of Bayreuth, Postfach Bayreuth, Gern~any
66.6 Controllability of Dynamic Plates .................................. 1148 Controllability of Kirchhoff Plates Controllability of the ReissnerMindlin System 'Controllability of the von Karmin System 66.7 Stabilizability of Dynamic Plates.. .................................. 1153 Stabllizabilityof Kirchhoff Plates Stabilizability of the Reissner-Mindlin System Stabilizability of the von KdrmLn System References ................................................................ 1155
66.1 Dynamic Elastic Beam Models Consider the deformation of a thin, initially curved beam of length l and constant cross-section of area A, which, in its undeformed reference configuration, occupies the region
52
=
{ r =: ro(x1) +x2e2(x1) +x3eg(xl)l xl E [0, el,
where ro : [O,el -+ IR3 is a smooth function representing the centerline, or the reference line, of the beam at rest. The orthonormal triads el (.), e 2 ( . ) , e s ( . ) are chosen as smooth functions of xi so that el is the direction of the tangent vector to the centerline, i.e., e l ( x 1 ) = ( d r o / d x l ) ( x l ) , and e2(x1), es(x1) span the orthogonal cross section at x l . The meanings of the variables xi are as follows: xl denotes arc length along the undeformed centerline, and 1:2 and x3 denote lengths along lines orthogonal 0-8493-8570-9/96/$0.00+$.50 @ 1996 by CRC Press, Inc.
to the reference line. The set 52 can then be viewed as obtained by translating the reference curve r o ( x l )to the position x2e2 +x3e3 within the cross-section perpendicular to the tangent of ro. At a given time t , let R ( x l , x2, x3, t ) denote the position vector after deformation to the particle which is at r ( x l , x2, x 3 ) in the reference configuration. We introduce the displacement vector by V := R .- r . The position vector R ( x l , 0, 0, t ) to the deformed reference line x2 = x3 = 0 is denoted by Ro. Accordingly, the displacement vector of a particle on the reference line is W := V ( x l , 0,O) = RO - K O . The position vector R may be approximated to first order by
where Ei are the tangents at R with respect to xi, respectively. Note, however, that the triad Ei is not necessarily orthogonal, due to shearing.
THE C O N T R O L H A N D B O O K The deformation ofr(.) into R(.,t) will be considered as a succession of two motions: (1) a rotation carrying the triad ei (XI) to an intermediate orthonormal triad Ci (XI,t), followed by (2) a deformation carrying Ci (XI,t) into the nonorthogonal triad Ei (xi, t). The two triadsJ& and Ei then differ on account of a strain E to be specified below. We choose to orient the in-. termediate (right-handed) triad Ci, which serves as a moving orthonormal reference frame, so that
A strain 5, related to the deformation carrying the triad iito the triad Ei , is defined by
whereas the approximate bending strains are
The approximations given above comprise the theory of rods with infinitesimal strains and moderate rotations; see Wempner 1271.
66.2 The Equations of Motion Under the assumptions of the previous section, the total strain (or potential) energy is given by
The remaining strains are defined by-requiring the symmetry 8 -,,. . = E - . If distortion of the planar cross sections is neglected, E23 0. The normal N to the cross section is then E22 x E33 then N = E2 x E3 = C1 - 2&& - 2E31C3. Let Oi denote the angles associated with the orthogonal transformation carrying the orthonormal basis ei into i i , whereas the rotation of ei into Ei is represented by the angles 0i (dextral mutual rotations). Up to quadratic approximations we obtain
-,,
"
+
where E, G, 122, 133,I = 122 133are Young's modulus, the shear modulus, and moments of the cross section, respectively. The kinetic energy is given by
where ' = d l d t . Controls may be introduced as distributed, pointwise, or boundary forces and couples (F, M, f , m) through the total work These angles are interpreted as the global rotations. It is obvious from these relations that the shear strains vanish if, and only if, the angles Oi , zPi coincide, i = 1,2,3, or, what is the same, if the normal N to the cross section coincides with El.This is what is known as the Euler-Bernoulli hypothesis. To complete the representation ofthe reference strains in terms of the angles above and the displacements Wi := W . ei, we compute up to quadratic approximations in all rotations and linear approximations in all strains Eij. To this end we introduce curvatures and twist for the undeformed reference configuration by Frknet-type formulae
Controls may also be introduced in geometric boundary conditions, but we shall not do so here. Typically, a beam will be rigidly clamped at one end (say at xl = 0) and simply supported or free at the other end, xl = l. If the space of test functions Vo := (f E H' (0,l)l f (0) = 0) is introduced, the requirement that the end xl = 0 be rigidly clamped is mathematically realized by the "geometric" constraints Wi, 0i E Vo. If xl = 0 is simply supported, rather than clamped, the appropriate geometric boundary conditions are Wi E Vo for i = 2 and i = 3 only. Let
C= (The index separated by a comma indicates a partial derivative with respect to the corresponding variable xi.) The reference strains are then approximated by
dT
+ w(t)- u(t)ldt
[~(t)
be the Lagrangian. Then, by Hamilton's principle (see, for example [26]), the dynamics of the deformation satisfy the stationarity conditions 6Cw = 0, 6CB = 0, the variations being in Vo. The equations of motion then follow by integration by parts, collecting terms, etc., in the usual manner. Due to space limitations, neither the most general beam model nor the entirety of all meaningful boundary conditions can be described here. Rather, a partial list of beam models which, in part, have been studied in the literature and which can easily be extracted from the formulas above, will be provided. We focus on the typical situation of a beam which is clamped at x l = 0 and free (resp., controlled) at X1
=t.
66.2. THE EQUATIONS OF MOTION
66.2.1 66.2.2
mow1 = [EAWl]'
(longitudinal motion)
mow2 = [GA(W; - O3)I1
+
1
(vertical motion)
= [GZOf]'
(torsional motion)
mo32 = [ E Z ~ ~ O $-] 'GA(W;
+ t!t2),
(shear around &)
+
mo33 = [E12261]1 GA(Wi - It3)
(shear around $3) (66.1)
where ' = d / d x l .
Boundary conditions
The geometric boundary conditions are
(lateral motion)
mow3 = [GA(W3 79211'
meal
66.2.7
Initially Straight imd Untwisted Linear Shearable 3-Dimensional Beams Equations of motion
I
I
while the dynm~icalboundary conditions are
I
66.2.3 Boundary conditions The geometric boundary conditions are
i = l,2,3.
Wi(0) = 0, Oi (0) = 0,
(66.2)
The dynamical boundary conditiorrs are EAW[(O = GA(W; - O3)(1') GA(W;
=
fl,
66.2.8
Initial conditions
66.2.9
Nonlinear Planar, Shearable Straight Beams
'
f2,
+ ~ ~ )=(fii,t )
GZV;(t)= ml ,
(66.3)
E 1339; (t)= m;!, EZ22V;(t) = mj.
66.2.10 66.2.4
Equations of motion
The three equations which describe the longitudinal, vertical, and shear motions, re:spectively, are
Initial conditions
+ $ w;)]', mow3 = I G ~ ( o + ; ! w;)]'+ [ E ~ ( W+[ $w;)w;]',
moh W I =: [Eh(Wi
66.2.5
Initially straight and Untwisted, Nonshearable Nonlinear 3-Dimensional Beams 66.2.6 Equations of motion
66.2.1 1 Boundary conditions
These are comprised of four equations which describe the longitudinal, lateral, vertical, and torsional motions, respectively. = [EA(Wi
+ $ ( w ; )+~$ ( w ; ) ~ ) I ' ,
The geometric boundary conditions are
.
+
mow2 -- [poZ22~;]' [EZ22W$Ir' = [(EA(Wi $ ( w $ ) ~$ ( w ; ) ~ ) w ~ ] ' ,
+ + mow3 - [ P 0 ~ 3w;]' 3 + [EZ33W:]" = [(EA(W;+ $(w;12+ $ ( W ; ) ~ ) W ; ] ' , po13~= [GZO~]'.
(66.4)
while the dynamical boundary conditions are
I
THE CONTROL HANDBOOK
66.2.12 Initial conditions
motions.
+ + K3W1),
mow1 = [ E ~ ( w ;- K3W3)I1 - ~ g G h ( 6 2 W;
1
wi + K ~ W ~ ) ] ' + K ~ E ~- [KW~ ;w ~ ] , ~013332 = EI330I/ - Gh(92, + W; + K3W1).
mow3 = LGh(92
(66.13) This system was first introduced by Bresse 161.
REMARK 66.1
The model in this section is attributed to Hirschhorn and Reiss [7]. If the longitudinal motion is neglected and the quadratic term in Wj is averaged over the interval [0, l ] , the model then reduces to a Woinowski-Krieger type. The corresponding partial differentialequations are quasi-linear and hard to handle. If, however, one replaces 0 2 by 6 2 in the expression for the strain Ell (which is justified for small strains), then a semilinear partial differentialequation with a cubic nonlinearity in 6 2 is obtained.
66.2.13 Planar, Nonshearable Nonlinear Beam The equations of motion for the longitudinal and vertical motions, respectively, are
We dispense with displaying the boundary and initial conditions for this and subsequent models, as those can be immediately deduced from the previous ones. This model has been derived in Lagnese and Leugering 1161. The models above easily reduce to the classical beam equations as follows. We first concentrate on the nonshearable beams.
66.2.14 The Rayleigh Beam Model Here the longitudinal motion is not coupled to the remaining motions. The equation for vertical motion is
66.2.15 The Euler-Bernoulli Beam Model This is obtained by ignoring the rotational inertia of cross sections in the Rayleigh model: mow3
+ [ E I ~ ~ W ~=]0.'
66.2.17 The Timoshenko System This is the Bresse model for a straight beam ( K = ~ 0), SO that the longitudinal motion uncouples from the other two equations, which are
REMARK 66.2 The models above can be taken to be the basic beam models. In applications it is also necessary to account for damping and various other (local or non-local) effects due to internal variables, such as viscoelastic damping of Boltzmann (non-local in time) or Kelvin-Voigt (local in time) types, structural damping, so-called shear-diffusion or spatial hysteresis type damping; see Russell [24] for the latter. It is also possible to impose large (and usually fast) rigid motions on the beam. We refer to [9]. A comprehensive treatment of elastic frames composed of beams of the types discussed above is given in [IT]. With respect to control applications, one should also mention the modeling of beams with piezoceramic actuators; see Banks et al. 131.
66.3 Exact Controllability 66.3.1 Hyperbolic Systems Consider the models (Equations 66.1, 66.7, 66.13, and 66.14). We concentrate on the linear equations first. All of these models can be put into the form
with positive definite matrices M,K depending continuously on x . In particular, for the model (Equation 66.1) z = (W, o ) ~ . M = diag(mo, mo, mo, w I , ~0133,~0122)
(66.12)
With regard to shearable beams, two systems are singled out.
K = diag(EA, G A , G A , G I , EI33, EI22), Czo = -1, C35 = 1, Cij = 0 otherwise. In the case of the Bresse beam,
66.2.16 The Bresse System This is a model for a planar, linear shearable beam with initial curvature involving couplings of longitudinal, vertical, and shear
66.3. EXACT CONTROLLABILITY C12 = - K g , C21 = K 3 , C23 = I ,
Cij = 0 otherwise.
The Timoshenko system is obtained by setting ~ 3 = 0. If C = 0, Equation 66.15 reduces to the one-dimensional wave equation. We introduce the space;
H = ~ ~ (e,0~ ,q ) , v = {ZE ~;ll(o,e, ~ 4 : )Z(O)= 01, (66.18) , denotes whereq is thenumber of statevariables and ~ ~ (l ,0R4) the Sobolev space consisting of Rq valued functions defined on the interval (0, e) whose distributional derivatives up to order , (The reader iis referred to [I] for general k are in ~ ~ (l ;0Wq). information about Sobolev spaces.) We further introduce the energy forms
(z, i),
:=
l
e Mz. i d x .
Indeed, the norm induced by 1 1 ~ 1 1= (z, z)? usual Sobolev norm of given by
is equivalent to the
the control prob1e:m reduces to a trigonometric (or to a complex exponential) moment problem. The controllability requirement is then equivalent to the base properties of the underlying set of complex exponentids [exp(ikkt)lk E Z, t E (0, T)]. If that 0 , then exact set constitutes a IRiesz basis in its ~ ~ (T)-closure, controllability is achieved. For conciseness, we do not pursue this approach here and, instead, refer to Krabs [12] for further reading. Rather, the approach we want to consider here, while equivalent to the former one, does not resort to the knowledge of eigenvalues and eigenelements. The controllability problem, as in finite dimensions, is a question of characterizing the image of a linear map, the control-to-state map. Unlike in finite dimensions, however, it. is not sufficient here to establish a uniqueness result for the honiogeneous adjoint problem, that is, to establish injectivity of the adjoint of the control-to-state map. In infinitedimensional syst~emsthis implies only that the control-to-state map has dense range, which in turn is referred to as approximate controllability. Rather, we need some additional information on the adjoint systern, namely, uniformity in the sense that the adjoint map is uniformly injective with respect to all finite energy In particular, given a bounded linear initial (final) co~~~ditions. map L between IHilbert spaces X, Y, the range of L is all of Y if, and only if, L1', the adjoint of L, satisfies II@II < vllL*@llfor some positive v 2nd all E Y. This inequality also implies that the right inverse IL*(LL*)-I exists as a bounded operator (see the finite-dimensionalanalog, where 'bounded' is generic). It is clear that this implies norm-minimality ofthe controls constructed this way. This result extends to more general space setups. It turns out that an inequality like this is needed tn assure that the set of complex expone~ltialsis a Riesz base in its L2-closure. As will be seen shortly, such an inequality is achieved by nonstandard energy estimates,which constitute the basis for the so-called HUM method introduced by Lions. It is thus clear that this inequality is the crucial point in the study of exact controllability. In order to obtain such estimates, we consider smooth enough solutions @ of the homogeneous adjoint final value problem, (@
Let zo E V, zl E H, f E ~ " 0 , T,H), u E L2(0, T, Rq), T > 0. It can be proven that a unique function z E C(0, T, V) n ~ ' ( 0 T, , H) exists which satisfies laquations 66.15, 66.16, and 66.17 in the followingweak sense:
and
Because of space limitations, only boundary controls and constant coefficientswill be considered. Distributed controls are easy to handle, while pointwise controls have much in common with boundary controls except for the liberty of their location. Thus we set f = 0 in Equation 66.15. The problem of exact controllability in its strongest sense can be jformulated as follows: Given initial data Equation 66.17 and final data ( z ~ oZ, T ~in) V x H and given T > 0, find a control UI E ~ ~ (T,0Rq) , so that the solution z of Equation 66.15 satisfits Equations 66.16, 66.17 and z(T) = z ~ oi(T) , =Z T ~ . It is, in principle, possible to solve the generalized eigenvalue problem
and write the solution of Equatia~ns66.15, 66.16, 66.17 using Fourier's method of separation of variables. The solution together with its time derivative car1 then be evaluated at T and
$40) = 0,
K(@'
@ ( . , TI = h,
+ c@)(l)= 0, $(., T) = 41.
(66.21)
(66.22) Let m be a smooth, positive, increasing function of x. Multiply Equation 66.20 lby m@', where m(.)is a smooth function in x, and integrate by parts over (x, t). After some calculus, we obtain the following crucial identity, valid for any sufficiently smooth solution of Equation 66.20:
THE CONTROL HANDBOOK where p = m ~ d(4'. i-C4)dxand e(x, t) denotesthe energy density given by
and
Set the total energy at time t,
Setting y2 := I I A and rescaling by t -t t m , this system can be brought into a nondimensional form. We define spaces,
and denote the norm of the matrix C by v . If we choose m in Equation 66.23 so that ml(x) - vm(x) 2 co > 0, Vx E (0, l ) , we obtain the estimates
and forms,
1
e
(u. v) = for some positive constants y(T), r(T), where T is sufficiently large (indeed, T > 2 x "optical length of the beam" is sufficient). The second inequality in Equation 66.25 requires the multiplier above and some estimation, whereas the first requires the multiplier h ( x ) = - 1 2x/l, and i~straightforwardlyproved. These inequalities can also be obtained by the method of characteristics which, in addition, yields the smallest possible control time [17]. It is then shown that the norm of the adjoint to the controlto-state map (which takes the control u into the final values z(T), i(T) (for zero initial conditions)), applied to &, 41, is exactly equal to t)12dt. By the above argument, the original map is onto between the control space L2(0, T, Rq) and the finite energy space V x H.
+
THEOREM661
let(^, zl), (zTo,zT1)beinVxHandT > 0 sufficientlylarge. Then a unique control u E ~ ~ (T,0W4) , exists,
uudx.
(u, v ) =~ (u, v)
+ y 2 (u' , v1 ),
(u, v)v = y 2 ( ~ ' /v"). , , > 0. It may Let Wo E V, W1 E H, and u E L2(0, T, R ~ ) T be proved that there is a unique W E C(0, T, V) n c1(0, T, H) satisfying Equation 66.26 in an appropriate variational sense. REMARK 66.4 The nonlinear models can be treated using the theory of nonlinear maximal monotone operators; see Lagnese and Leugering [ 161.
To produce an energy identity analogous to Equation 66.23, we multiply the first equation df Equation 66.26 by x W' - rw W, dhere a z 0 is a free parameter, and then we integrate over (0, l ) x (0, T). If we introduce the auxiliary functions pl = W(XW'- a W)dx and pz = y 2 ~ ' ( W' x - aW)'dx, and p = pl p2, we find after some calculus
fi
+
li
with minimal norm, so thatzsatisfiesEquations66.15,66.16,66.17 and z(T) = ZTO,i(T) = Z T ~ . REMARK 66.3 Controllability results for the fully nonlinear planar shearable beam Equation 66.7, and also for the Woinowski-Kriegrr-type approximation, are not known at present. The semilinear model is locally exactly controllable, using the implicit function theorem. The argument is quite similar to the one commonly used in finite-dimensional control theory.
66.3.2 Quasi-Hyperbolic Systems In this section we discuss the (linearized) nonshearable models with rotational inertia, namely, Equations 66.4,66.10,.66.11. We first discuss the linear subsystems. Observe that in that situation all equations dewuple into wave equations governing the longitudinal and torsional motion and equations of the type of Equation 66.11. Hence it is sufficient to consider the latter. For simplicity, we restrict ourselvesto constant coefficien:~. The system is then p h -~J I Z W
+ EIW~"'
= f,
With the total energy now defined by
this identity can now be used to derive the energy estimate,
66.4. STABZLZZABZLZTY for some positive constants n ( T ) a n d n( T ) ,which isvalid for sufficientlysmooth solutions q5 to the homogeneous system, Equation 66.26, and for sufficiently large T > 0 (again T is related to the "optical length," i.e., t a wave velocities). The first estimate is more standard and determines the regularityofthe solutions. It is. again a matter of calculating the control-to-state map and its adjoint. After some routine calculation, one verifies that the norm ofthe adjoint, applied to the final data for the backwards running homogeneous equation, coincidt:~with the time integral in the energy estimate. This leads to thehexact controllabilityof the system, Equation 66.26, in the space 'V x H, with V and H as defined in Equation 66.27, using controls u = ( u l ,u2) E L2(0,T , R'). REMARK66.5 Asin [18],for y + 0,the controllabilityresults for Rayleigh beams carry over to the corresponding results for Euler-Bernoulli beams. It is however instructive and, in fact, much easier to establish controllability of the Euler-Bernoulli beam directly with control only in the shear force.
for W as above and 19solving the backwards running adjoint equation (i.e., Equaticnn66.29 with finalconditionsq5~0a n d 4 ~ 1 ) . Integrating with respect to time over (0, T ) yields
The same argument as above yields the conclusion of exact controllability of the system (Equation 66.29), in the space V x H, where V and IH are defined in Equation 66.30, using controls u E L2(0,T , IS).
REMARK 66.6 It may be shown that the control time T for the Euler-Bernoulli system can actuallybe taken arbitrarily small. That is typical for this kind of model (Petrovskii type systems) and is closely related to the absence of a uniform wave speed. The reader is referred to the surveyarticle [15] for general background information on cont~rollabilityand stabilizability of beams and plates.
66.3.3 The Euler-Bernoulli Beam We considerthe nondimensional form of Equation 66.12, namely,
We introduce the spaces
and the corresponding energy functional
Again, we are going to use multiipliers to establish energy identities. The usual choice is m ( x ) = x. Upon introducing p = ji x w w'dx and multiplying the first equation by rn W', followed by integration by parts, we obtain
+ f 1: W2tfxdt+ q j: j : ( ~ " ) ~ d x d t = f j l W2dt + c: i W 1 ( l t)u(t)dt. ,
p ( ~ -) p(0)
(66.32) Using this identity for the homogeneous system solved by 4, we obtain the energy estimates
We proceed to establish uniform exponential decay for the solutions of the various beam models by linear feedback controls at the boundary x = l . There is much current work on nonlinear and constrained feedback laws. However, the results are usually very technical, and, therefore, do not seem suitable for reproduction in these notes. I[n the linear case it is known that for time reversible systems, exact controllability is equivalent to uniform exponential stabilizability. In contrast to the finite-dimensional case, however, we have to distinguish between various concepts of controllabilitg, such (asexact, spectral, or approximate controllability. Accordingly, we have to distinguish between different concepts of stabilizability, as uniform exponential decay is substantially different from nonuniform decay. Because of space limitations, we do not dwell on the relation between controllability, stabilizilbility, and even observability. The procedure we follow is based on Liapunov functions, and is the same in all of the models. Once again the energy identities, Equations 66.23, 66.3.2, and 66.32 are crucial. We take the hyperbolic case as an exemplar and outline the procedure in that case
66.4.1 Hyperbolic Systems Apply Equation 66.23 to solve Equation 66.15 (with f = 0 ) and Equation 66.EL6with the control u in Equation 66.16 replaced by a linear feedback law u ( t ) = -kz(i, t ) , k > 0. Recall that p = j: m M i - (z' Cz)dx(t). Then using Equation 66.23,
+
where again n ( T ) and n ( T ) depend on T > 0, with T sufficiently large. One way to obtain the adjoint control-to-state-map is to consider
$ L ' ~ W ) + W ~ ~ ~=P- uI ( t~) dX( t . t ) ,
where is given by Equation 66.24. Therefore, introducing the ) finds function Fe(r) := E(t) ~ p ( t .one
+
THE CONTROL HANDBOOK
1146 However, & ( t ) = - k ( i ( l , t)12,and therefore the boundary term can be compensated for by choosing c sufficiently small. This results in the estimate & ( t ) 5 - c l c E ( t ) for some cl > 0 , which in turn implies
p ( t ) . By following the same procedure as above, we obtain the decay estimate Equation 66.34 for the closed-loop system, Equations 66.26 and 66.35, where the energy functional & is given by Equation 66.28.
It also straightforward to see that 3 ( t ) satisfies
REMARK 66.8 One can show algebraic decay for certain monotone nonlinear feedbacks. In addition, the nonlinear system, Equation 66.10, exhibits those decay rates as well; see Lagnese and Leugering [ 1 6 ] .
The latter implies &(s)ds 5 (l/h)&(O),with A. = l l , / ( c l e ) &(s)ds and obtains a difand t 2 0. One defines r](t) := ferential inequality rj hr] 5 0 . A standard Gronwall argument implies r](t) 5 e x p ( - h t ) ~ ( O )that , is,
+
Now, because & ( t ) is nonincreasing,
66.4.3 The Euler-Bernoulli Beam Here we consider the system Equation 66.29 and close the loop by setting u ( t ) = - k ~ ( e t, ) , k > 0 . By utilizing the estimate Equation 66.32 and proceeding in much the same way as above, the decay estimate Equation 66.34 can be established for the closed-loop system, where I is given by Equation 66.31.
66.5 Dynamic Elastic Plate Models Let Q be a bounded, open, connected set in B~with a Lipschitz continuous boundary consisting of a finite number of smooth curves. Consider a deformable three-dimensional body which, in equilibrium, occupies the region
and this, together with the choice t = 111, gives
Hence, we have the following result. THEOREM66.2 LetVandH begiven by Equation 66.18. Given initial data ZQ, zl in V x H, the solution to the closed-loop system, Equation 66.15, Equation 66.16, and Equation 66.17, with u ( t ) = - k i ( l , t ) , satisfies
for some positive constants M and w. REMARK 66.7 The linear feedback law can be replaced by a monotone nonlinear feedback law with certain growth conditions. The corresponding energy estimates,however, are beyond the scope of these notes. Ultimately, the differential inequality above is to be replaced by a nonlinear one. The exponential decay has then (in general) to be replaced by an algebraic decay, see [ 171. The decay rate can be optimized using "hyperbolic estimates" as in [ 1 7 ] . Also the dependence on the feedback parameter can be made explicit.
66.4.2 Quasi-Hyperbolic Systems We consider the problem of Equation 66.26 with feedback controls, u l ( t ) = -klW1(e, t ) ,
u 2 ( t ) = k 2 W ( l ,t ) ,
(66.35)
with positive feedback gains kl and k2. The identity Equation 66.3.2 is used to calculate the derivative of the function
When the quantity h is very small compared to the diameter of SZ, the body is referred to as a thin plate of uniform thickness h and the planar region,
is its reference surface. Two-dimensional mathematical models describing the deformation of the three-dimensional body Equation 66.36 are obtained by relating the displacement vector associated with the deformation of each point within the body to certain state variables defined on the reference surface. Many such models are available; three are briefly described below.
66.5.1 Linear Models Let W ( x l,x2, x3, t ) denote the displacement vector at time t of the material point located at ( x l , x2, x3), and let w ( x l , x2, t ) denote the displacement vector of the material point located at ( X I ~, 2 ~in0 the ) reference surface. Further, let n(x1, x2, t ) be the unit-normal vector to the deformed reference surface at the point ( x l ,x2,O) + s ( x l , xz, t ) . The direction of n is chosen so that n . k > 0 , where i, j, k is the natural basis for lR3.
Kirchhoff Model The basic kinematic assumption of this model is
66.5. DYNAMIC ELASTIC PLi4TE MODELS which means that a filament in its equilibrium position, orthogonal to the reference surface, remains,straight, unstretched, and orthogonalto the deformed reference surface. It is further assumed that the material is linearlyelastic (Hookean), homogeneous and isotropic, that the transverse normal stress is small compared to the remaining stresses, and that the strains and the normal vector n are well-approximated by their linear approximations. Write w = wli w2j w3k. Under the assumptions above, there is no coupling between the in-plane displacements wl, w2 and the transverse displacement w3 := w. The former components satisfy the partial differential equations of linear plane elasticity and the latter satisfies the equation
+
+
+
where A = a2//xT a2/ax; is the harmonic operator in Kt2, p is the mass density per unit of reference volume, I,, = ph3/12 is the polar moment of inertia, 13 is the modulus of flexural rigidity, and F is the transverse component of an applied force distributed over a. h e "standard" Kirchhoff plate equation is ~ ) , accounts obtained by omitting the term ~ , , A ( a ~ w / a twhich for the rotational inertia of cross sections, from Equation 66.38.
Reissner-Mindlin Systern The basic kinematic assumption of this model is
+
where IU k ) = 1. Equation 661.39 means that a filament in its equilibrium position, orthogonal to the reference surface, remains straight and unstretched but not necessarily orthogonal to the deformed reference surface. Write
In the linear approximation, U3 = 0 so that the state variables of the problem are w and u. The latter variable accounts for transverse shearing of cross sections. It is further assumed that the material is homogeneous and Hookean. The stress-strain relations assume that the material is isotropic in directions parallel to the reference surface bbt may have different material properties in the transverse direction. As in the previous case, there is no coupling between wl, w2, and the remaining state variables in the linear approximations. The equations of motion satisfied by w3 :tw and u, referred to as the Reissner or Reissner-Mindlin system, may be written a2w phT -Ghdiv(u+Vw) := F, and a2u
Ips
h3
- i Z d i ~ ~ +( G ~ )~ ( + u Vw) = C,
I
(66.40)
where Gh is the shear modulus andl u(u) = (&ij (u)) is the stress tensor associated with u, i.e.,
e i j (u) denotes the linearized strain
and 1.1 are the Lamb parameters of the material, div(u
+ Vw) := V . + Vw), (IU
and C = Cli
+ C2j is a distributed force couple.
66.5.2 A RTonliriear Model: The von Karmin System Unlike the two previous;models, this is a "large deflection" model. It is obtained under the same assumptions as the Kirchhoff model except for the lineariza~tionof the strain tensor. Rather, in the general strain tensor,
the quadratic terms involving W3 are retained, an assumption formally justified if the planar strains are small relative to the transverse strains. The result is a nonlinear plate model in which the in-planecomponents of displacement w1 and w2 are coupled to the transverse displacement w3 := w. Under some further simplifying assumptions, wl and w2 may be replaced by a single function G, called an Airy stress function, related to the in-plane stresses. The resulting nonlinear equations for w and G are a2w a2tu ph- IpA-0 A 2 w - [w, G] = F, at2 at! Eh A ~ G -.[w, W]= 0, 2 where E is Youn~g'smo~dulusand where
+
(66.41)
+
One may observe that [w, w]/2 is the Gaussian curvature of the deformed reference surface x3 = w(x1, xz). The "standard" dynamic von KarmAn plate system is obtained by setting I,, = 0 in Equation 66.41. REMARK 66.9 For a derivation of various plate models, including thermoelasticplates and viscoelasticplates, see (181. For studies of junction co~tditionsbetween two or more interconnected (not necessarily co-planar) elastic plates, the reader is referred to the m~onographs[21] and [IT] and references therein.
66.5.3 Boundary Conditions Let r denote the boundary of S2. The boundary conditionsare of two types: geometric conditions, that constrain the geomete of the deformation at the boundary, and mechanical (or dynamic) conditions, that represent the balance of linear ;u~dangular momenta at the boundary.
THE CONTROL HANDBOOK
Boundary Conditions for the Reissner-Mindlin System
66.6 Controllability of Dynamic Plates
Geometric conditions are given by
In the models discussed in the last section, some, or all, of the applied forces and moments F, C, f, c, and the geometric data 6,ii, may be considered as controlswhich must be chosen in order to affect the transient behavior of the solution in some specified manner. These controls may either be open loop, or closed bop. Open-loop controls are usually associated with problems of controllability, which is that of steering the solution to, or nearly to, a specified state at a specified time. Closed-loop controls are usually associated with problems of stabilizability, that is, of asymptotically driving the solution towards an equilibrium state of the system. In fact, for infinite-dimensional systems of which the above plate models are representative, there are various related but distinct concepts of controllability (spectral, approximate, exact) and of stabilizability (weak, strong, uniform), distinctions which disappear in finite-dimensionalapproximations of these models (see [2, Chapter 41). Stabilizability problems will be discussed in the next section. With regard to controllability,m c t controllability is the most stringent requirement because it requires a complete description of the configuration space (reachable set) of the solution. This is equivalent to steering any initial state of the system to any other permissible state within a specifiedinterval of time. The notion of spectral controllability involves exactly controllingthe span of any set offinitelymany of the eigenrnodes of the system. Approximate controllability involves steering an arbitrary initial state to a given, but arbitrary, neighborhood of a desired configuration within a specified time. Among the possible controls, distinctions are made between distributed controls such as F and C, which are distributed over all or a portion of the face of the plate, and boundary controls, such as f,c, G , ii, which are distributed over all or a portion of the edge of the plate. Within the class ofboundary controls, a further distinction is made between mechanical controls, f and C, and geometric controls 6 and ii. Because mechanical controls correspond to forces and moments, they are, in principle, physically implementable; these are the only types of controls which will be considered here. In addition, only boundary control problems will be considered in detail; however, some remarks regarding distributed control problems will be provided.
The case G = ii = 0 corresponds to a rigidly clamped boundary. The mechanical boundary conditions are given by G h v . (U
+ Vw) = f,
1
where v is the unit exterior pointing normal vector to the boundary of Q, c = cli czj is a boundary force couple and f is the transverse component of an applied force distributed over the boundary. The problem consisting of the system of Equations 66.40 and boundary conditions Equations 66.42 or 66.43, together with the initial conditions
+
has a unique solution if the data of the problem is sufficiently regular. The same is true if the boundary conditions are Equation 66.42 on one part of r and Equation 66.43 on the remaining part, or if they consist of the first (resp., second) of the two expressions in Equation 66.42 and the second (resp., first) of the two expressions in Equation 66.43.
Boundary Conditions for the Kirchhoff Model The geometric boundary conditions are
and the mechanical boundary conditions may be written
66.6.1 Controllability of Kirchhoff Plates Boundary Conditions for the von K h m h System The state variables w and G are not coupled in the boundary conditions. The geometric and mechanical boundary conditions for w are those of the Kirchhoff model. The boundary conditions satisfied by G are
These arise if there are no in-plane applied forces along the boundary.
Assume that r = TouTl, where roand r1are disjoint, relitively open subsets of r with rl # 0. The problem under consideration consists of the partial differential Equation 66.38, boundary conditions Equation 66.45 on ro, boundary conditions Equation 66.46 on Ti,and initial conditions
In this system, the distributed force F, the geometric quantities G , ii, and the initial data (wO,wl)are assumed as given data, while f,c are the controls, chosen from a certain class C of admissible controls. The configuration space, or the reachable set, at
66.6. CONTROLLABILITY OP DYNAMIC PLATES time T is
where, for example, w ( T ) stands for the function [ w ( x ,T) : x E Q ] and where w = awlat. If z denotes the solution of the uncontrolled problem, i.e., the soliution with f = 0 and c = 0, then RT = @ { [ z ~ ( Ti )( ,T ) I ) *
ROT
where R O T denotes the configuration space when all of the given data are zero. Therefore, to study the reachable set it may be assumed without loss of generality that the data F, IE, ii, wO,w 1 vanish. The problem under consideration is, therefore,
ofC lead to a sufficientlyrich configuration space. For the problem under consideration,it is very difficult to determine the precise relation between the control and configuration spaces. For example, there is no simple characterization of those inputs for which the corresponding solution has finite energy at each instant. On the other hand, when standard control spaces with simple structure, such as L2 spaces, are utilized, the regularity properties of the solution are, in general, difficult to determine. (Thisis in contrast to the situation which occurs in the analogous boundary control problem for Rayleigh beams, where it is known that finite energ;y solutions correspond exactly to inputs which are L2 in time.) In order to make the ideas precise, it is necessary to introduce certain function spaces based on the energy functionals K: and U. Let L2(G?)denote the space of square &tegrable functions defined on a,and let H ~ ( Qbe) the Sobolev space consisting of functions in L2(S2)whose derivatives up to order k (in the sense of distributions)belong to L2(S2). Let
The quantity
defines a Hilbert norm on H which is equivalent to the standard induced H l ( a ) norm. Similarly, define
If w is a solution of Equation 66.49, its kinetic energy at time t
where the quantities in the integrand are evaluated at time t . The strain energy of this solution at time t is given by
A pair of functions (wo, w 1) defined on S2 is called afinite energy pair if
k(phw:
+ I P ~ ~ w l l 2 ) dcS 2a?.
A solution w of Equation 66.49 is called afinite energy solution if [ w ( t ) ,w ( t ) ] is a finite energy pair for each t 2 0 and is continuous with respect to t into the space of finite energy pairs. This means that the solution has finite kinetic and strain energies at each instant which vary continuously in time. Many choices of the control spacr C are possible, each ofwhich will lead to different configuration space ROT. One requirement on the choice of C is that the solution w corresponding to given input, f, c be reasonablywell behaved. Another is that the choice
The quantity
defines a seminorm on V. In fact, as a consequence of Korn's lemma, 11.11 y is actuallya norm equivalentto the standard induced H ~ ( S ~norm ) whenever ro # 0. Such will be assumed in what follows to simplify the discussion. The Hilbert space V is dense in H and the injection V H H is compact. Let H be identified with its dual space and let V* denote the dual space of V. Then H c V* with compact injection. A finite energy solution of Equations 66.49166.5 1 is characterized by the statements w ( t ) E V, w ( t ) E H for each t , and the mapping t I+ ( w ( t ) ,w ( t ) )is continuous inta the space V x H. The space V x H is sometimes referred to as finite energy space. Write c = cli c2j. In order to assure that the configuration space is sufficiently rich, the control space is chosen as
+
c = { [ f ,C ) : f E ~ ~x (0,(T I ] r, ci E~L ~ [ Ix- (0, ~ T)]}. (66.53) The penalty for this simple choice is that the corresponding solution, which may be defined in a certain weak sense and is unique, is not necessarily a finite energy solution. In fact, it can be shown by variational methods that, if w is the solution of Equations 66.49-66.52 corresponding to an input ( f , c ) E C, then w ( t ) E H , w ( t ) E V* and the mapping t I+ [ w( t ) , &(t)] is continuous into H x V*. ( A more refined analysis of the regularity of the solution may be found in [20].)
1150
THE CONTROL HANDBOOK
Approximate Controllability The system (66.49) - (66.52) is called approximately controllable at time T if 'ROT is dense in H x V*. To study this problem, introduce the control-to-state map CT defined by CT : C H H x V * , CT(f, C) = [w(T), tit(T)]. Then the system, Equations 66.4946.52, is approximately controllable at time T exactly when range(CT) is dense in H x V * . The linear operator CT is bounded, so, therefore, is its dual operator C; : H x V F+ C. Thus, proving the approximate controllability of Equations 66.49-66.52 is equivalent to showing that
The quantity C; (@I,@O)may be explicitlycalculated (see, [14]). It is given by the trace
where @ is the solution of the final value problem
a24 - IpA-a2@+ DA'@
ph-
at2
at2
REMARK 66.1 1 (Distributed control.) Consider the problem of approximate controllability using a distributed control rather than boundary controls. Let w be a nonempty, open subset of SZ and let the control space be
C = [ F : F E L'(SZ x ,(O, T)), F = 0 in SZ\w]. Consider the system consisting of Equation 66.38 and (for example) the homogeneous boundary conditions,
Assume that the initial data is zero and let
R; = ((w(T), i ( T ) ) : F E C}. For any input F taken from C the corresponding solution may be shown to be a finite energy solution. Let H and V be defined as above with ro = r. The control-to-state map CT : F H (w(T), w(T)) maps C boundedly into V x H and its dual is given by c;(@l7 4') = @lmx(OqT)> where @ is the solution of Equation 66.54 with final data, Equation 66.57, and boundary conditions
= 0,
If @lox(O,T) = 0 and T is sufficiently large, an application of Holmgren's theorem gives 4 r 0 in Q x (0, T) and, therefore, the system is approximately controllable in time T. When SZ is convex, the optimal control time is To = 2 m d ( S Z , w ) .
Exact Controllability
Therefore, the system, Equations 66.49-66.52, is approximately controllable if the only solution of Equations 66.54-66.57, which also satisfies
is the trivial solution. However, the boundary conditions, Equations 66.56 and 66.58, together, imply that @ satisfies Cauchy data on rl x (0, T), that is, @ and its derivatives up to order three x (0, T). If T is large enough, a general uniquevanish on ness theorem (Holmgren's theorem) then implies that @ E 0 in Q x (0, T). This implies approximate controllabilityin time T.
THEOREM 66.3 There is a To > 0 so that the system Equations 66.49-66.52 is approximately controllable in time T > To. REMARK66.10 The optimal time To depends on the material parameters and the geometry of Q and rl. If SZ is convex, then To = 2 m d ( Q , r l ) , where
Again consider the system Equations 66.4946.52 with the control space given by Equation 66.53. If 'D is a subspace in H x V*, the system is exactly controllable to D ' at time T if 'D c ROT. The exact controlhbilityproblem, in the strictest sense, consists of explicitlyidentifying 72; or, in aless restricted sense, of explicitly identifying dense subspaces D of H x V* contained in 77,;. To obtain useful explicit information about ROT it is necessary to restrict the geometry of ro and rl.It is assumed that there is SO that a point xo E
Condition (Equation 66.60) is a "nontrapping" assumption ofthe sort found in early work on scattering of waves from a reflecting obstacle. Without some such restriction on ro, the results described below would not be valid. It is further assumed that
This is a technical assumption needed to assure that solutions of the uncontrolled problem have adequate regularity up to the boundary (cf. Remark 66.13 below). . THEOREM 66.4 Under the assumptions (Equations 66.60 and 66.61), there is a To > 0 so that
66.6.CONTROLLABILITY OF DYNAMIC PLATES Then [w(T), w(T)] depends on (@O,4'). A linear mapping h is defined by setting
ifT > To. The inclusion stated in Equation 66.62 may be stated in equivalent form in terms of the control-to-state mapping CT. In fact, let Co = { ( f , C ) E C : C T ( f , C ) E V x H ) , and consider the restriction of Cr t'oCo. This is a closed, densely defined linear operator from Co into V x H. The inclusion (Equation 66.62) is the same as the assertion CT(CO)= V x H, the same as proving that the dual of the operator has a bounded inverse from V* x H to Co, that is,
where @ is the solution of Equations 66.54-66.57. For large T, the "observability estimate" (Equation 66.63) was proved in [18] under the additional geometric assumption
However, this hypothesis may be removed by application of the results of [20]. REMARK66.12 It is liiel-ythat the optimal control time To for exact controllability is the same as that for approximate controllability, but that has not been proved. REMARK 66.13 The hypothesis (Equation 66.61) may be replaced by the assumption that the setsro a n d r l meet in a strictly convex angle (measured in the interior of Q). REMARK66.14 The conclusion of Theorem 66.4 is false if the space of controls is restricted to finite-dimensional controllers of the form,
The inequality (Equation 66.63) may be used to show that, for any (wo, wl) E V x H, the pair (@O,@')may be chosen so that h(@O,@I) = (wl , -wo). The argument is based on the calculation (usin~gintegrations by parts)
that is,
According to Equation 66.63, for T large enough, the right-hand side of Equation 66.66 defines a Hilbert norm 1) (GO, # I ) 1) F and a corresponding Hilbert space F which is the completion of sufficiently smooth pairs (@'I, @ I ) with respect to )I . 11 F . The identity (Equation 66.66) shows that A is exactly the Riesz isomorphism of F onto its dual spact: F*. Because (wo, wl ) E precisely when (wl , -wo) E range(A), it follows that
ROT
The inequality (Equation 66.63) implies that H x V* > F. Therefore H x V c F*,which is the conclusion ofTheorem 66.4. If (wo, wl) E V x H, then the minimum norm control is given by Equation 66.65, where (4O, 4') = A-I (wl , -wo) This procedure for constructing the minimum norm control is the basis of the Hilbert Uniqueness Method 'ntroduced in [22], [23].
REMARK 66.16
In the situation where Ip = 0 in Equations 66.49 and 616.51, the only change is that H = L'(Q) rather than the space defined above and the optimal control time is known to be To -:= 0, i.e., exact controllability holds in arbitrarily short time (cf. [28]).
where ai,Pi are given L 2 ( r l ) functions and ci, fi are L2(0, T) controls, i = 1, . . . , N(see [25]). REMARK66.15 Givenadesiredfinalstate (wo, wl) E V x H, there are many ways of constructing a control pair (f, c ) so that w(T) = wo and w(T) = wl. The unique control of minimum L~(I'l x (0,T)) norm may be consltructed as follows. Set Z1 = rl x (0,T). Let 4 be the solution of Equations 66.54-66.57 and let w be the solution of Equations 66.49-66.52 with
f
=h I ,
and c = -V4lx,.
(66.65)
REMARK 66.17 If distributed controls are used rather than boundary contrc~~ls as in Remark 66.11, Equation 66.62 is not true, in general, but is valid if o is a neighborhood of I'.
66.6.2 Controllalbility of the Reissner-Mindlin System The controllabilityproperties of the Reissner-Mindlin system are similar to those of the Kirchhoff system. As in the last subsection, only the boundary control problem is considered in detail.
THE CONTROL HANDBOOK
Again, we work within the context of 15~controls and choose Equation 66.53 as the control space. As above, it may be assumed without losing generality that the data of the problem, F, C, G, ii, wO,w', uO,u1 vanish. The problem under consideration is, therefore,
With this setup, Theorem 66.3 and a slightly weaker version of Theorem 66.4 may be proved for the Reissner-Mindlin system. The proofs again consist of an examination ofthe control-to-state , (T)]. map CT : C ++ H x V * defined by CT(f , c) = [ w ( T ) w The dual mapping C; : H x V H C is given by
where
Ghv
. (U + V w ) = f,
h3 -a(u)v I2
= c on rl, t > 0 ,
and
4 , $ satisfy ph-a**
For convenience, it is assumed that ro # 0. Set w = u +wk and introduce the configuration space (reachable set) at time T for this problem by
at2
- Ghdiv(4
+ V 3 ) = 0,
a2 4 - -divu(#) h3 1,+ G h ( # + V$) atz
12
(66.71)
= 0,
ROT,
To describe certain function spaces based on the kinetic and strain energy functionals of the above problem must be introduced. Let
H = [v = uli
+ u2j + wk def = u + wk : ui, w E L ~ ( Q ) ] ,
It is a consequence of Korn's lemma and ro f 0 that 11 . 11 v is a norm equivalent to the induced H l ( 5 2 ) norm. The space V is dense in H and the embedding V H H is compact. A solution of Equations 66.67-66.70 is a finite energy solution if w ( t ) E V , w ( t ) E H for each t, and the mapping t H ( w ( t ) ,w ( t ) ) is continuous into V x H. This means that the solution has finite kinetic and strain energies at each instant. As with the Grchhoff model, solutions corresponding to inputs taken from C are not necessarily finite energy solutions. However, it is true t h a p ( t ) E H, w ( t ) E V * and the mapping t H [ w ( t ) ,w ( t ) ]is continuous into H x V * , where the concept of a solution is defined in an appropriate weak sense. The system, (Equations 66.67-66.70) is called approximately controllable if 72; is dense in H x V * . The exact controllability problem consists of explicitly identifying dense subspaces of H x V* contained in 72;.
Approximate controllability amounts to showing that C;( @', a O )= 0 only when ( @ I , )'~c = 0. However, if @ is ( 0 , ~= ) the solution of Equations 66.71-66.74 andsatisfies @Ir, 0, then @ and its first derivatives vanish on rl x (0, T). If T is large enough,'Holmgren's theorem then implies that @ E 0. With regard to the exact controllability problem, to prove the inclusion, (Equation 66.62) for the Reissner system amounts to establishingthe observability estimate jcf. Equation 66.63),
For sufficientlylarge T, this estimate has been provediin I181 under assumptions, Equations 66.60,66.61, and 66.64. The-following exact controllabilityresult is a consequenceof Equation 66.75.
66.7. STABILIZABILITY OF D Y N A M I C PLATES Under assumptions Equations 66.60, 66.61, and 66.64, there is a To > 0 so that
THEOREM 66.5
-
if T > To. If s2 is convex, then the optimal control time for approximate controllability is
The above system is usually analyzed by uncoupling w from G. This is done by solving the second equation in Equation 66.79, subject to the boundary conditions, Equation 66.79, for G in terms of w. One obtains G = - ( E h / Z ) G [ w , w ] ,where g is an appropriate Green's operator for the biharmonic equation. One then obtains for w the following equation with a cubic nonlinearity:
REMARK 66.18
The optimal control time for exact controllability is probably the same, but this has not been proved. See, however/(lO], [11, Chapter 51. Assumption, Equation 66.64, is probabIy unnecessary, but this has not been established. Remark 66.13 is valid also for the Reissner-Mindlin system. The remarks concerning approximate and exact controllabilityof the Kirchhoff model utilizing distributed controls remain true for the Reissner-Mindlin system.
66.6.3 Controllability of the von Karman System The global controllability results which hold for the Kirchhoff and Reissner-Mindlin models cannot be expected to hold for nonlinear partial differentialequations, in general, or for the von Kirmiin system in particular. Rather, only local controllabilityis to be expected (although global controllabilityresults may obtain for certain semilineur systems; cf. [19]), that is, in general the most that can be expected is that the reachable set (assuming zero initial data) contains some ball S, centered at the origin in the appropriate energy space, with the control time depending on r . The problem to be considered is
a2w a2w - IpAD A ~ W- [ w , G ] = 0, at at2 Eh A ~ G - [ w , W ] = 0, 2
+
phT
ph-
a2w a2w - IpAat2 at2
+ D A 2 w - -Eh[2w ,
G [ w ,w ] ]= 0.
(66.81) The problem for w consists of Equation 66.81 together with the boundary conditions, Equations 66.77, 66.78, and initial conditions Equation 66.80. The function spaces H and V based on the kinetic and strain energy functionals, respectively, related to the transverse displacement w , are the same as those introducedl in discussing O T controllabilityof the Kirchhoff model, as is the reachable set R corresponding to vanishing data. Let
With this notation, alocal controllabilityresult analogous to Theorem 66.4 can be established.
THEOREM 66.6
Under assumptions, Equatiofi~s66.60, 66.61, and 66.64, there is an r > 0 and a time To( r ) > 0 so that
Curiously, a result for the von Kirmin system analogous to Theorem 66.3 is not known. Theorem66.6 is proved by utilizing the global controllabilityof thelinearized (i.e., Kirchhoff)problem, together with the implicit function theorem in a manner familiar in the control theory of finite-dimensional nonlinear systems.
(66.76)
+
REMARK 66.19 If the underlying dynamics are modified by introducing a dissipative term b ( x ) w , b(x) > 0, into Equation 66.8 l , it may be proved, under assumptions, E.quations66.60 and 66.61, that the conclusion, Equation 66.82, is valid for every r > 0. However, the optimal control time To will continue to depend on r , so that such a result is still local.
66.7 Stabilizability of Dynamic Plates
It is assumed that G.
ro # 0. Note that there is no initial data for
The problem of stabilization is concerned with the description of feedback controls which assure that the trajectories of the system converge asymptotically to an equilibrium state of the system. For infinite-dimensional systems in general and distributed parameter systems in particular, there are various distinct notions of asymptotic stability: weak, strong, and uniform (distinctions which, incidentally, disappear in finite-dimensional approximations of the system). The differences in the various types of
THE CONTROL HANDBOOK stability are related to the topology in which convergence to an equilibrium takes place. The most robust notion of stability is that of uniform stability, which guarantees that all possible trajectories starting near an equilibrium of the system converge to that equilibrium at a uniform rate. In this concept, convergence is usually measured in the energy norm associated with the system. This is the classical viewpoint of stability. Strong stability, on the other hand, guarantees asymptotic convergence of each trajectory (in the energy norm) but at a rate which may become arbitrarily small, depending on the initial state of the system. The concept of weak stability is similar; however, in this case, convergence to an equilibrium takes place in a topology weaker than associated with the'energynorm. In the discussion which ensues, only uniform and strong asymptotic stability will be considered.
66.7.1 Stabilizability of Kirchhoff Plates Consider the Kirchoff system consisting of Equation 66.49, boundary conditions, Equations 66.50, and 66.51, and initial conditions aw w ( x , 0 ) = w O ( x ) , -(x, at
0) = wl(x),
x E C2.
(66.83)
It is assumed that rj # 0, i = 0 , 1. The boundary inputs c, f are the controls. The boundary outputs are
The problem is to determine the boundary inputs in terms of the boundary outputs to guarantee that the resulting closed-loop system is asymptotically stable in some sense. The total energy of the system at time t is
Thus the feedback laws Equation 66.85 are dissipative with respect to the total energy functional &. Let H and V be the Hilbert spaces based on the energy functionals K and U , respectively, as introduced above. If ( w O ,w ' ) E V x H, the Kirchoff system, Equations 66.49-66.5 1 , with initial conditions Equation 66.83, boundary outputs Equation 66.84, and feedback law Equation 66.85, is well-posed: it has a unique finite energy solution w . The system is called uniformly asymptotically stable if there is a positive, real-valued function a ( t ) with a ( t ) + 0 as t -+ co,so that Il(w(t),~ ( t ) ) l l V x H5 a(t)ll(wo,W ' ) I I V X H .
Therefore, E ( t ) 5 a2(t)E(0). If such a function a! exists, it is necessarily exponential: a ( t ) = Ce-O' for some w > 0. The system is strongly asymptotically stable if, for every initial state ( w O ,w ' ) E V x H, the corresponding solution satisfies
or, equivalently, that E ( t ) + 0 as t + co. Uniform and strong asymptotic stability are not equivalent concepts. Strong asymptotic stabilitydoes not imply uniform asymptotic stability. Strong stability has the following result.
THEOREM 66.7 Assume that ki >, 0, i = 0 , 1. Then the closed-loop Kirchhoff system is strongly asymptotically stable. The proof of this theorem amounts to verifying that, under the stated hypotheses, the problem has no spectrum on the imaginary axis. The latter is a consequence of the Holmgren uniqueness theorem (see [13,Chapter 41 for details). For the closed-loop Kirchhoff system to be uniformly asymptotically stable, the geometry of ri must be suitably restricted.
THEOREM 66.8 Assume that ki > 0, i = 0 , 1, and that ri satisfj, Equations 66.60 and 66.61. Then the closed-loop Kirchhoff system is uniformly asymptotically stable. The proof of Theorem 66.8 follows from the estimate, where w = a w l a t . A direct calculation shows that
When the loop is closedby introducing the proportional feedback law
E(T)
CT
lT
+
[ k o ( d ) 2 kl lvb12] drdt, 1
T large. (66.86)
The proof of Equation 66.86 is highly nontrivial. From Equation 66.86 and the above calculation of d E / d t , it follows that
which implies the conclusion of the theorem. it follows that Theorem 66.8 was first proved in [13]under the additional assumption Equation 66.64, but the latter condition may be removed by applyingthe results in [20].Assumption, Equation 66.61, may be weakened; see Remark 66.13. If Zp = 0 , the conclusion holds even when kl = 0. REMARK66.20
66.7. STABILIZABILITY OF DYNAMIC PLATES
REMARK 66.21
In place of the linear relationship Equation 66.85, one may consider a nonlinear feedback law
are defined as for the Kirchhoff system, and the loop is closed using the proportional feedback law Equation 66.85. The following result has been proved in [4].
lR2 and satisfies
THEOREM 66.9 Assume that ki > 0, i = 0, 1, and that T, satisfy Equations 66.60, 66.61, and 66.64. Then there is a n r > 0 so that
where y (.) is a real-valued function, z(.) : lR2
H
x . z(x) > 0 v x โฌ lR2\{0).
f(t)5
The closed-loop system is then dissipative. In addition, suppose that both y ( . ) and z(.) are continuous, monotone increasing functions. The closed-loop system is then well-posed in finite energy space. Under some additional assumptions on the growth of y(.), z(.) at 0 and at oo,the closed-loop system have a decay rate which, however, will be algebraic, rather than exponential, and will depend on a bound on the initial data; cf. 113, Chapter 51 and [16].
66.7.2 Stabilizability of the Reissner-Mindlin System The system, (Equations 66.67-66.69) is considered, along with the initial conditions
au U(X,0) = uO(x), -(x, at
1
0) = u (x),
x
E
f2.
I
(66.87) Theboundary inputs c, f are the controls. The boundary outputs
The total energy of the system at time t is
which suggests that the loop be closed by introducing the proportional feedback law f = -key,
c = -klz,
k, > 0, ko + kl > 0,
so that the closed-loop system is dissipative. In fact, it may be proved that the cc~nclusionsof Theorems 66.7 and 66.8 above hold for this system; see [ 131, where this is proved under the additional geometric assumption, Equation 66.64.
66.7.3 Stabilizability of the von K%rman System Consider the systern consisting of Equation 66.8 1, boundary conditions Equations 66.77 and 66.78, and initial conditions Equation 66.83. The inputs, outputs, and total energy of this system
c~-~'&(o)
(66.88)
provided E(0) < r , where w > 0 does not depena' on r . REMARK 66.22 If I p = 0 and ro = 8, the estimate Equation 66.88 was established in [13, Chapter 51 for every r > 0, with constants C, w independent of r, but under a modified feedback law for c.
REMARK 66.23 If the underlying dynamics are modified by introducing a dissipative term b ( x ) w , b(x) > 0, into Equation 66.81, it is proven in [5] that, under assumptions, Equations 66.60 and 66.61, the conclusion Equation 66.88 is valid for every r > 0, where both constants C, w depend o n r. This result was later extended in [8] to the case of nonlinear feedback laws (cf. Remark 66.21).
References [ I ] Adams, R.A., Sobolev spaces, Academic, New York, 1975. I21 Balakrishnan, A.V., Applted functional analysis, 2nd ed., Springer, New York, 1981. [3] Banks, H.T. and Smith, R.C., Models for control in smart material structures, in Idpntzficatrori and Control, in Systems Governed by Partiai Dzfferential Equations, Banks, H.T., Fabiano, R.H., and It12 K., Eds., SIAM, 1992, pp 27-44. [4] Bradley, M. and Lasiecka, I., Local expon~entialstabilization for a nonlinearly perturbed von KQrmQn plate, Nonlinear Analysis: Theory, Methods and Appl., 18,333-343, 1992. [5] Bradley, M. and Lasiecka, I., Global decay rtatestor the solutions to a von Karman plate without geometric constraints, J. Math. Anal. Appl., 181, 254276, 1994. [6] Bresse, J. A. C., Cours de niechanique applique, Mallet Bachellier, 1859. [7] Hirschhorn, M. and Reiss, E., Dynamic buckling of a nonlinear timoshenko beam, SIAM J. Appl Math., 37, 290-305, 1979. [8] Horn, M. A. and Lasiecka, I., Nonlinear boundary stabilization of a von KBrman plate equation, Differential Equations, Dynamical Systems and Control Science: A Festschrifi i n Honor of Lawrence Markus, Elworthy, K. D., Everit, W. N., and Lee, E. B., Eds., Marcel Dekker, New York, 1993, p p 581-604.
THE CONTROL HANDBOOK (91 Kane, T.R., Ryan, R.R., and Barnerjee,A.K., Dynamics of a beam attached to a moving base, AIAA J. Guidance, ControlDyn., 10, 139-151, 1987. [lo] Komornik, V., A new method of exact controllability in short time and applications,Ann. Fac. Sci. Toulouse, 10,415-464,1989. [ 111 Komornik, V., Exact controllability and stabilization, in The multiplier method, Masson - John Wiley &Sons, Paris, 1994. [I21 Krabs, W., On moment theory and controllability of one-dimensional vibrating systems and heating processes, in Lecture Notes in Control and Information Sciences, Springer, New York, 1992, Vol. 173. [13] Lagnese, J. E., Boundary stabilization of thin plates, Studies in Applied Mathematics, SIAh4, Philadelphia, 1989, Vol. 10. [14] Lagnese, J. E., The Hilbert uniqueness method: A retrospective, in Optimal Control of Partial Differential Equations, Hoffmann, K. H. and Krabs, W., Eds., Springer, Berlin, 1991, pp 158-18 1. [15] Lagnese, J. E., Recent progress in exact boundary controllability and uniform stabilizability of thin beams and plates, in Distributed Parameter Systems: New TrendsandAppEcafionsChen,G, Lee, E. B., Littmann, W., and Markus, L., Eds., Marcel Dekker, New York, 1991, pp 61-1 12. [16] Lagnese, J. E. and Leugering, G., Uniform stabilization of a nonlinear beam by nonlinear boundary feedback, J. Difi Eqns., 91,355-388, 1991. [17] Lagnese, J. E., Leugering, G., and Schmidt, E.J.P.G., Modeling, analysis and control of dynamic elastic multi-link structures, Birkhauser, Boston-BaselBerlin, 1994. [18] Lagnese, J. E. and Lions, J. L., Modelling, analysis and control of thin plates, Collection RMA, Masson, Paris, 1988, Vol. 6. [19] Lasiecka, I. and Triggiani, R., Exact controllabilityof semilinear abstract systemswith application to waves and plates boundary control problems, Appl. Math. Opt., 23,109-154,1991. [lo] Lasiecka, I. and Triggiani, R., Sharp trace estimatesof solutions to Kirchhoff and Euler-Bernoulli equations, Appl. Math. Opt., 28,277-306, 1993. [21] LeDret, H., Probl2mes variationnels dans les multidomains Moddsation des jonctions et applications, Collection RMA, Masson, Paris, 1991, vol. 19. [22] Lions, J. L., Contr8labilitk exacte, perturbations et stabilisation de syst2mes distribub Tome I: ContrBlabiltd exacte, Coll ction RMA, Masson, Paris, 1988, Vol. 8. [23] Lions, J. L., xact controllability, stabilization andperturbations for distributed systems, SIAM Rwiew, 30, 1-68,1988. [24] Russell, D. L., On mathematical models for the elastic beam with frequency-proportional damping, in Control and Estimation in Distributed Parameter Systems, Banks, H.T., Ed., SIAh4,1992, pp 125-169.
a
[25] Triggiani, R., Lack of exact controllabilityfor wave and plate equations with finitely many boundary controls, Difi Int. Equations, 4,683-705,1991. [26] Washizu, K., Variationalmethods in elasticityandplasticity, 3rd ed., Pergamon, Oxford, 1982. [27] Wempner, G., Mechanics of solids with applications to thin bodies, Sijthoff and Noordhoff, Rockville, MD, 1981. [28] Zuazua, E., ContrBlabilitk exacte d'un rnodkle de plaques vibrantes en un temps arbitrairement petit, C. R. Acad. Sci., 304, 173-176, 1987.
Control of the Heat Equation
Thomas I. Seidman Department of Mathematics cnd Statistics, University of Maryland Baltimore County. Baltimore, MD
67.1 Introduction ......................................................... 1157 67.2 Background: Physical Derivation .................................... 1158 67.3 Background: Significant Properties.. ................................ 1159 The Maximum Principle and Conservation Smoothingand 1,ocalization Linearity 'Autonomy, Similarity, and Scalings 67.4 Some Control-Theoretic Problems .................................. 1163 A Simple Heat Transfer Problem Exact Control System Identification 67.5 More Advanced System Theory.. .................................... 1165 The Duality of ObsemabilityIControllability The One-DimensionalCase Higher Dimensional Geometries References.. .................................................................. 1168
67.1 Introduction We must begin1 by distinguishing between the considerations for many practical problems in controlled heat transfer and the theoretical considerations for applying the essential ideas developed for the control theory of differential equations in systems governed by partial differentialequations -here, the linear heat equation,
du - = A v + $ fort > 0 , (67.3) dt abstract ordinary differential equations such as Equation 67.3 are quite different from the more familiar ordirlary differential equations with finite-dimensional states. The intuition appropriate for the parabolic partial differential Equation 67.3 is quite different from that appropriate for the wave equation d2w -=Aw+$2, dt2
fort z 0, x = (x, y , z ) E S C ~ W ~ . (67.1) Many of the former set of problems relate to optimal design, rather than dynamic controi, and many of the concerns are about fluid flow in a heat exchanger or phase changes (e.g., condensation) or other issues beyond the physical situations described by Equation 67.1. Some properties of Equation 67.1 are relevant to these problems and we shall touch on these, but the essential concerns are outside the scope of this chapter. This chapter will focus on the distinctions between 'lumped parameter systems' (with finite-dimensional state space, governed by ordinary differential equations) and 'distributed parameter systems' governed by partial differential equations such as Equation 67.1 so that the state, for each t, is a function of position in the spatial region S1. Though Equation 67.1 may be viewed abstractly as an ordinary differential equation2
describing a very different set of physical phenomena with very different properties (although in Section 67.5.3 we describe an interesting relationship for the corresponding theories of observation and control). The first two sections of this chapter provide background on relevant properties of Equation 67.1, including examples and implications of these general properties for practical heat conduction problems. We then discuss the system-theoretic properties of Equation 67.1 or 67.3. We will emphasize, in particular, the considerations arising when the inputloutput occurs with no direct analog in the theory of lumped parameter systems - not through the equation itself, but through the boundary conditions appropriate to the partial differential Equation 67.1. This
space of functions on Q, and A = in the three-dimensionalcase by
h his research has been partially supported by the Nationa1,Science Foundation under the grant ECS-8814788 and by the U.S. Air Force Office of ScientificResearch under the grant AFOSR-91-0008. 'NOW v ( t ) denotes the state, as an element of an infinite-dimensional 0-8493-8570-9/%/$0.00+$.50 @ 1996 by CRC Press, Inc.
fort >0,
a2is the Laplace operator, given
with specification of the relevant bousdary conditions.
THE CONTROL HANDBOOK mode of interaction is quite plausible for physical implementation because it is difficult to influence or to observe directly the behavior of the system in the interior of the spatial region.
67.2 Background: Physical Derivation Unlike situations involving ordinary differential equations with finite-dimensional state space, one cannot work with partial differential equations without developing an appreciation for the characteristic properties of the particular kind of equation. For the classical equations, such as Equation 67.1, this is closely related to physical interpretations. Thus, we begin with a discussion of some interpretations of Equation 67.1 and only then note the salient properties needed to understand its control. Although Equation 67.1 is the heat equation, governing conductive heat transfer, our intuition will be aided by noting also that this same equation also governs molecular diffusion for dilute solutions and certain dispersion phenomena and the evolution of the probability distribution in the stochastic theory of Brownian motion. For heat conduction, the fundamental notions of heat content & and temperature, are related3 by [heat content] = [heat capacity] . [temperature].
For any (imaginary) region B in the material, the total rate of heat flow out of B is then fa8 ij . ZdA (where ii is the outward normal to the bounding surface aB)and, by the Divergence Theorem, this equals the volume integral of divq'. Combining this with Equations 67.7 and 67.8, and using the arbitrariness of B, gives the governingSheat equation, aT
pc-
at
=?.k?~+@
where @ is a possible source term for heat. Let us now derive the equation governing molecular diffusion, the spread of a substance in another (e.g., a 'solute' in a 'solvent') caused by the random collisions of molecules. Assurning a solution dilute enough that one can neglect the volume fraction occupied by the solute compared with the solvent, we may present our analysis in terms of the concentration (relative density) C of the relevant chemical component. Analogous to the previous derivation, a materialflux vector j is given by Fick's Law: J'= - D ~ C (67.10) where D > 0 is the diffusion coeficiene. As in Equation 67.9, this law for the flux immediately leads to the conservation equation
(67.6)
or, in symbols, Q=pcT
where Q is the heat density (per unit volume), p is the mass density, T is the temperature, and c is the 'incremental heat capacity'. The well-known physics of the situation is that heat willflow by conduction from one body to another at a rate proportional to the difference of their temperatures. Within a continuum one has a heatfluxvectorij describing the heat flow: q'.iidA is therate (per unit time) at which heat flows through any (imaginary) surface element dA, oriented by its unit normal i .This is now given by Fourier's Law, (67.8) q' = -kgrad T = - k V T
-.
with a (constant4) coeficient of heat conduction k > 0.
3 ~ o r precisely, e because the mass density p and the incremental heat capacity c (i.e., the amount of heat needed to raise the temperature of a unit mass of material by 1ยฐC when it is already at temperature 6) are each temperature dependent, the heat content in a region B with temperature distribution T(.)is
For present purposes we assume that (except, perhaps, for the juxtaposition of regions with dissimilar materials) pc is constant. This means that we assume the temperature variation is not so large as to require working with the more complicated nonlinear model of Equation 67.5. In particular, we will not treat situations involving phase changes, such as condensation or melting. 4 ~ h icoefficient s k is, in general, also temperature dependent as well as a material property. Our earlier assumption about pc is relevant here permitting us to assume k is a constant.
where @ is now a source term for this component -say, by some chemical reaction. A different mechanism for the spread of some substance in another depends on the effect of comparatively small relative velocity fluctuations of the medium, e.g., gusting in the atmospheric spread of the plume from a smokestack or the effect of path variation through the interstices of packed soil in the spread of a pollutant in groundwater flow. Here one again has a material flux for the concentration .- given now by Darcy's Law, which appears identical to Equation 67.10. The situation may well be more complicatedhere, however, because anisotropy ( D is then a matrbi) and/or various forms of degeneracy may exist (e.g., D becoming 0 when C = 0); nevertheless, we still get Equation 67.1 1 with this dispersion coeficient D. Here, as earlier, we will assume a constant scalar D > 0. Assuming constant coefficients in each case, we can simplify Equation 67.9 or 67.11 by writing these as
5 ~istessential to realize that G,given in Equation 67.8, refers to heat flow relative to the material. If there is spatial motion of the material itself, then this argument remains valid provided the regions a move correspondingly, i.e., Equation 67.3 holds in material coordinates. When this is referred to stationary coordinates, the heat is transported in space by the material motion, i.e., we have convection. 6 ~ o r detailed e treatments might consider that D depends on the temperature, etc., of the solventand also on the existingconcentration, even for dilute concentrations. As earlier, these effects are insignificant for the situations considered, so that D is constant.
67.3. BACKGROUND: SIGNIFICANT PROPERTIES where v stands either for the temperature T or the concentration C, the subscript t denotes a partial derivative, and, in considering Equation 67.9, D stands for the thermal diffusivity a = k / p c . We may, of course, always ~ h o o s units e to make D = 1 in Equation 67.12 so that it becomes Equation 67.3. It is interesting and important for applications to know the range of magnitudes of* the coefficient D in Equation 67.12 in fixed units -say, cm2/sec. For heat conduction, typical values of the coefficient D = a! are, approximately 8.4 for diamond, 1.1 for copper, 0.2-0.6 for steam (rising with temperature),
67.3 Background: Significant Properties
l o p 3 for water,
for hard rubber.
For molecular diffusion, typical figures for D might be
In this section we note some ofthe characteristic properties ofthe 'direct problem' for the partial differentialEquation 67.1 and, reunderlying lated to these, introduce the representation form~~las the mathematical treatment.
about 0.1 for gaseous diffusion, 0.28 for the diffusion of water vapor in air, 2 x loh5 for air dissolved in water, 2x for a dilute solution of water in ethanol,
67.3.1 The Maximum Principle and Conservation
for ethanol in water, 8.4 x 1.5 x lo-' fix solid diffusion of carbon in iron, 1.6 x 10-lo for hydrogen in glass.
Finally, for the dis1,ersion of a smoke plume in mildly stable atmosphere (say, a 15 mph breeze) on the other hand, D might be approximately 10' rm2/sec. As expected, dispersion is a far more effective spreading mechanism than molecular diffusion. Assuming one kr~owsthe initial state of the physical system,
where a is the region of IR3 we wish to consider, we can determine the system evolutioi~unless we also know (or can determine) the source term @ and, unless S?, is all of IR3, can furnish adequate information about I he interaction at the boundary an. The sirnplest setting is no interaction at all: the physical system is insulated from the rest of the universe so no flux crossesthe boundary. Formally, this requires that;. n' = 0 or, from Equation 67.1 1 or 67.8 with the scaling of Equation 67.12,
More generally, the flux might be more arbitrary but known so we have the inhomogeneous Neumann condition:
av
- ~ - - = g ~
am
If we have Equation 67.1 on & = ( 0 , T ) x Q with @ specified on & and the initial condition Equation 67.13 specified on a,then eithe?' of the boundary conditions Equation 67.15 or 67.17 slifices to determine the evolution of the system on &,,i.e., for We refer to either of these as the direct problem. An important property of this problem is that it is well-posed,, i.e., a unique solution exists for each choice of the data and small changes in the data produce9 correspondingly small changes in the solution.
0.086 for broinze, 0.01 1 for ice, 7.8 x for glass, 4x for soil, >(
The mathematical theory supports our physical interpretation:
O
0.17 for cast iron,
1.4-1.7 6.2 x
An alternative7 set of data would involve knowing the temperature (concentration) at the boundary, i.e., having the Dirichlet condition, on E = ( 0 , T ) x an. (67.17) v = go
on c = (0, T ) x
aa.
(67.15)
One characteristicproperty, goingbackto the phyrrical derivation, is that Equation 67.1 is a conservation equation. In the simplest form, when heat or material isneither created nor (destroyedin the interior ($ = 0 ) and if the region is insulated in Equation 67.14, then [totalheat or material] = u d V is constant in time. More
7~lightlymore plausible physically would be to assume that the arnbient temperature or concentration would be known or determinable to beg 'just outside' aQ and then to use the fluxlaw (proportionality to the difference) directly,
with aflux transfer coefficient A > 0. Note that, if A % 0 (negligible heat or material transport), then we effectively get Equation 67.14. On the other hand, if A is very large ( v - g = -(D/A)av/an with D/A % 0), then v will immediately tend to match g at an, giving Equation 67.17; see Section 67.4.1. 'we may also have, more generally, a partition of 8 0 into I0 U TI with data given in the form Equation 67.17 on Co := (0, T ) x ro and in the formEquation 67.15 on X I = (0,T) x rl. ' ~ a k i nthis ~ precise - i.e., specifying the appropriate meanings of 'small' -becomes rather technical and, unlike the situation for ordinary differential equations, can be done in several ways each useful for different situations. We will see, on the other hand, that some other problems which arise in system-theoretic analysis turn out to be 'ill-posed', i.e., do noi have this well-posedness property; see, e.g., Section 67.4.3.
THE CONTROL HANDBOOK
1lG0
generally,
for there to be a steady state at all, and then must note that the solution of the steady-state equation
for v satisfying Equations 67.3 - 67.15. Another important property is the Maximum ~rinciple:"
Let v satisfy Equation 67.3 with $ 2 0 on Q, := (0, t ) x a. Then the minirntrrn value of v ( t , x) on is ~ ~ t t a i n eeither d initially (t = 0) or at the boundary (x E a a ) . Unless v is (1 constant, this value cannot 0150 occur in the interior of Q,; i f i t is a boundary m i n i m u m with t > 0, then one niust have avian' > 0 at that point. Similarly, if v satisfies Equation 67.3 with $ 5 0, then its maximum is attained for t = 0 or a t x E aa, etc.
a
One simple argument for this rests on the observation that, at an interior minimum, one would necessarily have vf = a v / a t 5 0 and also Av 2 0. The Maximum Principle shows, for example, that the mathematics of Equation 67.1 is consistent with the requirement for physically interpreting that a concentration cannot become negative and the fact that, because heat flows 'from hotter to cooler', it is impossible to develop a 'hot spot' except by providing a heat source.
67.3.2 Smoothing and Localization The dominant feature of Equation 67.1 is that solutions rapidly smooth out, with peaks and valleys of the initial data averaging. We will see this in more mathematical detail later, but comment now on three points: approach to steady state, infinite propagation speed, and
only becomes unique when one supplements Equation 67.20 by specifying, from the initial conditions Equation 67.13, the value ofl* i d V . Unlike the situation with the wave Equation 67.4, the mathematical formulation Equation 67.1, etc., implies an infinite propagation speed for disturbances - e.g., the effect of a change in the boundary data go(?,x) at some point x, E 352 occurring at a tlme r = t, is immediately felt throughout the region, affecting the solution for every x E 52 at every r =. t,. One can see that this is necessary to have the Maximum Principle, for example, but it is certainly nonphysical. This phenomenon results from idealizations in our derivation and becomes consistent with our physical intuition when we note that this 'immediate influence' is extremely small: there is a noticeable delay before a perturbation has a noticeable effect at a distance. Consistent with the last observation, we note that the behavior in any subregion will, to a great extent, be affected only very slightly (in any fixed time) by what happens at parts of the boundary which may be very far away; this is a 'localization' principle. For example, if we are only interested in what is happening close to one part of the boundary, then we may treat the far boundary as 'at infinity'. To the extent that there is little spatial variation in the data at the nearby part of the boundary, we may then approximate the solution quite well by solving the problem on a half-space with spatiallyconstant boundary data, dependent only on time. Taking coordinates so that the boundary becomes the plane 'x = 0: this solution will be independent of the variables y, z if the initial data and source term are. Equation 67.1 then reduces to a one-dimensional form
localization and geometric reduction. The first simply means that if neither $ nor the data go vary in time, then the solution v of Equations 67.3-67.17 on ( 0 , W) x 52 tend, as t -+ co,to the unique solution 6 of the (elliptic) steadystate equation -
[ a- 2+; - +a2a ax2
ay2
a 2 a ] =*, az2
i1
an = go.
(67.19)
The same would hold ifwe were to use Equation 67.15 rather than Equation 67.17 except that, as is obvious from Equation 67.18, we must then impose a consistency condition that
IO~hisshould not be confused with the Pontryagin Maximum Principle for optimal control.
for t > 0 and, now, x > 0 with, e.g., specification of v ( t , 0 ) = g o ( t ) and of v(0: x ) = vo(x). Similar dimensional reductions occur in other contexts - one might get Equation 67.21 for 0 < x < L where L gives the thickness of a slab in appropriate units or one might get a two-dimensional form corresponding to a body which is long compared to its constant cross-section and with data relatively constant longitudinally. In any case, our equation will be Equation 67.3, with the dimensionally suitable interpretation of'the Laplace operator. Even if the initial data does depend on the variables to be omitted, our first property asserts that this variation will disappear so that we may still get a good approximate after waiting through an initial transient. On the other hand, one usually cannot accept this approximation near, e.g., the ends of the body where 'end effects' due to those boundary conditions may become significant.
67.3. BACKGR 0 IJND: SIGNIFICANT PROPERTIES
67.3.3 Linearity We follow Fourier in using the linearity of the heat equation, expressed as a 'superposition principle' for solutions, to obtain a general representation for solutions as an infinite series. Let { [ e k ,h k ] : k = 0, 1,. . .) be the pairs of eigenfunctions and eigenvalues for - A on SZ, i.e., - h e k = hkek on 52 (with BC) fork = 0, 1, . . .
(67.22)
where "BC" denotes one of the homogeneous conditions
aek e k = O or-=O aii
onaa
(67.23)
based on considerinigEquation 67.17 or 67.15. It is-always possible to take these so that
with 0 5 ho c hl 5 . . . 4 m. Note ho > 0 for Equation 67.17 and l o = 0 for Equaition 67.15. From Equation 67.22, each function e-'ktek ( x )satisfiesEquation 67.1 so, superposing, we see that
for the solution of Equation 67.9. When $ is constant in t , this reduces to
which shows that v ( t , .) --t 5, as in Equation 67.19 with go = 0, and demonstrates the exponential rate of convergence with the transient dominated by the principal terms, correspondiig to the smaller eigenvalues. This last must be modified slightly when using Equation 67.15, because Lo = 0. Another consequence of linearity is that the effect of a perturbation is simply additive: if d solves Equation 67.9 with data $ and do and one perturbs this to obtain a new perturbed solution S for the data $ $ and ire vo (and unperturbed boundary data), then the solution perturbation v = ii - il itself satisfies Equation 67.9 with data $ and vo and homogeneous boundary conditions. If we now multiply the partial differential equation by v and integrate,
+
+
1
using the Divergence Theorem to see that 1 v A v = - lf'v12 with no boundary term because the boundary conditions are homogeneous. The Cauchy-Schwartz Inequality gives )$v$ 5
I
gives the "general solution" with the coefficients (ck) obtained from Equation 67.13 by
11 vll I[$ 11
where 11 . 11 is the L2(SZ)-norm: llvll == [jalv12]1/2 and we can then apply the Gronwall inequalityi' to derive, for example, the energy inequality
assuming Equation 67.24. Note that (., .) denotes the ~ ~ ( 5 2 ) inner product, (f, g) = ja f ( x ) g ( x ) d m x(for m-dimensional This is one form of the well-posedness property asserted at the SZ -with m = 1,2,3). The expansion Equation 67.26, and so end of the last section. Equation 67.25, is valid if the function vo is in the Hilbert space L ~ ( S Z ) , i.e., if 1v0lZ < m. The series Equation 67.26 need not 67.3.4 Autonomy, Similarity, and Scalings converge pointwise unless one assumes more smoothness for vo. Because it is known that, asymptotically ask 4 m, Two additional useful properties of the heat equation are autonomy and causality. The first means that the equatiton itselfis time kk Ck2/'" with C = C(52), (67.27) independent so that a time-shifted setting gives the time-shifted solution. For the pure initial-value problem, i.e., Equation 67.1 the factors e-'U decrease quite rapidly for any fixed t > 0, with g = 0 in Equation 67.17 or 67.15, 'causality' means that and Equation 67.25 then converges nicely to a smooth function. v ( t , .) is determined by its 'initial data' at any previous time to, Indeed, this is just the 'smoothing' noted above: this argument
-
can be used to show that solutions of Equation 67.1 are analytic (representable locally by convergent power series) in the interior of C2 for any t > 0 and that this does not depend on having homogeneous boundary conditions. The same approach can be used when there is a source term $ as in Equation 67.9 but we still have homogeneous boundary conditions as, e.g., go = 0 in Equation 67.17. The more general representation is
v(t,x )
SO
~ ( t.), = S(t - to) v(tO,.)
(67.30)
where S ( t ) is the solution operator for Equation 67.1 for elapsed time t 2 0. This operator S ( r ) is a useful linear operator in a variety of settings, e.g., L2(S2) or the space C(a) of continuous functions with the topology of uniform convergence. A comparison with Equation 67.25 shows that
yk(t)ek(x) where
= k
--
i
"~fafunction(pp Osatisfiesq(t) C+M then it satisfies ~ ( t5) ceMtthere.
-
j,f (p(s)dsfor~5 t 5 T,
THE CONTROL HANDBOOK
From Equation 67.3 1, the fundamental 'semigroup property' S(s
+ t) = S(t) o S(s)
for t, s
,
0.
(67.32)
This means that, if one initiates Equation 67.1 with any initial data vo at time 0 and so obtains v(s, .) = S(s)vo after a times and v(s + t , .) = S(s + t)vo after a longer time interval of length s + t, as in (67.30), 'causality' gives v(s t , .) = S(t)v(s, .). It is possible to verify that this operator function is strongly continuous at t = 0:
+
S(t)vo -+ vo as t -+ 0 for each vo
where the Laplace operator A here includes specifying the appropriate boundary conditions; we refer to A in (67.33) as 'the infinitesimal generator of the semigroup S(.)'. In terms of S(.), a new solution representation for Equation 67.9 is
+
It
(67.34)
S(t - s)+(s, .) ds.
,/mii *
which involves only a single spatial variable. For the twodimensional setting x = ( x , y) as in Section 67.3.2, this becomes
More generally, for the d-dimensional case, Equations 67.37 and 67.38 can be written as
and differentiable for t > 0. Equation 67.1 tells us that
v(t, .) = S(t)vo
the physically obvious fact that we may rotate in space. In particular, we may consider solutions which spatially depend only on the distance from the origin so that v = v(t, r) with r = 1x1 = Equation 67.12 with = $ ( t , r) is then equivalent to
S corresponds to the 'Fundamental Solution of the homogeneous equation' for ordinary differential equations and Equation 67.34 is the usual 'variation of parameters' solution for the inhomogeneous Equation 67.9; compare also with Equation 67.25. We may also treat the system with inhomogeneous boundary conditions by introducing the Green's operatorG : g H w, defined by solving
with Bw either w or aw/an', considering Equation 67.17 or 67.15. Using the fact that u = v - w then satisfies ut = Au ($ - w t ) with homogeneous boundary conditions, Equations 67.34 and 67.33 yield, after an integration by parts,
+
The apparent singularity of these equations as r -+ 0 is, an effect of the use of polar coordinates. As in Section 67.3.3, we may seek a series representation like Equation 67.25 for solutions of Equation 67.39 with the role of the eigenfunction expansion. Equation 67.22 now played by Bessel's equation; we then obtain an expansion in Bessel functions with the exponentially decaying time dependence e-'kt, as earlier. Finally, we may also make a combined scaling of both time and space. If, for some constant c, we set
then, for any solution v of Equation 67.12 with T++ = 0, the function C(;, 2 ) = v(t, x) will satisfy Equation 67.1 in the new variables. This corresponds to the earlier comment that we may make D = 1 by appropriately choosing units. Closely related to the above is the observation that the function
satisfies Equation 67.39 for t > 0, and a simple computation shows12 that k(t, x) ddx = 1 for each t > 0
(67.42)
so thatk(t, .)becomes as-function ast -+ 0. Thus, k(t -s, x-y) is the impulse response function for an impulse at (s, y). Taking d = 3,
-6'
AS(t - s)Ggo(s) ds.
(67.36)
The autonomy/causality above corresponds to the invariance of Equation 67.1 under time shifting and we now note the invariance under some other transformations. For this, we temporarily ignore considerations related to the domain boundary and take d to be the entire three-dimensional space IR3. In considering Equation 67.12 (with y? = 0) with constant coefficients, we have insured that we may shift solutions arbitrarily in space. Not quite as obvious mathematically is
is a superposition of solutions (now by integration, rather than by summation) so linearity insures that v is itself a solution; also,
12k(t, .) is a multivariate normal distribution (Gaussian) with standard deviation f i -+ 0 as t -+ 0.
67.4. SOME CONTROL-THEORETIC PROBLEMS where the specific interpretation of this convergence depends on how smooth vo is assumed to be. Thus, Equation 67.43 provides another solution representation although, as noted, it ignores the effect of the boundary for a physical region which is not all of EX3. For practical purposes, following the ideas of Section 67.3.2, the Equation 67.43 will approximate the solution as long as 2/2Dt is quite smallI3 compared to the distance from the point x to the boundary of the region.
67.4
Some Control-Theoretic Problems
In this section we provide three elementary examples to see how the considerations above apply to some control-theoretic questions. The first relates to a simplified version of a practical heat transfer problem and is treated with the use of rough approximations, to see how such heuristic treatment can be used for practical results. The second describes the problem of control to a specified terminal state, which is a standard problem in the case of ordinary differential equations but which involves some new considerations in this distributed parameter setting. The final example is a 'coefficient identification' problem: using interaction (inputloutput) boundary to determine the function q = q(x) in an equation of the form ut = u,, - qu, generalizing Equation 67.2 l.
67.4.1 A Simple Heat Transfer Problem Consider a slab of thickness a and diffusion coefficient D heated at constant rate $. On one side the slab is insulated (u, = 0) and on the other it is in contact with a stream of coolant (diffusion coefficient D') moving in an adjacent duct with constant flow rate F in the y-direction. Thus, the slab occupies ((x, y) : 0 < x < a , 0 < y < L} and the duct occupies ((x, y) : a < x < ii, 0 < y < L} with a , ii << L and no dependence on z. If the coolant enters the duct at y = 0 with input temperature uo, how hot will the slab become. For this purpose, assume a steady state, that the coolant flow is turbulent enough to insure perfect mixing (andl so constant temperature) across the duct, and that, to a first approximation, the longitudinal transfer of heat is entirely by the coolant flow so conduction in the slab is
I3when x is too close to the boundary for this to work well, it is plausible to think of an as 'almost flat' on the relevant spatial scale and then to extendl vo by reflection across it, as an odd function for Equation 67.17 withgo = Oor asan even functionforEquation67.15 with g~ = 0. For (67.17) with, say, go = go(t) locally, there would then be a further correction by adding
where kl = kl (r, x ) is as in Equation 67.41 for d = 1 and x is the distance from x to the boundary; compare (67.41). There are also comparable correction formulas for more complicated settings.
only in the transverse direction (0 < x < a). The source term in the slab gives heat production a$ per unit distance in y and this must be carried off by the coolant stream for a steady state. We might, as noted earlier, shift to material coordinates in the stream to obtain an equation there. More simply, we just observe that, when the coolant has reached the point y, it has absorbed the amount a$y of heat per second, and, for a flow rate F (choosing units so that pc in Equation 67.7 is l), the coolant temperature increases from uo to [LLO a + y / F l =: u(y). Now consider the transverse conduction in the slab, where $ with vt = 0 for steady state. As v, = 0 vt = Dv,, at the outer boundary x = 0, the solution has {heform v = v* - ( $ 1 2 ~ where ) ~ ~ v* is exactly what we wish to determine. Ifwe assume a simple temperature match of slab to coolant (v = u(y) at x = a), then v*(y) - ( $ / 2 ~ ) a = ~ v(a, y) = u(y) = uo a $ y / F so that
+
+
+
+
and
A slight correction of this derivation is worth noting: for the coolant flow we expect a boundary layer (say, of thickness 6 ) of 'stagnant' coolant at the duct wall and within this layer u, %constant = - [v(a) - u Iflow]/& Also, D'u, =flux = a$ by Fourier's Law so, instead of matching v(a) = u(y), v(a) = u(y)+(S/D1)a$ which also increases v*, v by (S/D1)a$ as a correction to Equation 67.46; this correction notes the reduction of heat transfer by replacing the boundary coinditions Equation 67.17 by Equation 67.16 with h = D1/S. Much more complicated corrections are needed to consider conduction within the duct, especially with a velocity profile other than the plug flow assumed here. In this derivation we neglected longitudinal conduction in the slab, omitting the vvy term in Equation 67.12. Because Equation 67.46 gives vvv = 0, this is consistent with tlhe equation. It is, however, inconsistent with reasonable boundar y conditions at the ends of the slab (y = 0, L) with 'end effects' as well as some evening out of v* . Although this was derived in steady state, we could use Equation 67.46 for an optimal control problem (especially if $ varies, but slowly) with the flow rate F as control.
67.4.2
Exact Control
Consider the problem of using $ as control to reach a specified 'target state' w = w (x) at time T. Basing the discussion on the representation14 Equation 67.28, permits treating each component independently: the condition that v(T, .) = w becomes the
I 4 ~ odefiniteness, r one may think of the one-dimensional heat Equdtion 67.21 with homogeneous Dirichlet boundary conditions . ~ t
THE CONTROL HANDBOOK
-
sequence of 'moment equations:
choice of integration method; we have noted that Ilek Il [,,I $'I2 because the differential operator A is already of order 2. This means that one might have to refine the mesh progressively to get such a uniform relative error for large k.
for each k. This does not determine the control uniquely, when one exists, so we select by optimality, minimizing the norm ) Q = (0, T ) x S2. This is equivalent of ljr in L ~ ( Q with to requiring that ljrk(t) is a constant times eAk(T-'). Noting that le-Ak(T-s)~2 d s = [ l - e-2"T] /2kk, the conditions (67.48)yield the formula
67.4.3 System Identification Finally, we consider a one-dimensional example governed by an equation of the form15
1;
(67.49) This formula converges if (and only if) the specified target state w is attainable by some control in L ~ ( & ) . Let us now see what happens when we attempt to use Equation 67.49. One must truncate the expansion (say, at k = K) and then find each coefficient a k = wk - eke-'kT with an error bound ~k by using an algorithm of numerical integration on 52 to compute the inner products (ek, w ) . [For simplicity we assume that we know the relevant eigenvalues and eigenfunctions,as for Equation 67.2 1 and a variety of higher-dimensional geometries.] Denoting the optimal control by \V and the approximation obtained by Q K , we can bound the total error by
but with D and the specific coefficient function q ( . ) unknown or known with inadequate accuracy. Assume that u = 0 at x = 1, but that interaction is possible at the end x = 0, where one can manipulate the temperature and observe the resulting heat f l u ; for simplicity, we assume that vo = 0. Thus, we consider the inputloutput pairing: g H f , defined through Equation 67.51 with
By linearity, causality, and autonomy of the Equation 67.51, this pairing takes the convolution form
f (t) =
b(t
- s ) g ( s )d s
where a ( . ) is an impulse-response function. Much as we obtained Equation 67.25 and 67.36, For an attainable target w the second sum is small for large K, corresponding to convergence of Equation 67.49. A fixed error bound lsk I 5 E for the coefficient computation makes the first ~ Equation 67.27 which becomes sum ofthe order of K'+ ( 2 / d )by large as K increases. To make the total error Equation 67.50 small requires picking K and then choosing E dependent on this . last choice, or using a relative error condition: I E I~ 5 ~ l a k lThis seems quite plausible for numerical integration with floatingpoint arithmetic, but one problem remains. Neglecting vo, a plausible form of the error estimate for a method of numerical integration might be
where h characterizes a mesh size and the subscript on ) I . I ] [ , ] indicates derivatives of order up to v, with v depending on the
and
with a k
:= - ~ k k e i ( o ) ( zek) ,
DZ" - q z
=
0, z(0) = 1, z(1) = O(67.54)
- ~ e g + ~ = e ~ kkekek(O)=O=ek(l),
noting that z and {(hk,ek)}are unknown because q is unknown. Viewing Equation 67.53 as an integral equation for o, Equation 67.53 determines a for appropriate choices of the input g(.). The simplest would be if g is a &'-function(impulse) so the observed f would be a: otherwise we must first solve a Volterra equation of first kind, already an ill-posed problem. The function a (.) contains all the information about the unknown q and large differences for q may produce only very small perturbations of a . Thus, this identification problem cannot be 'well-posed', regardless of g(.).
x = 0, 1. The eigenvalues and normalized eigenfunctionsare then
so the expansions, starting at k = 1 for convenience, are standard Fourier sine series.
I 5 ~ o example, r such an equation might arise for a rod with heat loss to the environment, appearing through a boundary condition at the surface of the rod as in Equation 67.16, with g =constant and 1 varying along the rod, reduced to a simplified one-dimensional form.
67.5. MORE ADVAtNCED SYSTEM THEORY None of the coefficients a k will be 0 so, given a(.), Equation 67.55 uniquely determines the eigenvalues {Ak} appearing as exponents. Note that hk ~ r SO D~ = limk k hk/r2k2 ~ is then determined. We can'then show (by an argument invohing analytic continuation, Fourier transforms, and properties of the corresponding wave equation) that a (.) uniquely determines q (.), as desired. The discussion above does not suggest howto compute D, q (.) from the observations. Typicall? one seeks nodal values for a discretization of q. This can be done, for example, by history matching, an approach often used for such identification problems, in which one solves the direct problem with a guessed q to obtain a resulting 'f = f (q)' and proceeds to find the q which makes this best match the observed f . With some further a priori information about the function q, say, a known bound on the derivative q', the uniqueness result, although nonconstructive, insures convergence for these computations to the correct result as the discretization is refined. The auxiliary a priori information converts this to a well-posed problem, although badly conditioned so that the practical difficulties do not entirely disappear.
We begin by computing the relevant adjoint problem, taking the boundary control problem as
Bu = g on X and u
= uo
(67.55)
II=O
in which the control function g is the data for the boundary conditions, defined on I: = (0, T) x an. As for Equation 67.35, the operator B will correspond to either Equation 67.17 or 67.15; in specifying B, g(.) must be 0 outside some fixed 'patch', i.e., a relatively open subset U C X,viewed as an 'accessible' portion of Z, and referred to as a problem of 'patch control'. Note that
where Equation 67.36 gives
-
lT
AS(T
- s)Gg(s) ds.
For the adjoint problem, we consider
67.5 More Advanced System Theory In this section we consider the system-theoretic results available for the heat equation, especially regarding observabilityand controllability. We ernphasize how, although the relevant questions are parallel to ithose in 'lumped parameter' control theory, new technical dificulties can occur because of the infinitedimensional state space; this means that this section depends s ' ~special more heavily on the results of Functional ~ n a l ~ s iand mathematics for the partial differential equations involved. One new consideration is that the geometry of the region C2 is relevant here. We will concentrate primarily on problems in which inputloutput interaction (for control and for observation) is restricted to the boundary, partly because this is physically reasonable and partly because only for a system governed by a partial differential equation can one consider 'control via the boundary conditions'.
where B gives the 'complementary'boundary data: BV := avian' if B corresponds to Equation 67.17 and ftv .- vliln if B corresponds to Equation 67.15. Using the Divergence Theorem,
[J,UTv.1- [J,vow]
=
/'(Uv)t
where we write VT,vo for v(T, -), v(0, .), respectively, and set & = (0, T) x Q. Thus, with subscripts indicating the domain for the inner product of L2(.), we have the identity
67.5.1 The Duality of Observability/ Controllability For the finite-dimensional case, controllabilityfor a problem and observability for the adjoint problem are dual. One can control x = Ax+Bg (g(.) = control) fromone arbitrary state to another if, and only if, only the trivial solution of the adjoint equation - j . = A* y can give [observation] = B* y(.) = 0.Similar result holds for the heat equation with boundary 110, but we must be careful in our statement.
16[1] is a general reference for Functional Analysis, directed toward distributed parameter system theory.
from which we wish to draw conclusions. First, consider the reachable set R = { u ~: g = any E L ~ ( U )uo ; = O}, which is the range of the operator L : L ~ ( U )+ L2(C2). If this were not dense, i.e., if we did not have = L2(C2), then (by the Hahn-Banach Theorem) there would be some nonzero v; orthogonal to all U T ( E so Equation 67.58 would give (g, v * ) ~= 0 for all g, whence q* = 0, violating detectability (i.e., that q* = 0 only if v* = 0). Conversely, a violation of detectabiity would give a nonzero v* with v; # 0 orthogonal to 2. Thus, detectability is equivalent to approximate controllability. This last means that one could control arbitrarily closely to any target state, even if it cannot be reached
THE CONTROL HANDBOOK exactly. This is a meaningless distinction for finite-dimensional linear systems although significantfor the heat equation, because solutions of Equation 67.1 are analytic in the interior of S2 so that only very special targets can be exactly reachable. Detectability means that the map v~ H v H q is 1-1 so, inverting, q H UT H uo is well-defined: one can predict (note the time reversal in Equation 67.57) vo from observation of q on U . In the finite-dimensional case, any linear map such as A : q~H vo wouldnecessarilybe continuous (bounded), but here this is not automatically the case; note that the natural domain of A is the range of v~ H q and, if one had continuity, this would extend to the closure M = Mu c L2(U). For bounded , is a bounded adjoint operator A* : A : M --+ L ~ ( Q ) there L2(d) -+ M and, ifwe were to set g = A*uo in Equation 67.55, (UT, vT).Q =
fd
with the values of pk explicitly computable as an infinite product
AV)Q- (A*u0. V ) U = 0 for every VT E L2(!2). ( ~ 0 9
This would imply U T = 0 so that g = A*uo is a nullcontrol from,uo. Indeed, it turns out that this g is the optimal nullcontrol in minimizing the L2(U)-norm. Conversely, if there is some nullcontrol g for each uo, there will be a minimum-norm nullcontrol g, and the linear map C : uo H g is then continuous by the Closed-Graph Theorem; further, its adjoint A = C* is the observation operator: q H VT. Thus, bounded observability for the adjoint problem is equivalentto nullcontrollability for Equation 67.55 which, from Equation 67.56, is equivalent to considering that the rasge of L contains the range of S(T). Suppose we have nullcontrollability for arbitrarily small T > 0, always taking U = [0, T ] x U for some fixed patch U C an. A simple argument shows that the reachable set R must then be entirely independent of T and of the initial state uo. No satisfactory characterization of R is available, although there are various known sufficient considerations for some w E R.
67.5.2 The One-Dimensional Case From the discussion above, it is sufficient to prove that bounded observability for the one-dimensional heat equation means it has nullcontrollability also. This was not realized when these results were first proved so that each was originally proved independently. We will consider the observability problem with n = (0, I), U = (0, T) x {0}and
for which we know Equation 67.47. From Equation 67.25,
and V(T,.)
[For convenience, we have rereversed time compared with Equavo(x) tion 67.57, and from Equation 67.25, & = &kn sin k ~ r xdx, although we won't need any information about vo.] The form of Equation 67.60 is a Dirichlet series; this becomes a power series in = e - ~ " with only the k2 powers appearing: e-Akt = e - k 2 ~ t - k2 - 6 . The theory of such series centers on the Miintz-SzPsz Theorem (extending the Weierstrass Approximation Theorem) which, for our purposes, shows that only special functions can have L2-convergent expansions Equation 67.60 when C l/hk < oo.An estimate for Equation 67.60 of the form is lskl i p k l l ~ l l ~ 2 ( 0 , ~ ) (67.62)
= k
Ck -e kx
-AkT
ek(.).
(67.61)
(convergent when Ckl/hk is convergent); 1/& is the distance 0 ,from exp[-hkt] to span ( e x p [ - ~ ~ t :] i # k). in ~ ~ (oo) Schwartz has further shown that for functions given as in Equation 67.60 I ~ v ~ ~ L ~ ( o . w ) 5 ~TIIvIILZ(O,T). (67.64) Combining these estimates
The sequence p k increases moderately rapidly ask -+ oo but the exponentials e x p [ - k 2 n 2 ~ ]decay even more rapidly so that the sum giving C; is always convergent and Equation 67.65 provides a bound (IIAll 5 rTCT < co)for the observation operator1' A : M = MIO,Tl -+ L2(!2) : q~ H v(T, .), when T > 0 is arbitrarily small. A different way of looking at this is that the linear functional, M --+ R : q H Zk, must be given by a function gk E L2(0, T) so that
If gk(.) is defined on R (0 off [O, TI), we may take the Fourier transform and note that Equation 67.66 asserts the 'interpolation conditions'
so that it suffices to construct functions jk satisfying Equation 67.67, together with the properties required by the PaleyWiener Theorem, to get the inverse Fourier transform in L2(0, T)
''A recent estimate by Borwein andErd6lyi makes it possible to obtain comparable results when U has the form U = E x {O) with & any subset of [O, TI having positive measure; one consequence of this is a bang-bang principle for time-optimal constrained boundary control.
1167
67.5. MORE ADVANCED SYSTEM THEORY with llgk 11 = pk. This approach leads to the sharp asymptotic estimate (67.68) In 11 All = (3(1/T) as T + 0, showing how much more difficultI8 observability or controllability becomes for small T, even though one does have these for every T > 0. A variant on this considers the interior point observation q(t) := v(t, a ) . The observability properties now depend on number-theoretic properties of 0 < a < 1, e.g., for rational a = m l n one gets no information at all about ck when k is a multiple of n, because then sin klra = 0. It can be shown that one has bounded observability (with arbitrarily small T > 0) for a in a set of full measure whereas the complementary set for which this fails is uncountable in each subinterval.19 Finally, we note that an identical treatment for all of the material of this subsection would work more generally for Equation 67.21 and with other boundary conditions.
call 'g' and using this in Equation 67.55 necessarily (by uniqueness) gives the restriction to Q of i which vanishes at T. Thus, this g is a nullcontrol for Equation 67.55. As already noted, once we have a nullcontrol g for each uo, it follows, as noted earlier, that the nullcontrol operator C for this setting is well-defined and continuous; by duality, one also has bounded obsewability. It is unnecessary that U be all of Z here: this construction works if the 'unused' part of I:is contained in 2 \ U. At this point we note a deep connection between the theories for the wave and heat equations:
observability for the wave equation wtt = Aw for some [ d ,U] andsome T* > 0 implies observability for the heat equation ut = Au for arbitrary T 0 in the same geometric setting [ a , U 1.
67.5.3 Higher Dimensional Geometries
We now have the condition l, gkzi exp[-Ait] = 6i,k (with zk := ~ e ~corresponding ) , to Equation 67.66 for this higher dimensional problem. When U has the form (0, T) x U for a patch U c aS2, one can take the Fourier transform (in t only) and get, as with Equation 67.67,
For higher dimensional cases, we can obtain observability for any with U = (0, T ) x 'cylindrical' region !2 := (0, 1) x 5 c [0 x h ] by using the method of 'separation ofvariables' to reduce this to a sequence of independent one-dimensional problems: noting that
Russell observed [5] that, if hk were the function for the boundary observability problem for the wave equation wtt = Aw on S2 corresponding to gk, then
(where 6* = 1,0 as the occurrences o f f on the left do or do not match), so that the spatial dependence is identicad and the time dependence closely related. This suggests constructing & as
we get
with pe(t) := ei"(v,(t,
0, -),& ) 8
where A1 is the observability operator for Equation 67.59. It is easy to verify that this gives IlAll 5 IIA111 < CQ and we have nullcontrollability by duality. For more general regions, when U is all of E := (0, T) x aS2 we may shift to the context of nullcontrollability for Equation 67.55 and rely on a simple geometric observation. Suppose Q, c f2 c IKd where f2 is some conveniently chosen region (e.g., a cylinder, as above) for which we already know that we have nullcontrollability for Equation 67.55, i.e., with &, U replaced by = (0, T ) x d m and^ = 5 = (0, T ) x af2, respectively. ) 67.55, we extend Given any initial data uo E ~ ~ (ford Equation it as 0 to all of fi and, as has been assumed, let ii be a (controlled) solution of Equation 67.55, vanishing on all of h at time T . The operator B acting (a1 E ) on il will have somevalue, which we now
1 8 ~ h i may s be compared to the corresponding estimate: llAll = (3 ( T - ( ~ + ' / ~for ) ) the finite-dimensional setting, with K the minimal index giving the rank condition there. lg~ecausethe 'bad set' has measure 0, one might guess that observation using local integral zverages (as a 'generalized thermometer') should always work. but, somewhat surprisingly, this is false.
(noting that the rhs is an even analpic function of a and so is an analytic function of 0 2 ) , so that Equation 67.70 implies the desired Equation 67.69. This does not quite work, since it would not , can be modified, multiplying by a suitable give gk E L ~ ( U )but function R = R ( r ) on the right, to get gk as an inverse Fourier transform. We note that it is the construction of the modifier R(.)which gives the asyrnptotics Equation 67.681; it is also this multiplication which makes the implication above irreversible. The relation used here is also usable to obtain uniqueness results for a higher dimensional version of the identificatron problem of Section 67.4.3 from known corresponding results for the wave equation. One gets the observability for the wave equation (hence, for the heat equation) for suitable2' [ a , U] by an argument (51, [7] based on Scattering Theory (existence of a uniform decay rate for the portion of the total energy remaining in C2 when this is embedded in a larger region 6). In particular, one can take fi
2 0 ~ h finite e propagation speed associated with the wave equation implies the existence of a minimum time T,, depending on Q, for observability/nullcontrollability.
THE CONTROL HANDBOOK as the complement of a 'star-shaped' obstacle at which Bw = 0 provided a8 \ U C afi = a (obstacle). On the other hand, there is a significantgeometric restriction on [S2, U ]for which one can observe or control the wave equation: there can be no 'trapped waves' (which continually reflect off the unused boundary B \ U without ever intersecting U )which, roughly, requires that an\ U should be 'visible' from some single point outside This argument then gives observability/controllabilityfor the heat equation for a variety of geometric settings, but there is a price, the geometric restriction noted above. This is quite reasonable for the wave equation but seems irrelevant to the behavior of the heat equation and it was long conjectured that accessibility of an arbitrary patch U c aS2 would suffice for observation or control of Equation 67.1. Quite recently, Lebeau and Robbiano have settled this conjecture by demonstrating -using an argument based on new Carleman-type estimatesfor a related elliptic equation -patch nullcontrollability using an arbitrary patch21 U for an arbitrary 8.
a.
References [ l ] Curtain, R.F. and Pritchard, A.J., Functional Analysis in Modern AppliedMathematics, Academic, NewYork, 1977. [2] Krabs, W., On Moment Theory and Controllability of One-Dimensional VibratingSystems and Heating Processes, (Lecture Notes in Control and Inf. Sci. # 173), Springer, Berlin, 1992. [3] Lasiecka, I. and Triggiani, R., Differential and Algebraic Riccati Equations with Application to Boundary/Point Control Problems: Continuous Theory and Approximation Theory, (Lecture Notes in Control and Inf. Sci. # 164), Springer, Berlin, 1991. [4] Lions, J.-L., Exact controllability, stabilization, and perturbations, SiAM Review, 30, 1-68, 1988. [5] Russell, D.L., A unified boundary controllabilitytheory for hyperbolic and parabolic partial differential equations, Stud. Appl. Math., LII, 189-21 1,1973. [6] Seidman, T.L, Boundary observation and control for the heat equation. In Calculus of Variations and Control Theory, Russell, D.L., Ed., Academic, New York, 1976, pp 321-351. [7] Seidman, T.I., Exact boundary controllability for some evolution equations, SUM 1. Control Opt., 16, 979-999,1978.
2 1 ~ h ehave y given this result for the case of boundary controllability described here and also for the case of distributed control - Equation 67.3 with homogeneous boundary conditions and control function $,vanishing outside a patchU open in = (0, T)x 51.
Observability of Linear Distributed-Parameter Systems 68.1 Comparison with the Finite Dimensional Case . . . . . . . . . . . . . . . . . . . . .1169 68.2 General Formulation in the Distributed-Parameter Case .......... 1170 68.3 Observation of a Heat Conduction Process ......................... 1171
David L. Russell Department ojMathematics, Virginia Tech, Blacksburg, VA
68.4 Observability Theory for Elastic Beams ............................. 1173 References.. ..................................................................1175
68.1 Comparison with the Finite Dimensional Case
that interval. This observability property, in turn, is equivalent to the positive definiteness of the matrix integral
The general question of observability in the finite dimensional setting concerns a system
x = f (x, ...), x
E
Rn,
where the ellipsis indicates that the system might involve additional parameters, controls, disturbances, etc., and an output (measurement, observation) function
Observability theory is concerned, first of all, with the question of distinguishability, i.e., whether distinct system trajectories x(t), i ( t ) necessarily give rise to distinct outputs y (t), j ( t ) over a specified time interval. The observability question, properly speaking, concerns actual identijkation of the trajectory x (t), equivalently, the init~alstate no, from the available observations on the system with an ultimate view to the possibility of reconstruction of the trajectories, or perhaps the initial or current states, which, in application, are generally not directly availablefrom the outputs, typically consisting of a fairly small set of recorded instrument readings on the system. In the linear case, observability and distinguishability are equivalent, and both questions can be treated in the context. of a vectorimatrix system
together with an output vector y related to x via
and it is a standard result [ l ] that observability obtains on an interval [O, T I , T > 0, just in case y r 0 implies that x = 0 on 0-8493-8570-91961$0.00+$.liO @ 1996 by CRC Press, Inc.
where @ (t, s) is the fundamental solution matrix of the system with @(s, s) = I. The initial state xo can then be reconstructed from the observation y (t) by means of the reconstruction operator (cf. 1181)
In the constant coefficient case A(t) r A , observability (on any interval of positive length) is equivalent to the algebraiccondition that no eigenvector of A should lie in the null space of C; there are many other equivalent formulations. In the (infinite dimensional) distributed-parameter setting, it is not possible to provide any comparably concise description of the general system to be studied or universal detinition of the terms involved, but we will make an attempt in this direction later in the chapter. To set the stage for that, let us begin with a very simple example that illustrates many of the complicating factors involved. The term distributed-parameter system indicates a system whose state parameters are distributed over a spatial region rather than constituting a discrete set of dependent variables. For our example, we take the spatial region to be the interval [0, 11 and represent the state by a function w(x, t), x E [0, 11, t E [0, a). Let us think of w(x, t) as representing some physical property (temperature, concentration of a dissolved chemical, etc.) in a fluid moving from left fo right through a conduit whose physical extent corresponds to 0 5 x 5 1 with a uniform unit velocity. Ifwe assume no diffusion process is involved, it is straightforward to see that w(x, t) will be, in some sense that we do not elaborate
THE CONTROL H A N D B O O K on at the moment, a solution of the first-order partial differential equation (PDE)
to which we need to adjoin a boundary condition
and an initial condition or initial state given, without loss of generality, at t = 0, W(X,0) = WO(X),X E [0, 11.
(68.7)
It can be shown (cf. (41, e.g.) that with appropriate regularity assumptions on v(t) and wo(x) there is auniquesolution w(x, t) of Equations 68.5,68.6, and 68.7 for x E [O, 11, t E [0, co). We will consider two different types of measurement. The first is a point measurement at a given location xo,
while the second is a distributed measurement, which we will suppose to have the form
for some piecewise continuous function c(x), defined and not identically equal to zero on [O, 11. Let us suppose that we have an initial state wo(x) defined on [O, 11, while the boundary input (Equation 68.6) is identically equal to 0. Let us examine the simplest form of observability, distinguishability, for the resulting system. In this case, the solution takes the form, fort 2 0,
For a point observation at xo, the output obtained is clearly
We see that if xo < 1, the data segment consisting of the values
is lost from the data. Consequently, we do not have distinguishability in this case because initial states wo(x), Go(x) differing only on the indicated interval cannot be distinguished on the basis of the observation y(t) on any interval [ O , TI. On the other hand, for xo = 1 the initial state is simply rewritten, in reversed order, in the values of y(t), 0 5 t 5 1 and, thus, we have complete knowledge of the initial state wo(x) and, hence, of the solution w(x, t ) determined by that initial state, provided the length of the observation interval is at least unity. If the length of the interval is less than one, we are again lacking some information on the initial state. It should be noted that this is already a
departure from the finite dimensional case; we have here a timeindependent system for which distinguishabilityis dependent on the length of the interval of observation. Now let us consider the case of a distributed observation; for definiteness, we consider the case wherein c(x) is the characteristic function of the interval [ I - 8 , I] for 0 5 6 ( 1. Thus,
If y(t) = 0 on the interval [0, 11, then by starting with t = 1 and decreasing to t = 0 it is easy to see that wo(x) = 0 and thus w (x, t) r 0; thus, we again have the property of distinguishability if the interval of observation has length 2 1. But now an additional feature comes into play, again not present in the finite dimensional case. If we consider trigonometric initial states wo(x) = sinwx, o r 0 we easily verify that 1 y (t) 1 5 $ , t 2 0, a bound tending to zero as w + m. Thus, it is not possible to bound the supremum norm of the initial state in terms of the corresponding norm of the observation, whatever the length of the observation interval. Thus, even though we have the property of distinguishability,we lack observabilityin a stronger sense to the extent that we cannot reconstruct the initial state from the indicated observation in a continuous (i.e., bounded) manner. This is again a departure from the finite dimensional case wherein we have noted, for linear systems, that distinguishabilitfr/observabilityis equivalent to the existence of a bounded reconstruction operator.
68.2 General Formulation in the Distributed-Parameter Case To make progress toward some rigorous definitions we have to introduce a certain degree of precision into the conceptual framework. In doing this we assume that the reader has some background in the basics of functional analysis [17]. Accordingly, then, we assume that the process under study has an associated state space, W, which we take to be a Banach space with norm 1) w 1) w [in specificinstances,this is often strengthened to a Hilbert space with innerproduct (w , G)w 1. The process itself is described by an operator differential equation in W,
where A is a (typically unbounded, differential) closed linear operator with domain D(A) constituting a dense subspace of W satisfying additional conditions (cf. [6], e.g.) so that the semigroup of bounded operators eA' is defined for t > 0, strongly continuous in the sense that the state trajectory associated with an initial state wo E W,
is continuous in W fort 2 0. Many time-independent PDEs can be represented in this way, with A corresponding to the "spatiai"
THE CONTROL HANDBOOK normal vector to r , external with respect to a, and this rate is proportional to the difference between the surface temperature w at the same point on the surface and the ambient temperature, which we will assume, for simplicity, to be zero. The system under study therefore consists of Equation 68.14 together with a Dirichlet-Neumann boundary condition
(Equation 68.18), i.e., such that not all ck = 0, can give rise to an observation (Equation 68.20) that is identically zero. This immediately requires us to give attention to the boundary values qk(X, t ) = e-Ak'4k(X) corresponding to wo(X) = $k(X). Clearly, the boundary observation corresponding to a general initial state (Equation 68.18) is then
z 00
y(X, t) =
ck qk(X, t ) , X E r , t E [0, TI,
(68.21)
k=l
where o is a positive constant of proportionality. Using X to stand for the triple x, y, z , the (unknown) initial condition is
The standard theory of parabolic PDEs [22] guarantees the existence of a unique'solution w (X, t), X E a , t E [0, TI of Equation 68.14 with the boundarginitial data of Equations 68.15 and 68.16. The available measurement data are y(X, t) = w(X, t), X E
r, t E [0, TI.
(68.17)
The question then is whether it is possible to reconstruct the temperature distribution w (X, r ) , X E S2 at a particular instant s E [0, TI on the basis of the measurement (Equation 68.18). This question has been extensivelystudied, sometimes indirectly via the dual question of boundary controllability. Brevity requirements constrain us to cite only [13], [14] here, but we will indicate below some of the mathematical issues involved. The Laplacian operator appearing on the right-hand side of ) Equation 68.14, defined on a dense domain in L ~ ( Q incorporating the boundary condition of Equation 68.15, is known to be a positive self-adjoint differential operator with positive eigenvalues hk, k = 1,2,3, ... and has corresponding normalized eigenfunctions &, k = 1,2,3, ... forming an orthonormal basis for the Hilbert space L2(!2). An initial state wo in that space has an expansion, convergent in that space,
Corresponding to this expansion, the system of Equations 68.14 to 68.16 has the solution
with the corresponding measurement, or observation M
ck e-kc $k(X), X E r , t E 10, TI.
y (X, t) = k= 1
(68.20) involving known kk , &, but unknown ck . Let us first consider the question of distinguishability: Can two solutions w(X, t), G(X, t), corresponding to initial states wo, Goproduce the same observation y (X, t) via Equation 68.17? From the linear homogeneous character of the system it is clear that this is equivalent to asking whether a nonzero initial state
Can y(X, t), taking this form, be idetltically zero if the ck are not all zero? The statement that this is not possible, hence that the system is distinguishable, is precisely the statement that the qk(X, t), X E r, t E [0, T] are weakly independent in the appropriate boundary space, for simplicity, say L 2 ( r x [O, TI). As a result of a variety of investigations [9],we can assert that this is, indeed, the case for any T 2 0. In fact, these investigations show that a stronger result, spectral observability, is true. The functions qk(X, t) are actually strongly independent in L 2 ( r x [0, TI), by which we mean that there exist biorthogonal functions, not necessarily unique, in L2(r x [O, TI) for the qk(X, t), i.e., functions ik(X. t) E L 2 ( r x [O, TI), k = 1,2,3, ..., such that
The existence of these biorthogonal functions implies that any finite number of the coefficients ck can be constructed from the observation y(X, t) via
Indeed, we can construct a map from the output space, here assumed to be L 2 ( r x [O, TI), namely,
(68.24) carrying the observation Y into the "K-approximation" to the state w(., r), t E [0, TI. Since the sum (Equation 68.18) is , property of spectral observability is convergent in L ~ ( Q )this a form of approximate observability in the sense that it permits reconstruction of the initial state of Equation 68.18, or a corresponding subsequent state w(., t), within any desired degree of accuracy. Unfortunately,there is, in general, no way to know how large K should be in Equation 68.24 in order to achieve a specified accuracy, nor is there any way to obtain a uniform estimate on the effect of errors in measurement of Y . The question of r-observability, for t E [0, TI, is a more demanding one; it is (the question as to whether the map S s , ~ defined in Equation 68.24 extends by continuity to a bounded linear map S, taking the observation Y into the corresponding state wo. It turns out [21] that this is possible for any t > 0, but it is not possible for t = 0;i.e., the initial state can never be continuously reconstructed from measurements of this kind. Fortunately, it is ordinarily the terminal state w(., T) that is the
68.4. OBSERVABILITY THEORY FOR ELASTIC BEAMS more relevant, and reconstruction is possible in this case. The proof of these assertions (cf. [7], [21], e.g.) relies on delicate estimates of the norms of the biorthogonal functions (k(X,t) ( TI). r The boundedness property of the in the space ~ ~x [0, operator ST is important not only in regard to convergence of approximate reconst:ructions ofthe state w (., t ) but also in regard to understanding the effect of an error SY. If such an error is present, the estimate obtained for w(., t ) will clearly be $ ( . , 7) = St (Y
+ SY) = w(., t ) + St SY,
(68.25)
and the norm of the reconstruction error thus does not exceed 11 St 11 llSY 11. A further consequence of this clearly is the importance, since reconstruction operators are not, in general, unique, of obtaining a reconstruction operator of least possible norm. ( TI) r spanned by the If we denote the subspace of ~ ~ x [O, functions r]k(X, t), as described following Equation 68.21, by E ( r x [0, TI), it is easy to show that the biorthogonal functions (k described via Equation 68.22 are unique and have least possible norm ifwe require that they should lie in E ( r x [0, TI). The particular reconstruction operator iT,constructed as the limit of operators (Equation 68.24) with the least norm biorthogonal functions t k , may then be seen to have least possible norm. In applications, reconstruction operators of the type we have described here are rarely used; one normally uses a state estimator (cf. [18], e.g.) that provides only an asymptotic reconstruction ofthe system state, but the performance of such a state estimator is still ultimately limited by considerations of the same sort as we have discussed here. It is possible to provide similar discussions for the "wave" counterpart of Equation 68.14, i.e.,
with a variety of boundary conditions, hcluding Equation 68.15. Averylarge number ofsuch studies have been made, but they have normally been carried out in terms ofthe dualcontrol system [19] rather than in terms of the linear observed system. In some cases, analysis similar to those just described for methods of harmo~~ic the heat equation llave'been used [8], but the most definitive results have been obtained using methods derived from the scattering theory of the vvave equation and from related methods such as geometrical optics [2] or multiplier methods [I I]. These studies include treatment of cases wherein the observationl(dual) control process is restricted to a subset rl c I' having certain geometrical properties. Other contributions [21] have shown the study of the wave equation to be pivotal in the sense that results for related heat and elastic processes can be inferred, via harmonic analysis, once the wave equation results are in place.
68.4 Observability Theory for Elastic Beams There are several different models for elastic beams, even when we restrict attentior1 to small deformation linear models. These
include the Euler-Bernoulli, Rayleigh and Timoshenko models. The oldest and most familiar of these is the Euler-Bernoulli model, consisting of the PDE
(68.27) wherein p(x) denotes the mass per unit length and EI(x) is the so-called bending modulus. We are concerned with solutions in a certain weak sense, which we will not elaborate upon here, corresponding to a given initial state w(., 0) = wo
E
aw H ~ [ oL], , -(x, at
0) = vo E L ~ [ oL], ,
(68.28) , is the standard Sobolev space of functions with where H ~ [ oL] square integrable second derivatives on the indicated interval. Additionally, one needs to give boundary conditions at x = 0 and at x = L; these vary with the physical circumstances. For the sake of brevity, we will confine our discussion here to the cantilever case, where the left-hand end is assumed "clamped" while the right-hand end is "free"; the appropriate boundary conditions are then
(L, t)
E
0.
(68.30)
In many applications a mechanical structure, such as a manipulator arm, is clamped to a rotating base that points the armlbeam in various directions in order to carry out particular tasks. Each "slewing" motion results in a degree of vibration of the structure, which, for most practical purposes, can be: thought of as taking place,within the context of the model of Equations 68.27 to 68.30. In order to attenuate the undesired vibration, it is first of all necessary to carry out an observation procedure in order to determine the oscillatorystate of the system preparatory to, or in conjunction with, control operations. A number of different measurement options exist whose feasibility depends on the operational situation in hand. We will cite three of these. In the first instance, one might attach a strain gauge to the beam near the clamped end. The extension or compression of such a (normally piezoelectric) device provides a scaled physical realization of the mathematical measurement a2w ~ ( t= ) -(o, ax2
t).
Alternatively, one can place an accelerometer near the free end of the beam to provide a measurement equivalent to
a 2 ~
y(tP = 7at( L , t), or one can use a laser device, relying on the Doppler effect, to measure
YO) =
a ~j
-(L, at
t).
(68.33)
THE CONTROL HANDBOOK Each of these measurement modes provides a scalar vaiued function y(t) carrying a certain amount of information on the system state w ( x , t); the problem, as before, is to reconstruct the initial state, or the current state at time T, from the record y (t), 0 5 t 5 T. The mathematical theory of this reconstruction is in many ways similar to the one we have just described for the heat equation, but with some significant differences. The most notable of these is immediatelyapparent from the equation itself; it is invariant under reversal of the time direction. This means that there is no inherent difference between initial and terminal states or, indeed, any intermediate state. We should expect this to show up in the mathematics, and it does. The differential operator defined by
, resulting from imposition of the on the subspace of H ~ [ oL] cantilever boundary conditions, is an unbounded positive selfadjoint operator with positive eigenvalues, listed in increasing order, kk, k = 1,2,3, .... The corresponding normalized eigenfunctions @k (x) form an orthonormal basis for L2[0, L]. Definthe solution of Equations 68.27 to 68.30 takes the ing wk = 6, form
everything depends on the independence properties of these functions in L2[0, TI, where T is the length of the observation interval, in relation to the asymptotic growth of the wk, k -+ co, the basic character ofwhich is that the wk are distinct, increasing with k, if ordered in the natural way, and
The relationship between properties of the functions of Equation 68.42 and the asymptotic and/or separation properties of the exponents wk is one of the questions considered in the general topic of nonharmonic Fourier series, whose systematic study began in the 1930s with the work of Paley and Wiener [ 161. The specific result that we make use of is due to A. E. Ingham [lo]. Combined with other, more or less elementary, considerations, it implies that if the wk are real and satisfy a separation condition, for some positive integer K and positive number r ,
,F,
where, with wo and vo as in Equation 68.28
.
Just as in the earlier example of the heat equation, but now we are concerned only with the scalar exponential functions
-
(68.36)
then the functions of Equation 68.42 are uniformly independent and uniformly convergent in L2[o, TI, provided that T which means there are numbers b, B, 0 < b < B, such that, for any square summable sequences of coefficients ck, dk, k = 1,2,3, ..., we have
with
The norm of the state wo, vo in the state space H ~ [ oL] , 8 L~[0, L] is equivalent to the norm of the double sequence {wO,k, vo,k}in the Hilbert space, which we call.c: That norm is
Any one of the measurement modes discussedearlier now takes the form
with yk and 6k depending on the particular measurement mode being employed. Thus, in the cases of Equations 68.31,68.32 and 68.33, respectively, we have
Since the asymptotic property of Equation 68.43 clearly implies, for any r > 0, that Equation 68.44 is satisfied if K = K(r)is sufficiently large, inequalities of Equation 68.45, with b = b(r), B = B(r), must hold for any T > 0. It follows that the linear map, or operator,
defined on the (necessarily closed, in view of Equation 68.45) subspace of L2[o, TI spanned by the exponential functions in question, is both bounded and boundedly invertible. Applied to the observation y(t), t E [0, TI, corresponding to an initial state of Equation 68.36, the boundedness of C, together with the relationship of Equation 68.37 between the wo,k, vo,k and the ck, dk and the form of the norm, it is not hard to see that the boundedness of the linear operator C implies the existence of a bounded reconstruction operator from the observation
68.4. OBSERVABlZIlY THEORY FOR ELASTIC BEAMS (Equation 68.38) to the initial state (Equation 68.36) provided the coefficients 6k, yk in Equation 68.38 satisfy an inequality of the form
[ 8 ] Graham, K.D. and Russell, D.L., Boundary value con-
lckl L c w k , ldkl 2 Dwk,
[ 9 ] Ho, L.F., Observabilitk frontiitre de l'kquation des ondes, Cah. R. Acad. Sci., Paris, 302, 1986. [ l o ] Ingham, A.E., Some trigonometrical inequalitieswith applicationsto the theory of series, Math. Z., 41,367379, 1936. [ l l ] Lagnese, J., Controllability and stabilizability of thin
(68.47)
for some positive cionstants C and D. Using the triqonometriclexponential form of the eigenfunctions&(x), one easily verifies this to be the case for each of the observation modes of Equations 68.39 to 68.41 ; indeed, the condition is overfulfilled in the case of the ac(:elerometer measurement (Equation 68.40). Thus, bounded observabilityof elastic beam statesvia scalar measurements can be cclnsidered to be typical. At any later time t = t,the system state resulting from the initial state of Equation 68.36 has the form
trol of the wave equation in a spherical region, SIAM
I. Control, 13, 174-196,1975.
beams and plates, in Distributed Parameter Control Systems, Chen, G., Lee, E.B., Littman, W., Markus, L., Eds., Lecture Notes in Pure and Applied Math., Marcel Dekker, New York, 1991, 128. 1121 LaSalle, J.P. and Lefschetz, S., Stability by Liapunov's Direct Method, with Applications, Academic Press, New York, 196 1. 1131 Lions, J.L., Optimal Control of Systems Governed by Partial Differential Equations, Grund. Math. Wiss. Einz., Springer-Verlag,New York, 197 1 , 170. [14] Lions, J.L., Controlabilitt Exacte, Perturbations et v, (x) = sin wkt wo-k + cos wkt ~ ~ , ~ ) @ ~ ( x ) . Stabilisation de Systkmes Distribuis, Tomes 1,2, k=l Recherches en Mathematiques Appliqutes, Masson, (68.48) Paris, 1988, 8, 9. Using Equation 68.48 with the form of the norm in c,: one can [ I S ] Luenberger, D.G., An introduction to observers, IEEE see that the map from the initial state wo, vo E H ~ [ oL, ] @ Trans. Autom. Control, 22, 596-602, 1971. L* [ 0 , L ] to w , , v , in the same space is bounded and boundedly 1161 Paley, R.E.A.C. and Wiener, N., Fourier transforms in invertible; this is another way of expressing the time reversibility the complex domain, in Colloq. Pub., American Mathof the system. It follows that the state w , , v , can be continuously ematical Society, Providence, RI, 1934, 19. reconstructed from the observation y ( t ) , t E [ O , T ] in precisely 1171 Pazy, A., Semigroups of linear operators and applicathe same circumstances as we can reconstruct the state wo, vo. tions to partial differential equations, in Applied Mathematical Sciences, Springer-Verlag, New York, 1983, M)
C(--mk
References
[ l ] Anderson, B.D.O. and Moore, J.B., Linear Optimal
Control, Elect. Eng. Series, Prentice Hall, Englewood Cliffs, NJ, 197 1, chap. 8. [ 2 ] Bardos, C., LeBeau, G., and Rauch, J., Contr6le et stabilisation dans des problitmes hyperboliques, in Controlabilitk Exacte, Perturbations et Stabilisation de Syst2mes Distribds, Lions, J.-L., Masson, Paris, 1988, appendix 2. (31 Beck, J.V. and Arnold, K.J., Parameter Estimation in Engineering (2nd Science, John Wiley & Sons, New York, 1977. [ 4 ] Courant, R. and Hilbert, D., Methods of Mathematical Physics; Vol. 2: Partial Differential Equations, Interscience Publishers, New York, 1%2. [ 5 ] Dolecki, S. and Russell, D.L., A general theory of observation and control, SIAMJ. Control Opt., 15,185220,1977. [ 6 ] Dunford, N. and Schwartz, J.T., Linear Operators; Vol. 1: General Theov, Interscience, New York, 1958. [ 7 ] Fattorini, H.8. and Russell, D.L., Exact controllability
theorems for linear parabolic equations in one space dimension, Arch. Ration. Mech. Anal., 4, 272-292, 1971.
44. [18] Russell, D.L., Mathematics of Finite Dimenszonal Con-
trol Systems: Theory and Design. Marcel Dekker, Inc., New York, 1979. 1191 Russell, D.L., Controllability and stabilizability theory for linear partial differential equations; recent progress and open questions, SIAM Rev., 20,639-739,1978. [20] Russell, D.L., Decay ratesfor weakly damped systems in Hilbert space obtained with control-theoretic methods, J. Di# Eq., 19,344-370, 1975. (211 Russell, D.L., A unified boundary controllability theory for hyperbolic and parabolic partial differential equations, Stud. Appl. Math., LII, 189-211, 197:). 1221 Showalter,R.E., Hilbert Space Methods for Partial DifferentialEquations, Pitman Publishing Ltd., San Francisco, 1977.
PART C APPLICATIONS OF CONTROL
SECTION XV Process Control
Water Level Control for ithe Toilet Tank: A Historical Perspective
Bruce G. Coury The
Hopkins
Applied PhysicsLaboratory,
Laurel, MD
69.1 Introduction .......................................................... 1179 69.2 Control Technology in the Toilet ................................... 1180 69.3 Toilet Control ......................................................... 1181 69.4 The Concept of Flushing ............................................. 1183 69.5 The Need for a Sanitation System.. ................................. 1184 69.6 Concerns About Disease ............................................. 1185 69.7 Changing Attitudes About Health and Hygiene.. ................... 1186 69.8 The Indoor Toilet.. ............. :. ................................:. .. 1186 69.9 Historical Approaches to Systems.. .................................. 1187 69.10 Summary.. ............................................................ 1189 References .................................................................... 1190
69.1 Introdiuction Control technologies are a ubiquitous feature of everyday life. We rely on them to perform a wide variety of tasks without giving much though~tto the origins of that technology or how it became such an important part of our lives. Consider the common toilet, a device that is found in virtually every home and encountered every day. Here is a device - hidden away in its own room and rarely a major topic of conversation - that plays a significant role in our life and depends on several control systems for its effeciive operation. Unlike many of the typical household systems which depend on control technologies and are single purpose, self-contained devices (e.g., coffee makers), the toilet is a relatively sophisticated device that uses two types of ancient controllen. For most of us, the toilet is also the primary point of daily contact with a large distributed system for sanitation management; a system that is crucially dependent on control and has extremely important social, cultural, environmental, and political implications. In addition, we have some very clear expectations about the performance of that technology and well-defined limits for the types of tolerable errors should the technology fail. !'To imagine life without properly functioning modern toilets conjures up images of a lifestyle that most of us would find unacceptable and very unpleasant. One need not look too far back in time to discover what life was really like without tolets linked to a major sanitation system. The toilet as we know it today is a recent technological development, coming into widespread use only at the end of the 19th century Prior to that time, indoor toilets were relatively rare in all but 0-8493-8570-9/%/$0.00+$50 O 1996 by CRC Press, Inc.
the most wealthy homes, and the disposal and management of sewage was a rather haphazard affair. Only as a result of the devastating cholera epidemics of the mid-19th century in the United States and Europe were significant efforts made to control and manage waste products. Surprisingly, the technologies for sanitation had been available for quite some time. For instance, the toilet has a rich technological history with roots dating back to antiquity. The Greeks and Romans had developed the types of control technologies necessary for the operation of modern toilets, but the widespread adoption of that technology did not occur for another 1800 years. One wonders, then, why a control technology so obviously useful took centuries to be adopted. To ponder the factors that motivate developing and adopting a technology is the essence of a historical analysis of that technology and is the first step in tracing the roots of technologicaldevelopment. One quickly realizesthat the answers to such questions are rather complex, requiring one to explore not only the development of the technology, but also the economic, political, and social influences that shaped the development and implementation of that technology. In this respect, toilets and their controls are an ideal topic. First, it is a technology that is common, familiar, and classic in its control engineering. Although writing about the control 'techrlplogy of toilets is fraught with many pitfalls (one can easily fall victim to bathroom humor), toilets are a technology that people encounter every day that is part of avery large and highly distributed system. Second, the history of the development of control technologies for toilets has many dimensions. There is the obvious chronicle of the evolution from latrine to commode to water closet to
THE CONTROL HANDBOOK toilet. The history will quickly reveal that the development of toilet technology was inexplicably linked to the development of an entire water supply and sanitation system. Perhaps more important, we will see that the history of toilet technology and the entire sanitation system m;st be considered in the context of significant parallel social, cultural, and political trends that describe changing attitudes towards health and sanitation. Hopefully, by the end of this article, the reader will begin to grasp the complexity of technological development and the myriad forces that shape the development and use of a technology. To consider the historical develbpment of technology requires a specific framework of analysis. To focus only on the development of toilet technology and the adaptation of sensory and control mechanisms to the operation of toilets is to miss the larger context of the history of technological development. As Hughes [4] points out in his studies of the historical development of systems (he concentrates on electric power systems), "systems embody the physical, intellectual, and symbolic resources of the society that constructs them." As we shall see with the toilet, its development and use was driven by the threat of disease, the appalling conditions resulting from uncontrolled waste disposal, changing attitudes towards health and cleanliness, and the fashionable features of a "modern" home made possible by an abulldant water supply. The discussion begins with a description of the toilet technology relevant to this chapter, the valves for maintaining the water level in the toilet tank. What may come as a surprise to the reader is the similarity of modern toilet technology to the water closets of the late 19th century and how the same technological challenges have persisted for more than 150 years. We will then turn to the historical roots of those technologies and trace the events of the last few centuries that resulted in today's toilet and sanitation system.
69.2 Control Technology in the Toilet Lift the lid from the top of a toilet tank and inside is the basic control technology for the entry to a widely dispersed sanitation system. Although simple in operation (e.g., as shown in Figures 69.1, 69.2, and 69.3), the mechanisms found in the toilet tank serve one very important function: maintaining a constant water level in the tank. Modern toilets operate on the concept of flushing where a preset amount of water is used to remove waste from the toilet bowl. The tank stores water for use during flushing and controls the amount of water in storage by two valves: one controlling the amount of water entering the tank and another controlling the amount of water leaving the tank. Push the handle on the tank to activate the flush cycle. Pushing the handle pops open the valve at the bottom of the tank ,(the toilet flush valve), allowing the water to rush into the toilet bowl (assuming well-maintained and nonleaking equipment). Notice that this valve (in most toilet designs) also floats, thereby keeping the valve open after the handle is released. As the water level drops, the float attached to the second valve (the float valve) also descends, opening that valve and allowingwater to enter the tank.
Figure69.1 U.S. Patent drawing ofthe design of a water closet showing the basic components of the toilet tank, flush and float valves, and the toilet basin. Source: U.S. Patent No. 349,348 filed by P. Harvey, Sept. 21, 1886.
When the water level drops below a certain level, the flush valve closes, allowing the tank to refill. The water level rises, carrying the float with it until there is sufficient water in the tank to close the float valve. Within a toilet tankwe find examples of classic control devices. There are valves for controlling input and output, activation devices for initiating a control sequence,feedback mechanisms that sense water level and provide feedback to the control devices, and failure modes that minimize the cost of disruptions in control. The toilet tank is comprised oftwo primary, independent control mechanisms, the valve for controlling water flow into the tank and the valve for controlling water flow out of the tank, both relying on the same variable (tank water level) for feedback control. Using water level as the feedback parameter is very useful. If all works well, the water level determines when valves should be opened and closed for proper toilet operation. The amount of water in the tank is also the measure of performance required to minimize the adverse effects of a failure. Should one of the valves fail (e.g., the float valve in the open position), the consequences of the failure (water running all over the floor) are minimized (water pours down the inside of the
69.3. TOILET COiVTROL
(Continued.) U.S.Patent drawing of the flush valve and the float valve for the toilet tank. Source: U.S. Patent No. 549,378 filed by J.F. Lymburner and M.E Lassance, Nov. 5,1895.
Figure 69.2
U.S. Patent drawingofthe flushvalveand the float valve for the toilet tank. Source: U.S. Patent No. 549,378 filed by J.F.Lymburner and M.F. Larsance, No\: 5,1895.
Figure 69.2
flush valve tube). Not all of the solutions to the failsafe maintenance of water level are the same. In the design shown in Patrick Harvey's U.S. Patent dated 1886 (Figure 69.1), a tank-within-atank approach is used so that the overflow from the cistern due to float valve failure goes directly to the waste pipe. The traditional approach to the same problem is shown in Figures 69.2 and 69.3; although the two designs are almost 100 years apart (Figures 69.2a and 6!3.2b are from the U.S. Patent filed by Joseph F. Lymburner and Mathias F. Lassance in 1895; Figure 69.3 is from the U.S. Pateni: filed in 1992 by Olof Olson), each uses a hollow tube set at a predetermined height for draining overflow should the float valve fail. In other words, the toilet is a control device characterized by relatively fail-safe operation.
other hand, combines both mechanisms for sensing and control into a single device. Both types of control technology have their origins in antiquity. The requirement for control of the level and flow of water was recognized by the Greeks in the clesign ofwater clocks. Water clocks were a rather intriguing deviice built on the principle that a constant flow of water could be used to measure the passage of time. In the design by Ktesibious (Figure 69.4), a mechanician serving under King Ptolemy I1 Philadelphus (285247 B.C.), water flows into a container through1 an orifice of predetermined size in which the water level slowly rises as time passes [6]. Riding on the surface of the water is aL float (labeled P in the diagram) attached to a mechanism that indicates the time of day; as the float rises, the pointer moves up the time scale. To assure an adequate supply of water to the clock, the orifice controlling the flow of water into the clock container was attached to a holding tank. Ktesibious recognized that accurate time keeping requires precise control over water flow and maintenance of a constant water level in the container. Consequently, it was necessary to develop a method of control assuring a constant flow rate of water into the container. The Greeks clearly understood the relationship
67.3 Toilet Control The toilet flush valve and the float valve operate on two different principles. The float valve has separate mechanisms for sensing the level of water in the tank (the float) and for controlling the input of water into the tank (the valve). The flush valve, on the
THE CONTROL HANDBOOK
U.S. Patent drawing of a modern design for the flush valve and float valve for the toilet tank. Source: U.S. Patent No. 5,142,710 filed by 0 . Olson, Sept. 1,1992. Figure 69.3
between flow rate and relative pressure, and recognized that the flow rate of water into the clock would decrease as the holding tank emptied. By maintaining a constant water level in the holding tank, the problem of variations in flow rate could be solved. There are, however, a number of possible passive or active solutions to the constant water level problem. A passive approach to the problem could use either a constantly overflowing holding tank or an extremelylarge reservoir of water relative to the size of the water clock container. Active control solutions, on the other hand, would use some form of water level sensing device and a valve to regulate the amount of water entering the holding tank (thereby reducing the need for a large reservoir or a messy overflow management system). Ktesibious chose the active control solution by designing a float valve for the inlet to the holding tank that assured a constant water level. From the descriptions of Ktesibious' water clock [6],the valve was a solid cone (labeled G in Figure 69.4) that floated on the surface of the water in the holding tank serving the orifice of the water clock. The valve stopped the flow of water into the holding tank when the level of the water forced the tip of the cone into a similarly shaped valve seat. Notice again that the functions of sensing the level of water and controlling the flow of water are both contained in the float valve. The flush valve of a modern toilet uses a similar principle. Combining both sensing and control in the same device, the flush valve is a modern variation of Ktesibious cone-shaped float that controls the amount of water flowing out of the toilet tank. Unlike the Greek original, however, the control sequence of flush
Figure 69.4 Control technology in holding vessels designed by Ktesibious (285-247 B.C.) showingthe use of a float valve for regulating water flow. Source: Mayr, O., The Origini of Feedback Control. MIT Press, Cambridge, MA, 1969.
valve action is initiated by the mechanism attached to the flushing handle on the outside of the toilet tank. Thus, the toilet flush valve is a discrete control device that seeks to control the ouckut ofwater after some external event has resulted in a drop in water level. Subsequent developments of the float valve by Heron in the first century A.D. improved on the relationship between float and valve. Described in the Pneumatics, Heron developed float valves to maintain constant fluid levels in two separate vessels (e.g., as in Figure 69.5). In his design, he employed a float connected to a valve by a series of jointed levers. The rise and fall of the float in one vessel would close or open a valve in another vessel. This design effectively separates the sensing of fluid level from the actions to control the flow of fluid, and most closely resembles the float valve for regulatingwater level in the toilet tank. Although the purpose ofthe toilet float valve is to maintain a constant levelofwater in the toilet tank, variations in that water level are usually step functions resulting from discrete, external events. The technology can be applied to continuom control, as is so audibly illustrated when the toilet flush valve leaks. The float valve for the water clock and the holding vessels illustrates a number of important concepts critical to feedback and control. First, feedback is a fundamental component of control. Without a means for sensing the appropriate performance parameter, automatic regulation of input and output cannot be
69.4. THE CONCEPT OFFLUSHING is stored in a holding tank until called upon for the removal of waste) was slow to develop and was relatively rare until more recent history (although lleynolds discusses the possibility for such a water closet at Knossos). For instance, 15th century evidence of such a mechanism for a latrine was found during excavation of St. Albans the Abbot in England [ l o ] . In the most simple form of an 18th century flushing toilet, a cistern captured and stored rain water until a valve was opened (usually by pulling on a lever), allowing the water to flush away waste products. In most early applications of flushing, the amount of water used was determined by the size of the cistern, the source of water, and the patience of the user. Although such devices persisted until the latter part of the 19th century, they were considered to be a rather foul and obnoxious solution to the problem.
Figure 69.5 Control technology in the waterclock designed by Heron ( 100A.D.) showing thteuse of afloat connected to avalve for maintaining a constant fluid level. Source: Mayr, O., The Origins of Feedback Control. MIT Press, Cambridge, MA, 1969. accomplished. Seccnd, separation of the mechanisms for sensing and control provide a more sophisticated form of control technology. In the example of Heron's holding vessels, using a float to sense fluid level that was connected by adjustable levers to a valve for controlling fluid flow allowed adjustments to be made to the float level. Thus, the level of fluid required to close the valve could be varied by changing the required float level. The float valve in Kl.esibious' design, on the other hand, was not adjustable; to change the water level resulting in valve closure required substitutirig an entirely new float valve of the proper dimensions.
In general, the collection and disposal of hurnan waste was accomplished without sophisticated technology until the mid1800s. When aspecial purpose device for collecting human waste existed in a home, it was usually a commode or chambkr pot (although the close stools built for royalty in 16th century Europe could be quite throne-like in appearance, complete with seat and arms and covered with velvet). Otherwise, residents trekked outside to a latrine or privy i r ~the backyard or garden. The first evidence of a recognizable predecessor of the modern toilet is found in the British patents filed by Alexander Cummic~gsin 1775 and Joseph Bramah in 1778 (although Sir John Harrington's valve closet of 1596 had some characteristics similar to 18th century water closets). Cummings proposed a valve closet that had an overhead supply cistern, a handle activated valve for the flush mechanism, and a valve that opened in the bottom of the basin to allowwaste to escape. All valves and flushing mechanisn~swere activated andcontrolled by the user. Bramah's contribution was a crank activated valve for emptying the basin. No control devices that relied on water level were evident, and the valve separating the basin from the discharge pipe was not a very effective barrier against noxious odors and potentially danger(oussewer gases [lo]. An early version of an American water closet of similar design is shown in Figure 69.6. Patented by Daniel Ryan and John
69.4 The C,onceptof Flushing The Greeks and Rornans did not immediately recognize the relevance of this control technology to the development of toilet like devices. The lack of insight was certainly not due to the inability to understand the relevance of water to waste removal. Flowing streams have always been a source of waste removal, especially in ancient history when people tended to settle near water. By the time the Greeks ancl Romans built public baths, the use of water to cleanse the body and remove excrement was an integral part of the design. Even the frontier Roman forts in Britain had elaborate latrines that used surface water to flush away human wastes [8]. There was, however, n o explicit mechanism for controlling the flow of water other than devices for diverting streams, pipes for routing the flow, or reservoirs to assure an ample supply of water [lo]. The use of a purpose-built flushing mechanism (where water
Figure69.6 U.S. Patent drawing ofan earlywater closet design showing manual operation of flushing cycle. Source: U.S. Patent No. 10,620 filed by D.Ryan and J. Flanagan, Mar. 7,1854. Flanagan in 1854, the basic components of a manually operated water closet are shown. The lever (labeled G ) operates the water closet. When the lever is depressed, it pushes up the sliding tube
1184
F until the opening a coincides with the pipe C from the toilet bowl. At the same time, the valve labeled V is opened to allow water to enter the basin and flush away waste. In all situations, commode, latrine, privy, or water closet, 110 organized or structured system of conecting and managing the disposal of human waste products existed. Human household waste was typically dumped directly into vaults or cesspits (some ofwhich were in the basements of homes) or heaved into ditches or drains in the street. The collection and disposal of that waste was usually handled by workers who came to the home at night with shovel and bucket to empty out the vault or cesspit and cart off the waste for disposal (usually into the most convenient ditch or source of flowing water). When a system of sewers did exist, household waste was specifically excluded. Even the Romans, who were quite advanced in building complex water and sanitation systems, used cesspits in the gardens of private homes to collect human waste products [8]. Not until the link between sanitation and health become evident and a universal concern arose in response to the health hazards of uncontrolled sewage did the need arise to develop an infrastructure to support the collection and removal of human waste. At this point, it should be evident to the reader that the development of toilet technology did not proceed along an orderly evolutionary path from control principle to technological development. For instance, the development of critical water level control technologies by the Greeks and Romans did not immediately lead to the use of that technology in a sanitation system (despite the fact that the need for the removal of waste products had been well-established for a long time). Nor did the existence of critical technologies immediately lead to the development of more sophisticated devices for the collection and disposal of human excrement. Even where conditions surrounding the disposal of human waste products was recognized as deplorable for many centuries, little was done to develop technological solutions that remotely resembled the modern toilet until other factors came into play. Understanding those factors will lead us to a better understanding of the development of toilet control technology.
69.5 The Need for a Sanitation System The development of sanitation systems was a direct response to health concerns. From a modern perspective, sanitation and the treatment of human waste was a haphazard affair through the early 19th century. Descriptions of conditions in British and American cities portray open sewers, waste piling up in streets and back alleys, cesspits in yards or even under living room floors, and outdoor privies shared by an entire neighborhood. Mortality rates during that period were shocking: the death rate per 1,000 children under fiveyears of age was 240 in the English countryside and 480 in the cities [lo]. In 1849, sewers delivered more than 9 million cubic feet of untreated sewage into the Thames, the same river used as a source of water for human consumption. In his comprehensive treatment of the conditions in 19th century Newark, NJ,Galishoff [3] provides a vivid account of the situation in an American city. Streets were unpaved and lined
THE CONTROL HANDBOOK with household and commercial wastes. Common sewers were open drains that ran down the center of streets, and many residents built drains from their home to the sewer to carry away household wastes. Prior to the construction of a sewer system in 1854, Newark's "privy and cesspool wastes not absorbed by the soil drained into the city's waterways and into ditches and other open conduits" [3]. When heavy rains came, the streets turned into filthy quagmires with the runoff filling cellars and lower levels of homes and businesses in low-lying areas. States Galishoff [3], "Despite greater and greater accumulations of street dirt (consistingmainly of garbage, rubbish, and manure), gutters and streets were cleaned only twice a year. .4n ordinance compelling property owners to keep streets abutting their parcels free of obstructions was ignored."
Given such horrible conditions, it is not surprising that cholera reached epidemic proportions during the period of 1832-1866; 14,000 Londoners died of cholera in 1849, another 10,000 in 1854, and more than 5,000 in the last major outbreak in 1866 [lo]. Newark was struck hard in 1832, 1849, and 1866, with minor outbreaks in 1854 and 1873. Although Newark's experience was typical of many American cities (the number of deaths represented approximately0.5 percent of the city populations), and only Boston and Charleston, SC, escaped relatively unharmed, cholera struck some cities devastating blows. Lexington, KY, for instance, lost nearly a third of its population when the disease swept through that city in 1833 [3].
Until legislative action was taken, the private sector was responsible for constructing sewage systems through most of the 19th century, especially in the United States. This arrangement resulted in highly variable service and waste treatment conditions. For example, Boston had a well developed system of sewers by the early 1700s,whereas the sewersin the city of New York were built piecemeal over a six decade period after 1700 [I]. At that time, and until the recognition of the link between disease and contaminated water in the mid- 19th century, the disposal of general waste water and the disposal of human waste were treated as separate concerns. Private citizens were responsible for the disposal of the waste from their own privies, outbuildings, and cesspools. If the waste was removed (and in the more densely populated sections of American cities, such was hardly ever the case), it usually was dumped into the nearest body of water. As a result, American cities suffered the same fate as European cities during the cholera years of 1832-1873. Needless to say, lawmakers were motivated to act, and the first public health legislation was passed. London enacted its Public Health Act in 1848, and efforts were made to control drainage, close cesspits, and repair, replace, and construct sewers. Resolution of the problem was a Herculean task; not until the 1870sdid the situation improve sufficiently for the death rate to decline significantly. The Newark Board of Health was created in 1857 to oversee sanitary conditions. It, too, faced an uphill battle and was still considered ineffectual in 1875.
69.6. CONCERNS ABOUT DISEASE
69.6 Concerns About Disease Fundamental to the efforts to control sewage was the realization of a direct link between contaminated water and the spread of disease. This realization did not occur until the medical prcfession established the basis for the transmission of disease. Prior to. the mid-1800s, the most common notion of the mechanism for spreading disease wls based on atmospheric poisoning (referred to as the miasmic theory of disease) caused by the release of toxic gases during the fermentation qf organic matter. In this view, diseases such as cholera were spread by the toxic gases released by stagnant pools of sewage and accumulations of garbage and waste. Urban congestion and squalor appeared to confirm the theory; the highest incidence of cholera (as well as most other types of communicable diseases) occurred in the most densely populated and poorest sections of a city where sanitation was virtually nonexistent. Some believed, even in the medical community, that sewers were especially dangerous in congested, urban areas because of the large volumes of sewage that could accumulate and emit deadly concentrations of sewer gases. The higher incidence of disease in the poorest sections of a city also contributed to the notion that certain factors predisposed specific segments of the populace to infection, a convenient way to single out immigrants and other less fortunate members of the community for discriminatory treatment in the battle against disease. Such attitudes also contributed to the slow growth in the public health movement in the United States because cleaning up the squalor in the poorest sections of a city could potentially place a significant economic burden on the city government and business community [3]. By associating the disease with a particular class of people, costly measures for improving sanitation could be ignored. Class differences in the availability of and access to water and sanitation facilities persisted in Newark from 1850-1900, with much of the resources for providing water and sanitation services directed towards the central business district and the more affluent neighborhoods [3]. The medical profession slowly realized that "atmospheric poisoning'' was not the mechanism for spreading diseases. During the 19th century, medical science was evolving and notions about the spread and control sf disease were being formalized into a coherent public health movement. Edwin Chadwick, as secretary of the British Poor Law Commission, reported to Parliament on the social and environmental conditions of poor health in 1848. The influence of his report was so great that Chadwick is credited with initiating the "Great Sanitary Awakening"[9]. In 1854, John Snow, a London anesthesiologist, determined that a contaminated water supply led to the deaths of 500 people when he traced the source off the infected water to a single common well. Snow had argued earlier that cholera was spread by a "poison" that attacked the intestines and was carried in human waste. His study of the 500 deaths caused by the contaminated well provided strong support for his theory. The first recorded study in the United States of'the spread of disease by contaminated water was conducted by ilustin Flint in 1855. In that study he established that the source of a typhoid epidemic in North Boston, NY was a contaminated well [9].
1185
During the 1860s, the miasmic theory of disease was slowly displaced by the germ theory. Louis Pasteur established the role of microorganisms in fermentation. At about the same time, Joseph Lister introduced antiseptic methods in surgery. These changes in medicine were due to an increasing awareness of the link between microorganisms and disease and the role of sterilization and cleanliness in preventing the spread of disease. The germ theory ofdisease and the role ofbacteria in the spread of disease were established in the 1880s by Robert Koch. His research unequivocally demonstrated that bacteriawere the cause of many types of disease. By firmly establishing the link between bacteria and disease, germ theory provided the basis for understanding the link between contaminated water and health. Once it was discovered that disease could spread as a result of ground water seepage and sewage leachate, uncontrolled dumping of waste was no longer acceptable. To eliminate water contamination, potable water had to be separated from sewage. Such an objective required that an extensive, coordinated sewer system be constructed to provide for the collection of waste products at the source. Water and drainage systems did exist in some communities and wealthy households in the mid-18th century (e.g., [lo]],but, in general, widespread water and sanitation systems were slow to develop. One major force behind the development of sanitation systems was the rapid growth of cities in the years leading up to the Civil War. For example, Newark became one of the nation's largest industrial cities when its population grew from 11,000to 246,000 during the period 1830-1900. In the three decades following 1830, Newark's population increased by more than 60,000 people. Much of the increase was due to a large influx of Irish and German immigrants. After 1900, the population of Newark almost doubled again (to 435,000 in 1918) after undergoing a new wave of immigration from eastern and southern Europe [3]. These trends paralleled urban growth in other parts of the nation; the total urban population in the United States, 322,371 in 1800, had grown to more than 6 million by 1860, and exceeded 54 million by 1920 [ 7 ] . As a consequence of such rapid growth, the demand for water for household and industrial use also increased. In the United States during the late 1700s, private companies were organized in a number of cities to provide clean water. 'The companies relied on a system of reservoirs, pumping stations, and aqueducts. For instance, by 1840 Philadelphia had developed one of the best water systems in the nation, delivering more than 1.6 million gallons of water per day to 4,800 customers. The enormous population growth during and after the Civil War (and the concomitant increase in congestion) overwhelmed the capacity of these early systems and outstripped the private sector's ability to meet the demands for water and sanitation. Worsening conditions finally forced the issue, resulting in large scale public works projects to meet the demand for water. In 1842, for example, the Croton Aqueduct was completed to provide New York City with a potential capacity of 42 million gallons of water per day. That system became obsolete by the early 1880s when demand exceeded nearly 370 million gallons per day for 3.5 million people and a more extensive municipal system had to be built using public funds [I].
THE CONTROL HANDBOOK The abundance of water allowed for the development af indoor plumbing, thereby increasing the volumes of waste water. !n the late 1800s, most of the increase in domestic water consurnption was due to the installation of bathroom fixtures 131. In Chicago, per capita water consumption increased from 33 gallons per day in 1856 to 144 in 1872. By 1880, approximately one-fourth of all Chicago households had water closets [?'I. Unfortunately, early sewage systems were constructed without allowance for human and household wastes. As demand increased and both residential and industrial effluent was being diverted into storm drainage systems, it became clear that municipalities would have to build separate sewers to accommodate household wastes and adopt the "water-carriage" system of waste removal (rather than rely on cesspools and privy vaults). The first separate sewer system was built in Memphis, TN, in 1880. Because of the enormous economic requirements of such large scale sewer systems, cities assumed the responsibility for building sewer systems and embarked on some of the largest construction projects of the. 19th century [ I ] , [7]. In Newark, more than 200 miles of sewers were constructed during the period 1894-1920 at a cost exceeding several million dollars, providing service to 95 percent of the improved areas of the city [3]. The technology to treat sewage effectively, however, (other than filtering and irrigation) would not be widely available for another 10 years.
69.7 Changing Attitudes About Health and Hygiene The major source of the increase in household water consumption was personal hygiene. Accompanying the efforts to curb the spread of disease was a renewed interest in bathing. Although a popular activity among the Greeks and Romans, the frequency of bathing and an emphasis on cleanliness has ebbed and flowed throughout history. By the Dark Ages, cleanliness had fallen out of favor, only to become acceptable once again during the time of the Crusades. Through the 16th and 17th centuries bathing was rare, except among members of the upper class and then on an infrequent basis. By the first half ofthe 19th century, bathing was largely a matter of appearance and daily sponge bathing was uncommon in middle-class American homes until the mid-1800s 151. As awareness of the causes of disease increased at the end of the 19th century, cleanliness became a means to prevent the spread of disease. Bathing and personal hygiene to prevent disease, states Lupton and Miller, "was aggressively promoted by health reformers, journalists, and the manufacturers of personal care products." Thus, hygienic products and the popular press became important mechanisms for defining the importance of cleanliness for Americans in the latter half of the 19th century. In this period of technological development, social forces significantly influenced the construction, adaptation, use, and acceptance of technologies related to hygiene. The importance of cleanliness and the appropriate solutions to the hygiene problem were defined for people by specific agents of change, namely, health professionals,journalists, andcommercial interests. Thus, the meaning of and need for health care products and home
sanitary systems were defined by a number of iilfluential social groups. In the history of technology literature, this is referred to as social constructionism where the meaning of a technology (especiallyin terms of its utility or value) is defined through the interaction of relevant social groups [2]. In the social constructionist's view, technological development is cast in terms of the problems relevant for each social group, with progress determined by the influence exerted by a particular social group to resolve a specific problem. For instance, the health professionals were instrumental in defining the need for cleanliness and sanitation, with journalists communicating the message to the middle class in the popular press. Through marketing and advertisements,those groups concerned with producing plu&bing and bathroom fixtures significantly influenced the standards (and products) for personal hygiene in the home. By increasing awareness of the need for a clean and sanitary home and defining the dominant attitudes towards health standards, these influential social forces had defined for Americans the importance of the components of a bathroom [5]. With the arrival of indoor plumbing, both hot and cold water could be piped into the home. Previous to the introduction of indoor plumbing, appliances for bathing and defecating were portable and similar in appearance to furniture (to disguise their purpose when placed in a room). The fixed nature of pipes and the use of running water required that the formerly portable equipment become stationary. To minimize the cost of installing pipes and drains, the bath, basin, and commode were placed in a single room; consequently, the "bathroom" became a central place in the home for bathing and the elimination ofwaste. Early designs of bathroom furniture retained the wood construction of the portable units, but the desire for easy cleaning and a nonporous surface that would not coliect dirt and grime led to the use of metal and china. The bathroom became, in effect, a reflection of the modern desire for an antiseptic approach to personal hygiene. Porcelain-lined tubs, china toilets, and tiled floors and walls (typically white in color) emphasized the clinical design of the bathroom and the ease with which dirt could be found and removed. As Lupton and Miller [5] point out, the popular press and personal hygiene guides of the period compared the design of the modern bathroom to a hospital. The cleanliness of the bathroom became a measure of household standards for hygiene and sanitation. By the late 1880s, indoor plumbing and bathrooms were a standard feature in homes and a prerequisite for a middle-classlifestyle.
69.8 The Indoor Toilet The growing market i n bathroom fixtures stimulated significant technological development. Before 1860,there were veryfew U.S. Patents filed forwater closets and related toilet components (the Ryan and Flanagan design shown in Figure 69.6 is one of only three patents for water closets filed in the period 1847-1855). Once the toilet moved into the home and became a permanent fixture in the bathroom, the technology rapidly developed. The primary concern in the development of the technology was the
69.9. HISTORICAL. APPROACHES TO SYSTEMS amount of water used by the toilet and its subsequent impact on the water supply and sewage collection and treatment systems. The fact that water could be piped directly to the cistern of the water closet (previous designs of water closets had required that they be filled by hand) necessitated some mechanism to control the flow ofwater to the toilet andminimize the impact oftoilet use on the water supply. l[n addition, there was a more systemwide concern to provide some automatic way to control the amount of water used by the toilet and discharged into the sewerage system (and thereby eliminating th5 unacceptable continuous flow method of flushing used by the Greeks and Romans). As a consequence, much effort was devoted io developing mechanisms for controlling water flow into and out of the toilet. During the period 1870-1920, the number of U.S. patents filed for toilet technology rapidly escalated. In the 1880s, more than 180 U.S. patents were issued for water closets. For many of the inventors concerned with the design of toilet technology, the flush and float ulves became the center of attention. In 1879, William Ross cf Glasgow, Scotland, filed the U.S. Patent "Improvement in Water-Closet Cisterns" shown in Figure 69.7. In the patent documents, Ross put forward an eloquent statement of the design objectives for the valves in the cistern: "The object I have in view is to provide cisterns for supplying water to waterclosets, urinals, and other vessels with valves for controlling the inlet and outlet of water, the valves for admitting water to such cisterns being balanced, so that they can be operated by small floats, and also being of simple, light, and durable construction, while the valves for governing the outflow of water from such cisterns allow a certain amount of water only to escape, and thus prevent waste, and are more simple and efficient in their action than those used before for the same purpose, and are portable and self-contained, not requiring outer guiding cylinders or frames, as heretofore." It is interesting to note that the basic design put forward by Ross for the float and flush valves has changed little in the past 116 years. By 1880 the basic design of the toilet, as we know it today, was well-established (as shown in Figures 69.1 and 69.2). A cistern with flush and float valves and a basin using a water trap are clearly evident. Some of the designs were quite complex. The patent filed in 1897 by David S. Wallace of the Denver, CO, Wallace Plumbing Improvement Co. depicts a design (shown in Figure 69.8) that would allow both manual and automatic control of the open and close cycle of the flush valve. Such an approach would allow the toilet user to adjust the amount of water to be used during the flushing cycle (a concept that would be resurrected almost 100 years later as a water saving device). Not all inventors chose to use the same approach1to the flush valve. David Craig and Henry Conley proposed a quite different solution in 1895 (Figure 69.9). In operation, the valve was quite simple, using a ball to close the water outlet from the toilet tank. The flushing operation is initiated by pulling on the cord 2, which raises the ball inside the tubed', allowing the ball to roll to the f ' end of the tube and the tank to empty. The ball rolls slowly back down the tube, settling into the collar c and closing the water outlet to the toilet bowl. This approach to control was not widely adopted. Throughout the history of the development of toilet control
1?~<71rmc.~ &ww'cL
k'd
I?&r
e-
4&
&&-kz
U.S. Patent drawing of the flush and float valves, including drawings of the operation of the flush valve, for a toilet tank. Source: U.S. Patent No. 211,260 filed by W. Ross, Jan. 7, 1879.
Figure 69.7
technology, much of the technologicaleffort was directed towards improving the mechanisms for assuring correct closure of the flushvalve. More recently, however, concerns about conservation of water have motivated technologists to consider new designs that increase the effectiveness of the flush valve. The design by Olson (Figure 69.3) is a good example of recent concerns in the development of control technologies for the toilet The design allows the user to control the amount of water used during the flushing cycle. Characterized as a water saving system that uses only several pints of water rather than the several gallons of a conventional toilet, the actual amount of water used during the flushing cycle is actively controlled by the user. Recall that this is similar in concept to the approach taken by Wallace in 1897 (Figure 69.8). Notice, too, that the approach to control remains the same; once the flushing cycle has been initiated, the level of water in the toilet tank is controlled by the float valve.
69.9 Historical Approaches to Systems It should be clear that understanding the historical development of toilet technology requires understanding the historical devel-
THE CONTI<0 L HANDBOOK
Figure69.8 U.S. Patent drawing of the flush and float valves that allows both manual and automatic control of the open and close cycle of the flush valve. Source: U.S. Patent No. 577,899 filed by D.S. Wallace, Mar. 2, 1897. opment of an entire system of sanitation. Historians have recognized for some time that the history of technology must be considered in terms of entire systems, and considerable recent effort has been devoted to understanding the ways in <whichsystems and the technology required to support them develop over time. The Thomas Hughes [4] model for the historical development of systems has-been especially influential. Hughes extensively studied the development of electric power networks in the period 1880-1930 and, as a result of that research, constructed a four-phase model of the historical development of'large-scale distributed systems. Each phase of development is defined by its own characteristics and a group of professionals who play a dominant role in developing the technologies employed in that phase. In the first phase, invention predominates with significant technical effort and intellectual resources devoted to developing the technologies comprising the system. The second phase is primarily concerned with the transfer of technologies from one region and society to another. Although inventors still play a predominant role in this phase, entrepreneurs and financiers become increasingly important to the growth and survival of the
Figure 69.9 U.S. Patent drawing of a variation on the flush valve for the toilet tank. The ball labeled d provides the means for closing and sealing the outlet to the toilet basin during the flush cycle. Source: U.S. Patent No. 543,570 filed by D. Craig and H. Conley, Jul. 30, 1895.
system. In the third phase of the model, regional systems grow into national systems and the scalability of regional technologies becomes a major concern. In the third phase of the model, Hughes links system growth and development to reverse salients and critical problems. Borrowed from military terminology, a reverse salient describes a situation where a section of an advancing line (in this context, the development of some aspect of system technology) is slowed or halted while the remaining sections continue movement in the expected direction. As Hughes [4] states, "A reverse salient appears in an expanding system when a component of the system does not march along harmoniously with other components" and reveals "imbalances or bottlenecks within the system. The imbalahces were seen as drags on the movement of the system towards its goals, especially those of lower costs or larger size."
A reverse salient has two major effects on technological development: the growth of technology slows and action must be taken for sustained growth to continue. It is the occurrence of a reverse salient and the effort expended to eliminate that bottle-
69.10. SUMMARY
neck which, according to Hughes, captures the essence of technological development. Innovations in technology occur because a reverse salient and the imbalance in system development motivates inventors, indust rial scientists, and engineers to concentrate their efforts in finding solutions to those problems. The imbalance defines the problem for the engineers, and the removal of that problem leads to innovations in technology, management policy, and methods of control. Hughes cites a number of examples of inventions and innovations in electrical technologies to illustrate the concept of a reverse salient. One such example is the complex information and control networks established in the 1920s to collect data, monitor plant performance, and control plant output. These networks developed in response to an increase in demand for electricity and a need for better scheduling of plant utilization. In the fourth phase of the model, the system develops substantial momentum with "mass, velocity, and direction." Much of that momentum can be characterized by the significant investment in capital equipment and infrastructure necessary to maintain the system. At this stage of development, the system's "culture" develops, defined by the professionals, corporate entities, government agencies, investors, and workers whose existence depends upon developing and perpetuating the system. The development ofa comprehensive water and sanitation system (with the toilet as an integral part of it), fits well into Hughes model. Once the need for a system to meet the demand for potable water and provide for the collection and disposal of human waste was identiiied, rapid technological development occurred. The necessary components of the system were developed and employed in building regional water supply and sewerage systems during the last half of the 19th century (phases one and two ofthe model). During that period there was considerable transfer of knowledge about requisite sanitation methods from one community to another ant1 among the variaus medical, engineering, and public health professionals concerned with sanitation policy and the construction of water supply and sewerage systems. As cities grew and demand outstripped capacity, much of the effort was directed towards transforming small-scale private systems into the large-scale muqicipal systems that could accommodate the requirements of rapidly expanding urban populations (phase three). Because the means for supplying water typically preceded systems of disposal, the collection and disposal of household and industrial waste products were a continual source of problems. When indoor plumbing provided the means to distribute water throughout the home, the m a r h t for sinks, baths, and toilets blossomed and the rate of domestic water consumption increased dramatically. In general, water supply preceded waste disposal, so that an increase in the ability to supply water resulted in significantly more volumes of waste water than could be handled by existing sewer systems. The increase in waste water created a reverse salient that could not be ignored and required a massive municipal response to construct the necessary collection and disposal system. The ability to pipe water directly to bathroom fixtures also created a reverse salient that was directly related to the control problem in the toilet tank. Both the amount of water flowing into the toilet
and the amount of water used by the toilet during ilushing had to be controlled to minimize the impact on the water supply and sewer systems (in fact, the control technology in the toilet tank became the means for linking the two systems). As a result, significant effort was expended in the two decades following 1875 to refine the operation of the toilet flush and float valves, especially to minimize the potential for incomplete valve closure and resultant water leakage. As we have seen, many designs were proposed, but the solutions have remained fairly consistent through the years. Once work began on the basic infrastructure for supplying water and disposing of wastes, the development of sanitation systems was well into Hughes' fourth stage. The inlcrease in the ability to supply water created demand for water and bathroom technologies and the subsequent need for waste water disposal. A public health movement and public works facilities and services grew out of the concern for clean water and sanitary living conditions. Underlying the desire for clean water and sanitary conditions were changing attitudes towards health, cleanliness, and the causes of disease. Once urban congestion reached a point where the health and well-being of people living in cities was threatened by disease, change occurred and new norms for cleanliness and standards of living were adopted. One of the major technological developments arising from that period of change was the modern toilet.
69.10 Summary The goal ofthis chapterwas to place the development of a controlbased technology in its historical context. In the process of attaining that goal, an attempt has been made to sh~owhow the development of such a technology, even in its simplest form, has many dimensions. The toilet and the basic control technology that resides in its water tank found its way into the home as the result of a number of important social, political, economic, and health reasons. The chronology shown in Figure 69.10 captures the multitude of events that occurred between 1830 and 1905 in the areas of disease, health, medicine, public works, and toilet technology. The chronology focuses on events in Newark, NJ during that period. Prior to 1832 and the first cholera epidemic in Newark, there was little activity in the areas of public health or public works. Until the community was motivated to do something about the appalling conditions, n o significant efforts were made to supply clean water and effectively remove waste products. As far as toilet technology was concerned, there was very little development prior to 1854. The development of control technology for the toilet rapidly followed the development of water supply and sewer systems, so that, by the 1880s, considerable progress had been made in controlling water flow in the toilet tank. There is no one single event that can be identified as the motivating factor for putting control technologies i1.to the toilet. Clearly, the availability of an adequate water supply, indoor plumbing, and a sewer system for carrying away waste were prerequisites for developing water flow control in the toilet tank.
THE C O N T R O L H A N D B O O K
1190
+ ++ + + 18491854
1832
1866 1873
Newark Bcadd Henlth Cmated 1857
Amepsn Pubh Henlthh
1872
r+l+ 1 i
1848 1854
1865
1880
Pastorh
Chadmek'a
John ~ Snow's
Report
Stvhes
stub-
E
1870-1930, University of Texas Press, Austin, TX, Cholera Epldemica
~
W
+ + +
1848
Publ~cHealth & Med~calSdene
KCCW~
B~&ena
Genn
mrlleary
+
1860
1893
CWChadw 'Or
we=
++
Design
Water Closet
I 1845
1860
Toilet Technology
cizmha we^
11,000 1830
PrOleds
112mile6
G4 SBUOIS
created
Ryan .& Planegan Wstsr Closet
Newark
Public Wwks
1875
1890
246fO"
~opulation
1905
Yigure 69.10 Chronology of events In Newark, NJ between 1830 and 1905 In the areas of dlsease, health, med~cine,public works. and toilet
technology.
However, those factors were not sufficient motivation for moving the toilet into the home nor adequate stimulus for the development of toilet technology. The threat of disease, changing attitudes towards health and sanitation, and a redefinition ofthe requirements of a modern home also contributed to the demand for bathroom technology. History has shown us how the development of a control technology, as well as the attitudes towards health, cleanliness, and standards for a middle class home, can be influencedby a number of strong, interacting forces. As Hughes' model points out, however, once development attained momentum, there was no stopping toilet technology.
References [ l ] Armstrong, E.L., History ofpublic Works in the United States 1776-1976, American Public Works Association, Chicago, IL, 1976. [2] Bijker, W.E., Hughes, T.P., and Pinch, T.J., The Social Construction of Technological Systems, The MIT Press, Cambridge, MA, 1987. [3] Galishoff, S., Newark: The Nation's Unhealthiest City, 1832-1895, Rutgers University Press, New Brunswick, NJ, 1975. [4] Hughes, T.P., Networks of Power: Electrification in M'estern Society, 1880-1930, The Johns Hopkins University Press, Baltimore, MD, 1983. (51 Lupton, E. and Miller, J.A., The Bathroom, The Kitchen, and the Aesthetics of Waste, MIT List Visual Arts Center, Cambridge, MA, 1992. [ 6 ] Mayr, O., The Origins of Feedback Control, MIT Press, Cambridge, MA, 1969. [7] Melosi, M.V., Pollution and Reform in American Cities,
1980. [8] Reynolds, R., Cleanliness and Godliness, Doubleday, New York, 1946. [9] Winslow, C.E.A., Man and Epidemics, Princeton University Press, Princeton, NJ, 1952. [lo] Wright, L., Clean and Decent, Revised ed., Routledge & Kegan Paul, London, 1980.
Temperature Control in Large Buildings
Clifford C. Federspiel Johnson Controls, Inc., Milwaukee, WI
John E. Seem fohnson Controls, Inc., Milwaukee, WI
70.1 Introduction ..........................................................1191 70.2 System and Component Description ................................ 1191 A Prototypical System Components ' Controls 70.3 Control Performance Specifications ................................. 1196 Time-Domain Specifications Performance Indices 70.4 Local-Loop Control ..................................................1197 Mathematical Models 'Feedback Control Algorithms Tuning 70.5 Logical Control ....................................................... 1199 Discrete-State Devices 'Modulating Devices 70.6 Energy-Efficient Control ............................................. 1200 Economizer Cooling Electrical Demand Limiting Predicting Return Time from Night or Weekend Setback Thermal Storage ' Setpoint Resetting 70.7 Configuring and Programming.. .................................... 1203 Textual Programming Graphical Programming Question-and-Answer Sessions References.. .................................................................. 1203
70.1 Introduction Heating, ventilating, and air-conditioning (HVAC) systems in large buildings are large-scale processes consisting of interconnected electromechanicaland thermo-fluid subsystems. The primary function of HVAC systems in large commercial buildings is to maintain a corgfortable indoor environment and good air quality for occupants under all anticipated conditions with low operational costs and high reliability. The importance 13f well-designed controls for HVAC systems becomes apparent when one considers the impact of HVAC systems on the econoniy, the environment, and the health of building occupants. One of the most significant operational costs of WAC systems is energy There is a direct cost associated with purchasing energy, ,and an indirect environmental cost of generating energy. It has been estimated that HVAC systems consume 18% of the total energy used in the U.S.each year [I]. Energy utilization in buildings is strongly influenced by the HVAC control systems, so the potential impact of WAC controls on the economy and the environment is significant. In addition to the impact on national energy consumption and the environment, HVAC systems have an impact on human health because people spend more than 90% of their lives indoors. Based on data from residential and commercial buildings, the Environmental Protection Agency (EPA) has pl~cedindoor radon first and indoor air pollution fourth on a list of 3 1 environmental problems that pose cancer risks to humans 121. --
0-8493-8570-9/%/$0.00+$.50 @ 1996 by CRC Press, Inc.
Additionally, indoor air pollution other than radon was assessed as a high noncancer risk. Therefore, HVAC control systems can have a significant impact on human health because they can play a central role in controlling indoor air pollution. In this chapter, building temperature controls and the associated flow and pressure controls are described. The emphasis of the chapter is on the current practice ofcontrol system design for large commercial buildings. ~herefoie,the controls described in this chapter may not reflect those for residential buildings or industrial buildings.
70.2 System and Component Description Before one can design a control system for a process, it is necessary to understand how the process works. In this section, a prototypical W A C system for a large building is described. Important features of subsystems that affect the control system design and performance are also described. Also, a description of the control system design is provided.
70.2.1 A Prototypical System In large buildings, there.is typically a central plant that supplies chilled and hot water or steam to a number of heating and cooling coils in air-handling units, terminal units, and perimeter heat-
THE CONTROL HANDBOOK ing coils. The central plant uses one or more boilers, chillers, and cooling towers to perform this task. Primary and secondary pumping systems such as shown in Figure 70.1 may be used to lower the pumping power requirements. Chillers generally re-
variabbspced
- - -- - --
i 8
chiller or boiler
plant
...
- - --- --- -- -
4
air-handling unit,.
:- terminalunit
-;
: :
chiller or boiler
:
, bypass
i
k
t
exchwga (coil)
,
; ;
modulated. Terminal units, such as the one shown in Figure 70.3, are used to control the temperature in each zone. Terminal units are typ-
valve
.
:
I
heat exchanger (coil) I
damper and actuator
1...-----...--d
constant-spced punP
Figure 70.1 Schematic diagram of a piping distribution system. The primary system is constant-flow while the secondary systems are variable-flow. quire a constant water flow rate, so the primary pumps are run at constant speed. The speed of the secondary pumps may be modulated to reduce pumping costs. Air-handling units distribute conditioned air to a number of zones, which is a volume within the building that is distinct from other volumes. Often, but not always, physical partitions, walls, or floors separate zones. In large buildings, air-handling units typically supply cool air to allzones during all seasons. Core zones in buildings must be cooled even when the outdoor temperature is well below freezing because of the large number of heat sources such as lights, equipment, and people, and because core zones are insulated from weather conditions by perimeter zones. As Figure 70.2 shows, air-handling units have a number of discrete components. Not all of the components shown in the figure are Ruum Air Rom Zones
&bust Air
Fan
Coa
Figure 70.2
Cril
R m Air Plow
Rate Sensor
RstcSenm Scnm
Schematic diagram of an air-handling unit.
found in every air-handling unit, and' some air-handling units may contain additional components not shown in the figure. There are two basic types of air-handling units: constant-airvolume (CAV) units and variable-air-volume (VAV) units. In CAV air-handling units, the supply and return fan capacities are constant, while in VAV air-handling units, the fan capacities are
Figure 70.3 Schematicdiagram of one type of VAV terminal unit. The zone is cooled by modulating the air flow rate, and the zone is heated by modulating the valve on the reheat coil. ically installed in the supply air duct for each zone, where they are used to modulate the flow rate andlor the temperature of air supplied to a zone.
70.2.2 Components An HVAC system consists of a number of interconnected components that each have an impact on the behavior ofthe system and on the ability of the control system to affect that behavior. Components that have a key influence on the control system design are described next.
Actuators Actuators are used to modulate the flow through dampers and valves. The two most common types of actuators are electric motors and pneumatic actuators. Most of the electric motors used in HVACcontrols are synchronous ac motors, some ofwhich are stepper motors. To reduce motor power consumpti011 and cost, gear ratios are used such that the load reflected onto. the motor through the gear reduction is small. Pneumatic actuators are usually linear motion devices consisting of a spring-loaded piston in a cylinder. Air presstire in, the cylinder pushes against the spring and any other forces acting on the piston. The spring returns the piston to the "normal" position in case of a loss of pressure. Pressure is supplied to the piston from a main, typically at a pressure of 20 (psi). Typical hardware for throttling the main pressure is a dual set of solenoid valves, an electric-to-pneumatic transducer, or a pilot positioner. Pneumatic actuators are prone to have substantial friction and hysteresis nonlinearities. A detailed description of pneumatic actuators can be found in [3].
Heat Exchangers It is important to understand the steady-state behavior of heat exchangersbecause it affects the process gain. There are dif-
70.2. SYSTEM A N D COMPONENT DESCRIPTION ferent ways to mathematically model the steady-state behavior including the log mean temperature difference (LMTD) method and the effectiveness-number of heat transfer units ( 6 - N T U ) method. These relations are used to design and size heat exchangers. For cont1-01system design, one needs to model and understand the relationship between the heat transfer rate and the controlled flow ]-ate. The steady-state characteristics of heat exchangers can be derived from the c-NTU relations, l~hichcan be found in most texts on heat transfer (e.g., [4]). For example, consider a cross-flow, water-to-air heat exchanger with unmixed flow on both sides In which the air flow rate is a constant, the water flow rate is modulated, and the water is heating the air. For such a heat exchanger, the c-NTU relation is [4]
where E is the heat exchanger effectiveness, C , is the ratio of the minimum to maxinlum heat transfer capacity rates, and NT U is the number of heat transfer units. All three of these quantities are nondimensional.. Mathematically, they are defined as c
EE
4 -
(70.2)
Qmax
Cmln(Tw,l - Tu.1)
Qmux
=
C,in
= min[C,,C,] Cm1n = -
17,.
Cmux CnCux = maxlCu, cwl
(7, C,
NTU
= =
mflc~, m,cp,, UA
-Cmin
(70.3) (70.4) (70.5) (70.6) (70.7)
m,,,,
= 3 h , k.The large slope at low flow and small slope at C ~ w
fraction of full flow, f
Heat exchanger characteristic for a cross-flow heat exchanger with unmixed flow in both fluids.
Figure 70.4
high flow is typical of heat exchanger characteiistics. This fact is important when control valves are coupled with heat exchangers, and it will be discussed in more detail later. The dynamic behavior of heat exchangers is complex and may be difficult to model accuratelybecause heat exchangersare nonlinear, distributed-parameter subsystems. Qualitatively, a heat exchanger introduces a phase lag and time delay into the overall system. This lag and delay vary as the flow rate through the heat exchanger is modulated.
(70.8)
Dampers and Valves
(70.9)
where q denotes the heat transfer rate iin units of energy per unit time, q,,, denotes the maximum achievable heat transfer rate, C m i , is the minimum heat transfer capacity rate of the two fluids in units of energy per unit time per unit temperature, T,,, is the temperature of the water entering the heat exchanger, T,,, is the temperature of the air Gpstream of the heat exchanger, Cmuxis the maximum heat transfer capacity rate of the two fluids, C, is the heat transfer capacity rate of the air, C , is the heat transfer capacity rate of the water, ma is the mass flow rate of the air, cpais the specific heat at constant pressure of the air in units of energy per unit mass per unit temperature, m , is the mass flow rate of the water, c,,, is the specific heat at constant pressure of the water, U is the heat transfer coefficient at the surface of the heat exchanger in units fofenergy per unit time per unit temperature per unit area, and '4 is the area of the heat exchanger (U and A must correspond to the same surface). The steady-state relation between the water flow rate and the heat transfer is q = 4maxc (70.10) Figure 70.4 shows the characteristic of Equation 70.10 normalized by the heat transfer at the maximum water flow rate when NTUmUx = 2, m , = 5.8 kgls (10000 f t 3 / m i n ) , and
The flow through a damper or valve can be determined from
7
where Q is the flow rate, f is the flow characteriaiic of the valve, C , is the flow coefficient, Ap, is the pressure drop across the device, and g, is the specific gravity of the fluid. The flow characteristic is the relatidn between the position of the damper blades or valve plug and the fraction of full flow. When the pressure difference across the damper or valve is held constant for all positions, the flow characteristic is referred to as the inherent characteristic and is denoted by f i . The flow coefficient, C,, is the ratio of the flow rate in the fully open position to the ratio of the square root of the pressure drop to the specific gravity. The C , increases as the size of the device increases. For water valves in which the flow rate is measured in units of gallons per minute and pressure measured in units of pounds per square inch, typical values for the C , are 1 for a ;-inch valve, 37 for a 2;-inch valve, and 150 for a 4-inch valve. Figure 70.5 shows an example of the inherent characteristic of a damper or valve. The quantity fi is the fraction of full flow, and L is the fraction of the fully open position of the damper blades or valve stem. The important features of a damper's inherent char-
THE CONTROL HANDBOOK
The other important feature of a valve's inherent characteristic is the relationship between L and fi for Lmin ( L ( 1. Valves are often designed to have one of the following inherent characteristics: quick-opening, linear, or equal-percentage. Assuming the inherent characteristic is discontinuous at L = 0 (i.e., L,nl, = 0) due to limited resolution of the lift, then the inherent characteristic of a linear valve is
and the inherent characteristic of an equal-percentage valve is
In most systems, the pressure drop across a damper or valve varies with the flow rate because of additional pressure losses in ducts or pipes. The authority of a damper or valve is defined to be
Figure 70.5
Inherent characteristic of a damper or valve.
acteriitic are ( I ) the flow rate at the closed position (dampers leak, although seals may be used to reduce leakage) and(2) the shape of the inherent characteristic between the fully closed and fully open positions. Parallel-blade dampers have a faster-opening inherent characteristic than opposed-blade dampers. Like dampers, one of the important features of a valve's inherent characteristic is the behavior when the lift is nearly zero. Unlike dampers, manyvalves do not leakwhen fully closed. However, when the lift is nearly zero, the characteristic changes drathe inherent matically as shown in Figure 70.5. W h e L~ < Lmrr), characteristic is dependent on the valve construction, seat material, manufacturing tolerances, etc. The valve may shut off quickly and completely as shown in the figure, or it may shut off quickly but not completely. If the valve seat has rubberized seals, then Lmin will be larger than without rubberized seals. All actuation devices have a limited resolution for positioning the valve stem. Pneumatic actuators, which are commonly used in HVAC systems, have low resolution because they are prone to stiction and hysteresis. Although the characteristic of the valve may be a continuous function of the lift for flow rates less than f,,,, the limited resolution of the valve stem positioning implies that the characteristic may be accurately modeled as a discontinuity at zero lift. The magnitude of the discontinuity is characterized by the rangeability of the valve. The inherent rangeability of a valve is defined as the ratio of the maximum to the minimum controllable flow. Mathematically, it is
where p, is the pressure drop across the damper or valve at the fully open position and p , is the total system pressure drop (e.g., the combined pressure drop across a valve, piping, and heat exchanger). The relationship between thelift and the fraction offull flow when a damper or valve is installed in a system is called the installed characteristic. The installed characteristic is dependent on both the lift and the authority. Mathematically, the installed characteristic is
Figure 70.6 shows the installedcharacteristics ofequal-percentage valves for several different values of authority assuming that the inherent characteristic is discontinuous when the lift is zero. When the authority is unity, the installed characteristic is the same as the inherent characteristic. Note that the installed value of f,,,in is dependent on the authority. This means that the range-
0
0:2
0:6
0.4
lift.
Controlvalves in HVAC systems typically have an inherent rangeability between 20 and 50.
Figure 70.6
Ri
= 25.
0.8
1
L
Installed characteristics of equal-percentagevalves with
70.2. SYSTEM A N D COMPONENT DESCRIPTION
1195 ratio of the maximum to minimum controllable heat transfer at steady-state. Mathematically, the heat transfer r,angeability is
ability of an installed valve or damper is not the same as the inherent rangeability. The installed rangeability is the ratio of the maximum installed flow to the minimum controllable installed flow and is mathernatically defined as
Note that the value of Rq corresponding to a = 0.1 and R, = 25 is 2.2. The implication of the heat transfer rangeability is that a heating or cooling load less than q,,,. cannot be rnatched without cycling. Since the steady-state characteristic of heat exchangers has a large slope at low flow rates, the value of Rq will be very low and cycling will occur unless the inherent rangeability of the valve and the authority of the valve are high.
where jv,,,,,,, is the smallest controllable fraction of the maximum installed flow. For an equal percentage valve, the installed rangeability is
See [5] for con1plt:te detdils on valve characteristics and terminology. Valve5 need to be properly sized to ensure proper operation. The sue of the valve dffects the authority of the valve; oversized valves have a low authority. Undersized valves cannot deliver sufficient flow rates to meet the maximum load conditions, but oversized valves r:quire less pumping power. In [6j it is recommended that valves be sized so that the authority is between 25 and 33%. In the HVAC industry, oversized valves are more prevalent than undersized valves because large safety factors are used to ensure adequate capacity. The inherent rangeability and the authority affect flow control loops at low flow rates because it is impossible to smoothly modulate the flow rate below the value of j,,,,,,. Low authority and/or low rangeability make this problem worse. Problems caused by low authority and low rangeability are worse when valves are coupled with heat exchangers. Consider the equalpercentage characteristic such as shown in Figure 70.6. When an equal-percentage valve is coupled with the cross-flow heat exchanger described in Section 70.2.2, the resulting steady-state characteristic of the valve and heat exchanger combination is as shown in Figure 70.7 for two different combinations of authority and inherent rangeability. One can define the heat transfer rangeability for a valve and heat exchanger combination as the
Fans and Pumps Steady-state relations between head, flow rate, speed, and power consumption are nonlinear. The following are idealized similarity relations for pumps moving incompressible fluids 171
where Q denotes flow rate, w denotes speed, D denotes the characteristic size, H denotes head, P denotes power consumption, and p denotes density. These relations have two important consequlences for control systems. The first is that the cubic relation between power and speed implies that capacity modulation by speed modulation is more efficient than by throttling; throttling results in only minor reductions in power consumption. The second implication of the similarity relations is that the gain of pressurization loops will be nonlinear due to the quadratic relation between pressure and speed. Since the load on a modulating pump or fan varies with time due to changes in the position of control valves or dampers, the quadratic relation varies with time.
Sensors and Transmitters
0
02
0.6
0.4
0.8
1
lift, L Figure 70.7 Steady-state installed heat transfer characteristic for an equal-percentagevalve with Ri = 25 and cr = 0.1.
One feature of sensors that affects control performance is the response time of the sensors. In [6] it is recommended that measurement and transmission time constants be less than one tenth the largest process time constant. In some cases, such as flow control loops, the process dynamics are extremely fast, so it may not be possible to meet this recommendation. Design decisions for HVAC systems and the associated controls are strongly influenced by initial cost or installed cost. Therefore, many bulldings have just the minimum number of sensors required for operation. In other words, many buildings are "sensory-starved." It is often possible to reduce operational costs when additional sensors are included in the design. Additional sensory information can be used by control systems to improve energy utilization and to increase diagnostic capabilities.
THE CONTROL HANDBOOK
70.2.3 Controls HVAC control systems are partially decentralized. Typicallythere is a separate controller for each subsystem. For example, there may be a controller for the central plant, a controller for each airhandling unit, and a controller fopeach terminal unit. Each subsystem may have multipleinput~ndmultipleoutputs.The commands to the process inputs are usually generated through a logical coordination of single-input, single-output (SISO) proportional-integral-derivative (PID) algorithms. There may be multiple PID algorithms operating in a controller at one time. Historically, pneumatic devices were used exclusively to compute and execute control commands. Today, many control systems in buildings are still partially or completely pneumatic, but the trend is toward the use of distributed digital controllers. The most modern control systems for HVAC equipment contain a network of digital controllers connected on a communication trunk or bus. There are several different methods for connecting various controllers. Figure 70.8 shows an architecture that uses peer-toypeer communications. The digital controllers and oper-
Operator Workstations A
L I
J3
Communication Trunk
common practice and inferences from industry standards. Since most HVAC subsystems are controlled using a decentralizedSISO design, this section is focused on control performance specifications for SISO systems.
70.3.1 Time-Domain Specifications In Chapter 10.1, time-domain specifications for feedback control systems are described in detail. In this section, some time-domain specifications for certain control loops are recommended.
Zone Temperature For zone temperature control, the steady-state error has a strong impact on the comfort of occupants. In [8] the ASHRAE standard for an acceptable range of zone temperatures is described. If temperatures remain within this range, then 80% of the occupants should be comfortable. The temperature range is 7'F and the center of the range is dependent on the season. The implication of this standard is that the steady-stateerror for zone temperature control is dependent on the setpoint. If the setpolnt is chosen as the center of the ASHRAE acceptable temperature range, then the maximum allowable steady-state error is 3.5'F. It is common practice for building operators to raise the zone temperature setpoints when it is hot outside and lower the setpoints when it is cold outside to conserve energy. If the setpoint is raised or lowered to the limit of the ASHRAE standard, then the steady-state error for the zone temperature controls should be zero.
Flow Control
digital controllers connected to actuators and sensors Figure 70.8
Communication architecture with peer-to-peer commu-
nications. ator workstations all communicate on the same network. There may be hundreds of digital controllers on a single network. Due to the historic useaf pneumatic controls, many of the digital control systems in buildings are programmed to emulate pneumatic controllers. The communications networks are used primarily for alarm reporting, monitoring, and status communications.
70.3 Control Performance Specifications In many engineering design specifications for HVAC systems, the performance of the control system is not rigorously specified. Often there is little reference made to quantifiablemeasures of performance. In this section, guidelines for certain measures of performance are described. These guidelines are based on
Flow control loops are cascaded with temperature control loops in pressure-independent VAV boxes and are used in airhandling units to regulate the outdoor air flow rate. Typically, it is expected that flow control loops will be controllable from 5 to 100% of the maximum rated flow. Flow controllers may be operated manually during commissioning or when troubleshooting. During manual operation, the settling time of flow controllers is important because it is expensive to have an operator wait for transients to settle out. A reasonable settling time is twice the stroke time of the actuator. For outdoor air flow control loops and cooling based on the use of cool outdoor air (economizer cooling), overshooting may be a problem when the outdoor air temperature is below freezing. If the controls overshoot, then freeze protection devices may be activated, which will shut down an air-handling unit. The maximum allowable overshoot on an outdoor air flow control or economizer control loop depends on the outdoor air temperature. When the outdoor air temperature is well below freezing, overshoot may be intolerable.
Pressurization Control For pressure control loops, overshoot can be dangerous. Overshooting pressurization controls can trip pressure relief devices or rupture ducts or pipes. The maximum allowable percent
70.4. LOCAL-LOOP CONTROL overshoot on a pressurization control loop depends on how close the loop is being controlled to the high limit. For example, if the high limit on a duct pressurization loop is 2.5 inches of water and the setpoint is 2 inches, then the maximum allowable percent overshoot for a safttty factor of 5 is 5%. However, if the setpoint is 1 inch, then the rnaxilnum allowable percent overshoot for the same safety factor is 30%.
70.3.2 Perfo1:mance Indices It is often easier to specify control performance in terms of timedomain specificationsbecause the physical meaning of the quantities such as percent overshoot and steady-state error are clear and are particularly relevant to certain control problems. However, it is often easier to design a cantrol system to optimize a single index of the control system performance. Three commonly used performance indices for designing single-loop contro~lalgorithms are the integrated absolute error (IAE),the integrated squared error (ISE),and the integrated time absolute error (ITAE).These indices are defined as
For example, one can get a first-order delayed transfer function from Equation 70.26 by letting p2 + co,or one <.anget a secondorder Type 1 system (an integrating system) by letting p2 + 0. The control systems are typically designed based on the worstcase model parameters.
70.4.2 Feedback Control Algorithrns Figure 70.9 shows the basic structure of a feedback controller. The objective of the feedback controller is to maintain the process output at the setpoint. The feedback controller uses the error to determine the control signal. The control signal is used to adjust the manipulated variable.
setpoint
9 4
enor ~
Figure 70.9
ISE
=
Lrn
e2(t)dt
(70.24)
These indices are rarely used to determine parameters of control loops in the field, but are commonly used to design automated tuning algorithms or adaptive control algorithms. They are also used as benchmarks for simulation of new control algorithms. For low-order linear systems, design relations that minimize one of these indices are available for certain types of linear SISO controllers. For more information, see Chapter 10.2.
70.4 Local-Loop Control This section descri6es algorithms for modulating valves, dampers, and other devices to control temperature, pressure, or flow. The emp'hasis is on SISO systems and PID control algorithms.
Controller
~
t
Structure of feedback controller.
According to [3],proportional-only (P) control is used in most pneumatic and older electric systems. The control signal for a P-controller is determined from
where u(t) is the controller output at time t, ii is a constant bias, K p is the controller gain, and e(t) is the error signal at time t. Equation 70.27 describes the ideal behavior of a P-controller. In an actual application, the actuator usually has upper and lower limits (e.g., a valve has limits of completely open or closed). Pcontrollers exhibit a steady-state offset if the process is a Type O system (i.e., nonintegrating process) and if the controller output is not biased. The amount of steady-state offset can be reduced by setting the bias value equal to the expected nominal steady-state value of the controller output. The steady-state offset of Type 0 systems can be eliminated by using a proportional plus integral (PI) controller. The timedomain representation of a PI controller is
70.4.1 Mathematical Models Despite the fact that many of the components comprising an HVAC system are most accurately modeled as nonlinear distributed-parameter systems, low-order linear models are used to determine parameters for the local-loop controllers. Many of the most commonly used process transfer function models for designing the local-loop controllers can be represented by the following transfer function:
where K I is the integration gain. There is a disadvantageof integral control action under certain conditions. If the actuator saturates and the error continues to be integrated, then the integral term will become very large and it will take a long time to bring the integral term back to a normal value. Integrating a sustained error while the actuator is saturated is commonly called integral windup or reset windup. Integral windup can occur during start-up, after a large setpoint change, or during alarge disturbance in which the process output cannot be controlled to the setpoint. In [6, 9, 101, methods for
THE CONTROL HANDBOOK
1198 reducing the effect of integral windup are described. The feature of reducing the effect of integrator windup is called anti-reset windup. An anti-reset windup strategy will significantly improve the performance of a controller with integral action. For some systems, the performance of a PI controller can be improved by adding a derivative (D) term. The classical equation for a PID control algorithm is u(t)= i
+ Kpe(t) + K,
1'
e(r)dr
+ K D d-det ( t )
(70.29)
where K g is the derivative gain. In commercial controllers, one does not use the classical form of the PID control algorithm because of a phenomenon known as derivative kick. If the classical PID controller is used, then the controller output goes through a large change following a step change in the controller output. This phenomenon is called derivative kick and it can be eliminated by using
Equation 70.30 determines the derivative contribution based on the process output rather than the error. Derivative action generally improves the response of control l o o p in buildings. However, it is not commonly used in HVAC control loops because a three-parameter PID controller is more difficult to tune than a two-parameter PI controller and because derivative action makes the controller more sensitive to noise. Therefore, the additional imnrovement in the response time of the controller attributed to derivative action is generally outweighed by the additional cost of tuning and by the increased sensitivity to noise. Figure 70.10 shows a block diagram for a practical feedback control algorithm in a digital controller. Next, we describe the purpose of the analog filter, noise spike filter, digital filter, and the deadzone nonlinearity.
the analog filter should be selected based on the sampling period for the AID converter. In the KVAC industry, a sampling period of 1 second is typical. Noise S ~ i k Filter e Noise spike filters improve the performance of digital control systems by removing outliers in the process output measurement. Outliers can be caused by a brief communication failure, instrument malfunction, electrical noise, or brief power failures. If noise spike filters are not used, then an outlier in the process output measurement will introduce a large disturbance into the control system. Also, if integration is used in the feedback controller, it may take a long time to recover from the disturbance. In [ 6 , 9 ] simple noise spike filters are described. Digital Filters The sampling time for digital feedback controllers should be selected based on the dynamics of the control system. When controlling slowloops, such as the temperature in a room, a sampling period of 1 minute may be adequate for the feedback controller. However, the AID converter may have a fixed sampling period of 1 second. For this case a combination of analog and digital filters should be used to remove aliases. The analog filter removes higher-frequency noise than the digital filter. Also, the cutoff frequency of the digital filter is adjusted in software. Thus, digital filters are especially useful for digital control algorithms that use an adjustable sampling period. Deadzone Nodlinearity The deadzone nonlinearity shown in Figure 70.1 1 is used to reduce actuator movement for small errors. The deadzone non-
output
r - - - - - - - - - - - - - - - -1
-"zone
Dlgltal Controller
I I
I
Slope = 1
Figure 70.1 1 L
J
Figure 70.10
Input-output characteristiccurve for a dead-zone non-
linearity.
Structure of digital feedback controllea
Analog Filter The purpose of the analog filter is to remove high frequency noise before sampling by the analog-to-digital (AID) converter. This filter is commonly called an anti-aliasing filter or an analog prefilter. It is especially important to use an anti-aliasing filter when controlling flow or pressure in a duct or pipe because turbulence can generate a large amount of noise. The bandwidth of
linearity can be used to increase actuator life without sacrificing control performance. The input-output characteristic curve for the deadzone nonlinearity is shown in Figure 70.11. The output from the deadzone block can be determined fiom the following relationship: e - Dzone when when e Dzone when
+
e > Dzone
Iel 5 &one e
t
-Dzone
(70.3 1)
70.5. LOGICAL C'ONTROL where ii is the modified error, D,,,,is the size of the dead zone, and e is the error. The dead zone nonlinearity is especially useful when controlling processes with noisy signals, such as the flow through pressure-independent VAV boxes. The size of the dead zone should be selected based on the noise in the sampled signal.
should stroke the controlled device to reduce the effect of hysteresis on the step test. Figure 70.12 shows how to reduce the effect of hysteresis for a step change in the positive direction. Initially, the controller output should move in a negative direc-
70.4.3 Tuning Tuning methods are used to determine appropriate values of controller parameters. Since the HVAC industry is cost driven, people installing and commissioning systems do not have along time to tune systems. Consequently, a number of systems use the default parameters shipped with the controller. For some systems, the default parameters may not be appropriate. HVAC systems are nonlinear and time-varying. Thus, fixed-gain controllers may be poorly tuned some of the time even if they were properly tuned when they were installed. Control loops for HVAC systems may require retuning during the year for the processes that have nonlinear characteristicsthat are dependent upon the season. Ifa feedback control loop is not retuned, the control response may be "poor" (e.g., the closc-d-looprespc:lse for the controlled variable is oscillatory). An operator would have to retune the controller to maintain "good" control. In buildings, it is important to tune loops because properly tuned loops improve indoor air quality, decrease energy consumption, and increase equipment life. Computer control systems can automate control loop tuning. One automated method for tuning is autotuning. Autotuning involves automatic tuning of a loop on command from an operator or a computer. With autotuning, the operator or control system must issue a new command to determine new control parameters. Since HVAC processes are nonlinear and time-varying, loops may require retuning after the system load changes. Another automated meihod for determining control parameters is adaptive controi. With adaptive control, the control parameters are continually being updated or adjusted. In the HVAC industry, people installing and commissioning control systems often have limited analytical skills. Thus, it is necessary that autotuning and adaptive control methods be user friendly. Also, the methods should be robust and have low computational and memory requirements. Prior to tuning a system, the operator should make sure that all elements of the control system are working properly. The controller should be reading the signal for the process output, and the actuator should be working properly. If the control system uses an electric-to-pneumatic (E-P) converter to drive the actuator, then the zero and span of the E-P converter should be properly adjusted. Several manual tuning and autotuning methods are based on the response to a step change in the controller output. Open-loop step tests should be I-un in the high-gain region of the process. Running the step test in the high-gain region helps er Ire that the control system remains stable for other regions of c ,ration. (The process gain is equal to the ratio of the steady-statechange in the process output to the change in the controller output following a step change in the controller output). For systelaswith a large amount of hysteresis in the controlled device, thy user
Process output at steady-state conditions
Step change in controller output
k ~ e d u c effect e of hysteresis
Time Figure 70.12
Controller output during a step test.
tion. Then, the controller output should move in the positive direction until it is back to the initial position. The step change in controller output should take place after the process output returns to a nearly steady-state condition. A simlar procedure should be used for a step change in the negative direction.
70.5 Logical Control Logical operations play a central role in the control of HVAC systems. Logical operations are used to ensure the safe startup and shutdown of subsystems such as boilers, chillers, and air-handling units. They are also used to adjust the capacity of subsystems with discrete states and to sequence modulating controls.
70.5.1 Discrete-State Devices Discrete-state devices are common in HVAC control systems. These devices may be mechanical devices, such as a pump, or electronic or software devices, such as an alarm. It is common to use combinations of discrete-state devices to modulate the capacity of a subsystem. For example, condenser water temperature is often controlled by sequencing fans, which may have different capacities and different discrete speeds. When sequencing discrete-state devices to modulate capacity there is a trade-off between the switching rate and the maximum deviation from the setpoint. The faster the switching rate, the smaller the deviations. This trade-off is affected by the capacitance or inertia of the system. If the system has alarge capacitance, then the switchingrate required to maintain the same maximum deviation from setpoint is lower.
THE CONTROL HANDBOOK
70.6 Energy-Efficient Control
70.5.2 Modulating Devices Often the capacity of a single modulating device such as a control valve is insufficient to maintain control over the entire range of operating conditions. In such cases, multiple modulating devices are operated in sequence. A common example is the control of supply air temperature in an air-handling unit or similar subsystem through the sequencing of the cooling valve, outdoor air dampers, and heating valve. When it is sufficiently cool outdoors, the desired sequence involves modulating only one device at a time and placing the outdoor air damper control between the heating valve control and the cooling valve control. This sequence guarantees that the system does not heat and cool at the same time and that the relatively inexpensive cooling with outdoor air (called economizer control) is used before resorting to the relatively expensive cooling provided with the cooling valve. A common method of sequencing modulating devices is to span the output of a single controller over the range of the modulating devices. For example, the heating valve, economizer, and coolingvalve are often controlled as shown in Figure 70.13. While
economizer command
Most of the local control loops in an HVAC system do not directly affect the zone temperatures. The purpose of these loops is to ensure the safe, reliable, and efficient operation of the subsystems. In this section, strategiesfor controlling such loops are described.
70.6.1 Economizer Cooling The use of outdoor air for cooling is commonly referred to as economizer cooling or free cooling. In [ l l ] simulations were used to estimate the energy savings with economizer cooling for two different building types in five different locations. Energy savings for cooling varied from 5 to 52%. Figure 70.14 is a psychrometric chart that shows the different regions of operation for the air-handling unit as a function of the thermodynamic state of the outdoor air. The thermo-
Constant Dry-Bulb Temp. L i n e 7
A
cooling v 2 command
command
n
Constant Enthalpy Line
FO''
Typ'
/'
1 Humidity
Changeovr Temp.
Dry-Bulb Temperature Psychrometric chart showing the decision boundaries for temperature-based and enthalpy-basedeconomizer control.
Figure 70.14 0
controller output, %
100
Sequencingcontrol for heating, economizer and cooling by spanning the output of a single controller. Figure 70.13
this method can result in a logically correct control sequence, it has a basic probkem. It is difficult to tune a single fixed-gain PID controller for three different modes of operation (heating, cooling, and economizer) because the system dynamics in each mode are different. A way to get better control performance from a sequence of modulating devices is to use separate controllers for each device. While this method improves the control performance, it complicates the sequencing logic. A problem with graphically representing sequencinglogic as in Figure 70.13 is that the representation of the logic is incomplete. For example, the economizer may be disabled due to the outdoor air temperature, an override by the ventilation controls, or an override by the mixed-air temperature controls. Other methods of graphically representing sequencing logic are described in Chapter 18.
dynamic state of the return air also is shown in Figure 70.14. There are two commonly used strategies for switching between 100% outdoor air and mechanical cooling to minimum outdoor air and mechanical cooling. One strategy compares outdoor air temperature with a changeover temperature (e.g., return air temperature), and the other strategy compares outdoor air enthalpy with a changeover enthalpy (e.g., return air enthalpy). With the temperature economizer strategy, the controls switch from minimum outdoor air to 100% outdoor air when the outdoor air temperature drops below the changeover temperature. With the enthalpy economizer cycle, the controls switch from minimum outdoor air to 100%outdoor airwhen the enthalpy ofthe outdoor air drops below the changeover enthalpy. For example, in region 1 of Figure 70.14, the temperature economizer cycle uses 100% outdoor air while the enthalpy economizer cycle uses the minimum amount of outdoor air. Also, in region 2 of Figure 70.14, the temperature economizer uses the minimum amount of outdoor air while the enthalpy economizer cycle uses 100% outdoor air.
70.6. ENERGY-EF1:ICIENT CONTROL Accurate humidity sensors should be used with the enthalpy economizer cycle because moderate sensor errors can result in significant energy penalties. Problems with low-cost humidity sensors are drift and slow recovery from saturated conditions. It is often assumed that the energy required for cooling will always be less with the enthalpy economizer cycle. However, this is not true. In [ l l ] it was shown that in dry climates the temperature economizer cycle may use less energy than the enthalpy economizer cycle.
. .
Consumption without
.-
2
19
I
M
E
70.6.2 E1ectric:alDemand Limiting Owners or operators of commercial buildings are commonly billed for electric power based upon energy consumed (i.e., kwh) and peak consumption over a demand interval. The rate structure of utilities varies considerably. For example, some utilities use a 30-minute demand interval and other utilities use a 5minute demand interval. Computer control systems can reduce the charges for peak demand by shutting off non-essential equipment during times of peak consumption. The strategy of shutting off the nonessential electric loads is commonly called demand limiting. Demandlimiting control strategies can reduce electric bills by 15 to 20% [12]. The building operator selects the desired electric demand target and enters a table of nonessential loads that can be turned off into the computer control system. Also, minimum and maximum times the load can remain off should be entered into the computer control system. A computer control system measures the electrical demand and determines the electrical loads to turn off to maintain the electrical consumption below the target. In [13] an adaptive demand-limiting strategy with three important features is developed. First, the algorithm is easy to use because statistical methods automatically determine the characteristics of electric energy consumption for a particular building. Second, the algorithm controls the energy consumption just below the target level when there are enough nonessential loads to turn off. Third, the algorithm has low computational and memory requirements. Figure 70.15 shows the target energy consumption, actual energy consumption data with the demandlimiting algorithm, and estimated energy consumption without demand-limiting control from a manufacturing facility. Notice that the demand stays just below the target. Figure 70.16 shows the calculated load to shed from the demand-limiting algorithm and the actual load shed. To limit the electrical demand below the target level, the actual load shed is larger than the calculated load to shed from the demand limiting algorithm. Also, the actual load shed remains larger than the calculated load to shed from the demand limiting algorithm because of a minimum off time for the sheddable loads.
70.6.3 Predicting Return Time from Night or Weekend Setback For buildings that ai-e not continuously occupied, energy costs can be reduced by adjusting the setpoint temperatures during unoccupied times. One commori strategy for adjusting setpoint
p
Y
Demand Limiting Control I
Time in Minutes since 1.00 p.m. Figure 70.15
Electric consumption with andwithoutdemandlimiting
control.
-8
6
2 Calculated Load to Shed from Demand Limiting Algorithm
'Time in Minutes since 1.00p.m. Figure 70.16
Actual amount of load shed and desired amount to shed.
temperatures is called night setback. Night setback involves lowering the setpoint temperature for heating and raising the setpoint temperature for cooling. In [14] it is shown that night setbaclrcan result in energysavings of 12% for heavyweight buildings and 34% for lightweight buildings. When using a night setback strategy, space conditions need to be brought to comfortable conditions prior to the time when occupants return. Ideally, the space conditions w~ouldjust be in the comfortable range as the building occupants return. In [15] seven different methods for predicting return time from night or weekend setback were compared. For nights that required cooling to return, the following equation can be used to predict return time
where Z is the estimate of the return time from night or weekend setback, ai is a coefficient, and T,,iis the initial room temperature at the beginning of the return period. For nights that require heating, the return time can be estimated from
THE CONTROL HANDBOOK where Rh,u is the setpoint temperature for heating during unoccupied times, Rh,, is the setpoint temperature for heating during occupied times, and T, is the outdoor air temperature at the beginning of return from night setback. Recursive linear least squares with exponential forgetting is used to determine the coefficients in Equations 70.32 and 70.33.
70.6.4 Thermal Storage Thermal storage systems use electrical energy to cool some storage media at off-peak hours, then use the cooling capacity of the media during on-peak hours. Storage systems are used for some of the same reasons as the electrical demand-limiting algorithm described previously; they allow the electrical demand at peak hours to be reduced. They also allow the chiller load to be shifted to times when electrical energy is inexpensive. Thermal storage systems may also allow one to use lower-capacity chillers. The two most common storage media are chilled water and ice. Ice storage tanks are becoming more popular because the energy density of ice is much greater than that of water.
velopment of heuristic strategies that are nearly optimal. For example, a nearly optimal strategy for the control of ice storage systems is developed in [ 171.
70.6.5 Setpoint Resetting One way to improve the operational efficiency of an HVAC system is to make adjustments to the setpoints of the local-loop controllers. This is commonly called setpoint resetting. In this section, two common setpoint resetting strategies are described.
Chiller Water Temperature Figure 70.17 shows a schematic diagram of a chiller. The
condenser
Water and Ice Storage Control Strategies The simplest control strategy for water or ice storage systems is chiller priority control. With chiller priority control, the cooling load is matched by the chiller until the load exceeds the chiller capacity. Then the storage media is used to augment the chiller capacity. During the off-peak hours, the storage media is fully recharged. Chiller priority control is sometimes referred to as demand-limiting control. Chiller priority control is simple to implement, but it is substantially suboptimal except when cooling loads are high. Storage priority control strategies make better use of the storage capacity than chiller priority control. The objective of storage priority control is to use mainly stored energy during the on-peak period when energy is expensive and to use all of the stored energy by the end of the on-peak period. An example of storage priority control is load-limiting control [16]. During the offpeak unoccupied hours, the storage media is charged as much as possible. During the on-peak unoccupied hours, the building is cooled using chiller priority control. During the on-peak hours, the chiller is run at a constant capacity such that the storage media is completely used by the end of the on-peak period. Storage priority control strategies such as load-limiting control require a forecast of the cooling load for the on-peak period. Optimal control ofthermal storage systems requires determining a sequence of control commands for charging and discharging the storage media such that the total cost of supplying chilled water is minimized subject to operating constraints. To determine the truly optimal sequence requires perfect knowledge of the behavior of the process and perfect knowledge of future cooling loads. Even with perfect knowledge of the process behavior and of future cooling loads, determining the optimal sequence is computationally intensive (e.g., requires solving a dynamic programming problem). Consequently, most of the practical work on optimal control of thermal storage systems involves the de-
4 leaving chilled water
Figure 70.17
Schematic diagrarmof a chiller.
chiller power consumption is a function of the refrigerant flow rate and the pressure difference across the compressor. The larger the pressure difference across the compressor, the larger the chiller power consumption. The temperature differencebetween the water leaving the condenser and the water leaving the evaporator is strongly correlated with the pressure difference across the compressor. Therefore, the chiller power can be reduced if the leaving water temperature difference can be reduced. One way to reduce the leaving water temperature difference is to reset the chilled water supply setpoint to a higher value whenever possible. There are numerous strategies for doing this. One is to control the position of the most-open valve in the chilled water distribution circuit to a nearly open position as shown in Figure 70.18 and described in [18].
chilled
most~pen
chilled
wale1
wafer
valve
setpoint
lempaaturc
positions
I
valve position
Figure 70.18
Block diagram of chilled water reset control.
An advantage of this strategy over others is that it allows the chilled water temperature to go as high as possiblewithout forcing
70.7. CONFIGURING AND PROGRAMMING a cooling valve into saturation. A potential disadvantage is that, if the controller is not properly tuned, the chiller controls are sensitive to disturbances on a single cooling coil. Therefore, the controller should be tuned so that the bandwidth of the resetting loop is much smaller than the bandwidth of the slowest coolilig coil control loop.
Duct Static Pressure In VAV systems, the most common method of modulating the total supply air flow rate is to modulate the supply fan so as to regulate the static pressure at some point in the supply air duct. The operator must select a static pressure setpoint. If the setpoint is too low, one or more terminal units will open completely and stillnot be able to dellver the required quantity ofsupply air. Ifthe setpoint is too high, the fan power consumption will be greater than necessary to supply air to the terminal units. Therefore, there is an optimal static pressure that minimizes the fan power yet supplies the necessary amount of air to all of the terminal units. One method to control the fan efficiently is to reset the static pressure setpoint based on the position of the terminal unit that is open the most. I11 [19], a strategy is described in which the setpoint is increased by a fixed amount when one or more of the terminal units ir saturated fully open and decreased by a fixed amount when one or none of the terminal units is saturated. It is reported that this strategy is able to reduce the required fan power by 19 to 42% over a fixed-setpoint control strategy.
70.7 Coufig;uringand Programming There are several different ways that digital HVAC controllers are configured or programmed. In this section, three common methods are described.
70.7.1 Textual Programming Perhaps the oldest a.pproach to configuring and programming controllers is textual programming. For example, BASIC, FORTRAN, or C may be used to program logical controls, localloop controls, and supervisory controls and diagnostics. The advantage of using textual programming is that it allows the programmer great flexibility in constructing the control software. Textual programmir~grequires substantial programming expertise. Since many controllers must be custom-programmed, the expertise required for textual programming often makes the cost of using textual programming prohibitive.
70.7.2 Graphical Programming Control algorithms and logic may be programmed graphically through the construction of diagrams. One common type of control diagram is the ladder logic diagram. Originally, ladder logic diagrams were designed to resemble electrical relay circuits and, therefore, could perform only limited types of operations. Today, additional functions can be included in ladder logic di-
agrams so that the controllers can perform functions such as arithmetic and data manipulation operations. Another common method of graphical programming involves the construction of block diagrams. Function blocks are connected by lines or wires. The blocks may perform basic logical operations or more complex functions (e.g., implement PID control algorithms). Also, the programmer may construct new blocks for a specialized purpose. The advantage of using graphical programming is that the controller can be configured and programmed for a nonstandard system or subsystem. Specialized control functions can often be constructed quickly, and the required level of programming expertise may be lower than other programming methods.
70.7.3 Question-and-AnswerSessions Many HVAC subsystems are designed in such a way that standardized controls can be used. In such cases, the systems are preprogrammed. To be made fully operati~nal,only information about the size or type of subsystem and the desired control strategy need be supplied. Therefore, these types of controllers may be configured through a simple question-and-answer session. Control strategies such as those described in Section 70.6 may be preprogrammed for avariety of different equipment and can be selected to match the particular system. The obvious disadvantage of using preprograrnmed controls is that they are inflexible. The user cannot make changes to the way in which the controller operates. However, there are significant advantages to preprogrammed controls. One advantage is that it reduces the commissioningtime. Furthermore, it does not require programming expertise. Another advantage is that preprogrammed controls can be tested in a laboratory so that they are guaranteed to be safe, reliable, and efficient. If the controls are custom-programmed for a specific system, rigorous testing cannot be done.
References [ l ] Hirst, E., Clinton, J., Geller, H., and Kroner, W.,
Energy 5pcrency in Buildings: Progress and Promise, O'Hara, EM., Jr., Ed., American Council for an Energy Efficient Economy, 1986,3. [2] Environmental Protection Asency, Unfinished Business: A Comparative Assessment of Envirionmental Problems, Vol. 1, February 1987. [3] Haines, R. W. and Hittle, D.C.,Control Systems for Heating, Ventilating, and Air Conditioning, 'Van Nostrand Reinhold, New York, 1993. [4] Incropera, F. P. and DeWitt, David P., Fundamentals of Heat and Mass Transfer, Wiley, New York, 1990. [5] Instrument Society ofAmerica, ISA HandbookofControl Valves, Pittsburgh, 1976. [6] Seborg, D. E., Edgar, T.F., and Mellichamp, D.A., Process Dynamics and Control, Wiley, New York, 1989.
THE C O N T R O L H A N D B O O K [7] White, F. M., Fluid Mechanics, McGraw-Hill, New York, 1979. [8] 1993 ASHRAE Handbook: Fundamentals, ASHRAE, Atlanta, 1993. [9] Clarke, D. W., PID algorithms and their computer implementation, Oxford University Engineering Laboratory Report No. 1482183, Oxford University, 1983. [lo] Shinskey, F. G., Process Control Systems: Application, Design, and Tuning, McGraw-Hill, New York, 1988. [ l l ] Spitler, J. D., Hittle, D.C., Johnson, D.L., and Pedersen, C.O., A comparative study of the performance of temperature-based and enthalpy-based economy cycles, ASHRAE Trans., 93(2), 13-22, 1987. [12] Stein, B., Reynolds, J.S., and McGuinness, W.J., Mechanical and Electrical Equipment for Buildings, 7th ed., Wiley, New York, 1986. [13] Seem, J. E., Adaptive demand limiting control using load shedding, Int. J. Heat., Vent., Air-Cond. Refrig. Res., 1(1),21-34, January 1995. [14] Bloomfield, D. P. and Fisk, D.J., The optimization of intermittent heating, Buildings and Environment, Vol. 12, 1997,43-55. [15] Seem, J. E., Armstrong, P.R., and Hancock, C.E., Algorithms for predicting recovery time from night setback, ASHRAE Trans., 95(2), 1989. [ 161 Braun, J. E., A comparison of chiller-priority, storagepriority, and optimal control of an ice-storage system," ASHRAE Trans., 98(1), 1992. [17] Drees, K., Modeling and Control of Area Constrained Ice Storage Systems, Master's thesis, Mechanical Engineering Department, Purdue University, 1994. [18] 1991 ASHRAE Handbook: HVAC Applications, ASHRAE, Atlanta, 1991, chap. 41. [19] Lorenzetti, D. M., and Norford, L.K., Pressure setpoint control of adjustable speed fans, J. Sol. Energy Eng., 116,158-163, August 1994.
Control of DH
F. Greg Shinskey Process Control Consultant, North Sandwich, NH
71.1 Introduction .......................................................... 1205 71.2 The pH Measuring System .......................................... 1205 The pH Scale Electrodes 7 1.3 Process Titration Curves ............................................ 1206 The Strong Acid-StrongBase Curve Distance from Target Weak Acids and Bases 71.4 Designing a Controllable Process .................................... 1209 A Dynamic Model of Mixing Vessel Design Effective Backmixing Reagent Delivery Systems Protection Against Failure 71.5 Control System Design ............................................... 1213 Valve or Pump Selection Linear and Nonlinear Controllers ' Controller Tuning Procedures Feedforward pH Control Batch pH Control 71.6 Defining Terms ...................................................... 1216 References ................................................................... 1217
71.1 Introduction The pH loop has been generally recognized as the most difficult single loop in process control, for many reasons. First, the response of pH to reagent addition tends to be nonlinear in the extreme. Second, the sensitivity of pH to reagent addition in the vicinity of the set point also tends to be extreme, in that a change of one pH unit can result from a fraction of a percent change in addition. Thirdly, the two relationships above are often subject to change, especially when treating wastewater. And finally, reagent flow requirements may vary over a range of 1000:l or more, especialiy when treating wastewater. As a result of obasce unusual characteristics,many, if not most pH control loops are unsatisfactory,either limit-cycling or slow to respond to upsets or both. Considerable effort and ingenuity have gone into design~ngadvanced controls, nonlinear, feedforward, and adaptive, to solve these problems. Not enough has gone into designingt h e neutralization process to be controllable. Therefore, after descriibing the pH control problem and before developing control-system design guidelines, process design is covered in substantial detail. If the process is designed to be controllable, the control system can be more effective and simpler, and, therefore easier to operate and maintain.
71.2 The pH - Measuring System Most of the unique cl~aracteristicsof the pH control loop center around its measuring system with its unusual combination of high sensitivity and wide range. The fundamentals of the 0-8493-8570-9/96/$0.06+$.50' @ 1996 by CRC Press, Inc.
measuring system are firm and not likely to change with time or technology. However, the methods and devices used to measure pH can be expected to continue to,improve, as obstacles are overcome. Still, failures can occur, and recognizing the possible failure modes is important in anticipating the response of the controls to them.
71.2.1 The pH Scale The pH measuring system is expressly designed to report the activity of hydrogen ions in an aqueous solution. (While pH measurements can be made in some nonaqueous media such as methanol, the results are not equivalent to those obtained in water and require independent study.) The true concentrationof the hydrogen ions may differ from its measured activity at pH levels below 2, where ion mobility may be impeded. Most pH loops have control points in the pH 2-12 range, however, where activity and concentration are essentially identical. Consequently, the pH measurement will be used herein as an indication of the concentration of hydrogen ions [H+] in solution. That concentration is expressed in terms of gram-ions of hydrogen per liter of solution. A Normal solution will contain one gram-ion per liter. Laboratory solutions are typically prepared using the Normal scale, e.g., 1.0 N, 0.01 N, etc., representing the number of replaceable gram-ions of hydrogen or hydroxyl ions per liter. When a pH measuring electrode is placed in a solution with a reference electrode and a liquid junction between them, a potential difference E may be measured between the two electrodes:
THE CONTROL HANDBOOK
where E r e , is the potential of the reference cell and Ej that of the liquid junction. The coAmonly accepted method of expressing base- 10 logarithms is to use p to indicate the negative power. Thus, pH = log[^+] and [H+] = 10-pH. The potential developed by the pH electrode is therefore 59.16 millivolts per pH unit. Actually, this coefficient of 59.15 is precise only at 25OC, varying with TI298 where T is the absolute temperature in Kelvin. The reference cell is designed to match the internal cell of the pH electrode, buffered to pH 7. Therefore at pH 7, there is no potential difference between the measuring and reference electrodes, and temperature has no effect there, the isopotentialpoint. Temperature compensation may be applied either automatically or manually, as desired. The liquid-junction potential representsanyvoltage developed by diffusion of ions across the junction between the solution being measured and that of the reference electrode. Potassium chloride is selected for the reference solution, because its ions have approximately the same ionic mobility, thereby minimizing the junction potential. For most solutions, Ej is less than 1 mV, and is not considered a major source of error. The net result of the above is a linear scale of pH vs. millivolts, with zero voltage corresponding to pH 7. The significance of pH 7 is that it represents the neutral point for water, where hydrogen and hydroxyl ions are equal. The two ions are related via the equilibrium constant, which is 10-l4 at 25OC:
The equilibrium constant is also a function of temperature, causing a given solution to change pH with temperature [I]. This effect is only significant above pH 6, and is not normally included in the temperature compensation applied to the electrodes.
71.2.2 Electrodes The pH measuring electrode in most common use is the glass electrode. It consistsof a glass bulb containing a solution buffered to pH 7, with an internal reference cell identical to that in the reference electrode. This combination produces the null voltage at pH 7. The sensitive portion of the electrode is a thin glass membrane saturated with water. The membrane must be thin enough to keep its electrical impedance in a reasonable range and yet resist breakage under normal use. The impedance of the glass can exceed 100 megohms, requiring extremely high-impedance voltage-measuring devices to produce accurate results. In view of these special requirements, ordinary voltmeters and wiring cannot be used to measure pH, as circuits are very sensitive to electrical leakage and grounding. The reference electrode forms the other connection between the solution to be measured and the voltage measuring device. It must have a stable voltage (where that of the measuring electrode v.%rieswith solution pH) and a low impedance.
The extremely high impedance of the glass electrode makes it susceptible to short-circuiting. Short-circuiting reduces the measured millivolts relative to the true pH of the solution, producing pH readings closer to 7 than actuallythe case. This is an unfortunate failure mode, for 7 is the most common set point, especially in wastewater treatment, and will not cause alarm. Rovira [2] describes a microprocessor-basedpH transmitter with the capability of checking the impedance ofthe glass electrode and thereby warning of this type of failure. There is only one adjustment available to calibrate or standardize a pH measuring system. If the electrodes are in good working order, calibration against one buffer should produce accurate results against other buffers without further adjustment. If it does not, then the millivoltage change per pH unit is not 59.16 at 25OC, indicating that the glass electrode is damaged or that there is an electrical leak. Coating of the pH electrode introduces a time lag between the ions in the process solution and those at the surface of the electrode. Coating can be verified by the appearance of the electrodes, but be aware that a film as thin as a millimeter could produce a time constant of several minutes. McMillan [ I ] recommends increasing the velocity of the process fluid past the electrodes to 7 ftlsec. to minimize fouling, although velocities exceeding 10, ftlsec. may cause excessive noise and erosion.
71.3 Process Titration Curves What we recognize as the nonlinearity in pH control loops is the process titration curve, the relationship between measured pH and the amount of acid or base reagent added to a solution. The reagents are usually strong acids or bases ofknown concentration added to a solution that contains variable concentrations of possibly unknown substances. Laboratory titrations are conducted batchwise by adding reagent incrementallyto a measured volume of process solution. In a process plant, pH control can be applied to a batch of solution, but more often is applied to a flowing stream, matching reagent flow to it. In a continuous operation, only that portion of the titration curve near the set point may be visible, but the slope of the curve in this region determines the steady-stateprocess gain, and the amount of reagent required to reach that point determines the process load. Therefore,titration curves are as important in continuous processing as in batch. Laboratory titrations are usually performed using standard solutions prepared to 0.01 N or similar concentration. Process reagents are usually much more concentrated, but their concentration must be known in the same terms to estimate required delivery based on laboratory titrations of process s a p l e s . The normality of 32% HC1 is 10:17 N, that of 98% H2S04 is 36.0 N, that of 25% NaOH is 7.93 N, and that of 10% Ca(OH)2 is 2.86 N. For other solutions, normality can be calculated by multiplying the weight percent concentration of the reagent by 10np/M, where n is the number of replaceable H+ or OH- ions per molecule, p is its density in glml, and M is its molecular weight.
71.3. PROCESS TITRATION CURVES
71.3.1 The Strong Acid-Strong Base Curve A "strong " agent is one which ionizes completely in solution. Examples of strong acids are HC1 and HNO3, and of bases are the alkalies NaOH and KOH. Surprisingly, H2S04 and HF are not classified as strong acids, despite their tendency to dehydrate and etch glass, respectively. A common basic reagent, Ca(OH)2, similarly is not strong, and also has limited solubility. Consider a reaction between NaOH and HCl: HCl + NaOH $. H 2 0 -+ H+ + OH-
+ ~ a ++C1-
(71.3)
The sum of the negative and positive charges must be equal: [ ~ a ] +-b [H' ] = [Cl-]
+ [OH-]
(71.4)
With both reagents being completely ionized, the concentration of the acid x ~ can be substituted for [Cl- ] and the concentration of the base x ~ for [ ~ a] +: XB
--
XA
+
= [OH-] - [H 1.
(71.5)
Due to the equilibrium of hydrogen and hydroxyl ions in water, the former may be substituted for the latter, using Equation 71.2:
mismatch of 1.0% will leave it at 5 or 9; a mismatch of 0.1% is required to produce a pH in the 6-8 range. The size of the target does not change with the scale, but the size of the reagent valve does. For example, if the wastewater were to enter one pH unit farther away from neutral, ten times as much reagent would be required to neutralize it. The accuracy of deliverymust then be ten times as great percentagewise to control to the same target. For example, a solution represented by the outer curve in Figure 7 1.1 would require a percentage accuracy in reagent delivery 100 times as great as for the inneir curve. A graphic illustration of the problem can be envisioned by comparing the size of the target to our distance from it. Think of the range of pH 6-8 as a target one foot in diameter, x ~ - x ~ being essentially f at the edge. If the entering solution is at pH 5 or 9, the distance to the target is ten times its radius or five feet away. From this distance, the target could be easily hit by throwing darts at it. If the entering pH were 3 or 11, the same target would be 500 feet away, requiring an expert with a highpowered rifle. If the pH entering were 1 or 13, the same target would be ten miles away, requiring a guided missile. To maximize the accuracy of reagent delivery, dead band in the control valves must be eliminated through the use of valve positioners. Dead band has been observed to cause limit-cycling in control of pH and reduction-oxidation potential, which was eliminated by installing positioners on the valves.
Placed in terms of pH, the relationship becomes
71.3.3 Weak Acids and Bases This is the fundanlental pH relationship in the absence of buffering. Although buffer:; are common to a limited extent in most solutions, Equation 71.7 describes the most severe titration curve that a control system must handle. It is shown for different scales of normality in Figure 71. I . Although the curves appear to have different breakpoints and therefore to .be different curves, they are, in fact, a single cnrve shown at two different ranges of sensitivity. Both curves have the same characteristic in that a base-acid mismatch of one-ten th of fullscale leaves the pH only one unit from the end of the curye. Similarly, a mismatch of 11100 of full scale leaves the pH two units from the end of the curve. This logarithmic relationship holds everywhere that one of the terms on the right of Equation 71.7 is negligible compared to the other, i.e., everywhere except in the range of pH 6-8.
71.3.2 Distance from Target The essence of the plH-control problem is the matching of acid and base at neutrality. In the strong acid-strong base system, neutrality occurs at pH 7, and therefore a perfect match will bring this result. In some process work, pH must be controlled within f0.1 unit, biit wastewater limits are usually pH 6-9, a much broader target Consider how difficult it can be to control wastewater pH in the absence of buffering. If the wastewater enters at pH 3, as in the inner curve of Figure 71.1, a mismatch between acid and base of 10% will leave the pH at 4 or 10; a
Most acids and bases do not completely ionize in solution, but establish an equilibrium between the concentration of the undissociated molecule and its ions. A monoprotic agent is one that has a single ionizable hydrogen or hydroxyl group. An example is acetic acid, which dissociates into hydrogen and a'cetateions in accordance with the following relationship: [H+ 1[AC-] = K A [HAc]
(71.8)
where K A is the ionization constant for acetic acid at 1 0 - ~ . ~ ~ . A pH electrode senses ozly the [H+], leaving the undissociated [HAc] unmeasured. However, any adjustment to solution pH made by adding either base or another acid will shift the equilibrium. For example, increasing [Hf ] with a strong acid will . convert some of the [Ac-] into [HAc], leaving mlore of it unmeasured. Conversely, reducing [H+] with a base will shift the equilibrium the other way, ionizing more of the [HAc];This behavior tends to reduce the change in pH to additions of acids and bases, and is called buffering. Adding acetic acid to the former solution of HCl and NaOH adds acetate ions to the charge balance: [H+]
+ [ ~ a +=] [OH-] + [Cl-] + [Ac-1.
(71.9)
Let the total concentration of acetic acid be represented by
The [HAc] in the above expression can be replaced 'by its value from Equation 7 1.8:
THE CONTROL HANDBOOK
Base - Acid, N Figure 71.1
Titration curve for strong base against strong acid, two scales.
Replacing all of the ions in the charge balance with equivalent values of the three ingredients, and [H+] with pH, gives
where pKA (like pH) is the negative base-10 logarithm of the ionization constant. The titration curve for a monoprotic weak base is derived in a fashion similar to that of the acid. As Equation 71.12 simply had the weak-acid term added to the strong acid-strong base relationship of Equation 71.7, so the following weak-base term may be added in the same way:
where XB, represents the concentration of weak base whose ion. is a ization constant has the negative logarithm ~ K BAmmonia weak base having a pKB of 4.75, identical to the pKA value for acetic acid. Most of the weak acids and bases encountered in industry have two ions, and are hence classified as diprotic. A very common diprotic weak acid is carbon dioxide, encountered in groundwater in the form of carbonate and bicarbonate ions. Dissociation of C02 and water into H+ and HCOY ions has a P K A value ~ of 6.35, so that carbonic acid is weaker than acetic acid by a factor of about 40. A second equilibrium takes place between HCOT ions and C O ~ -ions with a pKA2value of 10.25. Because this second value exceeds 7 , the carbonate ion effectively acts as a B of 14 - 10.25 or 3.75. weak base having a ~ K value Derivation of the equations representing the titration curves for diprotic agents follows the same procedures used for monoprotic agents. The result is adding to Equation 71.7 the following for weak diprotic acids: xAw(l+ 0.5 x 10PKa2-pH)
+
1
+ ~ o P K A * - P H ( ~~O+ P K A I - P H )
(71'14)
where XA, represents the concentration of the weak acid. Figure 71.2 is a titration curve for water containing 0.001 N C02, equivalent to a total alkalinity of 50 ppm as CaC03, the customary method of reporting water hardness. On the larger scale o f f 0.01 N, the bicarbonate shows its presence with a slight moderation of the titration curve around pH 6.35. This makes the pH ofwastewaters easier to control than it would be otherwise, and most water does contain significant amounts of bicarbonate ions. The second breakpoint associated with the carbonate ionization is not visible on this scale. On the smaller scale o f f 0.001 N, the bicarbonate buffering is even more pronounced. This characteristic makes bicarbonate useful for neutralizing stomach acids. The neutral point for this solution is pH 8, however only half as much strong base as acid is required to reach the neutral point. The reason for this is that ~ 7 , resulting in carbonates being weak the value of P K A exceeds bases. Na2C03 is a useful reagent for neutralizing acids, because it produces a titration curve well-buffered in the region of pH 6-7; unfortunately, twice as much sodium is required per mol of acid than if NaOH were used, making it too costly for most waste treatment applications. Carbonates are common in cleaning solutions, and thereforein wastewater, but their concentration may fluctuate considerably, as most cleaning is done batchwise. Other cleaning agents such as phosphates and silicates, ammonia and amines, and soaps and detergents of all kinds are weak agents. Consequently, the degree of buffering in industrial wastewater may be expected to be quite variable. The term added to the pH relationship for weak diprotic bases is similar to that for acids:
A list of ionization constants' for weak acids and bases is given in Table 71.1. For a more complete list, consult Weast [ 7 ] .
71.4. DESIGNING A CONTROLLABLE PROCESS
-
Base Acid, N
Figure 71.2
Titratic~nof 0.001 N CO2 by strong base on two scales. Ionization Constants for Cornmoll Acids and Bases. P K A ~ PKAZ Acid Acetic acid 4.75 Boric acid 4.7 9.1
TABLE 71.1
Carbon dioxide
Chrom.icacid Ferric ion Ferrous ion Formic acid Hydrogen fluoride
6.35 0.7
10.25 6.2
Hydrogen sulfide
Hypocfllorous acid Oxalic acid Phosphoric acid Sulfuric:acid Sulfur -dioxide Base
Aluminlate ion Ammonia Barium.hydroxide Calciurn hydroxide Ciipric hydroxide Ethylacnine Hydrazine Hydroxylamine Magnesium hydroxide Methylamine Phenol Silicon hydroxide Urea Zinc hydroxide
Any of the agent:; in the table could be listed either as an acid or base. For example, the ferric ion is listed as an acid with ~ K values A of 2.5 and 4.7; it could have been listed as ferric hydroxide with pK5 values complementary at 9.3 and 11.5, the
effect is the same. Most metal ions are weak agents, producing titration curves with multiple breakpoints like the inner curve of Figure 71.2. Therefore, effluents from metal treating operations, like pickling and plating, tend to be heavily buffered at all times, and their pH is generally easy to control. While it is possible to create titration curves from the equations and pK values above, it is strongly recommended that several samples be taken at different times of day from th~estream whose pH is to be controlled and titrated to identify how extreme the curves are and how much they may vary. This information will be useful in sizing valves and reducing the likelihood of surprises when the control system is commissioned.
71.4 Designing a Controllable Process The extreme gain changes presented by most titration curves, and their variablilty,pose special problems for the pH control system. The typically very high gain of the titration curve in the region about set point is its most demanding feature, causing most pH loops to cycle about the set point. Uniform cycling developswhen the gain of the control loop is unity. Loop gain is the product of all of the gains in the loop: process and controller, steady-state and dynamic. The high steady-state gain of the typical titration curve requires a proportionately low controller gain if cycling is to be avoided. But a low controller gain means poor response to disturbances in the flow and composition of the stream whose pH is being controlled. To maximize the controller gain, the process gains need to be minimized. The steady-stategain terms for the process consist of the titration curve converting concentration to pH, the reagent valve convertingpercent controller output to reagent concentration, and the pH transmitter converting pH to percent controller input. Other than selecting a buffering reagent such as Na2C03 or CO2, which may not be feasible, we must accept the process titration curve as it exists. Similarly, the valve size is determined
THE CONTROL HANDBOOK
by process needs, as is the transmitter span. But the process has a dynamic gain, too, and this can be minimized by proper process design. While this is always good practice, it is especially critical in pH control. Furthermore, in most processes, the design is determined by other considerations such as reaction rate, heat transfer, etc. But most acid-base reactions are zero order and generate negligible amounts of heat, so these considerations do not apply. The process designer is then free to choose vessel size, layout, mixing, piping, etc., and should choose those which favor dynamic response.
71.4.1 A Dynamic Model of Mixing Consider an acid-base reaction performed by combining the process stream and reagent in a static mixer as shown in Figure 71.3 (but without the circulating pump, whose function is described later). A static mixer consists of a pipe fitted with internal baffles which alternately split and rotate the fluids, resulting in a homogeneous blend at the exit. The principal problem with the static mixer is that it presents pure deadtime to the controller. A step change in the reagent flow will produce no observable response in the pH at the exit until the new mixture emerges, at which time the full extrnt of the step appears at once. The dynamic gain of the mixer is 1.0 because the step at the inlet appears as a step at the exit, not diminished or spread out over time. The time between the initiation of the step and its appearance in the response of the pH transmitter is the deadtime caused by transportation through the mixer. It is equal to the length of the mixer divided by the velocity of the mix, or the volume of the mixer divided by the flow of the mix. (The connection between the mixer exit and the pH electrodes must be included in this calculation.) In addition to the high dynamic gain of the mixer, its deadtime is also variable, as a function of flow. This variablilty presents an additional problem for the controller, in that its optimum integral time is contingent on the process deadtime. If tuned for the highest flow (and hence shortest deadtime), the controller will integrate too fast for the process to respond at lower flow rates, causing cycling to develop. Therefore, the controller must be tuned at the lowest expected flow (where deadtime is the longest) for stability across the flow range. However, control action will then be slower than desired at higher flow rates. And if flow should ever stop, the control loop will be open. For the above reasons, static mixers are not recommended for pH control, and transpoitation delay of effluent to the pH electrodes should be eliminated or minimized. The undesirable properties of the static mixer can be mitigated, however, by adding the circulating pump also shown in Figure 71.3. The pump applies suction downstream of the pH electrodes, recirculating treated product at a flow rate Fa relative to the discharge rate F. This reduces the deadtime through the process from V / F to V / ( F + Fa), where V is the volume of the process, and also places an upper limit on the deadtime in the loop at V / Fa. But it reduces the dynamic gain of the process as well. Consider the case where Fa = F , and the feed is being mixed with enough reagent introduced stepwise to change the prod-
uct concentration by one unit in the steady state. As the step in reagent flow is introduced, however, the reagent is being diluted with twice as much flow as without recirculation, so that its effect on product concentration after the lapse of the deadtime is reduced by a factor of two. At this point, this half-strength product is recirculated and mixed in equal volumes with full-strength feed, causing the blend to reach 75 percent of full strength after the lapse of another deadtime. Then the 75% product is recirculated to produce 87.5% product after the next deadtime, etc. The product concentration approaches its steady-state value covering 50% of the remaining distance each deadtime. The dynamic gain of the process has effectively been reduced by 50% . Increasing recirculation F,, to 4F reduces deadtime to 0.2 V / F and the dynamic gain to 0.2, as illustrated by the staircase response in Figure 71.4. The effect of recirculation is quite predictable. Next consider a backnlixed tank with approximately equal diameter and depth of liquid, as shown in Figure 71.5, and let the agitator be sized to recirculate liquid at a rate F, equal to 4 F . The same step response will be obtained as with the static mixer, except that the tank itself will not produce the same plug-flow profile as the baffled static mixer. As a result, the duratiou of the molecule travel from the reagent valve to rhe pH sensor has .a wider distribution. Some will take longer than the deadtime estimated as if it were a static mixer, and some will take less time, although the average will be the same. The resultingstep response will be better described by the smooth curve passing through the midpoints of all of the steps in Figure 71.4. This has the advantage of reducing the deadtime by a factor of two. Therefore, for the backmixed vessel, deadtime td can be estimated as
The average time that a molecule remains in the vessel is called its residence time, calculated as V / F . Observe that the exponential curve of Figure 71.4 passes through 63.2% response at time V / F. This curve represents the familiar step response of a first-order lag, which is 63.2% complete when the elapsed time from the beginning of the curve is equal to the time constant of the lag. The difference between the residence time and the deadtime is therefore the time constant of the vessel's first-order lag, :
and a dynamic gain A first-order lag has a phase angle G 1 which are functions of its time constant and the frequency or period at which theloop is cycling. In process control, the period, the duration of a complete cycle, is easier to measure than the frequency in cycles per minute or radians per minute, which in practice is calculated from the observed period. Therefore, the phase lag and dynamic gain are given here as functions of the period of oscillation rQ [5, p. 281:
The period of oscillation is a dependent variable, produced where thesum of the dynamic phaselags in theloop reaches 180 degrees. Process deadtime also contributes to the loop phase lag, but its dynamic gain is always unity:
71.4. DESIGNING A CONTROLLABLE PROCESS
Reagent
!
Figure 71.3
Contro llng pH using a static mixer features variable deadtlme and should not be used without recirculation. 100
Time Figure 71.4
The steps are produced by recirculation through a plug-flow process; the curve is produced by a backrnixed tank.
If a proportional col~trolleris used, which has no dynamic phase lag of its own, the period of oscillation determined by the process dynamics alone is then called the natural period, T,. Solution of $1 $d = - 180' lor i, by trial and error can give the natural period of the process; then its dynamic gain G 1 can be expressed as a function of the ratio of rd/rl produced by a given degree of mixing. Table 71.2 displays the results of these estimates for a selected set of rd/rl ratios.
+
TABLE 71.2
Natural Pegiod and Dynamic Gain with
Mixing. FaIF td/rl 00
2.0 1.O 0.5 0.2 0.1
r,I(V/F)
G1
Plug-Flow
Badunixed
2 .o 1 83 1.55 1.14 0.62 0.35
1.000 0.657 0.441 0.262 0.117 0.061
0 0.5 1O . 2.0 5.0 10.0
0 0.5 2.0 4.5
Reducing the t d / t l ratio by recirculation has two advantages: it shortens the natural period relative to the residence time of the vessel, and it lowers its dynamic gain as well. This allows an increase in controller gain (commensurate with the reduction in process dynamic gain) and a reduction in its integral time (commensurate with the reduction in natural period). For a given upset then, the deviation from set point will be smaller and of shorter duration. The corresponding recirculation ratios F,/F for both the plug-flow (static) mixer and the backrnixed tank are included in the table.
71.4.2 Vessel Design The principal elements to be determined in designing a vessel are its volume, its dimensions, and the location of inlet and outlet ports. Its volume is determined by the residence time needed for the reaction to go to completion at the maximum rate of feed. Since acid-base reactions are essentially zero order, the residence time on the basis of reaction rate is indeterminant. Then the limiting factor may be the response time of the pH electrode, which could be from a few secondsto a minute or more depending on the thickness of the glass and the presence of any coating [I]. Neutralization reactionshave been successfully conductedwithin
THE CONTROL HANDBOOK
Reagent
Feed
4 U Product
r
Figure 71.5
This is the preferred location of ports and of the pH measurement in a backmixed tank.
the casing of a centrifugal pump, using the impeller as a mixer. d e n reacting species are not in the liquid phase, however, time must be allowed for mass transfer between phases. The most common example of this type is the use of a slurry of hydrated lime as a reagent. The Ca(OH)2 already in solution reacts immediately, but must be replaced by mass transfer from the solid phase, which takes time. Neutralization of acid wastes with lime slurry has been successfully controlled in vessels having a residence time of 15 minutes. If the residence time is too short, the effluent will leave the vessel with some lime still undissolved, which will raise the pH when dissolution is complete. In the absence of mass transfer limitations such as this, a residence time of 3-5 minutes is ordinarily adequate. If the pH of the feed stream is within the 2-12 range, neutralization can ordinarily be controlled successfully within a single vessel. Should the pH of the feed fall outside this range, it should be first pretreated in a smaller vessel, where the pH is controlled in the 2-3 range for acid feeds or 11-12 for basic feeds, and then brought to neutral in a second, larger vessel. The reason for recommending a smaller vessel for pretreatment is to separate the periods of the two pH loops. A control loop is most sensitive to cyclic disturbances of its own period, and can actually amplify those cycles rather than attenuate them. Given equal recirculation ratios in the two vessels, the periods of the two pH loops will be proportional to their residence times. The sensitivity of the second loop to cycles in the first can be reduced by a factor of ten if the ratio of their residence times is 3:l [S,pp. 130-1341. The vessel should be of equal dimensions in all directions, to minimize deadtime between inlet and outlet. Trying to control pH in a long narrow channel is ineffective, no matter how thoroughly - . mixed, because it is essentially all deadtime. If the long narrow channel is already fixed, or if pH is to be controlled in a pond (e.g., a flocculation basin or an ash pond at a power plant), then a cubic section should be walled off at the inlet as shown in Figure 71.6,with pH controlled at the exit from that section. The remainder of the channel or pond can be used to attenuate fluctuations, precipitate solids, etc.
The vessel where pH is to be control!ed should be either cylindrical or cubic, with inlet and outlet ports diametrically opposed as shown in Figure 71.5. This avoids short-circuiting the vessel, allowing use of its full capacity. The pH electrodes should be located directly in front of the exit port, to give accurate representation of the product qualitv. Mrhile faster response may be obtained by moving them closer to the inlet, the measurement no longer represents product quality. However, avoid placing the electrodes within the exit pipe itself, as that adds deadtime to the loop, which varies inversely with feed rate; it also results in an open loop whenever feed is discontinued. Reagent should be premixed with the feed at the feed point, rather than introduced anywhere else, because reagent introduced at other locations has prod!~ced erratic results. The feed ordinarily enters through a free fall, whose turbulence affords adequate premixing if the reagent is introduced there. Feed could be introduced near the bottom of the vessel and product withdrawn from the top on the other side, but this increases the deadtime between reagent addition and electrode response. Agitators are designed to pump downward (assuming aeration is not the purpose): the gen ral pattern of flow in Figure 71.5 carries reagent down the shgft of the mixer over to the electrodesin a short path. If feed and reagent are introduced near the bottom, however, the pattern of flow takes reagent up the wall of the vessel, then down the center, and finally up the wall on the other side where the electrodes are losated; this path is twice as long, doubling the deadtime in the loop. Equation 71.16 applies to the arrangement shown in Figure 71.5.
B
71.4.3
Effective Backmixing
For pH control, high-velocitymixing is most effective. Thevessel should be fitted with a direct-drive axial turbine (or marine propeller in smaller sizes) to maximize circulation rate. Low-speed radial mixers intended to keep solidsin suspension increase deadtime excessively. If a vortex forms, the solution is rotating like a wheel, without
71.5. CONTROL SYSTEM DESIGN
Reagent
I Figure 71.6 product.
Recycle --------------A
I I I
A large settling basin should be partitioned for pH control at the inlet; a pH alarm at the outlet starts recirculation of off-specification
distributing the reagent uniformly. In smaller cylindrical vessels, tilting the shaft of f!he mixer or offsetting it from center can eliiinate vortex formation. In smaller rectangular vessels, the corners may be sufficient to prevent it, but in large, especially cylindrical vessels, vertical baffles are needed. Horsepower recluirements per gallon of vessel capacity decrease as the vessel capacity increases from 2.3 Hp for vessels of 1,000 gal to 0.8 Hp11,OOO gal for 100,000 gal. This level of agitation gives rd/rl ratios of about 0.05 13, p. 1601.
71.4.4 Reagent Delivery Systems Although Figure 71.5 shows the reagent valve located atop the vessel, it may be located in a more protected environment. However, its discharge line must not be allowed to empty when the valve is dosed, as this adds deadtime to the loop. To avoid this problem, place a loop seal (pigtail) in the reagent pipe at the point of discharge, even if the discharge is below the surface of the solution. If the reagent is a slurry such as lime, it must be kept in circulation at all times ito avoid plugging. A circulation pump must convey the slurry From its agitated supply tank to the reaction vessel and back. At the reaction vessel a "Y" connection then brings slurry up to1 a ball control valve located at the high point in its line, from which reagent drops into the vessel [3, p. 1781. Then the ball valve will not plug on closing, as solids will settle away from it. Often, feed is supplied by a pump driven by a level switch in a sump. In this case, feed rate is constant whiie it is flowing, but there are also periods of no flow. To protect the system against using reagents needlessly, the air supply to their control valves should be cut off whenever the feed pump is not running. At the same time, the pH controller should be switched to the manual mode to avoid integral windup, which would cause overshoot when the pump starts. When the pump is restarted, the controller
is returned to the automatic mode at the same output it had when last in automatic; its set point should not be initialized while in manual.
71.4.5 Protection Against Failure Failures are unavoidable. The pH electrodes could fail, reagent could plug or run out, or the feed could temporarily overload the reagent delivery system. Because of the damage which could result from off-specification product, protection is essential. A very effective strategy is shown in Figure 71.6. A second pH transmitter is located at the outfall, downstream of the point where pH is controlled. It actuates an alarm if the product is out of specification,alerting operators to a failure. At the same time, the alarm logic closes the dischargevalve and starts a recirculationpump to treat the effluent again. The process must have sufficient extra capacity to operate under these conditions for an hour or more, to allow time for diagnosis and repair of the failure.
71.5 Control System Design If all of the above recommendations are carefully implemented, control system design is a relatively simple matter. But if they are not, even the most elaborately designed control system may not produce satisfactory results. This section considers those elements of design which will provide the simplest, most effective system.
71.5.1 Valve or Pump Selection The first order of business is estimating the maximum reagent flow required to meet the highest demand. (This does not necessarily correspond to the highest feed rate, because in some
THE CONTROL HANDBOOK plants, the highest feed rate is stormwater which may require no reagent at all.) Then the minimum reagent demand must also be estimated, and divided into the maximum to determine the rangeability required for both conditions. If full flow of reagent is insufficient to match the demand, the pH will violate specifications and the product will have to be impounded and retreated, as described above. If the minimum controllable flow of reagent is too high to match the lowest demand, the pH will tend to limitcycle between that produced by the two conditions of minimum controllable flow and zero flow. The cycle will tend to be sawtoothed, with a relatively long period. Its amplitude could be within specification limits, however. Control valves typically have a rangeability of 35-100:1, higher than metering pumps at 10-20: 1. A linear valve will produce a linear relationship between controller output and reagent flow if it operates under constant pressure drop, as for example, supplied from a head tank. For higher rangeability, large and small valves must be operated in sequence, with the one not in use completely closed. Proper sequencing requires that the valves have equal-percentage (logarithmic) characteristics, yielding an overall characteristic which is also equal percentage and must be linearized by a matching characterizer. This technique is detailed by Shinskey [5, pp. 400-4011. Ball valves used to manipulate slurry flow ordinarily have equal-percentage characteristics. Ifthe pH ofthe feed could be on either side ofthe set point, then both acid and base valves must be sequenced, with both closed at 50% controller output. Since both valves must fail closed, the base valve should be fitted with a reverse-acting positioner, opening the valve as the controller output moves from 50% to zero.
71.5.2 Linear and Nonlinear Controllers In the case of a well-buffered titration curve, or controlling at a set point away from the region of highest gain, a linear pH controller may be used quite satisfactorily. Where neither ofthese mitigating factors apply, however, a linear controller will not give satisfactory results. Many p H loops limit-cycle between pH 4-10 or 5-9 using a linear PID controller having a proportional band below400% (proportional gain > 0.25). Ifthe proportional band is increased (gain lowered) until cycling stops, the controller will not respond satisfactorily to load upsets, allowing large deviations to last for exceptionally long durations. The reason for this behavior is the very low gain ofthe titration curve beyond p H 4-10. When a disturbance forces the pH out of the 4-10 range, the loop gain is so low that very little correction is applied by the controller, amounting to integration of the large deviation over time. Eventually, reagent flow will match the load, but it is likely to cause overshoot as shown for the step load change in Figure 71.7. The long durations during which the pH is out of specification could cause damage to the environment downstream unless protection is provided, in which case the protective system may severely curtail production. The simplest characterizer for pH control is a three-piece function which produces a low gain within an adjustable zone around the set point, with normal gain beyond. The width of the zone
and the gain within it must be adjustable to match the titration curve. (Note that this is not the same as switching the controller gain w h ~ nthe deviation changes zones, as this would bump the output.) Although an imperfect match for the typical curve, this compensation provides the required combination of effective response to load changes with stability at the set point. The step load response in Figure 7 1.8 for the same process as in Figure 71.7 shows the effectiveness of the three-piece compensator. The width of the low-gain zone is 2~2.2pH around the set point, and its gain is 0.05 compared to a gain of 1.0 outside the zone. The proportional band of the controller is a factor of 14 lower (gain 14 times higher) than that of the linear controller of Figure 71.7. This combination actually gives more stability (lower loop gain) around the set point. For very precise pH control of a process having a single welldefined titration curve, more accurate characterization is needed. In building such a characterizer, remember that input and output axes are reversed, pH being the input to the characterizer and equivalent concentration the output. The required characterizer simply has more points connected by straight lines than the threepiece characterizer described above. If the set point is always to remain in a fixed position, this characterizer can relate input and output in terms of deviation, as was done with the three-piece characterizer. However, if set-point adjustments are likely, then there must be two identical characterizers used, one applied to the pH measurement and the other to fhe set point. (In a digital controller, the same characterizer can be used for both.)
71.5.3 Controller Tuning Procedures When a characterizer is not required, tuning of the PID controller should be carried out following the same rules recommended for other loops. Shinskey [ 6 ]gives tuning rules in terms of gain, deadtime, and time constant, or the natural period and gain at that period. For a PI controller, integral time should be set at 4 . 0 or ~ 5,; for a PID controller, integral time should be 2.0rd or 0.55, and derivative time 0 . 5 or ~ 0. 12r,. Because pH loops tend to limit-cycle, the natural period is easy to obtain and should give better results. Derivative action is strongly recommended for p H control. Derivative action amplifies noise, however, so that if measurement noise is excessive, derivative action may not be useful, at least without filtering. Aviable alternative is the PIrd controll&, a PI controller with deadtime compensation [6]. When properly tuned, it provides integration without phase lag and without amplifying noise. Optimum integral time is the same as for the PID controller, and optimum controller deadtime is 25% less. Proportional gain should be adjusted for quick recovery from disturbances in the process load without excessive overshoot or cycling. Rules for fine-tuning based on observed overshoot and decay ratio in response to disturbances in the closed loop are also provided in Shinskey 161. ~ 6 s ptH loops operate at the same set point all the time. If a loop need not contend with set point disturbances, it should not be tested by disturbing the set point. Instead, a simuiatsd load change should be introduced by manually stepping the controller output from a steady state
71.5. COl"c'T.ROI. SYSTEM DESIGN 14 -
I 12 ;
1
; 6 -
--.
I
!
I
j
I
I
I
I I
i
I
1
I
1
--------------- -.
I I
I
4
2
I
I
I
I
(J 3 pH
I
1
I
I
~
I
i
0
Time
Figure 71.7
A llnear PID controller tuned to avoid cycl~ngwill be slow In responding to load changes.
Time
Figure 71.8
A three-piece nonlinear characterizer adds fast recovery and stabilityto the PID loop.
at set point, and immediately transferring the controller to the automatic mode. The effect of such a disturbance willbe identical to that of the step load changes introduced in Figures 71.7 and 71.8.
If a three-piece characterizer is used to compensate the titration curve, it should be adjusted to position its breakpoints about 1 pH unit inside the knees of the curve. The slope of the characterizer between the breakpoints should be in the order of 0.05: if set 0.1, it will not provide enough gain change for effective compensation, and, if set at zero, it will cause slow cycling between the breakpoints. In the event the titration curve is unknown, the width of the low-gain zone should be initially set at zero and the proportional band decreased (gain increased) until cycling begins. 'The width should then be increased until the cycling abates. There is no hope of adjusting more than two breakpoints on-lint.
An encounter with a particularly variable titration curve led the author to develop a self-tuning characterizer [4]. It was a three-piece characterizer with remotely adjustable width for the low-gain zone. In the presence of an oscillation, the zone would expand until the oscillationstopped; in the presence of prolonged deviation it would contract. This solved the immediate problem, but required a constant period of oscillation, because the PID settings of the controller and the dynamics of the tuner were manually set. Now there are autotuning and self-tuning PI13 controllers on the market. Autotuning generally means that the controller tests the process on request, either with a step in the open loop or by relay-cycling in the closed loop to generate initial PID settings (which may or may not be effective). These controllers have no means for improving on the initial settings, however, or of automatically adapting them to observed variations in pro-
THE CONTROL HANDBOOK
cess parameters. A true selftuning controller observes closedloop responses to naturally occurring disturbances, adjusting the controller settings to converge to a response that is optimum or represents a desired overshoot or decay ratio. The self-tuning controller therefore is capable of automatically adapting its settings to variations in process parameters. If a linear self-tuning controller is applied to a nonlinear pH process, it will adjust its gain low enough to dampen oscillations about set point. When a load disturbance follows, it will begin to respond like the low-gain linear controller in Figure 71.7. Observing the slow recovery of the loop, it will then increase controller gain until the set point is crossed, at which point oscillation begins, requiring that the gain be reduced to its former value. Although the resulting response is an improvement over that in Figure 71.7, it is not as effective as that in Figure 71.8. Therefore, nonlinear characterization is recommended, fit to the most severe titration curve likely to be encountered, with self-tuning used to adapt the controller gain to more moderate curves. Self-tuning of the integral and derivative settings of the controller can adapt to a period increasing with fouling of the electrodes. While this can keep the loop stable as fouling proceeds, response to disturbances will nonetheless deteriorate due to the increasing time constants of the measurement and the controller. Much more satisfactorybehavior will be achieved by keeping the electrodes clean. In the event that fouling is not continuous and cleaning is required only periodically, an alarm can be activated when the adapted integral time of the controller indicates that fouling is excessive.
71.5.4 Feedforward pH Control Feedforward control uses a measurement of a disturbing variable to manipulate a correcting variable directly, without waiting for its effect on the controlled variable. In a pH loop, disturbing variables are the flowand composition of the feed. It is quite effective to set the flow of reagent in ratio to the feed flow, and thereby eliminate this source of upset. However, the ratio between the two flows depends on the composition of the feed, which can be quite variable. The feedback pH controller must adjust this ratio, as needed, to control pH and thereby compensate for variations in feed composition. This is done by multiplying the measured feed flow by the output of the pH controller, with the product being the required flow of reagent. The reagent flow must then respond linearly to this signal. This system works well when the rangeabilityof the reagent flow is 10:1 or less, because it depends on the rangeability of flowrneters and metering pumps. Feedforward systems have been implemented using the pH of the feed as an indication of its composition, but with very limited success. This measurement is only representative of composition if the titration curve is fixed, which is never true for wastewater. Efforts have been made to titrate the feed to determine its composition or estimate the shape ofthe titration curve, but this takes time and these complications reduce the reliability of the system. Far more satisfactoryperformance will be realized if the process has been designed to be controllable, and a nonlinear feedback controller with self-tuning is used.
71.5.5 Batch pH Control Some pH adjustments are carried out batchwise: this is the case for small volumes, or solutions too precious or toxic to be allowed discharge without assurance of precise control. The solution is simply impounded until its composition is satisfactory. Batch processes require a different controller than their continuous counterparts, however. In a continuous process, flow in and out can be expected to vary over time, constituting the major source of load change. In a batch process, there is no flow out, and so the load is essentially zero, while composition changes with time as the neutralization proceeds. If a controller with integral action is used to control the endpoint on a batch process, the controller will windup, an expression indicating that the integral term has reached or exceeded the output limit of the controller. The cause of windup is the typically large deviation from set point experienced while reagent is being added to bring the deviation to zero. When the deviation is then finally reduced to zero, this integral term will keep the reagent valve open, causing a large overshoot. In a zero-load process, overshoot is permanent, and so must be avoided. The recommended controller for a batch pH application is proportional-plus-derivative (PD), whose output bias is set to zero. The controller output will then be zero when the pH rests at the set point, and will be zero as the pH approaches the set point if it is moving. This controller can thereby avoid overshoot, when properly tuned. Proportional gain should be set to keep the reagent at full flow until the pH begins to approach the set point, and the derivative time set to avoid overshoot. An equal-percentage reagent valve will help to compensate for the nonlinearity of the titration curve, because its gain decreases as flow is reduced and, therefore, as the set point is approached.
71.6 Defining Terms Activity: The effect of hydrogen ions on a hydrogen-ion electrode. Autotuning: Tuning a controller automatically based on the application of rules to the results of a test. Backmiring: Recirculating the contents of avessel with its feed. Concentration: The amount of an ingredient per unit of mixture. Dead band: The largest change in signal which fails to cause a valve to move upon reversal of direction. Diprotic: Denoting an acid or base which yields two hydrogen or hydroxyl ions per molecule. Feedforward control: Conversion of a measurement of a disturbing variable into changes in the manipulated variable which will cancel its effect on the controlled variable. Isopotential point: The pH at which changes in temperature have no effect on output voltage.
71.6. DEFINING TERMS Monoprotic: Denoting an acid or base which yields one hydrogen or hydroxyl ion per molecule. Normal: A Norma.1solution contains 1.0 gram-ion of hydrogen or hydroxyl ions pe;liter of solution. Residence time: The average time that molecules spend in a vessel. Self-tuning: Tuning a controller to improve the response observed to changes in set point or load. Solution ground: A connection between the process solution and the instrument ground. Standardize: To cilibrate a pH instrument against a standard solution. Static mixer: An in-line mixer with no moving parts. Titration curve: A plot of pH vs. reagent added to a solution. Valve positioner: A device which forces the position of a valve stem tfomatch the control signal. Windup: Saturation of the integral mode of a controller, causing ove~rshootof the set point.
References [ l ] McMillan, G.K.,Understanding Some Basic Truths of pH Measurements, Chem.Eng.Prog., October, 30-37, 1991. 121 Rovira, W.S., Microprocessors Bring New Power and Flexibilityto lpHTransmitters, Contr.Eng., September, 24-25, 1990. [3] Shinskey, F.G., pH and pion Control in Process and Waste Streams, John Wiley & Sons, New York, 1973. (41 Shinskey, F.G., Adaptive pH Controller Monitors Nonlinear Prlocess, Contr.Eng., February, 57-59,1974. [5] Shinskey, F.G., Process Control Systems, 3rd ed., McGraw-Hill, New York, 1988. [6] Shinskey, F.G;., Manual Tuning Methods, in Feedback Controllers fir the Process Industries, McGraw-Hill, New York, 1994, pp. 143-183. [7] Weast, R.C., Handbook of Chemistry and Physics, The Chemical Rubber Company, Cleveland, OH, 1970.
Control of the Pulp and Paper Making Process 72.1 Introduction .......................................................... 1219 72.2 Pulp and Paper Manufacturing Background ........................ 1220 Wood Species and Raw Materials Products Manufacturing Processes Mechanical Pulping Chemical Pulping Bleaching Paper Making The Pulp and Paper Mill Competitive Marketplace 72.3 Steady-State Process Design and Product Variability ............... 1222 Mill Variability Audit Findings VariabilityAudit Results Impact on Papermaking Final Product Variability * Variability Audit Typical Findings Control Equipment Loop Tuning Control Strategy Redesign, Process Redesign, and Changes in Operating Practice * Pulp and Paper Mill Culture and Awareness of Variability 72.4 Control Equipment Dynamic Performance Specifications ......... 1232 72.5 Linear Control Concepts - Lambda Tuning - Loop Performance Coordinated with Process Goals ...................... 1232 Design Example - Blend Chest Dynamics using Lambda Tuning Pulp and Paper Process Dynamics Lambda Tuning Choosing the Closedloop Time Constant for Robustness and Resonance * Multiple Control Objectives 'Algorithms 72.6 Control Strategies for Uniform Manufacturing.. ................... 1240 Minimizing Variability -Integrated Process and Control Design 72.7 Conclusions .......................................................... 1241 72.8 Defining Terms ....................................................... 1241 References.. ................................................................. 1242
-
W.L. Bialkowski EnTech Control Engineering Inc
72.1 Introduction Pulp and paper products are consumed by almost every sector of modern society--a society that shows no sign of becoming "paperless", especially with the advent of the photocopier and laser printer. Paper is manufactured from wood, a naturally renewable resource, by a large industry with significant economic impact on the world economy. The manufacture of pulp and paper products represents a particularly challenging environment for the control engineer, as it is large in scale, highly nonlinear, highly stochastic, and dominated by time delays. This is not a linear-time-invariant system, but rather a "near-chaotic" environment with all of the implications that this has for process and product variability. The market for paper products, however, is increasingly quality conscious and customer driven, one in which product uniformity (the opposite of product variability) is increasingly being demanded by the customer as the prime ingredient of "quality" and the "price of admission". This chapter reflects the experience gained [3] in the pulp and 0-8493-8570-9/%/$0.00+$.50 @ 1996 by CRC Press, Inc.
paper industry over the last decade, while attempting to improve product and process variability throughout the industry. The control engineer's challenge is not an academic one, but rather it is nothing short of enhancing the ability of his or her company to continue existing in such a competitive world. This challenge extends far beyond linear control theory and involves the control engineer as a critical member of an interdisciplinary team, who brings the essential understanding of dynamics to the manufacturing problem. Without this understanding of dynamics, it is impossible to understand variability. In the pulp and paper industry the control engineeris likely to be the only person who understands dynamics. Hence the scope of interest rnust extend far beyond the narrow confines oflinear control theory or the closedloop dynamic performance of single loops and must encompass the characteristics of the product required by the customer, the characteristicsof the raw materials, the nonlinear and multivariable nature of the process, the effectivenessof the control strategy to remove variability, and the information and training needs of the rest of the team.
THE CONTROL HANDBOOK This chapter will attempt to present the essence of this challenge by developing the subject in the following sequence: pulp and paper background, the impact of steady state design, mill variability results to date, actuator nonlinearities, linear control concepts "Lambda Tuning", algorithms in use, control strategies for uniform manufacturing, minimizing variability by integrating process and control design, and finally conclusions. The reader is assumed to have an electrical engineering background, a strong interest in automatic control, and no prior exposure to the pulp and paper industry.
-
shorter than cellulose. The cell structure is bonded together by lignin, the third wood ingredient, which acts very much like a glue. Typical wood composition by fraction on a dry basis is shown in Table 72.1 . TABLE 72.1
Wood Composition (Dry
Basis), Wood Species
Hardwood
Softwood
Cellulose %
50 30 20
50 20 30
Hemicellulose % Lignin %
72.2 Pulp and Paper Manufacturing Background Paper products, ever present in our daily lives, are manufactured primarily from wood in pulp and paper mills which are large, industrial complexes, each containing thousands of control loops. Let us describe some of this background before developing the control engineering perspective.
72.2.1 Wood Species and Raw Materials Wood from trees, the primary raw material for pulp and paper products, is a renewable natural resource. In North America alone, there are nearly 1000 pulp and paper mills which depend on forest harvesting activities extending from northern Canada, most of the U.S., northern Mexico, and from Newfoundland in the east, to Alaska in the west. Some mills produce and sell pulp as a product. Others are integrated mills that produce pulp and then paper. Some paper mills buy pulp as a raw material, and make paper. Recycling mills take paper waste and make paper. The pulp and paper industry exists in many parts of the world. The prime ingredient of wood, which makes it useful as a raw material, is cellulose, a long chain polymer of the sugar, glucose, which constitutes much of the wall structure of the wood cell, or fiber. Trees grow as a result ofthe natural process of photosynthesis, in which the energy from the sun converts atmospheric C 0 2 into organic compounds, such as cellulose,with the release of 0 2 gas. Growth occurs in the tree underneath the bark, as layers of cells are deposited in the form of annular rings (the age of a tree can be easily determined by counting these annular rings). The wood cell, or fiber, is "cigar" shaped, roughly square in cross-section andvaries in length from 0.5 mm to well over 4 mm. The center of the cell is hollow. Fiber length is a key property for paper making and is primarily determined by the wood species, which divide into two main types: softwoods and hardwoods. Softwoodsare mainly coniferous trees which have needles instead of leaves, such as pines and firs. These species have long fibers: typically 3 mm long (redwoods have 7 mm long fibers) and about 30 fim wide. Hardwoods are primarily the deciduous trees with leaves, such as maple, oak, and aspen. Their fibers tend to be short and "stubby" (typically0.5 mm). The fiber structure consists of many layers of ceilulosic "fibrils': crystalline bundles of cellulose molecules. Wood also contains hemicellulose, polymers of other sugars with chain lengths
The process of removing fibers from solid wood is called "pulping': There are two process for pulping: mechanical and chemical. Mechanical pulping involves "ripping" the fiber out of the solid wood structure by mechanical and thermal means. Mechanical pulp fibers include all of the above ingredients, hence the pulping process has a very high yield (90%+). However, due to the nature ofthepulpingprocess, there is substantial fiberdamage with a resulting loss in strength. Chemical pulping involves chemically dissolving the lignin fraction (and usually the hemicellulose fraction as well). This liberates the cellulose fraction, still in the shape of the original fiber, with relatively little damage, and results in high strength and the potential for high brightness, once all traces of lignin have been removed and the pulp has been bleached. The yield of chemical pulp is low, < 50% to 60%, because the lignin and hemicellulose have been dissolved. Typically, these organic compounds are used as a fuel in a chemical pulp mill.
72.2.2 Products Pulp and paper products cover a broad spectrum. An incomplete list includes newspaper, paper for paperback books, photocopier paper, photographic paper, paper for books, coated paper for magazines, facial tissue, toilet tissue, toweling, disposable diapers, sanitary products, corrugated boxboard, Iinerboard, corrugating medium, food board (e.g., cereal boxes, milk carton), Bristol board, bleached carton (cigarette carton, beer czses), molded products (paper plates), insulation board, roofing felt, fiberboard, market pulp for paper manufacture, and pulp for chemical feed stock such as the manufacture of rayon, film, or food additives. Each of these products requires certain unique properties which must be derived from the raw material. For instance, newspaper must be inexpensive, strong enough to withstand the tension imposed by the printing press without breaking, and have good printink properties. It is often made from recycled paper with added mechanical or chemical softwood pulp for strength. Photocopier paper must have excellent brightness, a superb printing surface and must not curl or jam in the photocopier. Highly bleached chemical pulp is used. This type of paper is made by blending hardwood pulp (for a very smooth printing surface) with softwood pulp (for strength). Additives
72.2. PULP A N D PAPER M A N U F A C T U R I N G B A C K G R O U N D
such as clay and titanium dioxide are added to enhance the printing surface. Facial tissue must be soft and absorbent. It is made from chemically bleached pulp. Market pulp is sold by pulp grade. Each grade having specifications for species, brightness, viscosity, cleanliness, etc. In most cases market pulps are made chemically and are i~suallybleached.
72.2.3 Manufacturing Processes Pulp and paper mills vary greatly in design, as they reflect the product being made:. The simplest division is to separate pulping from papermaking. We will discuss each of the significant areas as unit operations. The degree of control and instrumentation which exists varies widely depending on both the age and design of the plant.
72.2.4 Mechimical Pulping Mechanical pulping can be achieved by grinding or refining. Groundwood mills mechanically grindwhole logs against an abrasive surface. The pulping action is a combination of raising the temperature, and mechanically "ripping" the fiber from the wood surface. This is done by feeding logs to the "pockets" of grinders, which are powered by large synchronous motors (typically 5,000 to 10,000 HP). A typical groundwood mill may have 10 to 20 grinders (50 to 150 control loops). The Thermal-Mechanical Pulping (TMP) process (100 to 300 control loops) is the more modern way of mechanical pulping. It consists of feeding wood chips into the "gap" of rotating pressurized machines, called chip refiners, to produce pulp directly. Normally there are two refining stages. The ch~ipsdisintegrate inside the refiner as they pass between the "teeth" of the refiner plates. Each refiner is powered by a synchronous motor of 10,000 to 30,000 HP. The amount of refining can be controlled by adjusting the gap between the rotating plates. Mlechanical pulps can be brightened (bleached) to some degree.
1221
20 to 200 control loops), in which the pulp is washed in a mu!& stage countercurrent washing process to remove the spent cooking liquor, including the dissolved organic compounds (lignin, hemicellulose) and the spent sodium compounds which must be removed from the pulp stream and.reused. The two output streams consist oE washed pulp (which goes to the bleach plant for bleaching or to the paper machines if bleaching is not required) and the spent liquor which is called black liquor and is returned to the Kraft liquor cycle. The Kraft liquor cycle starts with pumping black liquor from the brown stock washers to the multiple eflkct evaporators (unit process with 20 to 100 control loops) where the black liquor solids are raised from about 15% to about 50% by evaporation with steam. These solids are then fed to the recovery boiler (unit process with 100 to 500 control loops) in which the black liquor solids are further concentrated and then the liquor is fired into the recovery boiler as a "fuel': The bottorn of this boiler contains a large smoldering "char-bed" of burning black liquor solids, below which the sodium compounds form a molten pool of "smelt", containing Na2C03, NazS and Na2S04. The smelt pours from the bottom of the recovery boiler into a dissolving tank, where it is dissolved in water, and becomes green liquor. The upper part of the recovery boiler is conventional in design and produces about 50% of the total steam consumed by the mill. The green liquor is pumped to the causticizing area (unit process with 50 to 200 control loops) where "burneti lime" (CaO) is reacted with the green liquor in order to re-constitute the white cooking liquor (first reaction is: CaO+H20=Ca(OH)2; second reaction is: Na2C03+Ca(OH)2 = 2NaOH+CaC03). The resulting white liquor is sent to the digester for cooking. The calcium carbonate is precipitated and sent to the lime kiln (unit process with 50 to 200 control loops) where the CaC03 plus heat produces burned lime (CaO) plus carbon dioxide gas. The burned lime is used in the causticizing area to reconstitute the white cooking liquor.
72.2.5 Chemical Pulping Chemical pulping can be divided into two main processes, sulfite and Kraft. The sullfite process is an acid based cooking process, whose use is in dedline. The Kraftprocess is an alkali-basedprocess in which the active chemicals are fully recycled in the Kraft liquor cycle. The Kraft process itself consists of eight individual unit operations. It starts with wood chips being fed to a digester house (unit process with 50 to 300 control loops) in which the chips are fed to a digester and impregnated with white cooking liquor (a solution of NaOH and Na2S), and "cooked" at about 175'C for about an hour. In this process the lignin and hemicellulose are dissolved. The spent cooking liquor is then extracted and the pulp is "blown" into the "blow" tank. Modern digesters are continuous vertical columns, with the chips descending down the column. The impregnation, cooking, and extraction processes take about three hours or so. Some digester houses use batch digesters instead, with 6 to 20 batch digesters. From the blow tank the pulp is pumped to the brown stock washers (unit process with
72.2.6 Bleaching After cooking, the pulp may need to be bleached. This is necessary for all products which require high brightness or the complete removal of lignin. A typical bleach plant (unit process with 50 to 300 control loops), consists of the sequential application of specific chemicals, each followed by a reaction vessel and a washing stage. A bleaching sequence which has been ~tsedoften in the past involves pumping unbleached "brown stock" from the pulp mill and applying the following chemicals in turn: chlorine (to dissolve lignin), caustic (to wash out lignin by-products), chlorine dioxide (to brighten), caustic (to dissolve by-products), and finally chlorine dioxide to provide final brightening. Bleach plant technology is currently in a state of flux as a result of the environmental impact of chemicals, such as chlorine. As a result, the industry is moving towards new chemicals, chiefly oxygen and hydrogen peroxide.
THE CONTROL HANDBOOK
72.2.7 Paper Making
72.2.9 Competitive Marketplace
Paper is made on a paper machine (unit process with 100 to 1000 control loops). Paper making involves refining various types of pulps (mechanical work to improve bonding tendencies) individually, and then blending them together in combination with specific additives. This "stock" is then screened and cleaned to remove any dirt, and is diluted to a very lean pulp slurry before being ejected onto a moving drainage medium, known as "the wire'', Fourdrinier, or former, where a sheet of wet paper is formed. This wet paper is then pressed, dried, and calendered (pressedbetween smooth polished surfaces) to produce a smooth final product with a specified basis weight (weight per unit area), moisture content, caliper (thickness), smoothness, brightness, color, ash content, and many other properties. Figure 72.1 shows a diagram of a simple fine paper machine (sayfor photocopier paper): Hardwood and softwood stocks are stored in High Density (HD) storage tanks at about 15% consistency (mass percentage of pulp in a slurry). As the stock is pumped from the HD's it is ..Muted to 4% and refined. Paper machine refiners have rotating discs which fibrilate or "fluff" the fiber, thereby giving it better bonding strength. The hardwood and softwood stocks are then blended together in a desired ratio in the blend chest (subject of case study later), and sent to the machine chest, the final source of stock for the paper machine. From here the stock is diluted with "white water" (filtrate which has drained through the forming section), and is sent to the headbox (a pressurized compartment with a wide orifice or "slice lip" which allows the "stock" to hc ejected as a wide jet). The stock discharges from the headbox at the required speed onto the "wire'', a wide porous "conveyor belt" consisting of synthetic woven material. Most of the fiber stays on, and most of the water drains through to be recycled again as "white water".
Regardless of the imperfections which might exist in the pulp and paper manufacturing environment, the marketplace now demands ever greater product uniformity. Warnings and recommendations about control performance and control engineering skill have been reported [3] in pulp and paper industry literature. Such problems are not unique to pulp and paper, and probably apply to most process industries. For instance, similar concerns have been expressed in the chemical industry [7],where the following recent prediction was made: "In ten years the ability to produce highly consistentproduct will not be an issue because those companies thatfail in this effort will be out of the market orpossibly out of business. Among the remainiq companies, the central issue ten years from now will be the eficient manufacture ofproducts that conform to customer variability expectations". To a great extent the issues are not technological but human. A higher level of competency in process control is required to reduce variability [9], which requires awareness creation, training, and education. To this end, there has been extensive training of both instrument technicians and control engineers in recent years [4],with nearly 2000 industry engineers and technicians having received training in dynamics and control.
72.2.8 The Pulp and Paper Mill The pulp and paper mill consists of between one and some thirty unit operations, with a total of about 500 to 10,000control loops. A typical mill prcduces 1000 tonslday of product (although the capacity of mills can vary from 50 to over 3000 tons). Each of the operations involving wood chips or fiber is essentiallyhydraulic in nature, and involves a sudden application of chemical, dilution water, mechanical energy, or thermal energy. Sometimes this is followed by a reaction vessel with long residence time. As a result, the fast hydraulic dynamics are very important in determining the final pulp or paper uniformity. A typical mill is operated some 355 days per year, 24 hours per day, by a staff of a few hundred people, often with a unionized workforce. The traditional responsibilityfor control loop tuning lies with the instrumentation department, which may have between 5 and 30 instrument technicians. The number of control engineers varies from a dozen in some mills to none in others. This has been avery briefdescription ofpulp and paper science, a well establishedfield with a rich literature. A good introduction to this work is available in the textbook series on pulp and paper manufacture edited by Kocurek [ 141.
72.3 Steady-State Process Design and Product Variability Pulp and paper mills, like most of heavy industry, have been designed only in steady state. Design starts with a steady-stateflow sheet, proceeds to mass and energy balances, foCowing which major equipment is selected and the design proceeds to the detailed stage. It is in the detailed design stage that control of the process is often considered for the first time, when sensors and valves are selected and a general purpose control system is purchased. It is generally assumed that only single PID loops will be required to control the process adequatelyand that these controllers can handle whatever dynamics are encountered when the process starts up. Until now, the prime focus has been on production capacity. The last ten years have witnessed increasing concern about high product variability, as the quality "revolution" has started to engulf the pulp and paper industry. Customer loyalties can change and large long term contracts can be canceled when the product does not perform well in the customer's plant. As an example, Japanese newspaper print-rooms are now demanding paper which is guaranteed to have only one web break or less, per one thousand rolls of paper [13]. Such intense competition has not yet reached North America, where newsprint suppliers are still being rated by their customers in breaks per hundred rolls, with 3 being good - this is 30 breaks per thousand rolls! As competition intensifies, pressures such as these will create an increasing need for closer attention to variability. Such market forces are creating new demands. For instance, the consulting company, EnTech Control Engineering Inc., was formed in response to the needs for an independent variability and control auditing service with design and training expertise to help pulp
72.3. STEADY-STATE PROCESS DESIGN A N D PRODUCT VARIABILITY
STEAM
4
CALENDAR
HARDWOOD
WHITE WATER
DILUTION HEADER
Figure 72.1
A paper machine.
and paper companies acquire the skills to become more effective competitors in the marketplace. About 200 mill variability audits have been carried cut to date. The results of these process and product variability audits [3] raise serious questions about the quality of mill design, maintenance practice, control equipment (such as control valves) and control knowledge, as these issues pertain to the ability of achieving effective process dynamics and low product variability.
72.3.1 Mill Variability Audit Findings The results shown are fairly typical of audit results which may be obtained from any part of the pulp and paper industry prior to corrective action. The blend chest area of a paper machine has been chosen a:$a small case study to illustrate typical mill variability problems. A design discussion of the blend chest area is continued throughout later sections of the chapter. The blend chest is includedin the paper machine sketch of Figure 72.1, and is also shown in more detail in Figure 72.2 below, which illustrate a fine paper machine (fine paper refers to high quality, publication grade paper) manufacturing paper, such as photocopy paper, using 70% hardwood and 30% softwood and typically producing about 500 tons per day. The hardwood pulp is needed for surface smoothness and printability, and the softwood pulp is needed for strength. 'Thes8etwo pulp slurries (stocks) are pumped out of their respective high density (HD) storage chests at about 5% consistency (mass percentage fiber slurry concentration). The hardwood stock is pumped out of the hardwood HD chest and is diluted further to 4.5% consistency at the suction of the stock pump, by the addition of dilution water. The dilution water is supplied from a common dilution header (pipe) for this part of the paper machine. In the example of Figure 72.2, the dilution water addition is modulated by the consistency control loop,
NC104 (typical loop tag based on ISA terminology), regulating the consistency to a set point of about 4.5%. The consistency sensor is normally located some distance after the pump, per the manufacturer's installation instructions. A time delay of 5 seconds or longer, is typical. After consistency control, the hardwood stock is pumped into the hardwood refiner. The purpose of the refiner is to "fibrillate" the fiber, increasing its surface area and tendency to bond more effectively in the forming section. The refining process is sensitive to the mass flow of fiber and the mechanical energy being supplied to the refiner motor (specific energy). After the refiner, the stock flow is controlled by flow controller FC105. The softwood line is identical to the hardwood line. The two flow controllers FC105 and FC205 are in turn part of a cascade control structure, in which both set points are adjusted together to maintain the blend chest level at the desired set point (typically at 70%), while maintaining the desired blend ratio (70% hardwood and 30% softwood).
72.3.2 Variability Audit Results Figure 72.3 shows the blend chest consistency W 3 0 2 on automatic control (upper plot) and on manual (lower plot) for a period of 15 minutes (900 sec). On automatic, the mean value is 3.82Yo consistency, and two standard deviations ("2Sig") are 0.05093% consistency. This number (which represents 95% of all readings assuming a Gaussian distribution) can also be expressed as 1.33% of mean. It is this number which is usually referred to as "variability" in the pulp and paper industry (approximately half the range of the variable expressed as a percent). For the manual data, the mean value is virtually unchanged (the loop was merely turned off control). The variability however is now 0.848%, 50% lower than on automatic. NC302 is controlled by a PI controller which has been tuned with default tuning settings
THE CONTROL HANDBOOK
DIL'N HEADER
T HARDWOOD H.D. CHEST
A4
@ FClOS
B
REFINER
DIL'N HEADER M
@ RATIO (70%)
\
,
\
FC205
Q
\ \ TO MACHINE
BLEND CHEST RATIO (30%)
DIL'N HEADER
\
\
@
NC204
@
LC301 SOFTWOOD
REFINER
Figure 72.2
The blend chest area of a paper machine.
NC302 - Auto BLENO CHEST CONSISTENCY % Cons
entl00at.dat 05104~32 Time Series
3.900 1 3.850 : 3.800 : 3.750 3.700
l
~
r
l
a
n
r
r
0.0
1
,
225.0
r
*
e
a
t
r
a
,
t
,
t
,
*
r
r
450.0 Mean=3.82503 2Sig=0.05093 (1.33%)
,
t
~
~
1
t
t
~
r
r
900.0 see
-
NC302 Manual BLEND CHEST CONSlSTEllCY % Cons
~
675.0
ent100am.dd 05104192 Time Series
3.850 :
3.750
i
3.700
,
0.0
Figure 72.3
,
l
r
,
,
1
,
~
1
1
225.0
1
1
1
1
1
1
1
1
1
1
1
1
1
1
460.0 Mean=3.81786 2Sig=0.03238 (0.848%)
Blend chest consistency NC302 time Series -on and off control.
1
1
1
1
1
675.0
1
1
1
1
1
1
1
1
1
1
900.0 sec
s
n
~
1
72.3. STEADY-STATE PROCESS DESIGN A N D PRODUCT VARIABILITY
consisting of a controller gain of 1.0 and an integral time of 0.5 minuteslrepeat (standard PI controller form). These tuning parameters occur frequently, are entered as default values prior to the shipping of controller equipment, and are "favorite numbers" for people tuning by "trial and error''. However, the variabilty problems of NC302 are not all tuning related; the control valve was found to have about 1% stiction. It is this tendency which induces the limit-cycling behavior seen in Figure 72.3 at second 225 and from second:; 500 to 800. Variability of I .33CV0for a blend chest consistency is not very high, but, considering that paper customers are starting to demand paper variability approaching 1%. this represents potential for improvement, especially because the variability on manual coritrol is 50% lower than on automatic. Figure 72.4 shows two open-loop step tests (bump tests) for the blend chest consistency loop. Also shown is the impact of these on the dilution header pressure control loop, PC309 (not shown in Figure 72.2). Excursions in the consistency control valve of 2% cause header pressure excursions of over lC/oand pressure control valve excursions of about 1%. This example illustrates the degree of coupling that exists between these two variables and poses questions about the tendency of this dilution header design to cause interaction between all of the consistency loops which feed from it. Every time that one consistency loop demands more dilution water, it will cause a negative dilution header pressure upset and disturb the other consistency loops. Nct visible in Figure 72.4 is the fact that the NC302 dilution control valve also has about 1% stiction which was mentioned as the cause of the limit cycle in Figure 72.3. Figure 72.5 shows an expanded view of the first NC302 bump test of Figure 72.4. The top trace shows the controller output, while the lower shows the process variable. A fit of a first-order plus deadtime proce:js transfer function model is superimposed over this data. The lower left hand of Figure 72.5 shows the resulting parameters for the transfer function form,
The time delay of 8.3 seconds is excessively long. It results from locating the consistency sensor too far from the pump discharge (could be less than '5 sec.). The time constant of 18.8 seconds is also excessively long (could be 3 sec.). This parameter is primarily determined by the adjustment of a low-pass filter in the consistency sensor itself. Both of these issues contribute to the sluggish behavior of NC302. Figure 72.6 shows a series of open-loop bump tests performed on the hardwood flow FC105. There are seven steps performed on the controll~eroutput. The first five have a magnitude of about 0.7% (one negative, four positive), while the last two have a magnitude of about 1.5%. It is clear that steps 3,4, and 5 produce no noticeable flow response. The response of step number six is 1.5% producing a flow response of about 50 gallons per minute, and the last step produces no response at all. The combined backlash and stict~onin this valve assembly is about 5%. Clearly, it is impossible to control this flow to within 1%. Sirnilar bump tests on Ihe softwood flow FC205 indicated that the combined backlash and stiction are in the order of about 1.7%.
1225
Figure 72.7 shows the softwood flow FC205 on automatic control. There is a limit cycle induced by the combined stiction and backlash of more than f5%. The controller output appears like a triangular wave of amplitude 1.7%. This is a characteristic of stiction-induced limit cycles and is caused by the PI controller integral term attempting to induce valve motion. The valve, however, is stuck and will not move until the controller output has changed by about 1.7%.The valve will then release suddenly and move too far, thereby, inducing the next cycle. The period of the limit cycle is about 1.4 minutes. The period is a function of the stiction magnitude, the PI controller tuning, arid the process gain. Figure 72.8 shows what happens when FC205 is taken off control. The upper plot of Figure 72.8 is the data from the last half of Figure 72.7. The bottom half of Figure 72.8 shows FC205 on manual control. It is clear that the limit cycling stops and the variability drops significantly. This is quite typical of many flow control loops. Figure 72.9 shows the operation of the blend chest level controller, together with the two flows entering the chest. The upper left hand plot shows the blend chest level LC301. The upper right hand plot shows the LC301 power spectrum plotted on a log-log scale. There is a tendency for the level control to cycle with a period of about 17 minutes, as indicated by the time series plot and by the low-frequency lobe of the power spectrum which contains the fundamental frequency (period of 1024 seconds). Level controllers frequently cycle in industry because their tuning is not intuitive. There is a tendency for people to use tuning parameters such as the default settings of a gain of 1.0 and an integral time of 0.5 minutes. This represents far too aggressive an integral time and far too low a gain for an integrating process such as the blend chest, which requires quite different tuning. The two lower plots show the softwood FC205 (left) and hardwood FC105 (right) flows. The variability of both is about 5%. However, this is where the similarity ends. Both flows limit cycle in their own characteristicway with large amplitude swings. The softwood flow is limit cycling in a manner similar to that shown in Figure 72.7 (period now 2.6 minutes). The hardwood flow FC105, on the other hand appears to break into a violent limit cycle on a periodic basis. The amplitude of this limit cycle is more than 8% with a period of about one minute. The violent limit cycling stops periodically, followed by periods of little action lasting more than five minutes. It is interesting how such nonlinearities cause frequency shifts and multiple frequency behavior reminiscent of chaos theory.
72.3.3 Impact on Papermaking The impact of the variability described above is quite damaging to papermaking. The variability in hardwood and softwood flows causes a number of problems. First of all, flow variability causes variable consistency, which the consistency controller cannot correct due to its time-delay dominant dynamics. In addition, the consistency controller tends to have a resonant mode due to its time-delay dominant dynamics. The net result is that the variability in the mass flow of fiber is amplified by the joint action of consistency control and stock flow control. This vari-
THE CONTROL HANDBOOK
-
NC302 Manual BLEND CHEST CONSISTENCY % Time Series
entl02ac.dat 05m4192
-
PC309 Auto DILUTIONHEADER PRESSURE PSI Time Series
entl02aa.dat 05104192
3438 33.91 33.46 32.99 32.53 0.0 51.2 102.4 153.6 Mean=3.80838 2Sig=0.04822 (1.27%)
-
NC3020P Manual BLENDCHCONSCTRLOUTPUT % Time Series 39.28
2048 Set
0.0
51.2 102.4 153.6 Mean=33.468 2Sig=0.4804 (1.44%)
-
entl02ad.dat 05m4192
PC3090P LOC-AUTO DlLN HDR PRESS CTRL OUTPU % Time Series
2048 Sec
entl02ab.dal 05104192
r
38.76 38.23 37.71
-
37.18 ~ , l ~ ; ' i " l , l l l , l , , 1 , , 1 , 1 , , 1 , 1 1 1 1 0.0 51.2 102.4 153.6 Mean=38.#tl 2Sig=1.966 (5.12%)
Figure 72.4
~ 1 , ~ , , 1 1 ~
2048 Sec
0.0
51.2 102.4 153.6 Mean=23.6762 2Sig-1.373 (5.8%)
2048 Sec
Interaction between blend chest consistency NC302 and dilution header pressure PC309.
NC3020P - Manual NC302 - Manual
BLENDCHCONSCTRLOUTPUT BLEND CHEST CONSISTENCY Bump 01
entlO2ad.dat 051041112 entl02ac.dat 051041112
% DU=(0.249 %mi*) 39.49--------------------------"----------------------------------------------
a
I
38-24-.---- - - - - - - - - - - - - - - - - I U
e,
I
:
i
, J
-----
I I
---L--------L---
-----
0
L - - -
,
I
------------I
J
1
~~.~~---------I---------I--------J--------L--------L--------I---------I--------J
%
DY=(0.003622 %IDiv)
Sec
X=(1.343 SecIDiv) 1st Order t= 36.6 Kp= -0.0288
Figure 72.5
U= 37.2 Td= 8.30
DU5 1.39 TauI= 18.8
Y= 3.84
Blend chest consistency NC302 open loop bump test.
DY= -0.0574
-
72.3. STEADY-STATE PROCESS DESIGN A N D PRODUCT VARIABILITY
HWD FLOW CONTROL OUTPUT HWD FLOW TO BLEND CHEST
-
FCIO5 Bump
entl0lab.dat 05120193 entl0laa.dat 05120193
Bump 01 % Output
GPM
1 Figure 72.6
DU=(0.6971% Output10iv)
DY=(28.1 GPMmiu)
X=(4.781 Secmiu)
Sec
Hardwood flow FC105 bump tests showing about 5%backlash/stiction. entl04ac.dd 0512lm3 Time Series
Set
entl04ad.dat 05R1193 Time Serier
1
112.8
Figure 72.7
225.5 Mean=48.4975 2Sig=0.8774 (1.81%)
Sofiwood flow FC205 with 1.2% stiction showing a limit cycle.
1
,
338.2
1
1
1
,
1
1
1
1
I 461.0 Sec
THE CONTROL HANDBOOK
-
FC205 Auto
entl04af.dat 05~1193
SWD now TO BLEND GPM
Time Series
755.0 720.0 685.0 650.0 615.0 0.0
60.4
120.7 Mean-683.574 2Sig=38.72 (5.66%)
181.1
241.5 Sec
FC205 - Manual
entl O4ae.dat 05Q1193
SWD FLOW TO BLEND GPM
Time Series
0.0
60.4
120.7 Mean1694664 2Sigs18.1 (2.61%)
181.1
241.5 Sec
-
Softwood flow FC205 on and off control,
Figure 72.8
-
entl05aa.dat 051'20193
LC301 Cascade BLENDCHESTLEVEL
entl05aa.dat 051'20193
Var (E-3)
Time Series
%
>
-
LC301 Cascade Power Spectrum
9.398
70.66 70.32
0.940 69.99 0.094
69.65 69.32 0
256 512 768 Mean=70.02I3 2Sig=0.4172 (0.596%)
-
FC205 Cascade SWU F L W TO BLEND GPM
0.050
1024 See
entl05ab.dat 05120193
Time Series
0.500
-
FC105 Cascade MFLOW TO BLEND GPM
5.000 CyclelSec (E-I) entl05ad.dat 05120193
Time Series
759.3 7342 709.1 6840 658.9
, . , , .. . . . . , . . . . . . . . . 0
Figure 72.9
256 Mean=698.1#
I
.""โ
I
512 768 2Sig=34.84 (499%)
Blend chest level LC301 on automatic control.
I
1024 Sec
Mean-1803.65 2Sig-97.75 (5.42%)
Sec
72.3. STEADY-STllTE PROCESS DESIGN AND PRODUCT VARIABILITY
ability is driven by the unique limit-cycle behavior of each flow controller. As a result the refining process will suffer, with the bonding strength developed by each fiber species varying in different ways. In addition, the stock blend (e.g., 70% hardwood, 30% softwood) uniformity is also compromised by the different limit cycles. At times, the hardwood/softwood ratio is off by 5% in Figure 72.8. Clearly this cannot help to deliver uniformly blended stock to the paper machine.
72.3.4 Final Product Variability Final paper product measurements include: basis weight (mass per unit area), moisture content, caliper (thickness),smoothness, gloss, opacity, color, and brightness. Of these, basis weight is the most important and the easiest to measure, as the sensor is essentially an analog sensor. Almost all of the other sensors are digital with limited bandwidth. The final quality sensors of most paper machines are located on the scanning frame, which allows the sensors to traverse the sheet at the dry end. Typically each traverse takes 20 to 60 seconds. During each traverse, sensor averages are calculated to allow feedback control in the "machine-direction" (MD). At the same time, data vectors are built during the sheet traversing process to measure and control the "cross-direction" (CD) properties. XJ measure "fast" basis weight variability the traversing must be stopped by placing the scanning sensors in "fixed-point" (MD and CD controls are suspended for the duration). While in fixed point, it is possible to collect data from the basis weight sensor (beta gauge) at data rates limited only by the bandwidth of the data collection equipment. Figure 72.10 shows a fixed-point basis weight data collection run. The average basis weight is 55 g/m2(gsm), with variability of 3.29%. Clearly there is a strong tendency to cycle at about 0.2 Hz, or 5 seconds per cycle. This is evident in the time series plot (upper left), in the power spectrum plot (upper right), in the autocorrelation function (bottom left), and in the cumulative spectrum (bottom right), which shows that about 25% of the variance is caused by this cycle. The cause of the cycle is not related to the blend chest data but rather to variability in the headbox area. Nevertheless, the time series data indicates that it is worthwhile to identify and eliminate this cycle. Iiigure 72.1 1 shows similar data collected at a slower rate for a llonger period of time. Once again the mean basis weight is 55 gsm, and the variability is 3.7%. The time series data indicates a relatively confused behavior. From the cumulative spectrum it is clear that 50% of the variance is caused by variability slower ihan 0.02 Hz (50 secondslcycle). From the power spectrum it is evident that there is significant variability from 0.003 Hz (5.6 minuteslcycle) through to about 0.0125 Hz (1.3 minuteslcycle). In addition there is considerable power at the fundamental period of 17 minutes. All of these frequencies fit the general behavior of the blend chest area, including the hardwood and softwood flows, as well as the blend chest level. Proof of cause and effect can be established by preventing the control-induced qlcling (placing the control loops on manual) and collecting the basis weight data again. Figure 72.12 shows the behavior of basis weight over a period of about 2.5 hours based on scan average data (average of each traverse) collected
every 40 seconds. The mean is still 55 gsm, and the variability is now 1.39%. The Nyquist frequency of 0.758 cycleslminute corresponds to a scanning time of 40 seconds. There is significant power at 0.38 cycleslminute (2.6 minuteslcycle), and at 0.057 cycleslminute (17 minuteslcycle). 0n;e again these frequencies correspond to some of the behavior in the blend chest area. The purpose of this small case study was only to illustrate typical behavior of automatic control loops in a fairly realistic case study. In practice, an operating unit process, such as the example paper machine, has several hundred control loops of which about a hundred would be examined carefully during a variability audit to investigate the causalinterrelationships between these individual variables and the h a l product.
72.3.5 Variability Audit Typical Findings The case study results are typical of audit results. As of now, our audit experience extends over approximately 200 audits, with 50 to 100 control loops analyzed in each audit. Audit recommendations frequently include 20 to 50 items covering process design, control strategy design, valve replacement, and loop tuning issues. Findings by loop are categorized into several categories as listed in Table 72.2. TABLE 72.2
VariabilityAudit Findings by Category. Loop category: Loops which... reduce variablility (have adequatelgood design, equipment, and tuning) cycle and increase variability due to control equipment (valve backlash, stiction, etc.) cycle and increase variability due to poor tuning require control strategy redesign to work properly requires process redesign or changes in operating practice to reduce variabiity Total
These findings say that only 20% of the corltrol loops surveyed actually reduce variability over the "short termn in their "as-found" condition. "Short term" means periods of about 15 minutes for fast variables, such as flow loops, and periods of one or two hours for loops with slow dynamics, such as basis weight. Let us examine the chief reasons for inadequate performance which are control equipment, tuning, and design.
72.3.6 Control Equipment The control equipment category accounts for about 30% of all loops and represents a major potential for improvement. Control valve nonlinear behavior is the primary problem. However, also included are other problems, such as installation (e.g., sensor location), excessive filtering, or equipment characteristics (e.g., inappropriate controller gain adjustment ranges or control mode features). The blend chest case study discussed severalproblems. The most serious were the control valve backlashlstiction prob-
THE CONTROL HANDBOOK
1230
Figure 72.10
Paper final product-Basis weight (0.01 to 5Hz).
-
BW Fixed Pt. Fixed BASIS WEIGHT FIXED POINT GSM Time Series
entbwlk.dat 04111194
-
BW Fixed Pt. Fixed Var
58.18
e n t b w lk.dat OM1194 Power Spectrum
0.2413
56.58 04241 5498 0.0024
53.38 51.78 0
768 256 512 Mean=55.1188 ZSig=2.041 (3.7%)
-
1024 Sec
entbwlk.dat 04111194
BW Fixed Pt. Fixed
Cycle/Sec (E-I)
-
BWFixed Pt. Fixed
Auto Corr.
0.5
[I: '
50
0.0
-1 -0.5.O
25 0
0.0
Figure 72.11
Cum. Spectrum
%
1.o
e n t b w lk.dat OM1194
10.0
20.0
30.0
40.0
50.0 Sec
Paper final product-Basis weight (0.001 to 0.5 Hz).
I
,
1 , , 1 1 1 1
0.050
, ,
1 1 1 1 , 1 1
0.500
, ,
1 1 1 1 1
,,
5.000 CyEleEec (E-I)
1231
72.3. STEADY-ST/\TE PROCESS DESIGN AND PRODUCT VARIABILITY BW Scan Aug - AUTO REEL BW SCAN AVERAGE GSM Time Series
entbwscn.dai
05122194
55.83
0.04926
55.35 54.87 54.39 53.91 37.1
743
111.4
Wlcan=54.961 2Sig=0.7617 (1.39%)
-
BW Scan Alrg AUTO
Auto Corr.
148.5 min
entbwscn.dat 05122194
-
I
0
0.00493
0.0
0.758
BW Scan % Aug - AUTO
Figure 72.12
-
50
0.0
/ 25
-0.5
I-"~
7.576 Cyclehnin (E-I) entbw8cn.M 05R2194
Cum. Spectrum
,
-
05122194 Power Spectrum
6 i 1
;:.; ;& iL
2.0 nn ir
Cyclehnin (E-I)
Paper final product - Basis weight (0.0001 to 0.0125 Hz)
lems exhibited by FCl105 and FC205. To some extent, the blend chest consistency loop NC302 also suffered from control valve induced limit cycling. Lesser problems included the inappropriate consistency sensor location for NC302 (time delay too long) and the excessive filter time constant. The serious problemsin this categoryare the controlvalvenonlinearities such as backlash, stiction, and issues relating to control valve positioners (local pneumatic feedback servo) which react to valve stiction with time delays that are inversely proportional to the change in the valve input signal. These nonlinearites cannot be modeled thrfough "small-signal" !inearization methods and are extremely d,estabilizingfor feedback control. The only recourse is to eliminate such equipment from the process (best way of linearizing). This is discussed further in the section on dynamic performance specifications.
72.3.7 Loop Tuning This category accounts for 30% of allloops and representsamajor potential for improv.ement. In the case study of the blend chest consistency, NC302 and the blend chest level, LC301 were in this category. until recently, loop tuning has been an "art" in the process industries done Ily trial and error. The responsibilityfor loop tuning in a -pulp paper mill normally rests with the Instru- anti mentation Shop and is carried out by Instrument Technicians. Until recently, the only training that most instrument technicians have received in the Quarter-Amplitude-Damping method
of Ziegler and Nichols [21]. This method is not robust (gain margin of 2.0 and phase margin of 30") and tends to produce excessive cycling behavior. As a result most people have taught themselves to "tune by feel': Recent training [4] throughout the industry has introduced many engineers and technicians to "Lambda Tuning", discussed in detail in a later section.
72.3.8 Control Strategy Redesign, Process Redesign, and Changes in Operating Practice Control equipment and loop tuning problems can be corrected at relatively little cost. The remaining opportunity to gain another 20% advantage in variabilityreduction is spread over a number of areas including control strategy redesign, process redesign, and changes in operating practice. Issues relating to process and control design are discussed in a later section. Changing operating practice involves people and how they think about their operation. Making changes of this kind involves creating awareness and changing "culture'!
72.3.9 Pulp and Paper Mill Culture and Awareness of Variability The reader can be forgiven for the obvious question: Are people aware of the variability that they actually have? The answer is generally "No". Space does not permit a thorough discussion of
THE CONTROL HANDBOOK this subject which determines a mill's ability to move forward (see [9]). In brief, there are many historical reasons why operating data is presented in a particular way. Even though most mills use digital "distributed control systems" (DCS) today, the psychology of the past has carried over to the present. In current DCSs, process data is manipulated in ways which tend to strip the true nature of the process variability from it. Some of these include slow input sampling rates, data aliasing, slow update rates for operator consoles and mill data archives, "reportby-exception" techniques (to minimize digital communication traffic), and display of data on a scale of &loo% of span [ l l ] .
72.4 Control Equipment Dynamic Performance Specifications Nothing can be done to improve control performance when the control equipment does not perform adequately. The control valve has been identified in audit findings as the biggest single cause of process variability in the pulp and paper industry. Another problem identified in audits is that some automatic controllers have inappropriate design features, such as gain adjustment ranges, which exclude legitimate yet effective settings. To help the industry move forward, dynamic performance specifications have been prepared. These include a control valve dynamic specification [lo], and a controller dynamic specification [8]. Currently these specification documents are widely used in the pulp and paper industry, frequently for equipment purchase. As a result, most valve suppliers to the pulp and paper industry are actively using the valve specification as a performance guide to test their products [20]. In a number of cases new or improved control valve products have emerged. It is anticipated that future control valves will have combined backlash and stiction as low as 1% or lower, and time delays in fractions of a second. When these levels cannot be tolerated, mills are considering variable speed drive technology for pumps as an alternativeto the control valve.
72.5 Linear Control Concepts Lambda Tuning - Loop Performance Coordinated with Process Goals In the pulp and paper industry, the term "Lambda Tuning" refers to a concept in which the designerltuner specifies the loop performance by choosing the closed-looptime constant (called Lambda (A)). The calculation of the parameters required by a specific control algorithm is determined by a "Lambda tuning rule," -a transformation of the closed-loop time constant, the process dynamics, and the control algorithm structure into the appropriate gains. Lambda tuning is considered a useful concept for designing and tuning control loops in continuous processes, especially when low product variability is the objective. The concept of specifying a desired performance for each loop is the key to coordinating the overall performance of a unit process with hundreds
of control loops. The idea of specifying a particular closed-loop performance is foreign to most people, who equate high bandwidth with performance, and hence would tune loops to be as fast as practically possible. The resulting closed-loop dynamics are a function only of the open-loop dynamics and the stability margins required by robustness considerations. Yet if every control loop in a paper machine application was independently tuned for maximum bandwidth, it is doubtful that the paper machine would run at all, let alone make saleable paper! The beginnings of the Lambda tuning concept can be traced back to the work of Newton, Gould, and Kaiser [16], and the analytical design of linear feedback controls in which the concept of stochastic control systems for minimum bandwidth and minimization of mean-square error was put forward. Dahlin [6], applied the analytical design concept to controlling processes with time delay and first-order dynamics and used the notion of a user-selected bandwidth. This was done by specifying the desired closed-loop pole position, I. Dahlin originated the expression, "Lambda Tuning:' - meaning a single parametric dynamic specification for closed-loop performance. Later these ideas, generalized and extended by others [MI, specifiedthe control loop performance by choosing a closed-loop time constant, as opposed to a pole location. The evolution of the internal control concept (IMC) further extended these ideas to processes of arbitrary dynamics, while considering the'associated problem of loop robustness [15]. More recently these concepts have been applied to the development of "tuning rules" for simple PI and PID controllers [5]. Finally, Lambda tuning has been adopted as the preferred controller tuning approach in the pulp and paper industry (171. The mathematics of "Lambda tuning" is well established in the literature cited. But almost nothing is said about selecting the closed-looptime constant I. Only the trade-off between performance and robustness [15] is recognized, and that robustness suffers as the loop performance specification is more aggressive. Yet it is precisely the ability to choose a specific closed-loop time constant for each loop that makes Lambda tuning so useful in the pulp and paper industry. This allows uniquely coordinated tuning of all ofthe loops which make up a unit process to enhance the manufacturing of uniform product. This idea is explored in the following blend chest design example.
-
72.5.1 Design Example Blend Chest Dynamics using Lambda Tuning For the now familiar blend chest example, let us establish the manufacturing objectives and translate these into control engineering objectives as desired closed-loop time constants for each loop.
Blend Chest Manufacturing Objectives The manufacturing or papermaking objectives are: 1. provide a uniformly blended source of paper stock, at a desired blend ratio
LINEAR CONTROL CONCEPTS
- LAMBDA TUNING -LOOP PERFORMANCE
2. insure that the stocksleavingboth the hardwood and
softwood chests are at uniform consistencies 3. insure uniform refining of both stocks, so that these fibers have uniform'ly developed bonding strength 4. insure that the blend chest level is maintained at the level set point, and never spillsover or goes below, say 40%, while allowing the paper machine production rate to change by, say 20% Uniform stock delivery from the hardwood and softwood chests is critically in~portantfor two reasons. First, the refiners depend on uniform stock delivery to insure that refining action is uniformly applied. Overrefined fibers provide higher bonding strength but reduce the drainage rate on the wire and cause high sheet moisture. Underrefined fibers cause loss of sheet strength, higher drainage rate, and lower moisture content in the sheet. Overrefined and underrefined fibers are discretequalities of stock and do not "average-out".
Blend Chesit Control Objectives: 1. Maintaining tight control over hardwood (NC104) and softwood (NC204) consistencies is certainly the most importimt objective, because it will insure constant and uniform fiber delivery. These two loops are then of prime importance. At the high density storage chests, the disturbance energy is normally very high and solnewhat unpredictable. Hence, these loops should be tuned for maximum practical bandwidth. However, the consistency loops have timedelay dominant dynamics and, as a result, a highfrequency resonance (discussedlater) slightly above the cutuff frequency. This high-frequencyresonance typically occilrs at about 0.2 radianslsecond (about 30 seconds per cycle, frequency depends on tuning) and should be minimized so as not to amplify process noise at this frequency and allow this resonance to propagate further down the paper machine line. The choice ol'the closed-loop time constant (I),determines the extent of the resonance. For a I two times the time delay, the resonance will be about +2 dB (AR = 1.:16), and a I equal to the time delay will cause a resonance of +3.6 dB (AR = 1.51). Because the noise in the resonant band is amplified by over 30%, this choice is too aggressive. The hardwood and softwood consistency loops with about 5 seconds of time delay, should be tuned for a h of about 15 seconds. 2. The blend chest has a residence time of 20 minutes when full. At the normal operating point of 70% full, there are 14 minutes of stock residence time and an air space equivalent to 6 minutes. We are told that the paper machine production rate changes will be no more than 20%. A paper machine stoppage will cause the blend chest to overflow in 6 minutes. A 20% reduction in flow would cause the chest to
1233
overflow in 30 minutes. The process immediately down stream from the blend chest is the machine chest, which is itself level controlled. As a result the actual changes in blend chest outflow will be subject to the tuning of the machine chest level controller. The purpose of the blend chest is to provide a surge capacity. Fast tuning of the level will tightly couple disturbance in the outflow to the very sensitive upstream refiner and consistency control problems. Hence the level controller LC301, should be tuned as slowly as practically possible (minimum bandwidth), subject to the stipulation that the level must never make excursions greater than 30%. A tank level, with the inlet flows controlled by a Lambda tuned PI controller, has a load response to a step change in outflow in which the maximum excursion occurs in one closed-loop time constant. A good choice for the closed-loop time constant may then be 10 or 15 minutes. Either of these choices will maintain the level well within the required 30%. It is important to insure that the level controller never oscillates, because oscillation is damaging to all of the surrounding loops. Hence the tuning must insure that the closed-loop dominant poles remain on the negative real axis (discussedlater). 3. The hardwood (FC105) andsoftwood (FC205)flows are the inner cascade loops for the level controller LC301, and their set points are adjusted via ratio stations which achieve the 70:30 blend ratio. To maintain a constant blend ratio under dynamic conditions, both flows must be tuned for the same absolute closed-loop time constant. Yet the hardwood and softwood lines probably differ, in pumps, pipe diameters, pipe lengths, valve sizes, valve types, flow meters, hence, open-loop dynamics. Any tuning method in which the closed-loop dynamics are a function only of the open-loop dynamics (e.g., Ziegler-Nichols, 5% overshoot, etc.) will produce different closed-loop dynamics for the hardwood and softwood flows. Only Lambda tuning can achieve the goal of identical closed-loop dynamics. The actual choice of closed-loop time constants for both flows is also critically important. Both of these flows can be tuned to closed-loop time constants under 10 secondswithout too much trouble. However, too fast a time constant will disturb the consistency loops and the refiner operation through hydraulic coupling. This will be especiallytrue when accounting for the nonlinear behaviour of the valves. Too slow a time constant wih interfere with the operation of the level controller which requires the inner loops to be considerably faster than its closed-loop time constant. Analysis is required to determine the exact value of the flow time constant to insure adequate stability margins for the level control. However, for
THE CONTROL HANDBOOK a level A of 15 minutes, it is likely that the flows can be tuned for As in the two minute range. This choice will insure relatively light coupling between the flow and the consistency loops when tuned for closed-loop time constants of 15 seconds. 4. The blend chest consistency loop NC302 can be tuned in the same exact way as the other consistency loops. Alternatively, if a repeatable noise model structure for this loop can be identified by studying the noise power spectrum on manual, it may be possible to tune NC302 using minimize-variance principles. If the noise model is the fairly common integrating-moving-average(IMA)type (drifts plus noise) and if the corner frequency of the noise structure is slower than the frequency at which the consistency loop has significant resonance [12], h can be chosen to match the corner frequency of the noise structure, thereby, approximately canceling the low-frequencydrifts and producing a "white" power spectrum.
72.5.3 Lambda Tuning Lambda tuning employs the general principles used in the Internal Model Control (IMC) concept which has the following requirements for a controller: 1. The controller should cancel the process dynamics, process poles with controller zeros, and process zeros with controller poles. 2. The controller should provide at least one free integrator (Type 1 loop) in the loop transfer function to insure that offsets from set point are canceled. 3. The controller must allowthe speed of response to be specified (Lambda (A) to be set by designedtuner). For the the process transfer function of Equation 72.2, these general principles call for PID control with a series filter (P1D.F). For nonintegrating processes, this translates into the following controller transfer functions:
Conclusion -Design of Blend Chest Dynamics with Lambda Tuning The blend chest example illustratessix reasons for choosing specific closed-loop time constant values: 1) maximum nonresonant bandwidth for the loops considered the most important (hardwood and softwood consistencies), 2) minimum possible bandwidth for the least important loop (level), 3) equal closedloop time constants for loops controlling parallel processeswhich must be maintained at a given ratio (hardwood and softwood flows), 4) a closed-loop time constant dictated by the dynamics of an upper cascade loop (flow loops), 5) a closed-loop time constant for loops of lesser importance sufficiently slower than adjacent coupled loops of greater importance (flows versus consistencies),and, finally,6) choosingthe closed-looptime constant to minimize variance through matching the regulator sensitivity function to the inverse of an IMA type noise structure by matching the regulator cut-off frequency to the IMA corner frequency.
72.5.2 Pulp and Paper Process Dynamics Next, let us consider the types of process dynamics present in the pulp and piper industry. Most process dynamics can be described by one of the following two general transfer functions shown below:
where a = 0 or 1, J! = lead time constant(positive or negative), t l , r2 = time constants (tl2 rz),g = damping coeficient, and Td = deadtime. Typical parameter values for pulp and paper process variables are listed in Table 72.3.
where closed-loop time constant = A, setpoint response bandwidth (inverse sensitivity function) 2 :, and load response bandwidth (sensitivityfunction) S . For integrating processes, it is normally necessary to specify control loops of Type 2 form, because a controller integrator is usually desirable to overcome offsets due to load disturbances. The controller required for this task is typically of the form,
&
Whereas the form of these corltrollers can be implemented in P1D.F form, in most cases a PI controller will suffice, especially when the performance objectives are reasonably modest relative to theloop robustnesslimits. This gives the followingadvantages: the series filters need not be implemented (not a standard feature of current distributed control systems (DCSs), the resulting controller is within the training scope of the instrument technician, and the large actuation "spikes" caused by derivative control are avoided. The form of the PI controller contained in most DCSs is
Equating the controller gain (Kc)and reset (or integral) time (Tx) to the general process parameter values of Equations 72.3 and 72.4 yields the following tuning rules [5]. Consider the two most important cases of process dynamics, first-order plus deadtime, and integrating plus deadtime, which together represent 85% of all process dynamics in the pulp and paper industry. The tuning rules are listed in Table 72.4 below. These two tuning rules form the basis for the bulk of the Lambda tuning for single loops in the pulp and paper industry. Whereas the choice of closed-loop time constant A, provides the
LINEAR CONTROL CONCEPTS - LAMBDA TUNING - LOOP PERFORMANCE TABLE 72.3 -
Typical Dynamics of Pulp and Paper Process Variables.
Process Variable Stodc flow S t ~ c flow k hith air entrainment Stock flow, flexible pipe supports
Stock consistency Stock pressure Headbox total head with "hornbostle" Headbox total head with variable speed fan pump Paper machine bask weight Paper machine moisture Pulp dryer basis weight Dryer steam pressure Clnest level C'hestlevel with controlled flows Bleach plant chlorine gas flow Bleach plant pulp brightness Bleach plant steam mixer temperature Bleach plant oxygen gas flow Bleach plant D l Stage 'J tube' brightness Digester Chip Bin Level Boiler feed water deaerator level Boiler steam drum level Boiler master - steam header control with bark boiler * varies # nominal value, depends on equipment sizing ! tuning dependent
mechanism for coordinating the tuning of many control loops, this must always be done with control loop stability and robustness in mind. Furthermore, variability audit experience indicates that the tendency ol'control loops to resonate should be avoided at all costs. Hence, (closed-looppoles must always be located on the negative real axis, and time-delay-induced resonance must be limited to manageable quantities, never more than, say, +3 dB. Such design objectives should significantly limit propagation of resonances within pulp and paper processes. Let us examine these issues further. PI Tuning Rules for First-Order and Integrating Dynamics. Process Dynamics KC TR
TABLE 72.4
1)
Gp(s)=
2)
Gp(s)l=
1235
~,e-~Id
5
t
72.5.4 Choosing the Closed-loop Time Constant for Robustness and Resonance First-Order Plus Deadtime Consider a consistency control loop, as an example of a first-order plus deadtime process, with the following process pa-
KP
a
1 .O# 1.0# 1.O# -1.0# 1.0# 1.0# 1 .O# *
0 0 0 0 0 0 0 0 0 0
1.0# 1.0# 1.0# 1.0# 1 .O# 0.005# 0.005# 0.005# 0.005#
p
tl
(sec.)
(sec.)
r2 (sec.)
Td
3
-
8
1
-
-
4 -
1
300
1 1 0 0 0 0 0 1 1 1 1
-
2 2 -
5 3 5 3 3 1 5 120 10 20
3
15! 2 0 0 0.05 120
5
30
-
-
-
-
-30
-
-
300
rameters (time in seconds, ignore negative gain):Kp = 1, t = 3, and Td = 5. By applying the Lambda tuning rule of Table 72.4, the loop transfer function becomes G ~ (-~ () r ~ + l ) K c S T d Kp(A+Td)s
(;s+l)
-
e -sTd (A+lj)s'
(72'6)
As long as pole-zero cancellation has been achieved (or nearly so), the resulting dynamics depend only on Td and the choice of h . Let us consider what will happen as h is varied as a function of Td. Figure 72.13 shows the root locus plot for the cases A/ Td = 1,2,3, and 4. The deadtime is approximated by a first-order Pade' approximation (open-loop poles are xs, zeros are os, and closed-loop poles are open squares). Figure 72.13 shows the time-delay polelzero pair of the Pade' approximation at *0.4, together with the controller integrator pole at the origin. The most aggressive of the tuning choices, h/Td = 1, h = Td = 5 has a damping coefficient of about 0.7 and hence is just starting to become oscillatory. Figure 72.14 shows the sensitivity functions plotted for all four cases. The resonance for the case of h = 2Td = 10 is fairly acceptable at 4-2 dB, or an amplitude ratio of 1.26, hence noise amplification by 26%. From the viewpoint of robustness, this tuning produces a gain margin of 4.7 and a phase margin of 7 l o , a fairly acceptable result considering that changes in loop gain by a factor of two can be expected and changes of 50% in both time constant and deadtime can also be expected. This tuning is thus recommended for general use. On the other hand, a choice of h = Td = 5 is barely acceptable, having a resonance of +3.6 dB, (amplitude ratio of 1.51), a gain margin of only 3.1, and a phase margin of
THE CONTROL HANDBOOK
1236
Figure
61'. A summary of these results is containFd in Table 72.5. These values represent the limiting conditions for the tuning of such loops based on robustness and resonance. The actual choices of closed-loop time constants are likely to be made using process related considerations as outlined in the blend chest example.
Integrator Plus Deadtime Now consider the integrator plus deadtime case. The loop transfer function is
Let us use a digester chip bin level example to illustrate this case. Wood chips are transported from the chip pile via a fixed speed conveyor belt to the chip bin. The transport delay is 200 seconds. Let us approximate this as 3 minutes for the sake of this example. The chip bin level controller adjusts the speed of a variable speed feeder which deposits the chips on the belt. The process parameters are (time in minutes):Kp = 0.005, and Td = 3. From Equation 72.7 the loop transfer function is a function onIy of the deadtime and the closed-loop time constant. Let us consider two cases, h/Td 2 and 3. Figure 72.15 shows the root locus plot for the case of h = 2Td = 6 minutes. The root locus plot shows the Pade' polelzero pair approximation for deadtime at f2/3, the reset zero at - 1/9,
-
and two poles at the origin. The closed poles are located on a root locus segment which moves quickly into the right half-plane (RHP), and the poles have a damping coefficient of only 0.69. Even more aggressive tuning speeds up the controller zero and increases the loop gain. This causes the root locus segment to bend towards the RHP earlier and reduce the damping coefficient. Clearly this and faster'tunings are far too aggressive. Figure 72.16 shows a root locus plot for h = 2Td = 9 minutes. The slower controller zero and lower loop gain cause the root locus to encircle the controller zero fully before breaking away again. Figure 72.16 shows an expanded plot of the region near the origin. Even though there is still a branch containing a pair of poles which goes to the RHP, the system is quite robust to changes in loop gain. Changes in deadtime move the deadtime polelzero pair closer to the origin as the deadtime lengthens, causing the eventual separation of the root locus branch as in Figure 72.15. From this analysis, it is clear that for reasons of robustness and resonance, the closed-loop time constant should not be made faster than three times the deadtime. Figure 72.17 shows the load response for the chip bin level control tuned for h = 3Td = 9 minutes, for a step cKange in outflow from the chip bin. The level initially sinks until the controller manages to increase the flow into the chest, to arrest the change for the firs! time. This occurs at 9.9 minutes in this case. For minimum-phase systems (no deadtime or RHP zeros), this point occurs exactly at A. Otherwise the closed-loop time constant is a good approximation for
LINEAR CONTROL CONCEPTS - LAMBDA TUNING -LOOP PERFORMANCE
Figure 72.14
Consistencyloop sensitivity functions for )i/ Td = 1, 2, 3, 4.
1238
THE CONTROL HANDBOOK
I
.1
n
a
-
!I
-
S
Td=3 Lambda.9
tl
-
0 -
-
-.I, - .3
Figure 72.17
I
I
8
I
I
- .2
I
I
Digester chip bin level load response to step change in outlet flow.
I
I
I
-.l
I
I
I
I
fl Real s
LINEAR CONTROL CONCEPTS -LAMBDA TUNING -LOOP PERFORMANCE TABLE 72.5 First-Order Plus Deadtime Lambda Tuning Performance1 Robustness Trade-Off. 3 4 1 2 Tuning (A/ Td) ratio 10 15 20 Closed-looptime constant A (sec.) 5 0.04 0.0667 0.05 ~andwidtk(l/(A+ Td))radianslsec. 0.1 +1.8 f1.5 +3.6 +2.0 Resonant Peak (dB) 1.23 1.19 1.51 1.26 Amplitude Ratio 6.8 7.9 3.1 4.7 Gain Margin 76 79 61 71 Phase Margin ( O )
the time when the load disturbance will be arrested. The time to correct fully for the whole change is approximately six tirnes longer.
72.5.5 Multiple Ccntrol Objectives A control loop can have multiple control objectives, such as
good set point tracking, good rejection of major load disturbances, and minimizing process variability. The tuning for each of these is unlikely to be the same. The first two relate to fairly large excursions, where speed of response is of the essence and process noise does not dominate. Minimizing process variability, on the other hand, is of major importance in the pulp and paper industry, and is essentially a regulation problem around fixed set points with noise dominant. There is a strong need to be able to accommodate the different tuning requirements (differentAs) of these cases, all ofwhich applyto the same loop at different times. One way to approach this problem might be to use logic within the control system, which would gain schedule based on the magnitude of the controller error signal. For errors below some threshold, say 3%, the controller would be tuned with As suitable for rejecting noise over a certain bandwidth. This tuxling is likely to be fairly conse~ative.On the other hand, when the controller error is above some threshold, say lo%, the gains could correspond to an aggressive A chosen to correct the disturbance as quickly as practically possible. In addition, it is useful to consider the use of set point feedforward control to help differentiatebetween set point changes and major load changes.
SISO Control Using PI, PID, PID.F, Cascade, and Feedforward The foregoingdiscussion has centered on the PI controller and SISO control (98% of the control loops in pulp and paper). The reasons for-such wide use of single loop control is partly historical, as this is how the industry has evolved. There is also a good technical justification given the sheer complexity of a process with hundreds of controlloops. The operator may arbitrarily place any of these in manual mode for operating reasons. Hence the concept of designing a workable MIMO system is almost unworkable given the complexity (note that the pilot of an aircraft
cannot put the left aileron on manual, only the entire autopilot system)., Furthermore, the concept of Lambda tuning allows the PI controller to provide an effective solution in most cases. PID control is needed when the controller must be able to cancel two process poles (e.g., the control of an underdampecl second-order process, the aggressive control of a second-order overdamped process). There are times when it may be advantageous to include a series filter with PID algorithm (P1D.F) (e.g., the control of a second-order process with a transfer function zero). The filter is then used to cancel the zero. There is extensive use in the pulp and paper industry of cascade control (e.g., basis weight and stock flow, paper moisture and steam pressure, bleach plant brightness and bleaching chemical flow, boiler steam drum level and feedwater flow). There is also quite frequent xse of feedforward control (e.g., boiler steam drum feedforward from steam demand).
Deadtime Compensation The pulp and paper industry uses deadtime compensator algorithms extensively (e.g., basis weight control, moisture control, Kappa number control, bleach plant brightness control). The common types are the Smith predictor [19], and the Dahlin algorithm [6]. It is interesting to note that the original work of Smith focused heavily on the set point response of the algorithm, for which the technique is well suited. However, almost all of the deadtime compensators in commission are employed as regulators, operating in steady state with the goal of disturbance rejection and minimizing variability. Although not a new discovery, Haggman and Bialkowski [12] have shown that the sensitivity function of these regulators, as well as deadtime ccmpensators using IMC structure [15], is essentiallythe same as that of a Lambda tuned PI controller for a given bandwidth. There is no way of escaping the time-delay phase lag induced resonance ("water-bed" effect) as long as simple process output feedback is used. These algorithms offer no advantage for low-frequency attenuation or in providing less resonancewhen tuned for the same bandwidth. On the other hand, they are more complex and have a greater sensitivity to model mismatch than the PI controller. One exception is the Kalrnan filter-based deadtime compensator [2] which uses state feedback from an upstream state variable in the deadtime model. Because the feedback control structure is identical to that of a Smith predictor, the time-delay resonance is also present. However, the Kalman filter prevents the excitation of this resonance through the low-pass dynamics of the Kalman filter update cycle which has equal bandwidth to the regulator,
THE CONTROL HANDBOOK
thereby attenuating noise in the region of the primary time-delay resonance by some - 10 dB.
Adaptive and MIMO Control With the exception of gain scheduling, adaptive control has not achieved wide use in the pulp and paper industry, because its algorithms are too complex for the current technical capability in most mills. Whereas there are occasionalreports of quite advanced work being done [such as [I], on MIMO adaptive control of Kamyr digester chip level using general predictive controI (GPC)], in most cases the results of such advanced work are very difficult to maintain for long in an operational state in a mill environment. As for the much simpler commercially available self-tuning controllers, most of these perform some type of identification (e.g., the relay method) and then implement a simple adaptation scheme which often takes the form of automating a Ziegler-Nichols-like tuning. This usually results in a lightly damped closed-loop system which is quite unsuitable for the control objectives previously discussed (e.g., to prevent propagation of resonances). Finally, there is fairly wide use of simple decoupling schemes which depend mainly on static decouplers (e.g., basis weight and moisture, multiply headbox control of fan pumps).
72.6 Control Strategies for Uniform Manufacturing The relationship between algorithms, control strategy, and variabilitywas put into a conciseperspective by Downs and Doss, [7], and some of their thoughts have been put into a pulp and paper context here. The control algorithm, however simple or complex, does not eliminate variability, but it causes the variability to be shifted to another place in the process. For instance, consistency control shifts variability from the stock consistency to the dilution valve and the dilution header, from the fiber stream to the dilution stream. The control algorithm and its tuning determines the efficiency with which this is done. The control strategy determines the pathway that the variability will follow. The control strategy can be defined as defining control system objectives selecting sensor types and their location selecting actuator types and their location deciding on inputloutput pairing for single loops designing multivariable control where needed 6. designing process changes that make control more effective
1. 2. 3. 4. 5.
The control strategy design hence determines where the varialility will attempt to go. Let us revisit the paper machine blend chest example. Both the hardwood and softwood steams have consistency control, as these stocks leave their respective stock chests. Both of these consistency controls modulate dilution valves which take dilution water from the same header. As a result, they interact (see Figure 72.4). Hence variability will move
from the hardwood consistencyto the dilution header, to the softwood consistency, to the blend chest consistency, and back again. When these stock flows enter the blend chest, they will undergo mixing, which will act as a low-pass filter. Hence, the consistency loops have attenuated the low-frequency content, while causing interaction between each other at high frequency. Then, the high-frequency content will be attenuated by the mixing action of the chest. However, on leaving the blend chest, the blend chest consistency controller also draws dilution water from the same header. Clearly, this design will compromise almost everything that has been gained at the high-frequency end of the spectrum. This example illustrates also that the control engineer, acting alone in the domain of the control algorithm, cannot achieve effective control over process variability. What is needed is an integration of process and control design.
-
72.6.1 Minimizing Variability Integrated Process and Control Design To design a process which produces low variability product, a true integration of the process and control design disciplines is required. The old way of designing the process in steady state, and adding the controls later, has produced the pulp and paper mills of today, which, as variability audits have already shown, are variable far in excess of potential. Control algorithm design follows a very general methodology and is largely based on linear dynamics. When thinking about control loop performance, the engineer pays no attention to the actual behavior of the process. For instance, the most important phenomena in the pulp and paper industry concern pulp slurries and the transport of fiber in two- or three-phase flow. The physics that govern these phenomena involve the principles of Bernoulli and Reynolds and are very nonlinear. The linear transfer function is a necessary abstraction to allow the cbntrol engineer to perform linear control design, the only analysis that can be done well. Yet in the final analysis, control only moves variability from one stream to another, where hopefully it will be less harmful. Yet the process design is not dued. What about the strategy of creating new streams? Integrated process and control design must take a broader view of control strategy design. Control is only a high-pass attenuation mechanism for variability. Process mixing and agitation provide low-pass attenuation of variability via mixing internal process streams. Yet both of these techniques only attenuate by so many dB. But the customer demands that the absolute variability of the product be within some specified limit to meet the manufacturing needs of his process. Surely, eliminating sources of variability is the best method to insure that no variabilitywill be present which will need attenuation. These issues are in the domain of process design and control strategy design. Control strategy design does pot lend itselfto elegant and general analysis. Each process is different and must be understood in its specific detail. Nonlinear dynamic simulation offers a powerful tool to allow detailed analysis of performance trade-offs apd is the only available method for investigatingthe impact of different design decisions on variability. Such decisions must question current process design practice.
72.8. DEFINING TERMS From the blend chest example, it is clear that a number of process design issues compromise variability, and that integrated process and control design could lead to the following design alternatives: 1. Eliminate all stock flow control valves, and use variable frequency pump drive!;. This will eliminate all of the nonlinear problems of backlash and stiction. 2, Redesign the existing blend chest with two compartments, each separately agitated. This will convert the existing agitation from a first-order low pass filter to a second-order filter and provide a high-frequency attenuation asymptote at -40 dB1decade instead of -20 dB1dccade. 3. Provide a separate dilution header for NC302, and use a variable frequency pump drive instead of the control hlve. This will eliminate the high-frequency noise content of the existing header from disturbing NC302, and will also provide nearly linear control of the dilution water. 4. Replace the dilution header control valve of PC309 by a variable frequencypump drive. This will allow much faster tuning of this loop, substantially reducing the interaction between NC104 and NC204. Each of these design alternativesshould be evaluated using dynamic simulation before significant funds are committed. The alterations proposed above vary in capital cost from $10,000 to $1,000,000. Hence the simulation must have high fidelity representing the phenomena of importance. In addition, network analysis techniques rnay determine how variability spectra propagate through a process and control strategy. Changes in process and control strategy design alter these pathways by creating new streams. The plant can be viewed as a network of connected nodes (process variables) with transmission paths (e.g., control loops, low pass process dynamics, etc.) which allow variability spectra to propagate. These ideas will need time to develop.
72.7 Concliusions This chapter has attempted to provide a general overview of control engineering in the pulp and paper industry in the mid 1990s. There is a brief introduction to wood, pulp, paper products, and the unit processes. The results of variability audits were then presented to show how much potential exists in this industry to improve product uniformity, especially when there is an increasingly strong demand for uniform product. The concept of specifying the closed-loop performance of each control loop to match the process needs is presented as the Lambda tuning concept. The use of kunbda tuning is illustrated in a paper mill example, and the performance and robustness of tuned control loops are explored. There is a general review of various algorithms and their usmein the industry. Finally, the concept of integrating control strategy and design of both the process and control is presented. This is seen as a new avenue of thought which promises to provide a design methodology for the pulp and paper mills of tlne future, which will be far more capable of efficiently manufacturing highly uniform product than today's mills.
72.8 Defining Terms AR: Amplitude ratio. Backlash: Hysteresis or lost motion in an actuator. Basis weight: The paper property of mass per unit area (gsm, lbs13000 sq. ft., etc.) Bump test: Step test. p: Process transfer function zero time constant. C: ISA symbol for control. Cellulose: Long chain polymer of glucose, the basic building block of wood fiber. Chest: Tank. Consistency: Mass percentage of solids or fiber content of a pulp or stock slurry. CPPA: Canadian Pulp and Paper Association, 1155 Metcalfe St. Montreal, Canada, H3B 4T6. DCS: Distributed control system. Deadtime: Time delay. F: ISA symbol for flow. Gc(s): Controller transfer function in the continuous (Laplace) domain. G p (s): Process transfer function in the continuous (Laplace) domain. gsm: Grams per square meter. HD Chest: HighDensitychest, alargeproduction capacity stock tankwith consistencytypically in the I 0 to 15% range and with a dilution zone in the bottom. Hemicellulose: Polymers of sugars other than glucose, a constituent of wood fiber. IMA: Integrating moving average noise structure. ISA: Instrument Society of America. ISAtags: ISA tagging convention (e.g., FIC177 means Flow Indicating Controller No. 177). KC: Controller gain. Kp: Process gain. L: ISA symbol for level. Lambda tuning: Tuning which requires the user to specify the desired ciosed-loop time constant, Lambda. Lambda (A): The desired closed-loop time constant, usually in seconds. Lignin: Organic compound which binds the wood fiber structure together. L i t cycle: A cycle induced in a control loop by nonlinear elements. N: ISA symbol for consistency. P: ISA symbol for pressure. PI: Proportional-Integral controller, PID: Proportional-Integral-Derivativecontroller. P1D.F: Proportional-Integral-Derivativecontrollerwith series fiiter.
THE C O N T R O L HANDBOOK Positioner: Control valve accessory which acts as a local pneumatic feedback servo. Pulping: The process of removing individual fibers from solid wood. Refiner: A machine with rotating plates used in pulping which disintegrates the wood chips into individual fibers through mechanical action, and in papermaking, to "fibrillate" the fibers to enhance bonding strength. RHP: Right half-plane ofthe s-plane-the unstable region.
+ [ Stiction: Static friction in an actuator. Standard PI form: G c ( s ) = Kc 1
' I.
Stodc Pulp slurry. Td: Deadtime. TR: Controller reset or integral timelrepeat. tl , 52: process time constants (ti 2 t2). TAPPI: Tech. Asn. P & P Ind., P. 0. Box 105133, Atlanta, GA, USA, 30348-5113.
References
[I] Allison, B. J., Dumont G. A., and Novak L. H., MultiInput Adaptive Control of Kamyr Digester Chip Level: Industrial Results and Practical Considerations, CPPA Proc., Control Systems '90, Helsinki, Finland, 1990. [2] Bialkowski, W. L., Application of Kalman Filters to the Regulation of Dead Time Processes, IEEE Trans. Automat. Conirol, AC-28,3,1983. [3] Bialkowski, W. L., Dreams Versus Reality: A View From Both Sides of the Gap, Keynote Address, Control Systems '92, Whistler, British Columbia, 1992, published, Pulp Paper Canada, 94,11,1993. [4] Bialkowski,W. L., Haggman, B. C., andMillette, S. K., Pulp and Paper Process Control Training Since 1984, Pulp Paper Canada, 9 5 , 4 1994. [5] Chien, I-L. and Fruehauf, P. S., Consider IMC Tuning to Improve Controller Performance,Hydrocarbon Proc., 1990. [6] Dahlin, E. B., Designing and Tuning Digital Controllers, Instrum. Control Syst., 41(6), 77, 1968. [7] Downs, J. J. and Doss, J. E., Present Status and Future Needs- a viewfrom North American Industry, Fourth International Conf. Chem. Proc. Control, Padre Island, Texas, 1991, 17-22. [8] ~ n ~ e ch Automatic ~ Controller Dynamic Specification (Version 1.0, 11/93) (EnTech Literature). 191 ~ n ~ e c hCompetency ~ ~ in Process ControlIndustry Guidelines (Version 1.0,3/94) (EnTech Literature).
[lo] ~ n ~ e c Control h ~ ~Valve Dynamic Specification (Version 2.1,3/94) (EnTech Literature). [ll] ~ n ~ e c Digital h ~ Measurement ~ Dynamics - Industry Guidelines (Version 1.0, 8/94) (EnTech Literature). [ 121 Haggman, B. C. and Bialkowski, W. L., Performance of Common Feedback Regulators for First-Order and Deadtime Dynamics. Pulp Paper Canada, 95,4,1994. [13] Kaminaga, H., Onein a thousand, Proc. CPPAControl Systems '94, Stockholm, Sweden, 1994. [14] Kocurek, M. J., Series Ed., Pulp and Paper Manufacture, Joint (TAPPI, CPPA) Textbook Committee of the Pulp and Paper Industry, 1983 to 1993, Vol. 1 to 10. [15] Morari, M. and Zafiriou, E., Robust Process Control, Prentice Hall, 1989. [16] Newton, G. C., Gould, L. A., and Kaiser, J. F., Analytical Design of Linear Feedback Controls, John Wiley & Sons, 1957. [17] Sell, N., Editor, Bialkowski, W. L., and Thomason, F. Y., contributors, Process Control Fundamentals for the Pulp 6 Paper Industry, TAPPI Textbook, to be published by TAPPI Press, 1995. [18] Smith, C. A. and Corripio, A. B., Principles and Practice of Automatic Process Control, John Wiley & Sons, 1885. [19] Smith, 0. J. M., Closer Control of Loops with Dead Time, Chem. Eng. Prog., 53(5), 2 17-2 19,1957. [20] Taylor, G., The Role of Control l'alves in Process Performance, Proc. CPPA, Tech. Section, Canadian Pulp and Paper Assoc., Montreal, 1994. & Ziegler, I] J. G. andNichols, N. B., Optimum settingsfor automatic controllers, Trans.ASME, 759-768, 1942.
Control for Advanced Semiconductor Device Manufacturing: A Case History 73.1 Introduction ..........................................................1243 73.2 Modeling and Simulation ............................................1246 T. Kailath, C. Schaper, Y. Cho, P. Gyugyi, S. Norman, PaPark, S. Boyd, G. Franlzlin, and 73.3 Performance Analysis.. ...............................................1247 73.4 Models for Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1248 K. Saraswat Department of Electr~calEng~neenng,Stanford U n ~ v e r s ~ t y , 73.5 Control Design .......................................................1252 Stanford, C A 73.6 Proof-of-Concept Testing ............................................1253 73.7 Technology Transfer to Industry.. ..................................1255 M. Moslehi and (Z.Davis 73.8 Conclusions.. ....................................................... 1259 Semiconductor Process and De.rlgn Center, Texas Instrun~er?ts References ...................................................................1259 Dallas. TX
73.1 Introduction
'
Capital costs for neuv integrated circuit (IC) fabrication lines are growing even more rapidly than had been expected even quite recently. Figure 73.1 was prepared in 1992,but a new Mitsubishi factory in Shoji, Japan, is reported to have cost $3 billion. Few companies can afford investments on this scale (and those that can perhaps prefer it that way). Moreover these factories are inflexible. New equipment and new standards, which account for roughly 314 of the total cost, are needed each time the device feature size is reduced, which has been happening about every 3 years. It takes about six years to bring a.new technology on line. The very high development costs, the high operational costs (e.g., equipment down tirne is extremely expensive so maintenance is done on a regular schedule, whether it is needed or not), and the intense price competition compel a focus on high-volume low cost commodit.~lines, especially memories. Low volume, high product mix ASIC iapplication-specific integrated circuit) production does not fit well within the current manufacturing scenario. In 1989, the Advanced Projects Research Agency(ARPA), Air Force Office of Scientific Research(AFOSR), and Texas Instruments (TI) joined in a $150 million cost-shared program called MMST (Microelectronics Manufacturing Science and Technology) to "establish and demonstrate (new) concepts for semi-
his research was supported by the Advanced Research Projects Agency of the Department of Defense, under Contract F49620-931-0085 monitored by the Air Force Office of Scientific Research. 0-8493-8570-9/96/$0,00+9;.50 @ 1996 by CRC Press, lnc.
0 Mitsubishi 64Mbit A NEC 16Mbit W Toshiba l6Mbit A Fujitsu 16&64 Mbit
Year
Figure 73.1 Capital cost for a new IC factory. (Source: Texas lnstruments Technical Journal, 9(5), 8, 1992.) conductor device manufacture which will permit flexible, costeffective manufacturing of application-specific logic integrated circuits in relatively low volume . . . during the mid 1990s and beyond". The approach taken by MMST was to seek fast cycle time by performing all single-wafer processingusing highly instrumented flexible equipment with advanced process controls. The goal of the equipment design and operation was to quickly adapt the equipment trajectories to a wide variety of processing specifications and to quickly reduce the effects of manufacturing disturbances associated with small lot sizes (e.g., 1, 5 or 24 wafers)
THE CONTROL HANDBOOK without the need for pilot wafers. Many other novel features were associated with MMST including a factorywide CIM (computer integrated manufacturing) computer system. The immediate target was a 1000-wafer demonstration (including demonstration of "bullet wafers" with three-day cycle times) of an all single-wafer factory by May 1993. In order to achieve the MMST objectives, a flexible manufacturing tool was needed for the thermal processing steps associated with IC manufacturing. For a typical CMOS process flow, more than 15 different thermal processing steps are used, including chemical vapor deposition (CVD), annealing, and oxidation. TheMMSTprogram decided to investigate the use ofRapid Thermal Processing (RTP) tools to achieve these objectives. TI awarded Professor K. Saraswat of Stanford$ Center for Integrated Systems (CIS) a subcontract to study various aspects of RTP. About a year later, a group of us at Stanford's Information Systems Laboratory got involved in this project. Manufacturing was much in the news at that time. Professor L. Auslander, newly arrived at ARPA's Material Science Office, soon came to feel that the ideas and techniques of control, optimization, and signal processing needed to be more widely used in materials manufacturing and processing. He suggested that we explore these possibilities, and after some investigation, we decided to work with CIS on the problems of RTP. RTP had been in the air for more than a decade, but for various reasons, its study was still in a research laboratory phase. Though there were several small companies making equipment for RTP, the technology still suffered from various limitations. One of these was an inabilityto achieve adequate temperature uniformity across the wafer during the rapid heating (e.g., 20ยฐC to 1100ยฐC in 20 seconds), hold (e.g., at 1100ยฐC for 1-5 minutes), and rapid cooling phases. This chapter is a case history of how we successfully tackled this problem, using the particular "systems-way-of-thinking" very familiar to control engineers, but seemingly not known or used in semiconductor manufacturing. In a little over two years, we started with simple idealized mathematical models and ended with deployment of a control system during the May, 1993, MMST demonstration. The system was applied to eight different RTP machines conducting thirteen different thermal operations, over a temperature range of 450ยฐC to 1100ยฐCand pressures ranging from to 1 atmosphere. Our first step was to analyzethe performance of available commercial equipment. Generally, a bank of linear lamps was used to heat the wafer (see Figure 73.2). The conventional wisdom was that a uniform energy flux to the wafer was needed to achieve uniform wafer temperature distribution. However, experimentally it had been seen that this still resulted in substantial temperature nonuniformities, which led to crystal slip and misprocessing. To improve performance, various heuristic strategies were used'by the equipment manufacturers, e.g., modification of the reactor through the addition of guard rings near the wafer edge to reflect more energy to the edge, modification of the lamp design by using multiple lamps with a fixed power ratio, and various types of reflector geometries. However, these modifications turned out to be satisfactory
0 06 000
IFiit amp ~ o n (cI bulb)
I
Figure 73.2 RTP lamp configurations: (a) bank of linear lamps, (b) single arc lamp, (c) two-zone lamp array.
only for a narrow range of conditions. The systemsmethodology suggests methods attempting to determine the performance limitations of RTP systems. To do this, we proceeded to develop a simple mathematical model, based on energy transfer relations that had been described in the literature. Computer simulations with this model indicated that conventional approaches trying to achieve uniform flux across the wafer would never work; there'was always going to be a large temperature roll-off at the wafer edge (Figure 73.3). To improve performance, we decided to study the case where circularly syrnmetric rings of lamps were used to heat the wafer. With this configuration, two cases were considered: (1) a single power supply in a fixed power ratio, a strategy being used in the field and (2) independently controllable multiple power supplies (one for each ring of lamps). Both steady-state and dynamic studies indicated that it was necessary to use the (second) multivariable configuration to achieve wafer temperature uniformity within specifications. These modeling and analysis results are described in Sections 73.2 and 73.3, respectively. The simulation results were presented to Texas Instruments, which had developed prototype RTP equipment for the MMST program with two concentric lamp zones, but operated in a scalar control mode using a fixed ratio between the two lamp zones. At our request, Texas Instruments modified the two zone lamp by adding a third zone and providing separate power supplies for each zone, allowing for multivariable control. The process engineers in the Center for Integrated Systems (CIS) at Stanford then evaluated the potential of multivariable control by their traditional so called "hand-tuning" methodology, which con-
Figure 73.3 Nonur~iforrnityin temperature induced by uniform energy flux impinging 011the wafer top surface (centertemperatures - solid line: 600ยฐC; dashed line: 1000ยฐC;dotted line: 1150ยฐC.).R is the radius of the wafer, r is the radial distance from the center of the wafer.
sists of having experienced operators determining the settings of the three lamp powers by manual iterative adjustment based on the results of test wafers. Good results were achieved (see Figure 73.4), but it took 7-8 hours and a large number of wafers before the procedure converged. Of course, it had to be repeated the next day because of unavoidable changes in the ambient conditions or operating conditions. Clearly, an "automatic" control strategy was required. However, the physics-based equations used to simulate the RTP were much too detailed and contained far too many uncer-
tain parameters for control design. The two main characteristics of the simulation model were (1) the relationship between the heating zones and the wafer temperature distribution and (2) the nonlinearities ( T ~of) radiant heat transfer. Two approaches were used to obtain a reduced-order model. The first used the physical relations as a basis in deriving alower-order approximate form. The resulting model captured the important aspects of the interactions and the nonlinearities, but had a simpler structure and fewer unknown parameters. The second approach vieived the RTP system as a black box. A novel model identification procedure was developed and applied to obtain a state-space model of the RTP system. In addition to identifying the dynamics of the process, these models were also studied to assess potential difficulties in performance and control design. For example, the models demonstrated that the system gain and time constants changed by a factor of 10 over the temperature range of interest. Also, the models were used to improve the condition number of the equipment via a change in reflector design. The development of control models is described in Section 73.4. Using these models, a variety of control strategies was evaluated. The fundamental strategy was to use feedforward in combination with feedback control. Feedforward control was used to get close to the desired trajectory and feedback control was used to compensate for inevitable tracking errors. A feedback controller based on the Internal Model Control (IMC) design procedures was developed using the low-order physics-based model. An LQG feedback controller was developed using the black-box model. Gain scheduling was used to compensate for the nonlinearities. Optimization procedures were used to design the feedforward controller. Controller design is described in Section 73.5. Our next step was to test the controller experimentally on the Stanford RTP system. After using step response and PRBS (Pseudo Random Binary Sequence) data to identify models of the process, the controllers were used to ramp up the wafer temperature from 20ยฐC to 900ยฐC at approximately 45"C/s, followed by a hold for 5 minutes at 900ยฐC. For these experiments, the wafer temperature distribution was sensed by three thermocouples bonded to the wafer. The temperature nonuniformity present during the ramp was less than f5OC from 400ยฐC to the processing temperature and better than f0.5OC on average during the hold. These proof-of-concept experiments are described in Section 73.6.
Figure 73.4 Tempel-aturenonuniformity when the powers to the lamp were manually adjust'ed ("hand-tuning"). These nonuniformities correspond to a ramp and hold from nearly room temperature to 600ยฐC at roughly 40ยฐC/s. The upper curve (-o-) corresponds to scalar control (fixed power ratio to lamps). The lower curve (x-x) correspondsto multivariable control.
These results were presented to Texas Instruments, who were preparing their RTP systems for a 1000 wafer demonstration of the MMST concept. After upper level management review, it was decided that the Stanford temperature control system would be integrated within their RTP equipment. The technology transfer involved installing and testing the controller on eight different RTP machines conducting thirteen different thermal operations used in two full-flow 0.35 p m CMOS process technologies (see Figure 73.5 taken from an article appearing in a semiconductor manufacturing trade journal). More discussion concerning the
THE CONTROL HANDBOOK
r WAF~~P~~~~SIRQHEWIS Prw Surger. Senior ELtor
I
Strnford Camplates RTP Tech Transfer t o TI A real-time multivariable temperature control technique for mpid t h e 4 proassing (RTP) been m d L R M from ~ Stanford ~ University to TI for use in the Microelectronics Mmufvtunng Science .nd -logy (MYST) p r o w . Many believe that such improved control tedurigua are to the long-tcraa suecuu, of RTP. RTP is a key p.rt of the hUKST prognm, the god of which b to demons m the ~ f d b ' i t y of 100%h g l e waf a pg-. The transfer of Stanford's control technology, which includes hardware md software .orighdly developed for prototype equrpment at Stanford, to TI and subsequent customisption was a eomplex process. involving: implementation on m e n RTP machines: three machines with a fow-zone TI lomp and four machines with a sixzone ( G a q u d ) h p , configurab'ility to 0.1. 2, 3 or 4 advanced pymmetrie temperatm sensors, jectorres for autocalibration of temperature SeMOm. usage on 13 different pnmsses &om 4WC to 11000C (see Table), flexibility for proms calibration (as opposed to temperature cdibntion), a ~ncorporationof software interlocks for RTP over-temperature protection, implementation of signal pmeasing, strateper for noisy temperntun sen-
Figure 73.5
~
RTP Reactors Used in MMST Program -n-
@I
---
(a)
-
(b)
IRA F 3 l o a c )
0
---
-:IIICI(.-
~ r C a -llr
Q
UI)
(a
(h)
%
son, and integration into the ClM MMST o p erationid software environment. The bawc idea behind the new control concept is to nuuupulate the power to the l m p u m y to control wafer tem-
peratun, thereby rhiwing improved procelu, uniformity and repeatability. This is done largely through a number of software modules for feedback, feedforward, anti-overshoot, galn schedulmg and oth;n. PS
Description of technology transfer in Semiconductor International, 16(7), 58,1993.
technology transfer and results of the MMST demonstration is given in Section 73.7. Finally, some overview remarks are offered in Section 73.8.
73.2 Modding and Simulation Three alternative lamp configurations for rapidly heating a semiconductor wafer are shown in Figure 73.2. In Figure 73.2(a), linear lamps are arranged above and below the wafer. A single arc lamp is shown in Figure 73.2(b). Concentric rings of single bulbs are presented in Figure 73.2(c). These designs can be modified with guard rings around the wafer edge, specially designed reflectors, and diffusers placed on the quartz window. These additions allowed fine-tuning of the energy flux profile to the wafer to improve temperature uniformity. To analyze the performance of these and related equipment designs, a simulator of the heat transfer effects was developed starting from physical relations for RTP available in the literature [I], [2]. The model was derived from a set of PDE's describingthe radiative, conductive and convective energy transport effects. The
basic expression is
r ar
+A;
(kr:)
( k g )
+
$6:)
= pcP1
aT
(73.1) where T is temperature, k is thermal conductivity, p is density, and C p is specific heat. Both k aria C i are temperature depen-dent. The boundary conditions are &en by
aT k-az
= qbottom (r, 6 ) , z = 0, and
where qedge9qbofrom, and qrop are heat flow per unit area into the wafer edge, bottom, and top, respectively, via radiative and convective heat transfer mechanisms, Z is the thickness of the wafer, and R is the radius of the wafer. These terms coupled the effects of the lamp heating zones to the wafer. Approximations were made to the general energ'y balance assuming axisymmetryand neglectingaxial temperature gradients. The heating effects in RTP were developed by discretizing the wafer into concentric annular elements. Within each annular
73.3. PERFORMANCE ANALYSIS wafer element, the temperature was assumed uniform [2]. The resulting model was given by a set of nonlinear vector differential equations:
satisfies 0 5 P, 5 P,!"ax. Using the finite difference model, the objective function of Equation 73.3 was approximated as
where qss( P ) is the steady-state temperature of element i with constant lamp power vector P and TSe' is a vector with all entriesequal to TSe'. A two-step numerical optimization procedure was then employed in which two minimax error problems were solved to determine the set of lamp powers that minimize Equation 73.3 [4] and 121. In Figure 73.6, the temperature deviation about the set points of 650ยฐC, 100O0C, and 1150ยฐC is shown. The deviation is less than f lยฐC, much better than for the case of uniform energy flux.
where
where N denotes the number of wafer elements and M denotes the number of radiant heating zones; K''~ is a full matrix describing the radiation emission characteristicsof the wafer, K C O " ~ is a tridiagonal rnatrix describing the conductive heat transfer effects across the wafer, KConVis a diagonal matrix describing the convective heat transfer effects fiom the wafer to the surrounding gas, F is a full matrix quantifying the fraction of energy leaving each lamp zone that radiates onto the wafer surface, qdist is a vector of disturbances, qwui' is avector of energy fluxleaving the chamber walls and ]radiating onto the wafer surface, and C is a diagonal matrix relating the heat flux to temperature transients. More details can be found in [Z] and [3].
73.3 Performance Analysis We first used the model to analyze the case of uniform energy flux impinging on the wafer surface. In Figure 73.3, the temperature profile induced by a uniform input energy flux is shown for the cases where the center portion of the wafer was specified to be at either 600ยฐC, 1000ยฐC, or 1150ยฐC. A roll-off in temperature is seen in the plots for all cases because the edge of the wafer required a different annount of energy flux than the interior due to differences in surface area. Conduction effects within the wafer helped to smooth the temperature profile. These results qualitatively agreed with those reported in the literature where, for example, sliplines at the wafer edge were seen because of the large temperature gradients induced by the uniform energy flux conditions. We then analyzed the multiple concentric lamp zone arrangement of Figure 73.2(c) to assess the capability of achieving uniform temperature distribution during steady-state and transients. We considered each of four lamp zones to be manipulated independently. The optimal lamp powers were determined to minimize the peak temperature difference across the wafer at a steady-state condition, max ITSS(r,P ) - TSe'l
OsrsR
(73.3)
where TSet is the desired wafer temperature and T S S ( r P, ) is the steady-state temperature at radius r with the constant lamp power vector P , subject to the constraint that each entry Pi of P
Optimal temperature profiles using a m~lltizoneRTP system (center temperatures - solid line: 600ยฐC; dashed lime: 1000ยฐC;dotted line: 1150ยฐC).
Figure73.6
In addition, an analysis of the transient perforrnance was conducted because a significant fraction of the processing and the potential for crystal slip occurs daring the ramps made to wafer temperature. We compared a multivariable lamp control strategy and a scalar lamp control strategy. Industry, at that time, employed a scalar control strategy. For the scalar case, the lamps were held in a fixed ratio of power while total power was allowed to vary. We selected the optimization criterion of minimizing max l l ~ ( t ) - ~ r e f ( t ) l l
t05'5tf
00
which denotes the largest temperature error from the specified trajectory Tref ( t ) at any point on the wafer at any time between an initial time To and a final time t f . The reference temperature trajectory was selected as a ramp from 600ยฐC to 1150ยฐC in 5 seconds. The optimization was carried out with respect to the power to the four lamp zones, in the case of the multilamp configuration, or to the total power for a fixed ratio that was optimal only at a 10!lOยฐC steady-statecondition. The temperature at the center of the wafer matched the desired temperature trajectory almost exactly for both the multivariable and scalar control
-
THE CONTROL HANDBOOK
cases. However, the peak temperature difference across the wafer was much less for the multivariable case compared to the scalar (fixed-ratio) case as shown in Figure 73.7.
achieved [2]. At the time of these simulations, prototype RTP equipment was being developed at Texas Instruments for implementation in the MMST program. TI had developed an RTP system with two concentric lamp zones. Their system at that time was operated in a scalar control mode with a fixed ratio between the two lamp zones. Upon presenting the above results, the two zone lamp was modified by adding a third zone and providing separate power supplies for each zone. This configuration allowed multivariable control. A resulting three-zone RTP lamp was then donated by TI to Stanford University. The chronology of this technology transfer is shown in Figure 73.9.
CHRONOLOGY OF TECHNOLOGY TRANSFER FROM STANFORD TO TEXAS INSTRUMENTS
-
Modeling of heat l r a n s s f o r RTP
(ID0 8190)
Optimhation and simulation of performance limits
U r n (see)
Comparison d muitizone lamp configurations
Figure 73.7
Peak temperature nonuniformity during ramp.
-
Development and simulationof contrnliers
(6191 3/92)
-
Experimental demonrtration on Stanford RTM
-
Usage for 1,000wafer MMST marathon demo
(9D0 5/91)
For the case of the fixed-ratio lamps, the peak temperature difference was more than 20ยฐC during the transient and the multivariable case resulted in a temperature deviation of about 2OC. The simulator suggested that this nonuniformity in temperature for the fixed-ratio case would result in crystal slip as shown in Figure 73.8 which shows the normalized maximum resolved stress
(1193 5/93)
Figure 73.9 ments.
Figure 73.8
Normalized maximum resolved stressduring ramp.
(based on simulation) as a function of time. No slip was present in the multivariable case. This analysis of the transient performance concluded that RTP systems configured with multiple independently controllablelampscan substantiallyoutperform any existing scalar RTP system: for the same temperature schedule, much smaller stress and temperature variation across the wafer was achieved; and for the same specifications for stress and temperature variation across the wafer, much faster rise times can be
Chronology of the technology transfer to Texas Instru-
A schematic of the Stanford RTP system and a picture of the three-zone arrangement are shown in Figures 73.10 and 73.11, respectively. "Hand-tuning" procedures were used to evaluate the performance of the RTP equipment at Stanford quickly. In this approach, the power settings to the lamp were manually manipulated in real-time to achieve a desirable temperature response. In Figure 73.4, open-loop, hand-tuned results are shown for scalar control (i.e., fixed power ratio) and multivariable centrol as well as the error during the transient. Clearly, this compark son demonstrated that multivariable control was preferred to the scalar control method [5]. However, the hand-tuning approach was a trial and error procedure that was time-consuming and resulted in sub-optimal performance. An automated real-time control strategy is described in the following sections.
73.4 Models for Control --
--
Two approaches were evaluated to develop a model for control design. In the first approach, the nonlinear physical model presented earlier was used to produce a reduced-order version. An energy balance equation on the ith annular element can be ex-
1249
73.4. MODELS FOR CONTROL
Retlector 1 kW temp 2 kW Lnmp
-r
Ring
Quartz Window
Wafer
ch.mbaWall Gaa Injector Quartz Window
Figure 73.10
Schematic of the rapid thermal processor.
Stefan-Boltzmannconstant, Ai is the surface area of the annular element, Di, is alumped parameter denoting the energytransfer due to reflections and emission, hi is a convective heat transfer coefficient, qydis heat transfer due to conduction, Fi,j is a view factor that represents the fraction of energy received by the i t h annular element from the jth lamp zone, and Pj is the power from the j t h lamp zone. To develop a simpler model, the temperature distribution of the wafer was considered nearly uniform and much greater than that of the water-cooled chamber walls. With these approximations, qYdand qpllwere negligible. In addition, the term accounting for radiative energy transport due tc reflections can be simplified by analyzing the expansion,
where Si,, = Tj - T i . After eliminating the terms involving Si, j (since Ti > > 6i.j ), the resulting model was,
It was noted that Equation 73.8 was interactive because each lamp zone affects the temperature of each annular element and noninteractive because the annular elements did not affect one another. The nonlinear model given by Equation 73.8 was then linearized about an operating point (8,ISi i),
Figure 73.11
Picture of the Stanford three-zone RTM lamp.
where the deviation variables are defined as fi := Ti - fi and Pi = Pi - Pi. This equation can be expressed more conveniently as
pressed as [3] and [16]
M
+ qfo"d $ qpam+ qYs' + c 1 Fi,j Pj
where the gain and time-constant are given by
j=1
(73.6)
where p is density, Vi is the volume of the annular element, C p is heat capacity, T, is temperature, 6 i s total emissivity, a is the
Ki,j
-
cFi,j
&a
~i
Tf
xr=lDi, + hi Ai
(73.11)
THE CONTROL HANDBOOK From Equation 73.11, the gain decreases as f was increased. Larger changes in the lamp power were required at higher f to achieve an equivalent rise in temperature. In addition, from Equation 73.12, the time constant decreases as f is increased. Thus, the wafer temperature responded faster to changes in the lamp power at higher T . The nonlinearities due to temperature were substantial, as the time constant and gain vary by a factor of 10 over the temperature range associated with RTP. The identification scheme to estimate t i and K from experimental data is described in [7], [8]. A sequence of lamp power values was sent to the RTP system. This sequence was known as a recipe. The recipe was formulated so that reasonable spatial temperature uniformity was maintained at all instants in order to satisfy the approximation used in the development ofthe loworder model. The eigenvalues of the system were estimated at various temperature using a procedure employing the TLS ESPRIT algorithm [9]. After the eigenvalues were estimated, the amplitude of the step response was estimated. This was difficult because of the temperature drift induced by the window heatin$; however, a least-squares technique can be employed. The gCn of the system and view factors were then identified ysing a leastsquares algorithm again. The results are shown in Figure 73.12 and 73.13 for the estimation of the effects oftemperature on the gain and time constant, respectively.
'1
6
1
-
.
o : relative gain horn lamp 1 to TC 1 ' : relative gain from. lamp 2 to TC 2 + :relathe gain Imrn lamp 3 to TC 3
Figure 73.13 Time constant of the process model as a function of temperature and comparison with theory.
In order to obtain a complete description of the system, this relationship was combined with models describing sensor dynamics and lamp dynamics [4]. The sensor and lamp dynamics can be described by detailed models. However, for the purpose of model-based control system design, it,was only necessary to approximate these dynamics as a simple time-delay relation,
The measured temperature at time t was denoted by TmVi ( t ) ,and the time delay was denoted by 0. The resulting model expressed in z-transform notation was given by
thoordcd rd.thn gain
Figure 73.12 Gain of the system relative to that at 900ยฐC and comparison with theory.
The model was expressed in discrete-time format for use in designing a control system. Using the zero-order hold to describe the input sequence, the discrete-time expression of Equation 73.10 was given by f(z) =~(z)K&(z)
operating temperatures.
(73.13)
where z denotes the z-transform,
and At denotes the sampling time. The system model was inherently stable since the poles lie within the unit circle for all
where d = 0/ At was rounded to the nearest integer. The power was supplied to the lamp filaments by sending a 0-10 volt signal from the computer control system to the power supplies that drive the lamps. The relation between the voltage signal applied to the supplies and the power sent to the lamps was not necessarily linear. Consequently, it was important to model that nonlinearity, if possible, so that it could be accounted for by the control system. It was possible to determine the nonlinearity with a transducer installed on the power supply to measure'the average power sent to the lamps. By applying a known voltage to the power supplies and then recording the average power output, the desired relationship can be determined. This function can be described by a polynomial and then inverted to remove the nonlinearity from the loop because the model is linear with respect to radiative power from the lamps (see Equation 73.16). We noted that the average power to the lamps may not equal the radiative power from the lamps. The offset was due to heating losses within the bulb filament. However, this offset was implicitly incorporated in the model when the gain matrix, K, was determined from experimental data. A second strategy that considered the RTP system as a black box was employed to identify a linear model of the process [ l 11,
1251
73.4. MODELS FOR CONTROL [ 121. Among numerous alternatives, an ARX model was used to describe the system,
where T is an 1 x 1 vector describing temperature, P is an M x 1 vector describing percent of maximum zone power, Ak is an I x 1 matrix, Bk is an I x M matrix, and I is the number of sensors on the wafer measuring temperature. The steady-state temperature and powerwere denoted by TT.s and P, ,,respectively. Because the steady-state temperature was difficult to determine accurately because of drift, a slight modification to the model was made. Let T , , = ? A?. The least squares problem for model identification can then be formulated as
+
= (I -where Thias
Substitutingin the appropriatevalues for 700ยฐC (with1 = 3, J = 31, 2.07 4.41 4.50 1.11 4.78 4.91 . (73.19) 0.73 5.08 5.51
I
Note that the magnitude of the first column o f D , is smaller than those of the second and third columns, which was due to the difference in the maximum power of each lamp: the first (center) lamp has 2 kW maximum power, the second (intermediate) lamp 12 kW maximum power, and the third (outer) lamp 24 kW maximum power. Also note the similarity of the second and third column of D,, which says that the second lamp will affect the wafer temperature in a manner similar to the third lamp. As a result, we have effectively two lamps rather than three (recall we have physically three lamps), which may cause difficulties in maintaining temperature uniformity in the steady state because of an inadequate number of degrees of freedom. This conclusion will be more clearly seen from an SVD (Singular Value Decomposition) analysis, about which more will be said later. To increase the independence of the control effects of the two outside lamps, a baffle was installed to redistribute the light energy from the lamps to the wafer. The same identification technique described earlier was used to identify the RTP system model, and the DC gain was computed from the identified model with the result
&)A?.
The strategy for estimating the unknown model parameters utilized PRBS (pseudo-random binary sequence) to excite the system to obtain tile necessary input-output data. The mean temperature that this excitation produced is designated as ?. Some other issues that were accounted for during model identification included the use of a data subsampling method so that the ARX model could span a longer time interval and observe a larger change of the temperature. Subsampling was needed because the data colle~ztionrate was 10 Hz and over that interval the temperature changed very little. Consequently, the least-squares formulation, as an identification method, may contain inherent error sources due ti3 the effect of the measurement sensor noise (which was presumed to be Gaussian distributed) and the quantization noise (which was presumed to be uniformly distributed with quantization level of 0.5"C). With the black box approach, the model order needed to be selected. The criterion used to determine the appropriateness of the model was to be able to make the prediction error smaller than the quantization level using as small a number of ARX model parameters as possible. For our applications, this order was three A matrices and three B matrices with subsampling selected as four. We are now going to show the value of the identified models by using themto study an important characteristic of the RTP sysBi}, tem, its DC gain. For the ARX model with coefficients {Ai, the identified DC gain was given by the formula
We can observe that the second column of the new DC gain matrix was no longer similar to the third column, as it was in Equation 73.19. As a result, the three lamps heated the wafer in different ways. The first (center) lamp heated mostly the center of the wafer, and the third (outer) lamp heated mostly the edge of the wafer. On the other hand, the second (intermediate) lamp heated the wafer overall, acting like a bulk heater. Of course, the second lamp heated the intermediate portion of the wafer more than the center and edge of the wafer, but the difference was not so significant. Even ifthe idea ofinstalling a baffle was partly motivated by the direct investigation of the DC gain matrix, it was in fact deduced from an SVD (Singular Value Decomposition) analysis of the DC gain matrix. The SVD of Do in Equation 73.19 is given by
From this, we can conclude that [ l , 1, 11 (u 1) is a strong output direction. Of course, ul is [0.54,0.57,0.62] and is not exactly equal to [ l , 1, 11. However, [0.54,0.57, 0.621 was close to [ 1, 1, 11 in terms of direction in a 3-dimensional coordinate
THE CONTROL HANDBOOK system and was denoted as the [I, 1, 11 direction, here. Since (1, 1, I ] was the strong output direction, we can affect the wafer temperature in the [I, 1, 11 direction by a minimal input power change. This means that if we maintain the temperature uniformity at the reference temperature (700ยฐC), we can maintain the uniformity near 700ยฐC (say, at 710ยฐC) with a small change in the input lamp power. The weak output direction (the vector u3 approximately [ l , -1, 11) says that it is difficult to increase the temperature of the center and outer portions of the wafer while cooling down the intermediate portion of the wafer, which was, more or less, expected. The gain of the weak direction (03) was two orders of magnitude smaller than that of the strong direction ( ~ 3 ) This . meant that there were effectivelyonly two lamps in the RTP system in terms of controlling the temperature of the wafer, even if there were physically three lamps. This naturally led to the idea of redesigning the RTP chamber to get a better lamp illumination pattern. Installing a baffle (see [lo] for more details) into the existing RTP system improved our situation as shown in the SVD analysis of the new DC gain D, in Equation 73.20. The SVD of D, was given by
Compared to the SVD of the previous DC gain, the lowest singular value (03) has been increased by a factor of 5, a significant improvement over the previous RTP system. In other words, only one-fifth of the power required to control the temperature in the weak direction, using the previous RTP system, was necessary for the same task with the new RTP system. As a result, we obtained three independent lamps by merely installing a baffle into the existing RTP system. Independence ofthe three lamps in the new RTP system was crucial in maintaining the temperature uniformity of the wafer.
73.5 Control Design The general strategyof feedbackcombinedwith feedforwardcontrol was investigated for RTP control. In this strategy, a feedforward value of the lamp power was computed (in response to a change in the temperature set point) according to a predetermined relationship. This feedforward value was then added to a feedback value and the resultant lamp power was sent to the system. The concept behind this approach was that the feedforward power brings the temperature dose to the desired temperature; the feedback value compensates for modeling errors and disturbances. The feedback value can be determined with a variety of design techniques, two ofwhich are describedbelow. Severalapproaches were investigated to determine the feedfornard value. One approach was based on replaying the lamp powers of previous runs. Another approach was based on a model-based optimization.
The physics-based model was employed to develop a controller using a variation of the classical Internal Model Control (IMC) design procedure [ 131, [6]. The IMC approach consisted of factoring the linearized form of the nonlinear low-order model (see Equation 73.16) as,
where G;f(z)contains the time delay terms, zVd, all right halfplane zeros, zeros that are close to (- 1,O) on the unit disk, and has unity gain. The IMC controller is then obtained by
where F(z) is a matrix of filters used to tune the closed-loop performance and robustness and to obtain a physically realizable controller. The inversion G; (2)-' was relativelystraightforward because the dynamics of the annular wafer elements of the linearized form of the nonlinear model were decoupled. The tuning matrix, or IMC filter, F(z) was selected to satisfy several requirements of RTP. The first requirement was related to repeatability in which zero offset between the actual and desired trajectory was to be guaranteed at steady-state condition despite modeling error. The second requirement was related to uniformity in which the dosed-loop dynamics of the wafer temperature should exhibit similar behavior. ,The third requirement was related to robustness and implementation in which the controller should be as low-order as possible. Other requirements were ease of operator usage and flexibility. One simple selection of F(z) that meets these requirements was given by the first-order filter
where a is a tuning parameter, the speed of response. This provided us with a simple controller with parameters that could be interpreted from a physical standpoint. In this approach to control design, the nonlinear dependency of K and Ti on temperature can be parameterized explicitly in the controller. Hence, a continuous gain-scheduling procedure can be applied. It was noted that, as temperature increqsed, the process gain decreased. Since the controller involved the inverse of the process model, the controller gain increased as temperature was increased. Consequently, the gain-scheduling provided consistent response over the entire temperature range. Thus, control design at one temperature should also apply at other temperatures. In addition to the IMC approach, a multivariable feedback control law was determined by an LQG design which incorporated integral control action to reduce run-to-runvariations [12], [14]. The controller needed to be designed carefully, because, in a nearly singular system such as the experimental RTP, actuator saturation and integrator windup can cause problems. To solve this problem partially, integral control was applied in only the (strsngly) controllable directions in temperature error space,
73.6. PR 0 OF-OF-CONCEPT TESTING helping to prevent the controller from trying to remove uncontrollable disturbances. For the LQG design, the black-box model was used. It can be expressed in the time domain as follows:
and the resulting equations ordered from
yy toy::
These combined equations determine a linear relationship between the input and output of the system in the form Y = HU, where the notation 'Y and U is used to designate the no N x 1 and ni N x 1 stacked vectors. The identified model of the system was augmented with new states representing the integral uf the error along them easiest to control directions, defined as 6 def = [ system model was, then,
c1
. . . tm]IT.The new
where U1:, is the first m columns of U ,the output matrix from the SVD of the open-loop transfer matrix H(z)JZ=1= U S V ~ , and represented the (easily) controllable subspace. The output y then consisted of thermocouple readings and the Litegrator states (in practice thie integrator stateswere computed in software from the measured temperature errors). Weights for the integrator states were chosen to provide a good transient response. A complete description of the control design can be found in [14], [151. The goal of our feedforward control was to design, in advance, a reference input trajectory which will cause the system to follow a predetermined ieference output trajectory, assuming no noise or modeling error. The approach built upon the analysis done in [16] and expressed the open-loop trajectory generation problem as a convex optimization with linear constraints and a quadratic objective function [14], [15]. The inputloutput relationship of the system can be described by a linear matrix equation. This relationship was used to convert convex constraints on the output trajectory into convex constraints on the input trajectory The RTP systern imposed a number of linear constraints. These were linear constraints imposed by the RTP hardware, specifically, the actuators had both saturation effects and a maximum rate of increase. Saturation constraints were modeled as follows: pirn is defined as the ni x 1 vector of steady-state powers at the point of linearization of the above model at time k. With an outer
feedback control loop running, to insure that the feedback controller has some room to work (for example f10% leeway), the total powers should be constrained to the operating range of 10 5 ptota1 < 90, which translates into a constraint on U of (10 - pLm)5 U 5 (90 - plm). Maximum rates of increase (or slew rate limits) for our actuators were included also. These were due to the dynamics of the halogen bulbs in our lamp. We included this constraint as u;+~ - u i 5 5, which can be expressed in a matrix equation of the form SU 5 5, where S has 1 on the main diagonal and -1 on the off-diagonal. The qualityof our optimized trajectory can be measured in two ways: minimized tracking error (following a reference trajectory) and minimized spatial nonuniformity across the wafer. Because the tracking error placed an upper bound on the nonuniformity error, we concentrate on it here. We define the desired trajectory yrefas relative to the same linearized starting point used for system identification. The tracking error E can be defined as E = Y - yref, where E again denotes the stacked error vector. We define our objective function to be a quadratic constraint on E as F(x) = ETE, and expand
Softwareprograms exist, such as the FORTRAN program LSSOL [17],which can take the convexconstraints and produce a unique solution, if one exists. After achieving successful results in simulation, the control system was implemented in a real-time computing environment linked to the actual RTP equipment. The computing environment included a VxWorks real-time operating system, SUN IPC workstation, VME 110 boards and a Motorola 68030 processor.
73.6 Proof-of-Concept Testing The Stanford RTP system was used for multiprocessing applications in which sequential thermal process steps were performed within the same reactor. A schematic of the RTM is shown in Figure 73.10. A concentric three-zone 38-kW illuminator, constructed and donated by Texas Instruments, was used for wafer heating. The center zone consisted of a 2-kW bulb, the intermediate zone consisted of 12 1-kW bulbs and the outer zone consisted of 24 1-kW bulbs. A picture of the three-zdne lamp is presented in Figure 73.1 1. The reflector was water and air cooled. An annular gold-plated stainless steel opaque ring was placed on the quartz window to provide improved compartmentalization between the intermediate and outer zones. This improvement was achievedby reducing the radiative energy from the outer zone impinging on the interior location of the wafer and from the intermediate zone impinging on the edge of the wafer. The RTM was used for 4-inch wafer processing. The wafer was manually loaded onto three supporting quartz pins of low thermal mass. The wafer was placed in the center of the chamber which was approximately 15 inches in diameter and 6 inches in height. Gas was injected via two jets. For the control experiments presented
THE CONTROL HANDBOOK
0
10
20
M
40
30
60
80
70
90
100
T h e (reel
Figure 73.14
Temperature trajectory of the three sensors over the first 100 seconds and the 5 minute process hold. 100
1
r3 so
-
40
-
-C
90
80
tl 70
C
3
1
60
30
20 10
I
1
I
I
I
I
I
.................a........J. .?o? .-........ ..-. .5-'\o---y--w------... v..................
I
---
Zone 2
I
I
I
0
0
Figure 73.15
I
1
0
0
,
I
I
2
W
*
0
5
I
0
6 0 T h e (rc)
,
i
7
0
8
I
I
0
)
Q
1
0
)
Powers of the three zones used to control temperature over the first 100 seconds.
below, temperature was measured with a thermocouple instrumented wafer. Three thermocouples were bonded to the wafer along a common diameter at radial positions of 0 inch (center), 1 inch, and 1 718 inches. The experiments were conducted in a N2 environment at 1 atmosphere pressure. The control algorithms were evaluated for control of temperature uniformity in achieving a ramp from room temperature to 900ยฐC at a ramp rate of 45"CIs followed by a hold for 5 minutes at 1 atm pressure and 1000 sccm (cclmin gas at standard conditions) NZ [4]. This trajectory typified low-temperature thermal oxidation or annealing operations. The ramp rate was selected to correspond to the performance limit (in terms of satisfying uniformity requirements) of the equipment. The control system utilized simultaneous IMC feedback and feedfonvard control. Gain scheduling was employed to compensate for the nonlinearities induced by radiative heating. The wafer temperature for the desired trajectory over the first 100 seconds is plotted in Figure 73.14 for the center, middle, and edge locations where thermocouples are bonded to the wafer along a common diameter at radial positions of 0 inch (center), 1 inch, and 1 718 inches. The ramp rate gradually increased to the specified 45OCIs and then decreased as the desired process hold temperature was approached. The corresponding lamp powers, that were manipulated by commands from the control system to achieve the desired
temperature trajectory, are shown in Figure 73.15. The time delay of the system can be seen by comparing the starting times of the lamp powers to the temperature response. Approximately, a two second delay e&sted in the beginning of the response. Of this delay, approximately 1.5 seconds was caused by a power surge interlock on the lamp power supplies which only functions when the lamp power is below 15% of the total power. The remaining delay was caused by the sensor and filament heating dynamics. In the power profile plot, the rate limiting of the lamp powers is seen. This rate-limiting strategy was employed as a safetyprecaution to prevent a large inrush current to the lamps. However, these interlocks prevented higher values of ramp rates from being achieved. The nonuniformity of the controlled temperature trajectory was then analyzed. From the measurements of the entire five minute run (i.e., the first 100 seconds shown in Figure 73.14 along with an additional 400 second hold at 900ยฐC not shown in the figure), the nonuniformity was computed by the peak-topeak temperature error of the temperature measurements of the three thermocouples. The result is plotted in Figure 73.16. The maximum temperature nonuniformity.of approximately 15ยฐC occurred during the ramp around a mean temperature of 350ยฐC. This nonuniformity occurred at a low temperature and does not effect processing or damage the wafer via slip. As the ramp progressed from this point, the nonuniformity decreased. The sig-
73.7. TECHNOLOGY TRANSFER T O INDUSTRY
Figure
Temperature nonuniformity for the 5 minute run as measured by the three temperature sensors. 1M
I20
G
Figure 73.17
I
I
I
I
I
I
I
I
I
-
Temperatures of the quartz window and chamber base over the 5 minute run.
nificant sensor noise can be seen. The capability of the controller to hold the wafer temperature at a desired process temperature despite the presence of dynamic heating from extraneous sources was then examined. As seen in Figure 73.14, the control system held the wafer temperature at the desiredvalue of 900ยฐC. Although the sensors were quite noisy and had resolution of 0.5"C, the wafer temperature averaged over the entire hold portion for the three sensors corresponded to 900.9OC, 900.7OC, and 900.8"C, respectively. This result was desired because the uniformityof the process parameters, such as film thickness and resistivity, generally depend on the integrated or averaged temperature over time. The capability of the control system to hold the wafer temperature at the desired value, albeit slightly higher, is demonstrated by plotting the dynamics of the quartz window and chamber base of the RTM in Figure 73.17. The slow heating of these components of the RTM corresponded to slow disturbances to the wafer temperature. Because of the reduced gain of the controller to compensate for time delays, these disturbances impacted the closed-loop response by raising the temperature to a value slightly higher than the set point. However, without the feedback temperature control system, the wafer temperature would have drifted to a value more than 50ยฐC higher tllan the set point as opposed to less than 1ยฐC in the measured wafer temperature.
73.7 Technology Transfer to Industry After demonstrating the prototype RTP equipment at Stanford, the multivariablecontrol strategy (including hardware and software) was transferred to Texas Instruments for application in the MMST program. This transfer involved integration on eight RTP reactors: seven on-line and one off-line. These RTP systems were eventually to be used in a 1000wafer demonstration of two fullflow sub-half-micron CMOS process technologies in the MMST program at TI [18], [19],[20],[21]and [22]. Although there were similaritiesbetween the RTP equipment at TI and the three-zone RTP system at Stanford, there were also substantial differences. Two types of illuminatorswere used for MMST, a four-zone system constructed at Texas Instruments for MMST applications and a six-zone system manufactured by G2 Semiconductor Corp. Both systems utilized concentric zone heating; the TI system employed a circular arrangementoflamps, and the G2 system used a hexagonal arrangement of lamps. A thick (roughly 15 mm) quartz window was used in the TI system to separate the lamps from the reaction chamber, and a thin (3 mm) quartz window was used in the G2 system. Wafer rotation was not employed with the TI system but was used with the G2 system. The rotation rate was approximately 20 rpm. Moreover, six-inch wafer processing took place using up to four pyrometers for feedback. Most reactors employed different con-
THE CONTROL HANDBOOK figurations purge ring assemblies, guard rings, and susceptors (see Figure 73.5). The on-line RTP reactors configured with the IMC controller were used for thirteen different thermal processes: LPCVD Nitride, LPCVD Tungsten, Silicide react, Silicide anneal, sinter, LPCVD polysilicon, LPCVD amorphous silicon, germane clean, dry RTO, wet RTO, sourceldrain anneal, gate anneal, and tank anneal. These processes ranged from 450' to 1100ยฐC, from 1 to 650 torr pressure, and from 30 secondsto 5 minutes ofprocessing time (see Figure 73.18).
RTP Roecu siakr
LPCYDbV
[q
wl
[sl
(81
P lTo.oerl
NI
450
4%
0
1llO
650
Arm,
425475
425475
0
601M)
30
Tc
Tr
t*
1,
LPlCYlkmorW
Ar
UDW
300-560
5-15
60180
U
-pol1
Ar
450-550
650
5-15
r2a-UO
15
d k i & rend
NI
450-Sm
650
5-15
180
1
d k i & LMCPl
h
454-550
750
5-U
60
1
LPCVD-SiO1
0,
450-550
750
5-15
30-180
1-5
P-C~-
El
4SO-W
(50750
5-1s
l24
15
tX\-33 Sip,
NE,
SO450
850
5-W
60160
1-5
eate WTA
Ar
750600
900
CU
30
650
0 1
750600
loo0
5-W
no-180
650
wet RTO
01
750-8m
too0
5-13
1201llO
650
murcel~RTA
Ar
750600
1WlOS
5-15
E-30
650
NH,
7Wd00
1100
5-15
3W
650
*
tnnt RTA v
Cnrrkr ci.r
Figure 73.18 List of processes controlled during the MMST program. The preheat temperature ( T p h )and time (tph)and the process temperature (Tpr)and time (tpr)are given. The carrier gases and operating pressures are also presented.
There were several challenges in customizing the temperature control system for operation in an all-RTP factory environment [ 131. These challenges included substantial differences among the eight reactors and thirteen processes, operation in a prototyping deveJopment environment, ill-conditioned processing equipment, calibrated pyrometers required for temperature sensing, equipment reliability tied to control power trajectories, multiple lamp-zonelsensor configurations, detection of equipment failures, and numerous operational and communication modes. Nonetheless,'it was possible to develop a single computer control code with the flexibility of achieving all of the desired objectives. This was accomplished by developing a controller in a modular framework based on a standardized model of the process and equipment. The control structure remained the same while the model-based parameters of the controller differed from process to process and reactor to reactor. It was possible to read these parameters from a data file while holding the control!er
code and logic constant. Consequently, it was only necessary to maintain and modify a single computer control code for the entire RTP factory. We present results here for an LPCVD-Nitride process that employed a TI illuminator and for an LPCVD-Poly process that employed a G~illuminator. Additional temperature and process control results are presented in [13] and 161. The desired temperature trajectory for the LPCVD-Nitride process involved a ramp to 850ยฐC and then a hold at 850ยฐC for roughly 180 seconds in a SiH41NH3 deposition environment. Temperature was measured using four radially distributed 3.3 p m InAs pyrometers. The center and edge pyrometers were actively used for real-time feedback control and the inner two pyrometers were used to monitor the temperature. The reasons for this analysis were: (1) repeatable results were possible using only two pyrometers, (2) an analysis of the benefits of using pyrometers for feedback could be assessed, and (3) fewer pyrometers were maintained during the mdrathon demonstration. In Figure 73.19, the center temperature measurement is shown for a 24-wafer lot process. The offsets in the plot during the ramps are merely due to differences in the starting points of the ramps. During the hold at 850ยฐC, the reactive gases were injected, and the deposition took place. The standard deviation (computed over the 24 runs) of the temperature measurements during the deposition time was analyzed. In Figure 73.20, the standard deviation ofthe four sensor measurements are shown. The controlled sensors improved repeatabi1ity:over the monitored sensor locations by a factor of seven. A three-sigma interpretation shows roughly that the controlled sensors held temperature to within f0.3"C and the monitored sensors were repeatable at f2.0ยฐC. We analyzed the power trajectories to the lamp zones to evaluate the repeatabilityofthe equipment. In Figure 73.21, the power to the center zone is presented for the 24 runs. The intermediate two zones were biased off the center and edge zones, respectively. From these results, it was clear that the lamp power decreased substantially during a nitride deposition run because the chamber and window heat more slowly than the wafer; because the chamber and window provide energy to the wafer, the necessary energy from the lamps to achieve a specified wafer temperature was less as the chamber and window heat up. In addition, we noted the chamber and window heating effect from run-to-run by observing the lowered lamp energy requirements as the lot processing progresses. These observations can be used in developing fault detection algorithms. To study the capability of temperature control on the process parameter, we compared the thickness of the LPCVD poly process determined at the center for each wafer of a 24-wafer lot where multizone feedback temperature control was used and no real-time feedback temperature control (i.e., open-loop operation) was used. ~ o r ' t h eopen-loop case, a redetermined lamp power trajectory was replayed for each wafer of the 24-wafer lot. The comparison is shown in Figure 73.22. It is clear that the multizone feedback control is much better than open-loop control. In some sense, this comparison is a worst case analysis since the lamp powers themselves for both cases had no control, not usual in industry. In our experiments, variations in line voltage
73.7. TECHNOLCIGY TRANSFER T O INDUSTRY
Time 6x4 Figure 73.19
The clenter temperature of the LPCVD nitride process for a 24-wafer lot run.
Sensor 2 (monitored)
0.8
Sensor 3 (moni~ored)
1
Depodtien time (mln)
;
Figure 73.20 Standard deviation of temperature measurements during the three minute deposition step of the LPCVD nitride process for the 24-wafer lot run.
THE CONTROL HANDBOOK
Time (sec) Figure 73.21
Power to the center zone for the 24-wafer lot run of the LPCVD nitride process.
~ 0
~
~ 5
~
'
-
-
.
10
15
.
-
20
~
-
Wafer Number Figure 73.22
Comparison of multizone feedback control and open-loop lamp replay for LPCVD polysilicon process.
I
-
25
'
-
-
~
73.8. CONCLUSIONS proceeded unfiltered through the reactor causing unprovoked fluctuations in the lamp power and inducing strong temperature effects. However, the feedback temperature control system can compensate somewhat for these fluctuations. For the open-loop case, these fluctuations pass on directlyand result in unacceptable repeatability.
A systems approach has been used for a study in semiconductor manufacturing. 'fiis methodology has included developing models, analyzing alternative equipment designs from a control perspective, establishing model identification techniques to develop a model for control design, developing a real-time control it within a control processor, proof-ofsystem and embedd~~ng concept testing with a prototype system, and then transferring the control technology to industry. This application has shown the role that control methodologies can play in semiconductor device manufacturinig.
References [I] Lord, H., Thermal and stress analysis of semiconduc-
tor wafers in a rapid thermal processing oven, IEEE Trans. Semicond, Manufact., 1, 141-1 50, 1988. [2] Norman, S.A., Wafer Temperature Control in Rapid Thermal Processing, Ph.D. Thesis, Stanford University, 1992. (31 Cho, Y., Schap~tr,C. and Kailath, T., Low order mod-
eling and dynamic characterization of rapid thermal processing, Appl. Phys. A: Solids and Surfaces, A:54(4), 317-326,1992. [4] Norman, S.A., Schaper, C.D. andBoyd, S.P., Improve-
ment of temperature uniformitjr in rapid thermal processing systems using multivariable control. In Muter. Res. Soc. Proc.: Rapid Thermal and Integrated Processing. Materials liesearch Society, 199 1. [5] Saraswat, K. and Apte, P., Rapid thermal processing uniformity using multivariable control of a circularly symmetric three-zone lamp, IEEE Trans.on Semicond. Manufact., 5 , 1992. [6] Saraswat,K., Schaper, C., Moslehi, M. and Kailath, T., Modeling, identification, and control of rapid thermal processing,]. IZlectrochem. Soc., 141(1l), 3200-3209, 1994. [7] Cho, Y., Fast Subspace Based System Identification:
Theory and Practice, Ph.D. Thesis, Stanford University, CA, 1993. [8] Cho, Y. and Kalilath, T., Model identification in rapid thermal processing systems, IEEE Trans. Semicond. Manufact., 6(3), 233-245, 1993. [9] Roy, R., Paulraj, A. and Kailath, T., ESPRIT - a subspace rotation approach to estimation of parameters of cisoids in noise, IEEE Trans. ASSP, 34(5), 13401342,1986.
[lo] Schaper, C., Cho, Y., Park, P., Norman, S., Gyugyi,
P., Hoffmann, G., Balemi, S., Boyd, S., Franklin, G., Kailath, T., and Sttrawat, K., Dynamics and control of a rapid thermal multiprocessor.In SPIE Conference on Rapid Thermal and Integrated Processing, September 1991. [ l l ] Cho, Y.M., Xu, G., andKailath, T., Fast recursive iden-
tificationof state-spacemodelsvia exploitation ofdisplacement structure, Automatics, 30(1), 4559,1994. [12] Gyugyi, P., Cho, Y., Franklin, G., and Kailath, T., Control of rapid thermal processing: A system theoretic approach. In IFAC World Congress, 1993. [13] Saraswat, K., Schaper,C., Moslehi, M., andKailath,~., Control of MMST RTP: Uniformity, repeatability, and integration for flexible manufacturing, IEEE Trans.on Semicond. Manifact., 7(2), 202-219, 1994. 1141 Gyugyi, P., Application of Model-Based Control to Rapid Thermal Processing Systems. Ph.D. Thesis, Stanford University, 1993. [15] Gyugyi, P.J., Cho, Y.M., Franklin, G., and Kailath, T., Convex optimization of wafer temperature trajectories for rapid thermal processing. In The i'nd IEEE Conf Control Appl., Vancouver, 1993. [ 161 Norman, S.A., Optimization of transient temperature uniformity in RTP systems, IEEE Trans.Electron Dev., January 1992. [17] Gill, P.E., Hammarling, S.J., Murray, W., Saunders, M.A., and Wright, M.H., User's guide for LSSOL (Version 1.0): A FORTRAN package for constrainedleastsquares and convex quadratic programming, Tech. Rep. SOL 86-1, Operations Research Dept., Stanford University, Stanford, CA, 1986. [ 181 Chatterjee, P. and Larrabee, G., Manufacturing for the gigabit age, IEEE Trans. on VLSI Technology, 1, 1993. [I91 Bowling, A., Davis, C., Moslehi, M., and Luttmer, J., Microeletronics manufacturing science and technology: Equipment and sensor technologies, TI Technical I., 9, 1992. [20] Davis, C, Moslehi, M., and Bowling, A., Microeletronics manufacturing science and technology: Single-wafer thermal processing and wafer cleaning, TI Technical]., 9, 1992. [21] Moslehi, M. et al., Single-wafer processing tools for agile semiconductor production, Solid State Technol., 37(1), 35-45,1994. [22] Saraswat, K. et al., Rapid thermal multiprocessing for
a programmable factory for adaptable manufacturing of ic's, IEEE Trans.on Semicond. Manufact., 7(2), 159175, 1994.
SECTION XVI Mechanical Control Systems
Automotive Control Systems J. A. Cook Ford Motor Company, Scientific Research Laboratory, Control Systems Depar,m~ent,Dearborn, MI
J.
w.Grizzle
Department of EECS, Control Systenls Laboratory, University of Michigan, Ann Arbor, MI
1. Sun Ford Motor Company, Sciennfic Research Laboratory, Control Systems Deparrmznt, Dearborn, MI
M. K. Liubakka Advanced Vehlcle Technology, Ford Motor Company, Dearborn, MI
D.S. Rhode Advanced Vehlcle Technology, Ford Motor Company, Dearborn, MI
J. R. Winkelman Advanced Vehlcle Technology, Ford Motor Company, Dearborn, MI
P. V. KokotoviC ECE Department, Unlverslty of Cahfornia, Santa Barbara, C A
74.1 Engine Control ....................................................... 1261 Introduction 'Air-Fuel Ratio Control System Design 'Idle Speed Control Acknowledgments References :................................................................... 1274 74.2 Adaptive Automotive Speed Control ................................ 1274 Introduction Design Objectives The Design Concept Adaptive Controller Implementation 74.3 Performance in Test Vehicles.. ....................................... 1279 Conclusions Appendix. .................................................................... 1281 References.. .................................................................. 1285
74.1 Engine Control
1. A. Cook, Ford Motor Company, Scientific Research Laboratc~ry,Control Systems Department, Dearborn, MI 1. W. Grizzle, Department of EECS, Control Systems Laboratory, University of Michigan, Ann Arbor, MI 1. Sun, Ford Motor Company, Scientific Research Laboratory, Control Systems Department, Dearborn, MI 74.1.1 Introduction Automotive engine (controlsystems must satisfy diverse and often conflicting requirements. These include regulating exhaust emissions to meet increasingly stringent standards without sacrificing good drivat~ility;providing increased fuel economy to satisfy customer desires and to comply with Corporate Average Fuel Economy (CAFE) regulations; and delivering these perfor0-8493-8570-9/%/$0.00+$.50 @ 1996 by CRC Press, Inc.
mance objectives at low cost, with the minimum set of sensors and actuators. The dramatic evolution in vehicle electronic control systems over the past two decades is substantiallyin response to the first of these requirements. It is the capacity and flexibility of microprocessor-based digital control systems, introduced in the 1970s to address the problem of emission control, that have resulted in the improved function and added convenience, safety, and performance features that distinguish the modern automobile [8]. Although the problem of automotive engine control may encompass a number of different power plants, the one with which this chapter is concernedis the ubiquitous four-stroke cycle, spark ignition, internal combustion gasoline engine. Mechanically,this power plant has remained essentially the same since Nikolaus Otto built the first successful example in 1876. In automotive ?pplications, it consists most often of four, six or eight cylinders wherein rexiprodating pistons transmit power via a simple connecting rod and crankshaft mechanism to the wheels. Two complete revolutions of the crankshaft comprise the following sequence of operations.
THE CONTROL HANDBOOK
1262
The initial 180 degrees of crankshaft revolution is the intake stroke, where the piston travels from top-dead-center (TDC) in the cylinder to bottom-dead-center (BDC). During this time an intake vdve in the top o t the cylinder is opened and a combustible mixture of air and fuel is drawn in from an intake manifold. Subsequent 180-degreeincrements ofcrankshaft revolution comprise the compression stroke, where the intake valve is closed and the mixture is compressedas the piston moves back to the top of the cylinder; the combustion stroke when, after the mixture is ignited by a spark plug, torque is*generatedat the crankshaft by the downward motion of the piston caused by the expanding gas; and finally, the exhaust stroke, when the piston moves back up in the cylinde~,expelling the products of combustion through an exhaust valve. Three fundamental control tasks affect emissions, performance, and fuel economy in the spark ignition engine: (1) airfuel ratio (A/F) control, that is, providing the correct ratio of air and fuel for efficient combustion to the proper cylinder at the right time; (2) ignition control, which refers to firing the appropriate spark plug at the precise instant required; and (3) control of exhaust gas recirculation to the combustion process to reduce the formation of oxide of nitrogen (NOx) emissions.
Ignition Control The spark plug is fired near the end of the compression stroke, as the piston approaches TDC. For any engine speed, the optimal time during the compression stroke for ignition to occur is the point at which the maximum brake torque (MBT) is generated. Spark timing significantly in advance of MBT risks damage from the piston moving against the expanding gas. As the ignition event is retarded from MBT, less combustion pressure is developed and more energy is lost to the exhaust stream. Numerous methods exist for energizing the spark plugs. For most of automotive history, cam-activated breaker points were used to develop a high voltage in the secondary windings of an induction coil connected between the battery and a distributor. Inside the distributor, a rotating switch, synchronized with the crankshaft, connected the coil to the appropriate spark plug. In the early days of motoring, the ignition system control function was accomplished by the driver, who manipulated a lever located on the steering wheel to change ignition timing. A driver that neglected to retard the spark when attempting to start a handcranked Model T Ford could suffer a broken arm ifhe experienced "kickback." Failing to advance the spark properly while driving resulted in less than optimal fuel economy and power. Before long, elaborate centrifugal and va'cuum-driven distributor systems were developed to adjust spark timing with respect to engine speed and torque. The first digital electronic engine control systems accomplished ignition timing simplyby mimicking the functionality of their mechanical predecessors. Modern electronic ignition systems sense crankshaft position to provide accurate cycle-time information and may use barometric pressure, engine coolant temperature, and throttle position along with engine speed and intake manifold pressure to schedule ignition events for the best fuel economy and drivability subject
to emissions and spark knock constraints. Additionally, ignition timing may be used LOmodulate torque to improve transmission shift quality and in a feedback loop as one control variable to regulate engine idle speed. In modern engines, the electronic control module activates the induction coil in response to the sensed timiiig and operating point information and, in concert with dedicated ignition electronics,routes the high voltage to the correct spark plug. One method of providing timing information to the control system is by using a magnetic proximity pickup and a toothed wheel driven from the crankshaft to generate a square wave signal indicating TDC for successive cylinders. A signature pulse of unique duration is often used to establish a reference from which absolute timing can be determined. During the past ten years there has been substantial research and development interest in using in-cylinder piezoelectricor piezoresistivecombustion pressure sensors for closed-loop feedback control of individual cylinder spark timing to MBT or to the knock limit. The advantages of combustion-pressure-based ignition control are reduced calibration and increased robustness to variability in manufacturing, environment, fuel, and component aging. The cost is in an increased sensor set and additional computing power.
Exhaust Gas Recirculation Exhaust gas recirculation (EGR) systems were introduced as early as 1973 to control (NOx) emissions. The principle of EGR is to reduce NOx formation during the combustion process by diluting the inducted air-fuel charge with inert exhaust gas. In electronically controlled EGR systems, this is accomplished using a metering orifice in the exhaust manifold to enable a portion of the exhaust gas to flow from the exhaust manifold through a vacuum-actuated EGR control valve and into the intake manifold. Feedback based on the difference between the desired and measured pressure drop across the metering orifice is employed to duty cycle modulate a vacuum regulator controlling the EGR valve pintle position. Because manifold pressure rate and engine torque are directlyinfluencedby EGR, the dynamics ofthe system can have a significant effect on engine response and, ultimately, vehicle drivability. Such dynamics are dominated by the valve chamber filling response time to changes in the EGR duty cycle command. The system can be represented as a pure transport delay associated with the time required to build up sufficient vacuum to overcome pintle shaft friction cascaded with first-order dynamics incorporating a time constant that is a function of engine exhaust flow rate. Typically, the EGR control algorithm is a simple proportional-integral (PI) or proportional-integralderivative (PID) loop. Nonetheless, careful control design is required to provide good emission control without sacrificing vehicle performance. An unconventional method to accomplish NOx control by exhaust recirculation is to directly manipulate the timing of the intake and exhaust valves. Variable-cam-timing (VCT) engines have demonstrated NOx control using mechanical and hydraulic actuators to adjust valve timing and to affect the amount of internal EGR remaining in the cylinder after the ex-
74.1. ENGINE CONTROL
haust stroke is completed. Early exhaust valve closing has the additional advantage that unburned hydrocarbons (HC) normally emitted to the exhaust stream are recycled through asecond combustion event, reducing HC emissions as well. Although VCT engines eliminate the normal EGR system dynamics, the fundamentally multivariable nature of the resulting system presages a difficult engine control problem.
Air-Fuel Ratio Control Historically, fuel control wis accomplished by a carburetor that used a venturi arrangement and a simple float-and-valve mechanism to meter the proper amount of fuel to the engine. For special operating conditions, such as idle or acceleration, additional mechanical and vacuum circuitry was required to assure satisfactoryengine operation and good drivability. The demise of the carburetorwas occasionedby the advent ofthree-waycatalytic converters (TWC) fbr emission control. These devices simultaneously convert oxidizing [HC and carbon monoxide (CO)] and reducing (NOx) species in the exhaust, but, as shown in Figure 74.1, require precise control of AIF to the stoichiometric value to be effective. Consequently, the electronic fuel system of
is fully effective only under steady-state conditions and when the EGO sensor has attained the proper operating temperature. The feedforward, or open-loop portion of the control system, is particularly important when the engine is cold (beforethe closedloop AIF control is operational) and during transient operation [when the significant delay between the injection of fuel (usually during the exhaust stroke, just before the intake valve opens) and the appearance of a signal at the EGO sensor (possibly long after the conclusion of the exhaust stroke) inhibits good control]. First, in Section 74.12, the open-loop AIF control problem is examined with emphasis on accounting for sensor dynamics. Then, the closed-loop problem is addressed from a modern control systemsperspective, where individual cylinder control of A/F is accomplished using a single EGO sensor.
Idle Speed Control In addition to these essential tasks of controlling ignition, AIF, and EGR, the typical on-board microprocessor performs many other diagnostic and control functions. These indude electric fan control, purge control of the evaporative emissions canister, turbocharged engine wastegate control, overspeed control, electronic transmission shift scheduling and control, cruise control, and idle speed control (ISC). The ISC requirement is to maintain constant engine RPM at closed throttle while rejecting disturbances such as automatic transmission neutral-to-drive transition, air conditioner compressor engagement, and power steering lock-up. The idle speed problem is a difficult one, especially for small engines at low speeds where marginal torque reserve is available for disturbance rejection. The problem is made more challenging by the fact that significant parameter variation can be expected over the substantial range of environmental conditions in which the engine must operate. Finally, the ISC design is subject not only to quantitative performance requirements, such as overshoot and settling time, but also to more subjective measures of performance, such as idle quality and the degree of noise and vibration communicated to the driver through the body structure. The ISC problem is addressed in Section 74.1.3.
74.1.2 Air-Fuel Ratio Control System Design MEAN AIF
Figure 74.1
Typicall TWC efficiency curves.
a modern spark ignition automobile engine employs individual fuel injectors located in the inlet manifold runners close to the intake valves to del~veraccurately timed and metered fuel to all cylinders. The injectors are regulated by an AIF control system that has two primar y components: a feedback portion, in which a signal related to AIF from an exhaust gas oxygen (EGO) sensor is fed back through a digital controller to regulate the pulse width command sent to the fuel injectors; and a feedforward portion, in which injector fuel flow is adjusted in response to a signal from an air flow meter. The feedback, or closed-loop portion of the control system,
Due to the precipitous falloff of TWC efficiency away from stoichiometry, the primary objective of the AIF control system is to maintain the fuel metering in a stoichiometric proportion to the incomin air flow [the only exceptionto this occurs in heavy load situations where a rich mixture is required to avoid premature detonation (or knock) and to keep the TWC from overheating]. Variation in air flow commanded by the driver is treated as a disturbance to the system. A block diagram of the control structure is illustrated in Figure 74.2, and the two major subcomponents treated here are highlighted in bold outline. The first part of this section describes the development and implementation of a cylinder air charge estimator for predicting the air chargeentering the cylinders downstream of the intake manifold plenum on the basis of available measurements of air mass flow rate upstream
THE CONTROL HANDBOOK -
CAC
Stoichiometric AIF
Cylinder Air Charge Estimator
r S ~ e d
ipe ions
I
IT&-]
Feedback
4zwController
F i w e 74.2
Basic A/F control loop showing major feedforward and feedback elements.
1: :hc thio1r;o. h e air charge estimate is used to form the base fuel calcuhtion, which is often then modified to account for any fuel-puddling dynamics and the delay associated with closedvalve fuel injection timing. Finally, a classical, time-invariant, single-input single-output (SISO) PI controller is normally used to correct for any persistent errors in the open-loop fuel calculation by adjusting the average AIF to perceived stoichiometry.
Even if the average AIF is controlled to stoichiometry, individual cylinders may be operating consistentlyrich or lean of the desired value. This cylinder-to-cylinder NF maldistribution is due, in part, to injector variability. Consequently, fuel injectors are machined to close tolerances to avoid individual cylinder flow discrepancies, resulting in high cost per injector. However, even if the injectors are perfectly matched, maldistribution can arise from individual cylinders having different breathing characteristics due to a combination of factors, such as intake manifold configuration and valve characteristics. It is known that such AIF maldistribution can result in increased emissions due to shifts in the closed-loop AIF setpoint relative to the TWC [9]. The second half of this section describes the development of a nonclassical, periodically time-varying controller for tuning the AIF in each cylinder to eliminate this maldistribution.
Hardware Assumptions The modeling and control methods presented here are applicable to just about any fuel-injected engine. For illustration purposes only, it is assumed that the engine is aport fuel-injected V8 with independent fuel control for each bank of cylinders. The cylinders are numbered one through four, starting from the front of the right bank, and five through eight, starting from the front of the left bank. The firing order of the engine is 1-3-7-2-6-5-48, which is not symmetric from bank to bank. Fuel injection is timed to occur on a closed valve prior to the intake stroke (induction event). For the purpose of closed-loop control, the engine is equipped with a switching type EGO sensor located at the confluence of the individual exhaust runners and just upstream of the catalytic converter. Such sensors typically incorporate a ZrOz ceramic thimble employing a platinum catalyst on the exterior surface to equilibrate the exhaust gas mixture. The interior surface of the sensor is exposed to the atmosphere. The output
voltage is exponentiallyrelated to the ratio of O2 partial pressures across the ceramic, and thus the sensor is essentially a switching device indicating by its state whether the exhaust gas is rich or lean of stoichiometry.
Cylinder Air Charge Computation This section describes the development and implementation of an air charge estimator for an eight-cylinder engine. A very real practical problem is posed by the fact that the hot-wire anemometers currently used to measure mass air flow rate have relatively slow dynamics. Indeed, the time constant of this sensor is often on the order of an induction event for an engine speed of 1500 RPM and is only about four to five times faster than the dynamics of the intake manifold. Taking these dynamics into account in the air charge estimation algorithm can significantly improve the accuracy of the algorithm and have substantial benefits for reducing emissions.
Basic Model The air path of a typical engine is depicted in Figure 74.3. An associated lumped-parameter phenomenological model suitable for developingan on-line cylinder air charge estimator [2] is now described. Let P, V, T and rn be the pressure in the intake manifold (psi), volume of the intake manifold and runners (liters), temperature (OR),and mass (lbm) of the air in the intake manifold, respectively. Invoking the ideal gas law, and assuming that the manifold air temperature is slowly varying, leads to
where MA Fa is the actual mass air flowmetered in by the throttle, R is the molar gas constant, CyE(N, P, T E c ,Tj) is the average instantaneous air flow pumped out of the intake manifold by the cylinders, as a function of engine speed, N (RPM), manifold pressure, engine coolant temperature, TEC (OR), and air inlet temperature, Ti( O R ) . It is assumed that both MAFa and C y l ( N , P , TEC,Tj) have units of lbmls. The dependence of the cylinder pumping or induction funrtion on variations of the engine coolant and air inlet temperatures
74.1. ENGINE CONTROL /'-
Hot Wire Probe
Cylinder air charge is then computed from Equa.tion 74.3 as 120
CAC = -Cyl(N,x nN Runner
Intake Valve
Figure 74.3
Schematic diagram of air path in engine.
is modeled empirically by [lo], as
where the superscript "mapping" denotes the corresponding temperatures (OR) at which the function C y l ( N , P ) is determined, based on engine mapping data. An explicit procedure for determining this function is explained in the next subsection. Cylinder air charge per induction event, C A C , can be determined directlyfrom Equation 74.1. In steady state, the integral of the mass flow rate of air pumped out of the intake manifold over two engine revolutions, divided by the number of cylinders, is the air charge per cylinder. Since engine speed is nearly constant over a single induction event, and the time in seconds for two a good approximation of the inducted engine revolutions is air charge on a per- cylinder basis is given by
9, 120
C14C := - C y l ( N , nW
P , T E c ,Ti)lbm,
(74.3)
where n is the number of cylinders. The final element to be incorporated in the model is the mass air flow meter. The importance of including this was demonstrated in [2]. For the purpose of achieving rapid on-line computations, a simple first-order model is used
where MAFm is the measured mass air flow and y is the time constant of the air meter. Substituting the left-hand side of Equation 74.4 for MAFa in Equation 74.1 yields
+ y -RT MAF,, V
Note that the effect of including the mass air flow meter's dynamics is to add a feedforward term involving the mass air flow rate to the cylinder air charge computation. When y = 0, Equations 74.6 and 74.7 reduce to an estimator that ignores the air meter's dynamics or, equivalently, treats the sensor as being infinitely fast. Determining Model Parameters The pumping function C y l ( N , P ) can be determined on the basis of steady-state engine mapping data. Equip the engine with a high-bandwidth manifold absolutepressure (MAP) sensor and exercise the engine over the full range of speed and load conditions while recording the steady-state value of the instantaneous mass air flow rate as a function of engine speed and manifold pressure. For this purpose, any external exhaust gas recirculation should be disabled. A typical data set should cover every 500 RPM of engine speed from 500 to 5,000 RPM and every half psi of manifold pressure from 3 psi to atmosphere. For the purpose of making these measurements, it is preferable to use a laminar air flow element as this, in addition, allows the calibration of the mass air flow meter to be verified. Record the engine coolant and air inlet temperatures for use in Equation 74.2. The function C y l ( N , P ) can be represented as a table lookup or as a polynomial regressed against the above mapping data. In either case, it is common to represent it in the following functional form:
The time constant of the aii meter is best determined by installing the meter on a flow bench and applying step or rapid sin-.soidalvariations in air flow to the meter. Methods for fitting an approximate first-order model to the data can be found in any textbook on classical control. A typical value is y = 20 ms. Though not highly recommended, a value for the time constant can be determined by on-vehicle calibration, if an accurate determination of the pumping function has been completed. This is explained at the end of the next subsection. Model Discretization and Validation The estimator modeled by Equations 74.6 and 74.7 must be discretized for implementation. In engine models, an event-based sampling schemeis often used [7].For illustration purposes, the discretization is carried out here for a V8; the modifications required for other configurations will be evident. Let k be the recursion index andlet Atk be the elapsed time in secondsper 45 degrees of crankangle advancement, or revolution; that is, Atk = s, where Nk is the current engine speed'in RPM. Then Equation 74.6 can be Euler-integrated as
:
To eliminate the derivative of M A Fm in Equation 74.5, let x = P -y M A Fm. This yields
7
T E C ,Ti). (74.7)
1266
THE CONTROL HANDBOOK
The cylinder air charge is calculated by CACk = 2AtkCyl(Nk,Xk
R Tk + y-MAFm,k, v
TEC, T i )
(74.10) and need be computed only once per 90 crank-angle degrees. The accuracy of the cylinder air charge model can be easily validated on an engine dynamometer equipped to maintain constant engine speed. Apply very rapid throttle tip-ins and tip-outs, as in Figure 74.4, while holding the engine speed constant. If the model parameters have been properly determined, the calculated manifold pressure accurately tracks the measured manifold pressure. Figure 74.5 illustrates one such test at 1500 RPM. The
0
0.5
1
I .5
2
2.5
1.5
2
2.5
Seconds
From Figure 74.5, it can be seen that the air meter time constant has been accurately identified in this test. If the value for y in Equation 74.4 had been chosen too large, then the computed manifold pressure would be leading the measured manifold pressure; conversely, if y were too small, the computed value would lag the measured value.
Eliminating A/F Maldistribution through Feedback Control AIF maldistribution is evidencedby very rapid switchingof the EGO sensor on an engine-event-by-engine-event basis. Such cylinder-to-cylinder A/F maldistribution can result in increased emissions due to shifts in the closed-loop AIF setpoint relative to the TWC [9]. The trivial control solution, which consists of placing individual EGO sensors in each exhaust runner and wrapping independent PI controllers around each injector-sensor pair, is not acceptable from an economic point of view; technically, there would also be problems due to the nonequilibrium condition of the exhaust gas immediately upon exiting the cylinder. This section details the development of a controller that eliminates A/F maldistribution on the basis of a single EGO sensor per engine bank. The controller is developed for the left bank of the engine. Basic Model A control-oriented block diagram for the AIF system of the left bank of an eight-cylinder engine is depicted in Figure 74.6. The model evolves at an engine speed-
,.__-____.______..__ 0
0.5
I
;?=t@-p&o
Seconds
~ E x h a u s l Runners and Induction-toI Power Stroke
:' 1
Figure 74.4
Lag
Engine operating conditions at nominal 1500 RPM.
dynamic responses of the measured and computed values match up quite well. There is some inaccuracy in the quasi-steady-state values at 12 psi; this corresponds to an error in the pumping function Cyl(N, P) at high manifold pressures, so, in this operating condition, it should be reevaluated.
I
I
ENGINE and EXHAUST CONTROLLER
T
Figure 74.6
Seconds
Figure 74.5 sure.
Comparison of measured and computed manifold pres-
Control-orientedblock diagram of A/F subsystem.
dependent sampling interval of 90 crank-angle degrees consistent with the eight-cylinder geometry (one exhaust event occurs every 90 degrees of crankshaft rotation). The engine is represented by injector gains G1 through G8 and pure delay zFd, which accounts for the number of engine events that occur from the time that an individual cylinder,'^ fuel injector pulse width is computed until the corresponding exhaust valve opens to admit the mixture to the exhaust manifold. The transfer function H(z) represents the mixing process of the exhaust gases from the individual exhaust ports to the EGO sensor location, including any transport delay. The switching type EGO sensor is represented by a first-order transfer function followed by a preload (switch) nonlinearity.
74.1. ENGINE CONTROL
1267
Note that only cylinders 5 through 8 are inputs to the mixing model, H(z). This is due to the fact that the separate banks of the V8 engine are co~itrolledindependently. The gains and delays for cylinders 1 through 4 correspond to the right bank of the engine and are included in the diagram only to represent the firing order. Note furthermore that cylinders 5 and 6 exhaust within 90 degrees of one another, whereas cylinders 7 and 8 exhaust 270 degrees apart. Since the exhaust stroke of cylinder 6 is not complete before ihe exhaust stroke of cylinder 5 commences, for any exhaust manifold configuration, there is mixing of the exhaust gases from these cylinders at the point where the EGO sensor samples the exhaust gas. We adopt the notation that the sampling index, k, is a multiple of 90 degrees, that is, x(k) is the quantityx at k .90 degrees; moreover, if we arelooking at a signal's value during a particular point of an engine cycle, then we will denote this by x(8k j), which is x at (8k j ) . 90 degrees, or x at the j-th event of the k-th engine cycle. The initial time will be taken as k = 0 at TDC of the compression stroke of cylinder 1. The basic model for a V6 or a four-cylinder engine is simpler; see [3]. A dynamic model of the exhaust manifold mixing is difficult to determine with current technology. This is because a linear A/F measurement is required, and, currently, such sensors have a slow dynamic response in comparison with the time duration of an individual cylinder event. Hence, standard system identification methods break down. In [6], a model structure for the mixing dynamics and an attendant model parameter identification procedure, compatible with existing laboratory sensors, is provided. This is outlined next. The key assumption used to develop a mathematical model of the exhaust gas mixing is that once the exhaust gas from any particular cylinder reaches the EGO sensor, the exhaust ofthat cylinder from the previous cycle (two revolutions) has been completely evacuated from the exhaust manifold. It is further assumed that the transport lag from the exhaust portpf any cylinder to the sensor location is less than two engine cycles. With these assumptions, and with reference to the timing diagram of Figure 74.7, a model for the exhaust-mixing dynamics may be expressed as relating the A/F at the sensor over one 720-crank-angle-degree period beginning at (8k) as a linear combination of the A/Fs admitted to the exhaust manifold by cylinder 5 during the exhaust strokes occurring at times (8k+7), (8k-I), and (8k-9); by cylinder 6 at (8k+6), (8k-2), and (8k-10); by cylinder 7 at (8k+4), (8k-4), and (8k-12); and by cylinder 8 at (8k+l), (8k-7), and (8k-15). This relationship is given by
+
+
as(l, 1)
...
a5(8, 1)
. . . a8(8, 1)
1
1)
8 .. . a8(1, 2) a5(8,2)
...
2)
[
'5(8k - 9) E6(8k - 10) E7(8k - 12) ~ , ( 8 k- 15)
]
where q is the actual A/F at the production sensor location, En is the exhaust gas AIF from cylinder n (n = 5,6,7.8), and a, is the time-dependent fraction of the exhaust gas from cylinder n contributing to the NF fuel ratio at the sensor. It follows from the key assumption that only 32 of the 96 coefficients in Equation 74.1 l can be nonzero. Specifically, every triplet {a,(k, 0), a, (k, I), a, (k, 2)) has, at most, one nonzero element. This is exploited in the model parameter identification procedure. Determining Model Parameters The pure delay z-d is determined by the type of injection timing used (open or closed valve) and does not vary with engine speed or load. A typical value for closed-valve injection timing is d = 8. The time constant of the EGO sensor is normally provided by the manufacturer; if not, it can be estimated by installing it directly in the exhaust runner of one of the cylinders and controlling the fuel pulse width to cause a switch from rich to lean and then lean to rich. A typical average value of these two times is t = 70 ms. The first step towards identifying the parameters in the exhaust-mixing model is to determine which one of the parameters (a, (k, 0), a, (k, I), a, (k, 2)) is the possibly nonzero element; this can be uniquely determined on the basis of the transport delay between the opening of the exhaust valve of each cylinder and the time of arrival of the corresponding exhaust gas pulse at the EGO sensor. The measurement of this delay is accomplished by installing the fast, switching type EGO sensor in the exhaust manifold in the production location and carefully balancing the A/F of each cylinder to the stoichiometric value. Then apply a step change in A/F to each cylinder and observe the time delay. The results of a typical test are given in [6]. The transport delays change as a function of engine speed and load ,and,thus, should be determined at several operating points. In addition, they may not be a multiple of 90 degrees. A practical method to account for these issues through a slightly more sophisticated sampling schedule is outlined in [3]. At steady state, Equation 74.11 reduces to
where
THE CONTROL HANDBOOK
SEirnple Tim
ek-4 8k-3 Bk-2 Bk-1 8k IhTAKE
COUPRESSICU
8ktl 8k+2, 8k+3 8 k 4 8kt5 8 k 4 8kt7 8kt8 8ktS 8ktlO 8k+11 8kt12 PCWER
CYLINDER 1
EXHIUST
INTAKE
EXHIUS1
CYLINDER 3 MMPFESSICU
POWER
COLlPRESSlON
IhTAKE
EXHAUST
POWER
COMPRESSION
EXHAUST
KWER
EXHAUST
IKIAKE
CYLINDER 7 INTAKE
COUPFEWON
POWER
EXHAUST
CYLINDER 2 POWER
EMIMT
INTAKE
COUPRESIGN
POWER
EXHAUST
INTAKE
COUPREWON
CYLINDER 6 COWREPSION
POWER
EXHIUST
INTAKE
MMPRESSION
POWER
EMUST
INTAKE
POWER
EXHAUST
INTAKE
POWER
EXHAUST
CMRE8SK)N
CYLINDER 5 CYLINDER 4 COUPRSSM
INTAKE
CYLINDER 8
Figure 74.7
Timing diagram for eight-cylinder engine.
This leads to the second part of the parameter identification procedure in which the values of the summed coefficients of Equation 74.12 may be identified from "steady-state'' experiments performed with a linear EGO sensor installed in the production location. Then, knowing which of the coefficients is nonzero, Equation 74.1 1 may be evaluated. Install a linear EGO sensor in the production location. Then the measured N F response, y(k), to the sensor input, ~ ( k )is, modeled by
Next, carefully balance each of the cylinders to the stoichiometric AIF, then offset each cylinder, successively, 1 N F rich and then 1 A/F lean to assess the effect on the A/F at the sensor location. At each condition, let the system reach steady state, then record Y and E over three to ten engine cycles, averaging the components of each vector in order to minimize the impact of noise and cycle-to-cycle variability in the combustion process. This provides input A/F vectors E = [El... E8] and output N F vectors 3 = [yl. ..f 8 ],.where the overbar represents the averaged value. The least squares solution to Equation 74.15 is then given by A,,, = Q ; ~ Z E ~ ( E E ~ ) - ~ . (74.16)
where a! = e - T l t ~ ,r L is the time constant of the linear EGO sensor, and T is the sampling interval, that is, the amount of time per 90 degrees of crank-angle advance. It follows [ 6 ] that the combined steady-state model of exhaust-mixing and linear EGO sensor dynamics is
The identified coefficients of Ami, should satisfy two conditions: ( 1 ) the entries in the matrix lie in the interval 10, 11; ( 2 ) the sum of the entries in any row of the matrix is unity. These conditions correspond to no chemicalprocesses occurring in the exhaust system (which could modify the A/F) between the exhaust valve and the EGO sensor. Inherent nonlinearities in the "linear" EGO sensor or errors in the identification of its time constant often lead to violations of these conditions. In this case, the following fixis suggested. For each row of the matrix, identlfy the largest negative entry and subtract it from each entry so that all are non-negative; then scale the row so that its entries sum to one. Assembling and Validating the State-SpaceModel A state-space model will be used for control law design. ~ h tcom? bined dynamics of the N F system from the fuel scheduler to the EGO sensor is shown in Figure 74.8. The coefficients ~1 (k), . .. , ~ ~ ~arise ( kfrom ) constructing a state-space representation of Equation 74.11 and, thus, are directly related to the a, (k, j). In particular, they are periodic, with period equal to one engine cycle. Figure 74.9 provides an example ofihese coefficients for the model identified in [ 6 ] . Assigning state variables as indicated, the state-space model can be expressed as
where
and
This is an %periodic SISO system.
74.1. ENGINE CONTROL
Induction-toPower Stroke
+
EGO Sensor
Fuel Scheduling
Figure 74.8
Periodically time-varyingmodel of engine showing state variable assignments.
Figure 74.9
Time,-dependentcoefficients for Figure 74.8.
For control design, it is convenient to transform this system, via lifting [ I ] , [5], to a linear, time-invariant multipleinput multiple-output (MIMO) system as follows. Let Z ( k ) = x(8k), Y ( k ) =: [ y(8k), . .., y(8k 7 ) l T , U ( k ) = [ u ( 8 k ) ,. . . , u (8k 7 ) ]T . Then
+
where
+
Normally, D is identically zero because the time delay separating the input from the sensor is greater than one engine cycle. Since only cylinders 5 through 8 are to be controlleld, the B and 8 matrices may be reduced by eliminating the columns that correspond to the control variables for cylinders 1 through 4 . This results in a system model with four inputs and eight outputs. Additional data should be taken to validate the identified model of the A/F system. An example of the experimental and modeled response to a unit step input in AIF is shown in Figure 74.10.
Control Algorithm for ICAFC The first step is to check the feasibility of independently controlling the AIF in the four cylinders. This will be possible if and only if1 the model of Equation 74.18, with all of the injector gains set to unity, has
'Since the model is asymptoticallystable, it is automaticaily stabilizable and detectable.
THE CONTROL HANDBOOK MODELED vs. ACTUAL STEP RESPONSE for CYLINDER 6
stoichiometry must be selected. One way to do this is to choose four components of Y on the basis of achieving the best nurnerically conditioned dc gain matrix when the other four output components are deleted. Denote the resulting reduced output by f(k). Then integrators can be added as
. . . . . . . . . . . . . . . . . . . . . .
0.80
Crank Angle, degrees
g
o.,
I
'.
.
-..
where AY, is the error between the measured value o f f and the stoichiometric setpoint. In either case, it is now very easy to design a stabilizing controller by a host of techniques presented in this handbook. For implementation purposes, the order of the resulting controller can normally be significantly lowered through the use of model reduction methods. Other issues dealing with implementation are discussed in [3], such as how to incorporate the switching aspect of the sensor into the final controller and how to properly schedule the computed control signals. Specificexamples of such controllers eliminating AIF maldistribution are given in [3] and
'..
crank Angle, degree8
[61.
74.1.3 Idle SpeedControl ,
0
,
.
,
.
I
.
,
.
I
.
I
I
I
.
I
.
1
.
'
.
'
.
300.0 BW.0 900.0 1200.0 1500.0 1800.0 8100.0 24W.O 2700.030m.O 3300.0 3 6 ~ 1 0
Crank Angle, degrees
Comparison of actual and modeled step response for cylinder number 6 .
Figure 74.10
"full rank2 at dc" (no transmission zeros at 1). To evaluate this, compute the dc gain of the system
then compute the singular value decomposition (SVD) of Gdc For the regulation problem to be feasible, the ratio of the largest to the fourth largest singular values should be no larger than 4 or 5. If the ratio is too large, then a redesign of the hardware is necessary before proceeding to the next step 161. In order to achieve individual set-point control on allcylinders, the system model needs to be augmented with four integrators. This can be done on the input side by
where V(k) is the new control variable; or on the output side. To do the latter, the four components of Y that are to be regulated to
2~hysically, this corresponds to being able to use constant injector inputs to arbitrarily adjust the A/F in the individual cylinders.
I
Engine idle is one of the most frequently encountered operating conditions for city driving. The quality of ISC affects almost every aspect of vehicle performance such as fuel economy, emissions, drivability, etc. The ISC problem has been extensively studied, and ;? comprehensive overview of the subject can be found in [4]. The primary objective for ISC is to maintain the engine speed at a desired setpoint in the presence of various load disturbances. The key factors to be considered in its design include: Engine speed setpoint. To m k m i z e fuel economy, the reference engine speed is scheduled at the minimum that yields acceptable combustion quality; accessory drive requirements; and noise, vibration, and harshness (NVH) properties. As the automotive industry strives to reduce fuel consumption by lowering the idle speed, the problems associated with the idle quality (such as allowable speed droop and , recovery transient, combustion quality and engine vibration, etc.) tend to be magnified and thus put more stringent requirements on the performance of the control system. Accessory load disturbances. Typical loads in today's automobile include air conditioning, power steering, power windows, neutral-to-drive shift, alternator loads, etc. Their characteristics and range of operation determine the complexityof the control design and achievable performance. Control authority and actuator limitations. The controlvariablesfor ISC are air flow (regulatedby the throttle or a bypass valve) and spark timing. Other variables, such as AIF, also affect engine operation,
74.1. ENGINE CONTROL
but A/F is not considered as a controlvariable for ISC because it is the primary handle on emissions. The air bypass valve (or throttle) and spark timing are subject to constraints imposed by the hardware itself as well as other engine'control design considerations. For example in order to give spark enough control authority to respond to the load disturbances, it is necessary to retard it froniMBT to provide appreciable torque reserve. On the other hand, there is a fuel penalty associated with the retarded spark, which, in theory, can be compensated by the lower idle speed allowed by the increased control authority of spark. The optimal trade-off, however, differs from engine to engine and needs to be evaluated by taking into consideration combustion quality and the ignition hardware co~nstraints(the physical time required for arming the coil and firing the next spark imposes a limitation om the allowable spark advance increase between twc~consecutive events). Available measurement. Typically, only engine speed is used for ISC feedback. MAP, or inferred MAP, is also used in some designs. Accessory load sensors (such as the air conditioning switch, neutralto-drive shift switch, power steering pressure sensor, etc.) are: installed in many vehicles to provide information on load disturbances for feedforward control. Variations in engine characteristics over the entire operating range. The ISC design has to consider different operational and environmental conditions such as temperature, altitude, etc. To meet the performance objectivesfor alarge fleet ofvehicles throughout their entire engine life, the control system has to be robust enough to incorporate changes in the plant dynamics due to aging and unit-to-unit variability. The selection of desired engine setpoint and spark retard is a sophisticated design trade-off process and is beyond the scope of this chapter. The control problem addressed here is the speed tracking problem, which can be formally stated as: For a given desired engine speed setpoint, design a controller that, based on the measured engine speed, generates commands for the air bypass valve and spark:timing to minimize engine speed variationsfrom the setpoint in the presence of load disturbances. A schematic control system diagram is shown in Figure 74.1 1.
Engine Models for ISC An engine model that encompasses the most important characteristics and dynamics of engine idle operation is given in Figure 74.12. It uses the model structure developed in [7] and consists of the actlaator characteristics, manifold filling dynamics, engine pumping characteristics, intake-to-power stroke delay, torque characteristics,and engine rotational dynamics (EGR is not considered at idle). The assumption of sonic flow through the throttle, glenerally satisfied at idle, has led to a much simpli-
Valve Position
h k e Manifold
Air Charge Temperature
Sensor-actuator configuration for ISC.
Figure 74.1 1
Spark
Load Toque
I
I
A/F
:
Air Bypass Valve
Manifold Filling
lMN,f')wDelay Engine Pumping
Engine Torque
Nonlinear engine model.
Figure 74.12
fied model where the air flow across the throttle is only a function of the throttle position. The differentialequations describingthe overall dynamics are given by MAF
P m
J ~ N Tq 0) where u
r S
fa
(u)
K m ( M AF - m ) C y l ( N , P) T* - TL,
(74.23)
f ~ ( m ( --a), t N ( t ) , r ( t --a),S ( t ) ) duty cycle for the air bypass valve
A/F spark timing in terms of crank-angle degrees before TDC
TL
load torque
J, and K m in Equation 74.23 are two engine-dependent constants, where Je represents the engine rotational inertia, and K,,, is a function of the gas constant, air temperature, manifold volume, etc. Both Je and K m can be determined from engine design specifications and given nominal operating conditions. The time delay a in the engine model'equals approximately 180 degrees of crank-angle advance and, thus, is a speed-dependent parameter. This is one reason that models for ISC often use crank-angle instead of time as the independent variable. Additionally, most engine control activities are event driven and synchronized with instead of respeccrank position; the use of tively, tends to have alinearizing effect on the pumping and torque
%,
PZ,g,
THE CONTROL HANDBOOK generation blocks. Performing a standard linearization procedure results in the linear model shown in Figure 74.13, with inputs Au, AS, ATL (the change of the bypass valve duty cycle, spark, and load torque from their nominal values, respectively) and output A N (the deviation of the idle speed from the setpoint). The time delay in the continuous-time feedback loop usually complicates the control design and analysis tasks. In a discrete-time representation,
Manifold Dynamics
Engine
Figure 74.14
Spark-torque relation for different engine speeds (with
A/F=14.64). Engine Pumping
Figure 74.13
Linearized model for typical eight-cylinderengine.
however, the time delay in Figure 74.13 corresponds to a rational transfer function z-n where n is an integer that depends on the sampling scheme and the number of cylinders. It is generally more convenient to accomplish the controller design using a discrete-time model.
Determining ~ o d eParameters l In the engine model of Equation 74.23, the nonlinear algebraic functions fa, fT, Cyl describe characteristicsof the air bypassvalve, torque generation, and engine pumping blocks. These functions can be obtained by regressing engine dynamometer test data, using least squares or other curve-fitting algorithms. The torque generation function is developed on the engine dynamometer by individually sweeping ignition timing, AIF, and mass flow rate (regulated by throttle or air bypass valve position) over their expected values across the idle operating speed range. For a typical eight-cylinder engine, the torque regression is given by
hot-wire anemometer or volume flow by measuring the pressure drop across a calibrated laminar flow element. The dynamic elements in the linearized idle speed model can be obtained by linearizing the model in Section 74.1.2, or they can be estimated by evaluating (small) step response data from a dynamometer. In particular, the intake manifold time constant can be validated by constant-speed, sonic-flow throttle step tests, using a sufficiently high-bandwidth sensor to measure manifold pressure.
ISC Controller Design The ISC problem lends itself to the application of various control design techniques. Many different design methodologies, ranging from classical (such as PID) to modern (such as LQG, H,, adaptive, etc.) and nonconventional (such as neural networks and fuzzy logic) designs have been discussed and implemented [4]. The mathematical models described previously and commercially available software tools can be used to design different control strategies, depending on the implementor's preference and experience. A general ISC system with both feedforward and feedback components is shown in the block diagram of Figure 74.15.
* Lad Dkturbuve
d
Air Byprra V l l n F d Forwad Contml
The steady-state speed-torque relation to spark advance is illustrated in Figure 74.14. For choked (i.e., sonic) flow, the bypass valve's static relationship is developed simply by exercising the actuator over its operating envelope and measuring either mass airflow using a
Figure 74.15 ments.
General ISC system with feedforward and feedback ele-
74.1. ENGINE CONTROL
Feedforward Control Design Feedforward control is considered as an effective mechanism to reject load disturbances, especially for small engines. When a disturbance is measured (most disturbance sensor? used in vehicles are on-off type), control signals can be generated in an attempt to counteract its effect. A typical ISC strategyhas feedforward only for the air bypassvalvi control, and the feedforward is designed based on static engine mapping data. For example, if an air conditioning switch sensor is installed, an extra amount of air will be scheduled to prevent engine speed droop when the air conditioning compressor is engaged. The amount of feedforward control can be determined as follows. At the steady state, since P,,Q 0, the available engine torque to balance the load torque is related to the mass air flow and engine speed through
By estimating the load torque presented to the engine by the measured disturbance, one can calculate, for fixed AIF and spark, the amount of air that is needed to maintain the engine speed at the fixed setpoint. The feedforward control can be applied either as a multiplier or an adder to the control signal. Feedforward control introduces extra cost due to the added sensor and software complexity; thus, it should be used only when necessary. Most inlportantly, it should not be used to replace the role of feedback in rejecting disturbances, since it does not address the problems of operating condition variations, miscalibration, etc. Feedback Design Feedbackdesignfor ISC can be pursued in many different ways. Two philosophicaliy different approaches are used in developing the control strategy. One is the SISO approach, which treats the air m d spark control as separate entities and designs one loop at a time. When the SISO approach is used, the natural separation of tinne scale in the air and spark dynamics (spark has a fast response compared to air flow, which has a time lag due to the manifold dynamics and intake-to-power delay) suggests that the spark control be closed first as an inner loop. Then the air control, as an outer loop, is designed by including the spark feedback as part of the plant dynamics. Another approach is to treat the ISC as a multiple-input single-output (MISO) or, when the manifold pressure is to be controlled, a MIMO problem. Many control strategies, such as LQ-optimal control and H, have been developed within the MIMO framework. This approach generally leads to a coordinated air and spark control strategy and improved performance. Despite the rich literature on ISC featuring different control design methodologies, PID control in combination with static feedforwarddesign is stillviewedas the control structure of choice in the automotive industry. In many cases, controllers designed using advanced theory, such as H, and LQR, are ultimately implemented in the PID format to reduce complexity and to append engineering content to design parameters. A typical production ISC feedback strategy has a PID fort he air bypass valve (or throttle) control and a simple proportional feedback for the spark control. This control configuration is dictated by the following requirements: (1) at steady state, the spark should return to its
nominal value indlependent of the load disturbances; (2) zero steady-state error has to be attained for step disturbances.
Calibratioin of ISC Control system development in the automotive industry has been traditionally an empirical process with heavy reliance on manual tuning. As engine control systems have become more complex because of increased functionality, the old-fashioned trial-and-error aplproach has proved inadequate to achieve optimum performance for interactively connected systems. The trends in today's automotive control system development are in favor of more model-based design and systematic calibration. Tools introduced for performing systematic in-vehicle calibration include dynamic optimization packages, which are used to search for optimal parameters based on a large amount of vehicle data, and controller fine-tuning techniques. Given the reality that most ISC strategiesimplemented in vehicles are of PID type, we discuss two PID tuning approaches that have proved effective in ISC calibration. The first method is based on the sensitivity functions of the engine speed with respect to the controller parameters. Let K be a generic controller parameter (possibly vector valued), and suppose that we want to minimize a performance cost function J ( A N ) [a commonly used function for J is J = AN)^] by adjusting K . Viewing A N as a function of K and noting that = $$,we have
Accordingto Newton's method, A K, which minimizes J , is given by AK = -
I(%)' %I-' 6)' AN.
(74.24)
By measuring the sensitivity function $,we can use a simple gradient method or Equation 74.24 to iteratively minimize the cost function J. The controller gains for the air and spark loops can be adjusted simultaneously. The advantages of the method are that the sensitivity functions are easy to generate. For the ISC calibration, the sensitivity functions of N with respect to PID controller paramieters can be obtained by measuring the signal at the sedsitivitypoints, as illustrated in Figure 74.16. It should be pointed out that this offlinetuning principle can be used to develop an on-line adaptive PID control scheme (referred to as the M.I.T. rule in the adaptive control literature). The sensitivity function method can also be used to investigate the robustness of the ISC system with respect to key plant parameters by evaluating where K p is the plant parameter vector. The second method is the well-known Ziegler-Nichols PID tuning method. It gives a set of heuristic rules for selecting the optimal PID gains. For the ISC applications, modifications have
3;
THE CONTROL HANDBOOK Load Distubance
Desired Engine Specd
PID control for ISC Load Disturbance
I
Sensitivity points for proportional spark-loop control
Sensitivity points for PID air-loop control Figure 74.16
Sensitivity points for calibrating PID idle speed con-
troller. to be introduced to accommodate the time delay and other constraints. Generally, the Ziegler-Nicholssensitivity method is used to calibrate the PID air feedback loop after the proportional gain for the spark is fixed.
[3] Grizzle, J.W., Dobbins, K.L., and Cook, J.A., Individual cylinder air fuel ratio control with a single EGO sensor, IEEE Trans. Vehicular Technol., 40(1), 280286, February 1991. [4] Hrovat, D. and Powers, W.F., Modeling and Control of Automotive Power Trains, in Control and Dynamic Systems, Vol. 37, Academic Press, New York, 1990, 33-64. [5] Khargonekar, P.P., Poolla, K., and Tannenbaum, A., Robust control of linear time-invariant plants using periodic compensation, IEEE Trans. Autom. Control, 30(1 l), 1088-1096,1985. [6] Moraal, P.E., Cook, J.A., and Grizzle, J.W., Single sensor individual cylinder air-fuel ratio control of an eight cylinder engine with exhaust gas mixing, in Proc. 1993Am. Control Conf, San Francisco, CA, June 1993, 1761-1767. [7] Powell, B.K. and Cook, J.A., Nonlinear low frequency phenomenological engine modeling and analysis, in Proc. 1987Am. Control Conf, Minneapolis, MN, June 1987,332-340. [8] Powers, W.F., Customers and controls, IEEE Control Syst. Mag., 13(1), February 1993, 10-14. [9] Shulman, M.A. and Hamburg, D.R., Non-ideal properties of Z , O2and O2 exhaust gas oxygen sensors, SAE Tech. Paper Series, No. 800018, 1980. [lo] Taylor, C.F., The Internal Combustion Engine in Theory and Practice, Vol. l i Thermodynamics, Fluid Flow, Performance, MIT Press, Cambridge, MA, 1980,187.
74.2 Adaptive Automotive Speed Control
This work was supported in part by the National Science Foundation under contract NSF ECS-92-13551. The authors also acknowledge their many colleagues at Ford Motor Company and the University ofMichigan who contributed to the work described in this chapter, with special thanks to Dr. Paul Moraal ofFord.
M. K. Liubakka, Advanced Vehicle Technology, Ford Motor Company, Dearborn, MI D.S. Rhode, Advanced Vehicle Technology, Ford Motor Company, Dearborn, MI 1. R. Winkelman, Advanced Vehicle Technology, Ford Motor Company, Dearborn, MI l? l? KokotoviC, ECE Department, University of California, Santa Barbara, C A
References
74.2.1 Introduction
74.1.4 Acknowledgments
[ l ] Buescher, K.L., Representation, Analysis, and Design of Multirate Discrete-TimeControl Systems, Master's thesis, Department of Electrical and Computer Engineering, University of Illinois, Urbana-Champaign, 1988. [2] Grizzle, J.W., Cook, J.A., and Milam, W.P., Improved cylinder air charge estimation for transient air fuel ratio control, in Proc. 1994Am. Control Con$, Baltimore, MD, June 1994,1568-1573.
One of the main goals for an automobile speed control (cruise control) system is to provide acceptable performance over a wide range of vehicle lines and operating conditions. Ideally, this is to be achieved with one control module, without recalibration for
3@1993 IEEE. Reprinted, with permission, from IEEE Transactions on Automatic Control; Volume 38, Number 7, Pages 1011-1020; July 1993.
74.2. ADAPTIVE AUTOMOTIVE SPEED CONTROL different vehicle lines. For commonly used proportional feedback controllers, no single controller gain is adequate for all vehicles and all operating conditions. Such simple controllers no longer have the level of p!rformance expected by customers. The complexity of speed control algorithms has increased through the years to meet the more stringent performance re2 quirements. The earliest systems simply held the throttle in a fixed position [I]. In the late 1950s speed control systems with feedback appeared [ 2 ] . These used proportional (P) feedback of the speed error, with the gain typjcally chosen so that 6 to 10 mph of error would pull full throttle. The next enhancement was proportional control with an integral preset or bias input (PI) [3]. This helped to minimize steady-state error as well as speed droop when the system was initialized. Only with the recent availability of inexpensive microprocessors have more sophis~icatedcontrol strategies been implemented. Proportional-integral-derivative (PID) controllers, optimal LQ regulators, Kalman filters, fuzzy logic, and adaptive algorithms have all been tried [4]- [ 101. Still, it is hard to beat the performance of a well-tuned PI controller for speed control. The problem is how to keep the PI controller well tuned, since both the system and operating conditions vary greatly. The optimal speed control gains are dependent on: Vehicle parameters (engine, transmission, weight, etc.) Vehicle speed Torque disturbances (road slope, wind, etc.) Gain scheduling over vehicle speed is not a viable option because the vehicle parameters are not constant and torque disturbances are not measurable. Much testing and calibration work has been done to tune PI gains for a controller that works across more than one car line, but as new vehicles are added, retuning is often necessary. For example, with a PI speed control, lowpower cars generally need higher gains than high-power cars. This suggests a need for adaptation to vehicle parameters. For an individual car, the best performance on flat roads is achievedwith low-integral gain, while rolling hill terrain requires high-integral gain. This suggests a need for adaptation to disturbances. Our goal was to build an adaptive controller that outperforms its fixed-gain competitors, yet retains their simplicity and robustness. This goal has been achieved with a slow-adaptation design using a sensitivity-based gradient algorithm. This algorithm, driven by the vehicle response to unmeasured load torque disturbances, adjusts the proportional and integral gains, KP and Kj, respectively, to minimize il quadratic cost functional. Through simulations and experiments a single cost functional was found that, when minimized, resulted in satisfactory speed control performance for each vehicle and d operating conditions. Adaptive minimization of this cost functional improved the performance of every tested vehicle over varying road terrain (flat, rolling hills, steep grades, etc.). This is not possible with a fixed-gain controller. Our optimization type adaptive design has several advantages. The slow adaption of only two adjustable parameters is simple
and makes use of knowleclge already acquired about the vehicle. The main requirement for slow adaptation is the existence of a fixed-gain controller that provides the desired performance when properly tuned. Sincethe PI controlmeets this requirement andis well understood, the design and implementation of the adaptive control with good robustness properties become fairly easy tasks. With only two adjustable parameters, all but perfectly flat roads provide sufficient excitation for parameter convergence and local robustness. These properties are strengthened by the sensitivity filter design and speed-dependent initialization. The main idea of the adaptive algorithm employed here comes from a sensitivity aipproach proposed in the 1960s [ l 11 but soon aban'doned because of its instabilities in fast adaptation. Under the ideal model-matching conditions, such instabilities do not occur in more complex schemes developed in the 1970s to 1980s. However, the ideall model-matching requires twice as many adjustable parameters as the dynamic order of the plant. If the design is based on a reduced-order model, the resulting unmodeled dynamics may cause instability and robust redesigns are required. This difficulty motivated our renewed interest in a sensitivity-based approach in which both the controller structure and the adjustableparameters arefree to be chosen independently of theplant order. Such an approach would be suitable for adaptive tuning of simple controllers to higher-order plants if a verifiable condition for its stability could be found. For this purpose we employ the "pseudogradient condition," recently derived by Rhode 1121, [13] using the averaging results of [14]-[16]. A brief outline of this derivation is given in the appendix. From the known bounds on vehicle parameters and torque disturbances, we evaluate, in the frequency domain, a "phase-uncertainty envelope.'' Then we design a sensitivity filter to guarantee that the pseudogradient condition is satisfied at all points encompassed by the envelope.
74.2.2
Design Objectives
The automobile speed control is simpler than many other automotive control problems: engine efficiency and emissions, active suspension, four-.wheel steering, to name only a few. It is, therefore, required that the solution to the speed control problem be simple. However, this simple solution must also satisfy a set of challenging perfo~rmanceand functional requirements. A typical list of these is as follows: Performanice requirements
- Speed tracking ability for low-frequencycomman'ds.
- Torque disturbance attenuation for low frequencies, with zero steady-state error for large grades (within the capabilities of the vehicle power train).
- Smooth and minimal throttle movement. - Roblustness of the above properties over a wide range of operating conditions.
THE CONTROL HANDBOOK
74.2.3 The Design Concept
Functional requirements
- Universality: the same control module must meet the performance requirements for different vehicle lines without recalibration.
- Simplicity: design concepts and diagrams should be understandable to automotive engineers with basic control background. The dynamics that are relevant for this design problem are organized in the form of a generic vehicle model in Figure 74.17.
0 I 8
Ensinm U i t h L o c k e d UP Torsue Conuertmr Or Clutch
I, T r a n s m i s s i o n
II 81
18 14
Vehicle D y n a m i c s Road
r Dlsturbmct Input
14
Figure 74.17 Vehicle model for speed control. Different vehicle lines are represented by different structures and parameters of individual blocks.
To represent vehicles of different lines (Escort, Mustang, Taurus, etc.) individual blocks will contain different parameters and, possibly, slightly different structures (e.g., manual or automatic transmission). Although the first-order vehicle dynamics are dominant, there are other blocks with significant higherfrequency dynamics. Nonlinearities such as dead zone, saturation, multiple gear ratios, and backlash are also present. In conjunction with largevariations of static gain (e.g., low- or highpower engine) these nonlinearities may cause limit cycles, which should be suppressed, especially if noticeable to the driver. There are two inputs: the speed set-point y,,t and the road load disturbance torque Tdi,. While the response to set-point changes should be as specified, the most important performance requirement is the accommodation of the torque disturbance. The steady-state error caused by a constant torque disturbance (e.g., constant-slope road) must be zero. For other types of roads (e.g., rolling hills) a low-frequency specification of disturbance accommodation is defined. This illustrative review of design objectives, which is by no means complete, suffices to motivate the speed control design presented in Sections 74.2.3 and 74.2.4. Typical results with test vehicles presented in Section 74.3 show how the above requirements have been satisfied.
I
The choice of a design concept for mass production differs substantially from an academic study of competing theories, in this case, numerous ways to design an adaptive scheme. With physical constraints, necessary safety nets, and diagnostics, the implementation of an analytically conceived algorithm may appear similar to an "expert," "fuzzy," or "intelligent" system. Innovative termino!ogies respond to personal tastes and market pressures, but the origins of most successful control designs are often traced to some fundamental concepts. The most enduring among these are PI control and gradient type algorithms. Recent theoretical results on conditions for stability of such algorithms reduce the necessary ad hoc fixes required to assure reliable performance. They are an excellent starting point for many practical adaptive designs and can be expanded by additional nonlinear compensators and least-square modifications of the algorithm. For our design, a PI controller is suggested by the zero steadystate error requirement, as well as by earlier speed control designs. In the final design, a simple nonlinear compensator was added, but is not discussed in this text. The decision to adaptively tune the PI controller gains K,, and K , was reached alter it was confirmed that a controller with gain scheduling based on speed cannot satisfy performance requirernents for all vehicle lines under all road load conditions. Adaptive control is chosen to eliminate the need for costly recalibration and to satisfy the universal functionality requirement. The remaining choice was that of a parameter adaptation algorithm. Based on the data about the vehicle lines and the fact that the torque disturbance is not available for measurement, the choice was made of an optimization-based algorithm. A reference-model approach was not followed because no single model can specify the desired performance for the wide range of dynamics and disturbances. On the other hand, through simulation studies and experiencewith earlier designs, a single quadratic cost functional was constructed whose minimization led to an acceptable performance for each vehicle and each operating condition. For a given vehicle subjected to a given torque disturbance, the minimization of the cost functional generates an optimal pair of the PI controller gains Kp and K, . In this sense, the choice of a single cost functional represents an implicit map from the set of vehicles and operating conditions to an admissible region in the parameter plane (Kp, Ki). This region was chosen to be a rectangle with preassigned bounds. The task of parameter adaptation was to minimize the selected quadratic cost functional for each unknown vehicle and each unmeasured disturbance. A possible candidate was an indirect adaptive scheme with an estimator of the unknown vehicle and disturbance model parameters and an on-line LQ optimization algorithm. In this particular system, the frequency content in the disturbance was significantly faster than the plant dynamics. This resulted in difficulties in estimating the disturbance. After some experimentation, this scheme was abandoned in favor of a simpler sensitivity-based scheme, which more directly led to adaptive minimization of the cost functional and made better use of the knowledge acquired during its construction.
74.2. ADAPTIVE AUTOMOTIVE SPEED CONTROL
+&+-I;--;."~~ y
Figure 74.18
vrlx!cle speed
The system and its copy generate sensltlvity functions for optimization of the PI controller parameters Kp and K , .
The sensitivity-based approach to parameter optimization exploits the remarkable sensitivity property of linear systems: the sensitivity function (i.e., partial derivative) of any signal in the system with respect to any constant systern parameter can be obtained from a particular cascade connection of the system and its copy. For a linearized version of the vehicle model in Figure 74.17, the sensitivities of the vehicle speed error e = y - y,,t with respect to the PI controller parameters K p and Ki are obtained as in Figure 74.18, where Go(s) and G I (s) represent the vehicle and K. power train dynamics, respectively; C ( s , K) = - K,, - f ; and the control variable u is throttle position. This result can be derived by differentiation of
this functional when the system is known, so that its copy can be employed to generate the sensitivities needed in Equation 74.27. In fact, our computational procedure for finding acost functional good for all vehicle lines and operating conditions made use of this algorithm. Unfortunately, ,when the vehicle parameters are unknown, the exact-gradient algorithm of Equation 74.27 cannot be used because a copy of thte system is not available. In other words, an algorithm employing exact sensitivities is not suitable for adaptive control. A practical escape from this difficulty is to generate some approximations of the sensitivity functions aii
@I
%
-, $2X
~ K P
ae
aii
-,
aK,
$0"-,
ae $4X-
~ K P
a K,
(74.29)
and to employ them in a "pseudogradient" algorithm with respect to K = [ K,,, Ki 1, namely,
dK
7lf?
= (- 1 , - f ). Expressions analogous to Equation where 74.26 can be obtained for the control sensitivities Our cost functional also uses a high-pass filter F,(s) to penalize higher frequencies in u; that is, ii(s, K) . - = F(s)u(s, K). The sensitivities of ii are obtained simply as = F (s) When the sensitivity functions are available, a continuousgradient algorithm for the PI controller parameters is
&.
&
&.
where the adaptation speed determined by t- must be kept sufficiently s m d so that the averaging assumption (Kp and Ki are constant) is approximately satisfied. With 6 small, the method of averaging [14]-[16] is applicable to Equation 74.27 and proves that, as t -+ m, the parameters Kp and Ki converge to an 6 neighborhood of the values that ]minimizethe quadratic cost functional
I =
LCP
(pIii2 +- p2e2) d t .
(74.28)
With a choice of the weighting coefficients, B1 and p2 (to be discussed later), our cost functional is Equation 74.28. Thus, Equation 74.27 is a convergent algorithm that can be used to minimize
=
-E(plC$l+B2e$3)
A filter used to generate $,, $2, $3, and $4 is called a pseudosensitivity filter The fundamental problem in the design of
the pseudosensitivity filter is to guarantee not only that the al74.30 converges, but also that the values to gorithm of'Equat~~on which it converges are close to those that minimize the chosen cost functional.
74.2.4 Adap~tiveController Implementation The adaptive speed control algorithm presented here is fairly simple and easy to implement, but care must be taken when choosing its free parameters and designing pseudosensitivity filters. This section discusses the procedure used to achieve a robust system and to provide the desired speed control performance.
Pseudoseinsitivity Filter While testing the adaptive algorithm it becomes obvious that the gains K p and K, and the vehicle parameters vary greatly for operating contditions and vehicles. This makes it impossible to implement the exact sensitivity filters for the gradient algorithm of Equation 74.27. Our approach is to generate a "pseudogradient" approximation of a J / a P, satisfying the stability and convergence conditions summarized in the appendix. In the appendlix, the two main requirements for stability and convergence are: a persistently exciting (PE) input condition and
THE CONTROL H A N D B O O K
a "pseudogradient condition:' which, in our case, is a phase condition on the nominal sensitivity filters. Since we are using a reduced-order controller with only two adjustable parameters, the PE condition is easily met by the changing road loads. Road disturbances have an approximate frequency spectrum centered about zero that drops offwith the square of frequency. This meets the PE condition for adapting two gains, K p and K i . To satisfythe pseudogradient condition, the phase of the pseudosensitivity filter must be within f90" of the phase of the actual sensitivity at the dominant frequencies. To help guarantee this for a wide range of vehicles and operating conditions, we varied the system parameters in the detailed vehicle model to generate an envelope of possible exact sensitivities. Then the pseudosensitivity filter was chosen near the center of this envelope. An important element of the phase condition is the fact that it is a condition on the sum of frequencies; that is, the phase condition is most important in the range of frequencies where there are dominant dynamics. If the pseudosensitivityfilters do not meet the phase conditions at frequencies where there is little dominant spectral content, the algorithm may still be convergent, provided the phase conditions are strongly met in the region of dominant dynamics. Thus, the algorithm possesses a robustness property. Figure 74.19 shows the gain and phase plots for thepseudosensitivity filter ay/aKp. The other three sensitivities are left out for brevity. Figure 74.20 shows the f90" phase boundary (solid lines) along with exact sensitivity phase angles (dashed lines) as vehicle inertia, engine power, and the speed control gains are varied over their full range. From this plot it is seen that the chosen pseudosensitivityfilter meets the pseudogradient condition along with some safety margin to accommodate unmodeled dynamics.
Figure 74.19
Gain and phase plot of pseudosensitivity ay/a Kp.
Choice of E,
p1,
and p2
For implementation, the first parameters that must be chosen are the adaptation gain r and the weightings in the cost functional, and pz. The adaptation gain E determines the speed of adaptation and should be chosen based on the slowest dynamics of the system. For speed control, the dominant dynamics result from the vehicle inertia and have a time constant on the order of 30 to 50 seconds. To avoid interaction between the adaptive
.oo 1
.01
1
1
10
Frequency in rad/sec
Figure 74.20 Envelope of possible exact sensitivity phase angles. The solid curves mark the f90' boundaries, and the dashed curves denote the limits of plant variation. controller states and those of the plant, the adaptation should be approximately an order of magnitude slower than the plant. As shown in 1141- 1161, this allows one to use the frozen parameter system and averaging to analyze stability of the adaptive system. The adaptation law takes up to several minutes to converge, depending on initial conditions and the road load disturbances. The two extreme choices of ps are (1) p1 = 0, p2 = k and (2) = k, 8 2 = 0 where k > 0. For the first extreme, = 0, we are penalizing only speed error, and the adaptation will tune to an unacceptable high-gain controller. High gain will cause too much throttle movement, resulting in nonsmooth behavior as felt by the passengers, and the system will be less robust from a stability point of view. For the second case, p2 = 0, the adaptation will try to keep a fixed throttle angle and will allow large speed errors. Obviously, some middle values for the weightings are desired. An increase of the ratio Bl/P2 reduces unnecessary throttle movement, while to improve tracking and transient speed errors we need to decrease this ratio.
Figure 74.21
Gain trajectories on flat ground at 30 mph.
The choice of the weightings was based on experience with tuning standard PI speed controllers. Much subjective testing has been performed on vehicles to obtain the best fixed gains for the PI controller when the vehicle parameters are known. With this information, simulations were run on a detailedvehi.de model with a small E, and the ps were.varied until the adaptation converged to approximately the gains obtained from subjective testing. The cost functional weights p1 and p2 are a different parameterization of the controller tuning problem. For development engineers, who may not be control system engineers,
74.3. PERFORMANCE IN TEST VEHICLES
Figure 74.22
Gain trajectories on low-frequency M s at 30 mph.
and p2 represent a pair of tunable parameters that relate directly to customer needs. This allows for a broader range of engineering inputs into the tuning process. As examples of adaptive controller performance, simulation results in Figures 74.21 and 74.22 show the trajectories of the gains for a vehicle on two different road terrains. Much can be learned about the behavior of the adaptive algorithm from these types of simulations. In general, K p varies proportionally with vehicle speed and K, varies with road load. The integral gain K, tends toward low values for small disturbances or for disturbances too fast for the vehicle to respond to. This can be seen in Figure 74.21. Conversely, Ki tends toward high values for large or slowly varying road disturbances, as can be seen in Figure 74.22.
Modifications For Improved Robustness Additional steps have been talzen to ensure robust performance m the automotive environment. First, the adaptation is turned off if the vehicle is operating in regions where the modeling assumptions are violated. These include operation at closed throttle or near wide-open throttle, during start-up transients, and when the set speed is changing. When the adaptation is turned off, the gains are frozen, but the sensitivities are still computed. Care has been taken to avoid parameter drift due to noise and modeling imperfections. Two common ways to reduce drift are projection and a dead band on error. Projection, which limits the range over which the gains may ad
measured. To minimize these unwanted gain changes, the rate at which the gains can adapt is limited. If the adaptation adjusts more quickly than a predetermined rate, the adaptation gain 6 is lowered, limiting the rate of change of the gains K p and K, . The final parameters to choose are the initial guesses for K p and K, . Sincethe adaptation is fairly slow, a poor choice of initial gains can cause poor start-up performance. For quick convergence, the initial controller gains are scheduled with vehicle speed at the points A, E;,and <:. Figure 74.23 shows these initial gains along with the range of possible tuning. The proportional relationship between vehicle speed and the optimal controller gains can be seen.
74.3 Performance in Test Vehicles -The adaptivealgorithm discussed in this chapter has been tried in a number of vehicles with excellent results. The adaptive control has worked well in every car tested so far and makes significant performance improvements in vehicles that have poor performance with the fixed-gain PI control. For vehicles where speed control already performs well, improvements from the adaptive algorithm are still significant at low speeds or at small throttle angles. Many vehicles with conventional speed control systems exhibit a limit cycle or surge condition at low speeds or on down slopes [7]. This is due in part to nonlinearities such as the idle stop, which limits;throttle movement. In such vehicles, the adaptive controller reduces the magnitude and lengthens the period of the limit cycle, thus improving performance. The first set of data, Figures 74.24 to 74.26, is for a vehicle that had a low-speed surge. Here the limit cycle is very noticeable in the data as well a:$while driving the car. The high frequency (0.2 Hz) of the limit cycle is what makes it noticeable to the driver. The adaptive controller greatly improves performance by decreasing the amplitude and frequencyto the point where the driver cannot feel the limit cycle. Looking at the control gains during this test, it is seen that the gains initially decrease to reduce the limit cycle,
74.3. PERFORMANCE I N TEST VEHICLES
0
50
100
150
200
250
300
350
400
450
71me (scc)
Figure 74.29
Figure 74.26
Controller gains.
Teat car C, 40 mph.
sacrifices in performance for certain car lines. It also allows improved performance for changing road conditions not possible with a fixed-gain control or other types of adaptive control. The results of initial vehicle testing confirm the performance improvements artd robustness of the adaptive controller. Vehicle speed control is not the only automotive applicati& of adaptive control at Ford Motor Company. The adaptive control technique presented in this chapter has also been applied to solve other problems of elecironicaYly controlled automotive systems.
Appendix T ~ m e(scc)
Figure 74.27
Test car A, 40 mph.
The pseudogradient adaptive approach used in this application relies upon the ]properties of slow adaptation. Such gradient tuning algorithms can be described by k ( t ) = -rQ(t, K)e(t,K ) , K
E
Srn
(74.3 1 )
where K is a vector of tunable parameters, e ( t , K ) is an error signal, and \V ( t , K ) is the regressor, which is an approximation of the sensitivity of the output with respect to the parameters In, this derivation, e ( t , K ) is defined as the difference between the output y ( t . K ) and a desired response ym ( t ) ,
w.
Figure 74.28
Test car B,40 mph.
gain tunes up and the integral gain tunes down. The integral gain decreases to reduce limit cycle behavior since the road disturbance is too fast for the vehicle to track. From the tests run in several vehicles it seems that, compared with other controllers, the adaptive algorithm is providing the best control possible as vehicle parameters change from car to car and as the road profile changes.
74.3.1 Conclusions This chapter has presented an adaptive algorithm to adjust the gains of avehiclespeed control system. By continuously adjusting the PI control gains, speed control performance can be optimized for each vehicle and operating condition. This helps to design a single speed control module without additional calibration or
so that the pseudogradient update law of Equation 74.31 aims to mitimize the average o f ' e ( t ,K l 2 . However, the same procedure ~ is readily modified to minimize a weighted sum of e ( t , K ) and u ( t , K ) as ~ in this speed control application. Since slow adaptation is used ortly for performance improvement, and not for stabilization, we constrain the vector K of adjustable controller parameters, k l , ..., k,, )toremain in a set K such that with constant K ( E = O), the resulting linear system is stable. Our main tool in this analysis is an integral manifold, the so-called slow manifold (141, that separates the fast linear plant and controller states from the slow parameter dynamics. To assure the existence of this manifold, we assume that \V(t, K ) and e(t , K) are differentiable with respect to K and the input to the system, r ( t ) , is a uniformly bounded, almost periodic function of time. Then applying the averaging theorem of Bogolubov, the stability of the adaptive algorithm is determined from the response of the linear system with constant parameters and the &erage update law. The following ar~alysisis restricted to initial conditions near this slow manifold. -
THE CONTROL HANDBOOK
0
Figure 74.30
20
40
80 Time (aec)
80
100
120
Vehicle with hgh-frequency road disturbance: top, with adaptive control; bottom, with fixed-gain control.
0
20
40
80 Tlme (aec)
80
100
120
80 Time (aec)
80
100
120
Wllh Flxed Caln Control
0
Figure 74.31
20
40
Vehicle with high-frequency road disturbance: tsp, with adaptive control; bottom, with fixed-gain control.
74.3. PERFORMANCE IN TEST VEHICLES
Figure 74.32
Gains with high-frequency disturbances
0
2000
4000
h
m
I ) 4 I
a 03
6000
10000
12000
2 0 ~ E E ' ,
3 w a g -200 0
2000
4000
6000
Distance (ft) Figure 74.33
t3000
Distance (ft)
Approxiriate freeway profile.
13000
,
bp, 10000
,
12000
THE CONTROL HANDBOOK In the pseudogradient approach used here, the regressor Y ( t , K ) is an approximation of the sensitivities of the error with We assume that the adjustableconrespect to parameters troller parameters K feed a common summing junction as shown in Figure 74.34. In this case, if the system in Figure 74.34 was known, these sensitivitieswould be obtained as in Figure 74.35, where H c ( s , K ) is the scalar transfer function from the summing junction input to the system output as shown in Figure 74.36. However, this transfer function is unknown, and we approximate it by the sensitivity filter H q ( s ) . The pseudogradient stability condition to be derived here specifies a bound on the allowable mismatch between H z ( s , K * ) and H q ( s ) ,where K* is such that
w.
Figure 74.35
Sensitivity system.
System Output
Junction Input Figure 74.36
Hz ( s , K ) .
As shown in [14]and [ I S ] , for slow adaptation this term is small and does not affect the stability of Equation 74.37 in a neighborhood of the equilibrium K * . The stability of this equilibrium, in the case when e * ( t ) = 0 , is established as follows. LEMMA74.1 [15]Consider the average update system (Equation 74.37) with A ( t ) = 0 , e ( t , K * ) = 0 , and K = K * . If k i p r e 74.34
Adjustable linear system.
With reduced-order controllers, it is unrealistic to assume that the error e ( t , K ) can be eliminated entirely for any value of controller parameters and plant variations. The remaining nonzero error is called the tuned error, e * ( t ) = e ( t , K * ) . The error e ( t , K ) is comprised of e* ( t ) and a term caused by the parameter error K = K - K * . Using a mixed ( t , s ) notation with signals as functions of time and transfer functions denoted as Laplace transforms, the error e ( t , K ) and the regressor \Ir(t, K ) are both obtained from the measured signals W ( t , K ) as follows:
* ( t , K ) = H q ( s ) [ W ( t ,K ) ] .
where we have again used H z ( s , K * ) in the definition of the signal The average swapping term A in Equation 74.37 is
To interpret the condition of Equation 74.40, we represent the regressor as
so that the matrix in Equation 74.40 can be visualized as in Figure 74.38, Representing the almost periodic signal V ( t , K ) as
(74.36)
The error expression illustrated in Figure 74.37 contains the exact sensitivity filter transfer function Hz ( s , K * ) . The stability properties of this slowly adapting system are determined by examination of the average parameter update equation
V ( t , K ) r H c ( s , K * ) [ W ( t ,K ) ] .
then there exists an E* > Osuch that VE E ( 0 , E * ]the equilibrium K = 0 is uniformly asymptotically stable.
(74.38)
a sufficient condition for Equation 74.40 to hold is
If the signal, V ( t ) ,possesses an autocovariance, Rv (z), it is PE if and only if Rv ( 0 ) > 0 [ 171. It follows that V ( t , K ) is PE if and only if C z - , R , [ v ~ C T ] > 0. Clearly, if the sensitivity filter is exact, H q ( j w ) = H z ( j w , K * ) , and V ( t , K ) is PE, then the sufficient stability condition in Equation 74.43 holds. When Hw ( s ) cannot be made exact, this stability condition is still satisfied when the pseudosensitivityfilter is chosen such that R ~ [ (Hj w~)HS' ( j w , K * ) ] 0 for dominant frequencies, that
[7] Uriuhara, M., Hattori, T., and Morida, S., Develop-
ment of Automatic Cruising Using Fuzzy Control System, J. SAEJpn., 42(2), 224-229, 1988. [8] Abate, M. and Dosio, N., Use of Fuzzy Logic for Engine Idle Speed Control, SAE Tech. Paper #900594, Figure 74.37
Figure 74.38
1990. [9] Tsujii, T., Takeuchi, H., Oda, K., and Ohba, M., Ap-
Error model.
Feedback matrix.
is large. This conaition serves is, the frequencies where vi as a guide for designing the pseudosensitivity filter Hq ( s ) . As HE( s , K * ) is unknown beforehand, we use the a priori information about the closed-loop system to design a filter Hq ( s ) such that in the dominant frequency range the following pseudogradient condition is satisfied for all plant and controller parameters of interest:
When Equation 74.44 is satisfied, then the equilibrium K* of the average update law of Equation 74.37 with A = 0, e*(t) = 0 is uniformly asymptotically stable. By Bogolubov's theorem and slow manifold analysis [14], this implies the local stability property of the actual adaptive system, provided e*(t) # 0 is sufficiently small. Since, in this al~proach,both the controller structure and the number of adjustable parameters are free to be chosen independently of plant order, there is no guarantee that e* ( t ) will be sufficiently small. Although conservative bounds for e* ( t ) may be calculated [12], in practice, since the design objective of the controller and pseudogradient adaptive law is tb minimize the average of e(t , K ) ~e*, ( t ) is typically small.
References [ l ] Ball, J.T., Approaches and Trends in Automatic Speed . #670195, 1967. Controls, SAE ~ e l hPaper [2] Follmer, W.C., Electronic Speed Control, SAE Tech. Paper #740022,1974. [3] Sobolak, S.J., Simulation of the Ford Vehicle Speed Control System, SAE Tech. Paper #820777, 1982. [4] Nakamura, K., Ochiai, T., ancl Tanigawa, K., Application of microprocessor to cruise control system, Proc. ZEEE Workshop Automot. Ap,ol. Microprocessors, 3744, 1982. [5] Chaudhure, B., Schwabel, R.J., and Voelkle, L.H.,
Speed Control Integrated into the Powertrain Computer, SAE Tech. Paper #860480, 1986. , [6] Tabe, T., Takeuchi, H., Tsujii, M., and Ohba, M., Vehicle speed control system using modern control theory, Proc. 1986Znt. Con$ Industrial Electron., Control Instrum., 1,365370,1986,
,
plication of self-tuning to automotive cruise control, Proc. Am. Cbntrol Con$, 1843-1848, 1990. [ 101 Hong, G. and Collings, N., Application of Self-Tuning Control, SALE Tech. Paper #900593, 1990. [ l l ] Kokotovic, P.V., Method of sensitivity points in the investigation and optimization of linear control systems, Automation Remote Control, 25, 1670-1676, 1964. [12] Rhode, D.S., Sensitivity Methods and Slow Adaptation, Ph.D. thesis, University of Illinois at UrbanaChampaign, 1990. [13] Rhode, D.S. and Kokotovic, P.V., Parameter Convergence conditions independent of plant order, in Proc. Am. Controd Con$, 981-986, 1990. [14] Riedle, B.D. and Kokotovic, P.V., Integral manifolds of slow adaptation, IEEE Trans. Autom. Control, 31, 316-323, 1986. 1151 Kokotovic, P.V., Riedle, B.D., and Praly, L., On a stability criterion for continuous slow adaptation, Sys. Control Lett., 6, 7-14, 1985. [16] Anderson, B.D.O., Bitmead, R.R., Johnson, C.R., Jr., Kokotovic, P.V., Kosut, R.L., Mareels, I., Praly, L., and Riedle, B.D., Stability of Adaptive Systems: Passivity and Averaging Analysis, MIT Press, Cambridge, MA, 1986. [17] Boyd S. ancl Sastry, S.S., Necessary and sufficient conditions for parameter convergence in adaptive control, Automatics, 22(6), 629-639, 1986.
Aerospace Controls 75.1 Flight Control of Piloted Aircraft
....................................
1287
Introduction Flight Mechanics * Nonlinear Dynamics Actuators Flight Control Requirements* Dynamic Analysis*Conventional Flight Controlo Time-Scale Separation Actuator Saturation Mitigation in Tracking Control * Nonlinear Inner Loop Design * Flight Control of Piloted Aircraft
References.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1302 ... Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1303 75.2 Spacecraft Attitude Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1303 Introduction Modeling" Spacecraft Attitude Sensors and Control Actuators Spacecraft Rotational Kinematics Spacecraft Rotational Dynamics Linearized Spacecraft R(otationa1Equations Control Specifications and Objectives Spacecraft Control Problem: Linear Control Law Based on Linearized Spacecraft Equations Spacecraft Control Problem: BangBang control Law ~ a s e d b n~inearizedSpacecraft Equations * Spacecraft Control Problem: Nonlinear Control Law Based on Nonlinear Spacecraft Equations Spacecraft Control Problem: Attitude Control in-circular Orbit Other Spacecraft Control. Problems and Control Methodologies ' Defining Terms
M. Pachter Department of Electrical and Computer Engineering, Air Force Institute of Technology, Wright-Patterson AFB, OH
C. H. Houpis Department of Electrical and Computer Engineenng. Air Force Institute of Technology, Wright-Patterson AFB, OH
Vincent T. Coppola Department of Aerospace Engineering, The University of Michigan, Ann Arbor, MI
N. Harris McClamroch Department of Aerospace Engineering, The University of Michigan, Ann Arbor, MI
S. M. joshi and A. G. Kellzar
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .13 . . 15 Further Reading ............................................................. 13 15 75.3 Control of Flexible Space Str.uctures... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1316 . Introduction Single-Body Flexible Spacecraft Multibody Flexible Space Systems Summary
References. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1326 .. 75.4 Line-of-Sight Pointing and Stabilization Control Systern .......... 1326 Introduction ' Overall 5ystem Performance Objectives Physical System Description and Modeling Controller Design ' Performance Achieved ' Concluding Remarks Defining Terms
NASA Langley Research Center
David Haessig GEC-Marconi Systems Corporation, Wayne, N/
References .................................................................... 1338
75.1 Flight Control (of Piloted Aircraft M. .Pachter, Department of Electrical and Computer Engineering, Air Force Institute of Technology, Wright-Patterson AFB, OH C. H. Houpis, Department of Electrical and Computer Engineering, Air Force Institute of Technology, Wright-Patterson AFB, OH 75.1.1 Introduction Modern Flight Control Systems (FCS)consist of (1) aerodynamic control surfaces and/or the engines' nozzles, (2) actuators, (3) sensors, (4) a sampler and ZOH device, and (5) compensators. The first four components of an FCS are hardware elements, whereas the controller (the digital implementation of the compensator) is an algorithm executed in real time in the on-board 0-8493-8570-9/96/$0.00+$.50 @ 1996 by CRC Press, Inc.
digital computer. In this chapter the design of the compensation/controller elemen tlalgorithm of the FCS, for a given aircraft, after the actuators, sensors and samplers have been chosen, is addressed. The way in which control theory is applied to the FCS's controller design is the main focus of this article. An advanced and comprehensive perspective on flight control is presented. The emphasis is on maneuvering flight control. Thus, attention is given to the ]process of setting up the flight control problem from its inceptijon. Flight Mechanics is used to obtain a rigorous formulation of the nonlinear dynamic model of the controlled "plant;" the linearization of the latter yields Linear Time Invariant (LTI) models routinely used for controller design. Also, it is important tat remember that the pilot will be closing an additional outer feedback loop. This transforms the FCS design problem from one of meeting flying quality specifications into one of meeting handling quality specifications.
THE CONTROL HANDBOOK
The essence of flight control is the design of an FCS for maneuvering flight. Hence, we chose not to dwell on outer-loop control associated with autopilot design and, instead, the focus is on the challenging problems of maneuvering flight control and the design of an inner-loop FCS, and pilot-in-the-loop issues. By its very nature, maneuvering flight entails large state variable excursions, which forces us to address nonlinearity and crosscoupling. To turn, pilots will bank their aircraft and pull g's. It is thus realized that the critical control problem in maneuvering flight is the stability axis, or, velocity vector, roll maneuver. During a velocity vector roll both the lateralldirectional and the pitch channels are controlled simultaneously, Hence, this maneuver epitomizes maneuvering flight control, for it brings into the foreground the pitch and lateralldirectionalchannels' cross-coupling, nonlinearity, time-scale separation, tracking controi design, actuator saturation concerns, and pilot-inthe-loop issues-all addressed in this chapter. Moreover, when additional simplifyingassumptions apply, the velocity vector roll maneuver is general enough to serve as the starting point for derivating the classical LTZ aircraft model, where the longitudinal and lateral/directional flight control channels are decoupled. Evidently, the design of a FCS for velocity vector rolls is a vehicle for exploring the important aspects of maneuvering flight control. Hence, this article's leitmotif is the high Angle Of Attack (AOA) velocity vector roll maneuver. Since this chapter is configured around maneuvering flight, and because pilots employ a high AOA and bank in order to turn, the kinematics of maneuvering flight and high AOA velocity vector rolls are now illustrated. Thus, should the aircraft roll about its x -body axis, say, as a result of adverse yaw, then at the point of attainment of a bank angle of 90ยฐ,the AOA will have been totally convertedinto sideslip angle, as illustrated in Figure 75. le. In this case, the aircraft's nose won't be pointed in the right direction. During a velocity vector roll the aircraft rotates about an axis aligned with its velocity vector, as illustrated in Figures 75.la 75.ld. The operational significance of this maneuver is obvious, for it allows the pilot to slew quickly and point the aircraft's nose using a fast roll maneuver, without pulling g's, i.e., without increasing the normal acceleration, and turning. This maneuver is also a critical element of the close air combat S maneuver, where one would like to roll the loaded airframe, rather than first unload, roll and ppll gs. Thus, in this chapter, maneuvering flight control of modern fighter aircraft, akin to an F-16 derivative, is considered.
75.1.2 Flight Mechanics Proper application of the existing control theoretic methods is contingent on a thorough understanding of the "plant." Hence, a careful derivation of the aircraft, ("plant") model required in FCS design is given. To account properly for the nonlinearities affecting the control system in maneuvering flight, the plant model must be rigorously derived from the fundamental nine state equations of motion [ l ] and [ 2 ] . The Euler equations for a rigid body yield the following equations of motion. The Force
Initial and h a 1 flight configurations for high AOA maneuvers; V is always the aircraft velocity vector. a) level flight. +o = Oo = #Jo =,O; +f = Of = 0,#Jf = 4 b) climbing flight. $0 = 0,Oo = y, 40 = 0;+f = 0,Of = y , #Jf = #J c) planar S = 00 = 0, = -40;+f = Of = 0,4f = 40 maneuver. d) climbing S maneuver. @o = 0,60 = Y . 40 7 -&; qf = 0,Of = Y , #Jf= 40 Figure 75.1
75.1. FLIGHT CONTROL OF PILOTED AIRCRAFT
Side View at
TOP View at
@ =0
@ = 90"
(Cont.)Initial and final flight configurations for high AOA maneuvers; V is always the aircraft velocity vector. e ) roll about x-body axis. Figure75.1
Equations are
u
=
1 V R - W Q + - - ~ F ~ ,
rn
U(O)= U o =
v
=
w
=
0 ,0
where I denotes the aircraft's inertia tensor and the definition D = I, I, is used. Also note that the modified "moment equations" above are obtained by transforming the classical Euler equations for the dynamics of a rigid body into state-space form. The Kinematic Equations are concerned with the propagation of the Euler angles, which describe the attitude of the aircraft in inertial space: The Euler angles specify the orientation of the aircraft's body axes triad, and rotations are measured with reference to a right handed inertial frame whose z-axis points toward the center of the earth. The body axes are initially aligned with the inertial frame and the orientation of the body axes triad is determined by three consecutive rotations of $, 8 and @ radians about the respective z, y , and x body axes. When the three consecutive rotations of the body axes frame is performed in the specified order above, $, 4, and 8 are referred to as (3,2,1)Euler angles, the convention adhered to in this chapter. The Euler angles, which are needed to resolve the force of gravity inro the body axes, are determined by 'the aircraft's angular rates P, Q, and R, according to the following equations:
(75.1)
5 t Ztf,
1
WP-UR+--cF,7,
V ( O ) = V o , (75.2)
WI
and
1 UQ-VP+--cFz. m
W(O)=O,
(75.3)
where m is the mass of the aircraft and'^, V , and W are the components of the aircraft's velocity vector, resolved in the respective x, y, and z body axes; similarly, P , Q , and R are the components of the aircraft's rotational speed, resolved in the x, y, and z body axes (see Figure 75.2). The origin of the body axes is collocated with the aircraft's CG. ?he initial condition in Equation 75.3 is consistent with the use of stability axes. If body axes were used instead, then the initial condition would be W ( 0 ) = Wo. The Moment Equations are
The contribution of the force of gravity, the contributions of the aerodynamic and propulsive forces and moments acting on the aircraft, and the control input contributions, are contained in the force ( F ) and moment ( M ) summations in Equations 75.1 - 75.6. The aerodynamic forces and moments are produced by the aircraft's relative motion with respect to the air flow, and are proportional to the air density p and the square of the airspeed i?. In addition, the aerodynamic forces and moments are determined by the orientation angles with respect to the relative wind, viz., the AOA cr and the sideslip angle B. The aerodynamic angles are depicted in Figure 75.2, where the aerodynamic forces and moments are also shown. The standard [4]nondimensional aerodynamic force and moment coefficients, designated by the letter C , are introduced. The and the dynamic pressure = lp(u2 v2 w2)= wing's area S, are used to nondimensionalize the aerodynamic forces; the additional parameters, c, the wing's mean aerodynamic chord, or, b, the wing's span, are used to nondimensionalize the aerodynamic moments. Moreover, two subscript levels are used. The first subscript of C designates the pertinent aerodynamic force or mornent component, and the second subscript
+
P ( 0 ) = Po,
and
(75.4)
+
lpfi2,
THE CONTROL HANDBOOK inner-loop flight control work and Equation 75.1 is not used. The pertinent Equations are 75.2 - 75.13. The thrust controlsetting is also not included In the Tzthrust component. Furthermore, the heading angle @ does not play a role in Equations 75.2 - 75.8 and Equations 75.10 - 75.13. Hence, one need not consider Equation 75.9 and thus the pertinent velocity vector roll dynamics are described by the seven DOF system of Equations 75.2 - 75.8 and Equations 75.10 - 75.13.
75.1.3 Nonlinear Dynamics The seven states are P, Q , R,
E, $, 8, and 4. The aero-
dynamic angles are defined as follows: a = tan7'($) Figure 75.2 Diagram depicting axes definitions, componentes of velocity vector ofthe aircraft, angular rates ofthe aircraft, and aerodynamic forces and moments. pertains to a specific component of the state vector. For example, the first stability derivative in Equation 75.10 is Cl,, and it yields the aircraft's roll rate contribution to the nondimensional rolling moment coefficient Cl. Thus, the aerodynamic derivatives are "influence coefficients".Using three generalized control inyuts, aileron, elevator, and rudder deflections, 6,, A,, and 6,, respectively, these force and moment summation equations are
, j3 =
tan-' ( i )In.Equations 75.2 - 75.6 and Equations 75.10 - 75.13, consolitated stability and control derivatives (which combine like terms in the equations of motion) are used, and the following nonlinear dynamics are obtained 121:
It is assumed that stability axes are used. Furthermore, the particular stability axes used are chosen at time t = 0. *so, in Equations 75.15 and 75.17,& = ($)/(I ~hekinexnadc Equations 75.7 and 75.8 are also included in the nonlinear dynamical system.
+ g).
In Equation 75.13, T, is the z-axis component of the thrust (usually < 0) and aor, (< 0) is the zero lift Angle Of Attack (AOA) referenced from the x-stability axis. Hence, aor, is determined by the choice of stability axes. The same is true for the Cmostabilty derivative in Equation 75.1 1. The Cmostability derivative used here is CL, - Cm,aor,, where C;, pertains to AOA measurements referenced to the aircraft's zero lift plane. Both Cmo and a o are ~ defined with reference to a nominal elevator setting (6, = 0). The x-axis velocity component is assumed constant throughout the short time horizon of interest in inner-loop flight control work. Hence, the thrust (control) setting is not included in
Trim Analysis The LIlS of the seven differential Equations 75.2 - 75.6 (or Equations 75.14 - 75.18) and Equations 75.7 and 75.8, is set equal to zero to compute the so-called "trim" values of the aircraft states and controls. Trim conditions are now considered where the aircraft's angular rates Po, Qo, and Ro, its sideslip velocity component Vo, and the respective pitch and bank Euler are constant. The pertinent seven states are angles 80, and I$~, V , W, P, Q , R, 8 , and 4, and the (generalized) controls are
75.1. FLIGHT CONTROL OF PILOTED AIRCRAFT S,, S,, and 6,. In the sequel, the trim controls and states are denoted with the subscript 0, with the exception of the trim value of U ,which is barred. For fixed control trim settings, an initial trim condition (or nominalJrajectory) is established. Also, when some of the trim states are specified, the remaining states and control trim settings can be obtained, provided that the total number of specified variables (controls and/or states) is three. Using generalized stability and control derivatives, the following algebraic trim equations are obtained from Equations 75.14, 75.16, and 75.18:
+ CvpBO = -cvs,Sr,,
(75.19)
+
+
C11,4POQO ClrlrQORO Cl1,PO +ClsBO
= -cl~,, ~%IO
-
q =J
+ CL,R~ (75.20)
CI,, Sr0 ,
and Cnl,,
POQO=
C ~ l p B ~
cl,, Qo170 + Cn,,Po + C r Ro +
-cn,,, ha,
11
- c,, 8,.
Equations 75.22 and 75.23. Obviously, Seo must be feasible, i.e., -6enlox i Sea i 6e,,, . An analysis of Equations 75.7 and 75.8 is now undertaken. Recall that the trim trajectory is flown at a constant pitch angle. If both Qo ancl Ro are not 0, then the bank angle 40 must be constant. The trim pitch and bank angles are obtained from the Euler angles Equations 75.7 and 75.8. Thus, if Ro # 0, then Qo = t a n 1 (-RO ) , and 00 = -tan-'(% c 0 s 4 ~ ) . Hence, if Po = 0, then 00 = 0 and this trim condition entails a level turn, illustrated in Figure 75.1~.In the general case where Po # 0, a steady spiral climb or a steady corkscrew descent, as illustrated in Figure 75.ld, is being considered. Moreover, for the above flight maneuvers, Equation 75.9 yields the elegant heading rate result
(75.21)
In Equations 75.19 - 75.21 the ~~nknowns are Po, Qo, Ro, Po, S,,, and SrO. In addition, and as will be shown in the sequel, Equations 75.7 and 75.8 yield Oo = Oo(Po,Qo, Ro) and 40 = 40(Po,Qo, Ro). Hence, we have obtained three equations in six unknowns. This then requires the specification of three trim variables, whereupon the remaining three trim variables are obtained from the above three trim equations. For example, if the equilibrium angular rates Po, Qo,and Ro are specified then the required control trim settings S,, and S,,, and the trim sideslip angle Po, can be calculated. The calculation then entails the solution of a set of three linear equations in three unknowns. The system matrix that needs to be inverted obtained from Equations 75.19 - 75.21 is
F Q ;
Obviously, not every pair of prespecified pitch and bank trim angles, 00 and 40, respectively, is feasible. Indeed, the solution of the equations above must satisfy -6,,,, 5 SaO 5 S,, and -S,,,, 5 Sro 5 6r,,,,1,. In the very special case where the bank angle 40 = 0, a symmetrical flight condition ensues with Bo = Sao = Sro = 0. A steady climb in the pitch plane is then considered, as illustrated in Figure 75.lb. Finally, a0L and IS,, are then determined from Equations 75.22 and 75.23, where $0 = Po = Qo = Ro = 0, i.e., ~ c o s O o CT, - C z , a O ~ u C,,eSeo = O , and Cho - Cm,aOL Cms,Seo== 0. The trim oonditions identified above represent interesting flight phases. For example, and as shown in Figure 75.la, one can initiate a velocity vector roll in trimmed and level flight, where Po = (20 = Ro = Oo = = Po = Sag = SrO = 0, and one can subsequently arrest the velocity vector roll at a new trim flight condition which entails a level turn, shown in Figure 7 5 . 1 ~ and where Qo and Ro are prespecified, Po = 00 = 0, and 4 0 = t a n - ' ( $ ) . The new bank angle satisfies the equa-
+
tion sinma =
Obviously, the M matrix must be nonsingular for a trim solution with constant angular rates to be feasible, i.e., the condi-
z
+
tion CYBC ~ Sens,. ( ~ +C.vsr elpCns, cYB Cn,, eYs, cL,, en, must hold. From the remaining Equations 75.15 and 75.17, the two equations which determine the zero Lift angle included between the velocity vector and the aircraft's zero lift plane, a o ~and , the trim elevator setting, S,,, are obtained:
+ ( ~ + C Z , ) Q O + ( ~ ~ ~ ~ S , ~ = O , (75.22)
and 'mo
\+
+ C W ISeo ~ , + CmqQO+ CmprPoRO 2
Cmp2RO - Cmp2P: = 0.
(75.23)
Once the angulai. rates and the sideslip angle Do have been established, the S,, and aoL unknowns are determined from the linear
+ R:.
T* Q
=
+R
7.
+
The required trim
+
Po, b,and
Sro is determined by solving the linear trim equations. Since the trim bank angle changes, this will require a change in the trim elevator setting and in a0L. The latter directly translates into a change in AOA. To account for the difference in the required AOA for trim before and after the maneuver is accomplished, an a = ~ O L, W O L , must be commanded. To establish the new trim condition, a new trim sideslip angle = Do must also be commanded. Similarly, a level and symmetric S maneuver will be initiated in a level and trimmed turn, say, to the right, and the end trim condition will be a level and trimmed turn to the left. In this case, the new trim variables are Qo +- Qo, -Ro +- Ro, Po = Qo = 0, n - 40 +- 40. Evidently, Ro changes, hence a new trim sideslip angle must be established. A similar analysis can be conducted for climbing turns, corkscrew descents and nonplanar S maneuvers, where Oo # 0.
75.1.4 Actuators The control effectors' actuator dynamicsplay an important role in flight control. Typical fighter aircraft actuator Transfer Functions
THE CONTROL HANDBOOK
1292 (TFs) are given below. Elevator dynamics:
Aileron dynamics:
The aileron's saturation limits are similar to the elevator's. Rudder dynamics: Sr (s) Src (s)
The Bode plot of the elevator actuator TF is shown in Figure 75.3. Note the phase lag at high frequency. High frequency operation is brought about by a high loop gain, which, in turn, is specified for robustness purposes. Thus, robust (flight) control mandates the use of high-order actuator models, which, in turn, complicates the control design problem. In addition, the following nonlinearities play an important role in actuator modeling and significantly impact the control system's design and performance. These nonlinearities are of saturation type and entail 1) Maximum elevator deflection of, e.g., f22' and 2) Maximum elevator deflection rate o f f 60ยฐ1sec. The latter significantly impacts the aircraft's maneuverability.
-
5148 s2 99.36s 5 148
+
+
The maximum rudder deflection is f30" and its maximum deflection rate is f 60ยฐ/sec.. Multiaxes Thrust Vectoring (MATV) actuator dynamics: The MATV actuator entails a thrust vectoring nozzle. Its deflection in pitch and yaw is governed by the following transfer function:
Note the reduced bandwidth of the MATV actuator. In addition, the dynamic range of the MATV actuator is f20' aud its maximum deflection rate is limited to f45'1sec.
Weighting Matrix .... .... .... .... .... .... .... .... .... ... ...
... ... ... ... . . . . .. .. .. ..
.... .. ..
.... .. ..
... ... ... ... . . . . .. .. .. ..
.... .... .... .... .... .... .... . . . . . . .
... ... ... ... ... ... ...
.,. .. .. .. , .. . .. . .., , .., ... ., . . ,,. . .. . ,, . ., .., ., . ,, .. . ,
.. .. .. .. .. .. . . .. . . . , . . .... .. .... ... : .,.. :. .. .. ,
-25 10
..., ...., ....
... . ... ... .. .. .. .. . . . . .. . .. . ,..
... ... ... ... ... ... ... ... . . .. .. . , .. . . , .. ,
.... .... .... .... .... .... .... .... .. .. .. .. .. .. .. ..
10'
Frequency (radlsec)
1o2
The three generalized control inputs S,, A,, and 8, are responsible for producing the required control moments about the aircraft's x, y, and z axes. kodern combat aircraft are, however, equipped with redundant control effectors. The latter are of aerodynamic type, and, in addition, Multiaxes Thrust Vectoring (MATV) is employed. Hence, a weighting matrix W for control authority apportionment is used to transform the three generalized command inputs into the six command signals Sar, Ser, Sep, GTV,,, Sry, and G T V y that control the six physical actuators. In industrial flight control circles the weighting matrix is also referred to as the "Mixer." In our flight control problem, W is a 6 x 3 matrix. It is noteworthy that a weighting matrix is required even in the case of aircraft equipped with basic control effectors only, in which case the generalized controls are also the actual physical controls and the weighting matrixis 3 x 3. The flight control concept of an aileron rudder interconnect will mitigate some of the adverse yaw effects encountered at high AOAs. Thus, a roll command would not only command an aileron deflection, but also generate a rudder command to remove adverse yaw produced by the aileron deflection. The appropriate weighting matrix coefficient is based on the ratio of the applicable control derivatives because the latter are indicativeofthe yaw moment control power of the respective effector. Hence, the weighting matrix
" ["I'
[ 1] 0
wi =
o * o
and Figure 75.3
Elevator actuator transfer function.
6r
is oftentimes employed,
+ w l [ " ]6r.
I
75.1. FLIGHT CONTROL OF PILOTED AIRCRAFT When the number of physical effectors exceeds the number of generalized controls, the control authority in each axis needs to be adequately apportioned among the physical effectors, as shown in Figure 75.4. Note that the six physical control effectors are partitioned into h r e e distinct groups which correspond to the three control axes of the aircraft. There are two physi. cal effectors in each group. The physical effectors Sep and S T V p are in the S, group, the physical effectors S,, and S,, are in the 6 , group, and the physical effectors S,?, and STV?, are in the 6, group. Now, the underlying control concept is to have the control effectors in a group saturaie simultaneously. Hence, scale factors are used in the weighting matrix to account for dissimilar maximum control deflections wtthii the control effectors in each group. For example, a maximal aileron command should command the roll ailerons to their maximum deflection of 20 degrees, but the roll elevators to their preassigned maximum roll control authority of 5 degrees only. Thus, scaling is achieved by multiplying the generalized control command by the ratio of the respective maxinium actuator limits in its group. Hence,
Finally, the weighting matrices W 1and W2 are combined to form the weighting matrix W = W2W1
Figure 75.4
plish a prespecified task. This entails the following: 1. Accommcudate high cr (velocity vector roll) maneuvers. 2. Meet handlinglflying qualities specifications over the entire flight envelope and for all aircraft configurations, including tight tracking and accurate pointing for Fire Control. 3. Above and beyond item 1, obtain optimal performance. Pilot "gets what he wants." 4. Minimize number of placards, the number of limitations and restrictions the pilot must adhere to + Pilot can fly his aircraft "in abandon".
Mil-Std 17'97A [3) defines the flying quality specifications, Level 1 the best. The specifications (airworthiness criteria) are given as a mixture of time and frequency domain specifications. The longitudinal channel uses time-domain specificationsbased on the pitch rate q response to a step input command calculated from the two-degrees-of-freedom model given by the fast, Short Period (SP), approximation. The time-domain specifications are based on two straight lines drawn on the q response shown in Figure 75.5. To meet Level 1 flying quality specifications, the equivalent time delay ( t l ) must be less than 0.12 sec, the less than 0.30, and the effective rise transient peak ratio time (At = t2 - tl ) between 9 1 0 and 50010, where fi is the true airspeed (fils;).
(g)
Mixer.
A more detailed analysis also considersthe relative magnitudes of the control moments and control forces generated by the two effectors in each control group, which requires knowledge of the control derivatives. The control theoretic problem of optimal control authority apportionment among the various control effectors is currently a research topic in flight control.
75.1.5 F1igh.t Control Requirements The Flight Control System constitutes the pilotlaircraft interface and its primary mission is to enable the pilotlaircraft to accom-
I
Figure 75.5
Step Elevator Response.
For the lateral/directional channel, the flying quality specifications apply to simultaneously matching the Bode plots of the final system with those of the equivalent fourth-order transfer functions given by
THE CONTROL HANDBOOK
1294
+ + s+ A~~ s+ Ao) e-Teps + &)(s + k ) ( s 2 + + w;)'
B(s) -
( ~ 3 s ~~ 2
arud(s)
(S
~ { ~ W J S
To meet Level 1 flying qualities, the Roll Mode time constant (TR) must be less than 1.0 sec, the Dutch Roll Mode damping (rd) greater than 0.40, the Dutch Roll Mode natural frequencf (wd) greater than 1 radlsec, the roll rate time delay (rep)less than 0.1 sec, and the time to double of the Spiral Mode (-ln2 Ts) grf iter than 12 sec. Disturbance rejection specifications in the lateralldirectional channel are based on control surface usage and are variable throughout the flight envelope. In addition, the open-loop phase margin angle y of both the longitudinal and lateralldirectional designs must be greater than 30ยฐ, and the oken-loop gain margin a must be greater than 6dB. To determine phase and gain margins, the open-loop transmissions from stick inputs to the required outputs in each loop are examined. Finally,the phase margin frequency (cutofffrequency = w4) should be less than 30 rad/sec to prevent deleterious in- , teraction with the bending modes of the aircraft. Concerning cross-couplingdisturbance bounds: One requirement is that a sustained 10" sideslip will use less than 75% of the available roll axis control power (aileron). There is also a complicated specification for the amount of sideslip p allowed for a particular roll angle. If one simplifies the requirements in a conservative fashion over the majority of the flight envelope, a roll command of 1ยฐ/sec should result in less than 0.022" of j3, but, at low speeds, j3 is allowed to increase to 0.067'. The maximal /3 allowed in any roll maneuver is 6". Evidently, a "robust" set of specifications is not available because the flying qualities depend on flight condition throughout the flight envelope. The lack of uniform performance bounds over the entire envelope illustrates a shortcoming of the current robust control paradigms. Moreover, the specifications above must be met in the face of the following.
A. Constraints Practical limitations on achievable performance (including robustness) are imposed by (2) actuatorsaturation ( 1 ) actuator rate limits (e.g., ca. (e.g., ca. 22"), (3) sensor noise/sensor quality, (4) low frequency elastic modes, and (5) highest possible sampling rate for digital implementation. B. Environment The FCS needs to accommodate the following: (1) "plant" variability/uncertainty in the extended flight envelope, and aircraft configurations possibilities. Thus, in the real world, inuctured uncertainty needs to be addressed. Also, accommodation of inflight aerodynamic control surfaces failureslbattle damage is required, (2) atmospheric disturbances1 turbulence suppression, (3) unstable aircraft (for reduced trim drag and improved cruise performance); guard against adverse interaction with actuator saturation, for the latter can cause loss of control, i.e., departure, (4) control authority allocation for trim, maneuverability, additional channel control, stabilization. Example: In the current F- 16, the apportionment of control au-
thority is 10,7,4,and 2 degrees of elevator deflection, respectively, (5) optimal control redundancy management, and ( 6 ) pilot-inthe-loop.
75.1.6 ~~~~~i~ Analysis The linearization of the state equations of motion about trimmed flight conditions is the first step performed in classical flight control system design. The simple trim condition shown in Figure 75. la, which entails symmetric and level, or climbing, flight, and where Po = Qo = Ro = = 0 and Po = a,., = a,, = 0, is considered. Linearization is based on the small perturbations hypothesis, which predicates that only small excursions in the state variables about a given trim condition occur [l], [21, 151. For conventional aircraft with an (xJ)plane of symmetr)l,and for small roll rates, an additional benefit accrues: Linearization brings about the decoupling of the aircraft's pitch and lateralldirectional dynamics, thus significantly simplifying the flight control problem. Unfortunately, the velocity vector roll is a large amplitude maneuver consisting of a high roll rate and large excursions in Euler angles. Thus, the velocity vector roll maneuver could entail a high (stability axis) roll rate P. According to the flying quality specifications in Section 75.1.5, a maximal roll rate is desired, hence P is not necessarily small. In addition, during a velocity vector roll maneuver, large excursions in the bank angle occur. Hence, for maneuvering flight, the standard linearization approach must be modified. All of the system states, other than the roll rate and the Euler angle q5, can still be represented as small perturbation quantities (denoted by lower case variables), and the squares and products of these state variables can be neglected as is typically done in the small perturbations based linearization process. All terms containing the roll rate P and bank angle are not linearized, however, because these variables do not represent small perturbations. Furthermore, the elevator deflection 6, is now measured with reference to the trim elevator setting of 6eo. Therefore, UOL does not feature in ~ ~ u a t i 75.15, o n nor does C,, feature in Equation 75.17. Furthermore, the gravity term in Equation 75.15 is modified to account for the trim condition P ' o P ~ ~ ~Moreover, Y. the FCS strilres to maintain q and small, and O U choice ~ of stabilty axes yields the velocity vector roll's end conditions Of = Go. Hence, it is assumed that, throughout 0. In the velocity vector roll, the pitch angle perturbation 0 addition, the stabilty axes choice lets us approximate the AOA as or = tan-'(g) x U and, at this particular trim condition $. Thus, the (Po = 0), thq sideslip angle /3 = tan-'(+) "slOw" d~nmicsin Equations 75'8* 75.15 and 75.14 are reduced
4
4 d!
= =
P
+ (4sin@+ 'co~@)tan00,
- pp + ,(cos,gocos~ - sineo 0 cas@ - c~sOo) U
+ C Z , +~ Czqq + C Z , +~ CZ&&, and
(75.28)
g
(75.29)
75.1. FLIGHT CONTROL OF PILOTED AIRCRAFT third order 121, 161:
where 6 now denotes the perturbation in the pitch angle. Equations 75.16, 75.17 and 75.18 yield the fast dynamics
P = q
=
i
=
,,', + Cisel.6er + ClSry8r.y + C/6TVgS T V ~ ' (75.31) Cm,,,.Pr - Cmr2p 2 + Cm,a + Cmclq + Cm,k + CmSepfiep+ CrnsTVP~ T V ~ (75.32) , Cnp,Pq + CapP + Cn, r + CnpB + Cn6c,r &or + Cn,,,Ser + CnsrYS r y + CnsTyySrvr3 (75.33)
C / , , P + C / Pq+Clrr+ClaB+Cls,,Sar
and
Hence, the dynamics of maneuvering flight and of the velocity vector roll maneuver entail a seven DOF model which is specified by Equations 75.28 - 75.34. The three generalized control effectors' inputs S,, 6 , and Sr appearing in the six DOF equations represent the six physical control effector inputs, i.e., the pitch elevator deflection Be,, the deflection in pitch of the thrust vectoring engine nozzle S T V ~ ,the aileron deflection a,,, the roll elevator deflection S,, (we here refer to the differential tails), the rudder deflection S,,, and the deflection in yaw of the thrust vectoring engine nozzle S T V , . When the initial trim condition is straight and level flight, 60 = 0. Then a further simplificationof the Equations of motion 75.28 - 75.30 is possible:
+ Czbatp
+ CzaTv,h v p .'and
(75.36)
The third-order dynamics depend on the pitch angle parameter 60. At elevated ]pitch angles the SPIPhugoid time-scale separation's validity is questionable. Furthermore, applying the small perturbations hypothesis to the additional variables P and @ in Equations 75.28, 75.30, 75.31, and 75.33, allows us to neglect the terms which contain perturbation products. This decouples the lateralldirectional channel from the pitch channel. Thus, the conventional lateralldirectional "plant" is
In conclusion, the horizontal flight assumption 80 = 0 reduces the complexity of both the pitch channel and the velocity vector roll control problems. Thestability and control derivatives in Equations 75.38,75.39, and Equations 75.40 - 75.43 depend on flight condition. Therefore, in most aircraft implementations, gain scheduling is used in the controller. Hence, low order controllers are used. A simple first-order controller whose gain and time constant are scheduled on the dynamic ]pressurej will do the job. In modern approachesto flight control, robust controllers that do not need gain scheduling are sought. A full envelope controller for the F-16 VISTA that meets the flying quality specifications of Section 75.1.5 has been synthesized using the QFT robust control design method 171. The pitch channel's inner and outer loop compensators, G I and G2, and the prefilter F, are
Equations 75.35 - 75.37 are used in conjunction with the "fast" dynamics in Equations 75.31 - 75.34 to complete the description of the aircraft dynamics for an initial straight and level trim.
75.1.7 Conventional Flight Control It is remarkable that the very same assumption of trimmed level flight renders inthe pitch channel the classical second-order Short Period (SP) approximation. Thus, P = @ = O yields the celebrated linear SP pitch dynamics c+
= C z a +~ (1
q
=
+ Czc,:)q+ Cza&+ Czse6e (75.38)
Cmaa S- Cm,q
+ Cmad + CmseSe
(75.39)
In the more general case of trimmed and symmetric climbing flight in the vertical plane, the longitudinal SP dynamics are of
75.1.8 Time-Scale Separation Time-scale separation plays an important role in flight control. The fast states P, q , and r which represent the angular rates are dominant during maneuvering flight where the velocity vector roll is initiated and arrested. The perturbations a! in AOA and in the sideslip angle are maintained at near zero in the short time periods of the transition regimes. Therefore, the six DOE
THE CONTROL HANDBOOK dynamical model obtained when Oo = 0 can be decoupled into "slow" dynamics associated with the a , B and q5 variables and the "fast" dynamics of the P , q , and r angular rates. Two primary regimes ofthe roll maneuver are identified, viz., the transition and free stream regions shown in Figure 75.6. The dynamic transition regions represent the initiation and arrest of the velocity vector roll, which occur in a relatively short amount of time (on the order of one second). In these transition regions the fast states of the aircraft, viz., the angular rates, are dominant. During these initial and/or terminal phases of the velocity vector roll maneuver a desired roll rate needs to be established, viz., the controller's modus operandi entails tracking. In the free stream regime, the established roll rateis maintained through thedesired bankangle, while the perturbations in AOA and sideslip angle are regulated to zero. Thus, in the "free stream" regime, the controller's function entails regulation.
Two Translllon Regions
Figure 75.6
Ttime
Time-scale separation.
The design of the FCS for maneuvering flight and for large amplitude velocity vector roll maneuvers at high AOAs hinges on this discernible time-scale separation, and is now outlined. The time-scale separation leads to the two-loop control structure shown in Figure 75.7. The inner loop of the FCS consists of a three-axis rate-commanded control system which controls the fast states of the system. This control function is essential in the transition regions of the velocity vector roll maneuver. The bandwidth of the inner loop is relatively high. The outer loop serves to regulate to zero the relatively slower aerodynamic angles, viz., the AOA and sideslip angle perturbations which accrue during the velocity vector roll, in particular, during the transition regions. The outer loop is therefore applicable to both the transition and free stream regimes of the velocity vector roll.
Time-Domain Approach 1
Maneuvering flight control entails tracking control, as opposed to operation on autopilot, where regulation is required. Tracking control in particular in the presence of actuator saturation requires a time-domain approach. Therefore, the objective is the derivation of a receding horizonlmodel predictive tracking control law, in the case where both fast and slow state variables feature in the plant model. An optimal control sequence for the
whole planning horizon is obtained andits first element is applied to the plant, following which the control sequence is reevaluated over a one time step displaced horizon. Thus, feedback action is obtained. A procedure for synthesizing an optimal control sequence for solving a fixed horizon optimal control problem, which exploits the time-scale separation inherent in the dynamical model, is employed. The time-scale separation decomposes the receding horizon control problem into two nested, but lower dimensional, receding horizon control optimization problems. This yields an efficient control signal generation algorithm. The outer loop optimal control problem's horizon is longer than the inner loop optimal control problem's horizon, and the discretization time step in the inner loop is smaller than in the outer loop. Thus, to apply model predictive tracking control to dynamical systems with time-scale separation, the classical nested feedback control loop structure of the frequency domain is transcribed into the time domain. The plant model for the inner loop's "fast" control system entails reduced order dynamics and hence the "fast" statexl consists of the angular rates P , Q, and R. In the inner loop the control signal is 16, S,, 6,1T. The reference variables are the desired angular rates, namely PC,Q,, and Rc. The latter are supplied by the slower outer loop. The on-line optimization performed in the inner loop yields the optimal control signals 6,*, 6:, and 6:. The slowvariables are passed into the inner loop from the outer loop. Thus,intheinnerlooptheslowvariablesx2= ( a , B, 0, q5) are fixed, and if linearization is employed, their perturbations are 8, set to 0. In the slower, outer loop, only the "slow" states a , #?, and 4 are employed. In the outer loop the control variables are the "fast" angular rates P , Q, and R. Their optimal values are determined over the outer loop optimization horizon, and they are subsequently used as the reference signal in the fast inner loop. The outer loop's reference variables are commanded by the pilot. For example, at low ij the pilot's commands are Q,, PC, B,.
75.1.9 Actuator Saturation Mitigation in Tracking Control The classical regulator paradigm applies to outer loop flight control, where autopilot design problems are addressed. More challenging maneuvering, or inner loop, flight control entails the on-line solution of tracking control problems. The design of an advanced pitch channel FCS which accommodates actuator saturation is presented. Linear control law synthesis methods, e.g., LQR, H,, and QFT, tend to ignore "hard" control constraints, viz., actuator saturation. Unfortunately, good tracking requires the use of high gain in the linear design, and consequently actuator saturation effectsare pronounced in tracking control. Hence, in this section, the basic problems of tracking control and mitigating actuator saturation effects are synergetically addressed. The emphasis is on real-time synthesis of the control law, so that implementatin a feedback flight control system is possible. Thus, a hybrid optimization approach is used for mitigating actuator saturation effects in a high gain situation. Specifically, model following control is considered. The Linear Quadratic Regulator (LQR) and
75.1. FLIGHT CONTROL OF PILOTED AIRCRAFT
FCS
- _ - _ - - - - - - _ _ - _ _ _ _ -
I
Figure 75.7
Flight control system.
Linear Programming (LP) optimization paradigms are combined [8] so that a tracking control law that does not violate the actuator constraints is synthesized in real time. Furthermore, the multiscale/multigrid approach presented in Section 75.1.8 is applied. Attention is confined to the pitch channel only and the conventionally linearized plant Equations 75.38 and 75.39 is used.
Saturation Problem
Actuator saturation reduces the benefits of feedback and degrades the tracking performance. Actuator saturation mitigation is of paramount importance in the case where feedback control 1s employed to stabilize open-loop unstable aircraft, e.g., the F-16: Prolonged saturation of the input stgnal to the plant is tantamount to opening the feedback control loop. Hence, in the event where the controller continuously outputs infeasible control signals which will saturate the plant',, instability and departure will follow. At the same time, conservative approaches, which rely on small signal levels to avoid saturation, invariably yield inferior (sluggish) control performance. Hence, the following realization is crucial: The cause of saturation-precipitated instability is the continuous output by the linear controller of infeasible control signals. Conversely, if 'the controller sends only feasible control signals to the actuator, including controls on the boundary of the control constraint set, instability won't occur. In this case, in the event where the fed back measurement tells the controller that a reduction in control effort is called for, the latter can be instantaneously accomplished. Thus, a "windup" situation, typical in linear controllers, does not occur and no 'anti-windup" measures are required. Therefore, the feedback loop has not been opened, and the onset of instability is precluded. Obviously, a nonlinear control strategy is needed to generate feasible control signals. To command feasible control signals u, one must compromise on the tracking performance which is achievable with unconstrained controls.
'"Plant" here means the original plant and the actuator.
Tracking Control The model follow~nga state feedback control system is illustrated schematically in Figure 75.8. The pitch channe!, modeled by Equations 75.38 and 75.39, is considered The controlled variable is pitch rate q . The model M outputs the reference signal r, which is the pilot's desired pitch rate q,. P represents the aircraft's dynamics, A the actuator dynamics, and G the controller. K, and K , are adaptive gains in the feedback and feedforward loops, respectively.
Figure 75.8
Tracking control.
A first-order actuator model with a time constant of 50 milliseconds is used. The Short Period approximation of the longitudinal dynamics of an F-16 aircraft derivative is employed. The latter is sthtically unstable, viz., M, > 0. Integral action is sought, and hence the plant is augmented to include an integrator in the forward path. The integrator's state is z,viz., z = r 1 - q. Hence, the fed back state vector consists of the pitch rate q , the AOA a,the elevator deflection 6,, and the integrator state z . The command signal u= 6,< at the actuator's input consists of two parts, a fed back signal which is a linear function of the state and a signal which is a linear function of the pilot's input to the model and hence ofthe reference signal r. Therefore, actuator saturation will be precipitated by either a high amplitude excursion of the state, caused by a disturbance applied to the plant, or by a high amplitude reference signal when the flight clontrol system is being driven hard by the pilot. Now, it is most desirable to track
THE CONTROL HANDBOOK
1298
the reference model output driven directly by the pilot. However, because saturation requires the generation by the controller of feasible contro! signals, tracking performance must be relaxed. Hence, saturation mitigation requires that r ; # r i , therefore, reduce the feedforward gain K,, or, alternatively, reduce the loop gain Ki.
Reference Signal Extrapolation The optimal control solution of the tracking control problem (as opposed to the regulator problem) requires advance knowledge of the exogenous reference signal. Therefore, the reference signal r must be extrapolated over the optimization time horizon. However, long range extrapolation is dangerous. Hence, receding horizon optimal control provides a natural framework for tracking control. Even though past reference signals are employed for extrapolation, some of which ( r ' ) are divorced from the pilot's reference signal input (r), a good way must be found to incorporate the pilot's extrapolated (future) reference signal demand into the extrapolated reference signal being supplied to the LQR algorithm. Hence, the extrapolation of the pilot's signal and its incorporation into the reference signal provided to the LQR algorithm provides good lead and helps enhance tracking performance. At the same time, the use of the past reference signal output by the controller (and not necessarily by the pilot-driven model) will not require abrupt changes in the current elevator position and this will ameliorate the elevator rate saturation problem. Since the receding horizon tracking control!er requires an estimate of q for the future, the latter is generated as a parabolic interpolation of r;, the yet to be determined adjusted reference signal r i , and i l o i;l ois generated by parabolically interpolating the pilot issued commands r-1, ro, and r l , and extrapolating ten time steps in the future, yielding i l o .The ten point r' vector (which is the vector actually applied to the system, not the r vector) is parameterized as a function of the yet to be determined r i . Therefore, the objective of the controller's LP algorithm is to find the reference signal at time now, r i , that will not cause aviolation of the actuator limits over the ten-point prediction horizon.
Optimization
+
The discrete-time dynamical system xk+l = Axk buk , xo=xo , k = 0, 1, ...N - 1, is considered over a receding ten time steps planning horizon, i.e., N = 10. The state vector x E lZ4 and the control variable is the commanded elevifor deflection 6,' E R ~ In . flight control, sampling rates of 100 Hz are conceivable. The-quadratic cost criterion weighs the tracking error r' - q, the control effort (commanded elevator deflection), and control rate effort, over the ten time steps planning horizon which is 0.1 sec. At the time instant k = 0, the future control sequence uo, u l , . . . u ~ - 1is planned based on the initial state information xo and the reference signal sequence rl ,rz, ...rN given ahead of time. Now, the actual reference signal received at time k = 0 is rl ;the sequence r2, ...rN is generated by polynomially (quadratically) extrapolating the reference signal forward in time, based on
the knowledge of past reference signals. The extrapolated reference signals 1-2, ...rN are linear in rl . An open-loop optimization approach is employed. Although the complete optimal control sequence uo, u1, ...U N - 1 is calculated, only uo is exercised at time "now," thus obtaining feedback action. An LQR tracking control law is first synthesized without due consideration of saturation. The objective here is to obtain good "small signal" tracking performance. If saturation does not occur, the LQRcontrollawwill be exercised. Saturation is addressed by using the explicit formula that relates the applied reference signal sequence ri , ...r h and the LQR algorithm-generated optimal control sequence u0, . . . , U N . Thus, the dependence of the optimal control sequence on the initial state and on the reference signal r; is transparent:
where the vectors a, b and c E R N . The vector p E R~ represents the parameters (weights) of the LQR algorithm, viz., pT = Q P ' QP' QP We are mainly concerned with actuator rate saturation and hence the important parameter is Next, the LP problem is formulated to enforce the nonsaturation of the command signal to the actuator and of the command rate signal. The following 2N (= 20) inequalities must hold:
[x a].
-defmax i l -defrmaxLI
3.
uo
I 5 defmax,
uo-U-,
-defmax 5
l
7 lidefrmax, u1 I I defmax,
-def rmax5
I
I i d e f rmax, and
The control u-1 is available from the previous window. The def,,, (22') and def r,,, (60ยฐ/sec) bounds are the maximal elevator deflection and elevator deflection rate, respectively. The sampling rate is 100 Hz and hence AT=0.01 sec. The 2N inequalities above yield a set of 2N constraints in the currently applied reference signal ri . Thus,
(B)
R Next, f = mini [ fi], g(&) = maxi [gi] is defined, and the case where g lf is considered. The set of 2N inequalities is consistent and a feasible solution r; exists. I'hen, given the pilot-injected reference signal r l , a feasible solution is obtained, which satisfies the 2N inequalities above, and which is as close as possible to r l . In other words,
75.1. FLIGHT CONTROL OF PILOTED AIRCRAFT Finally, at time k=O, set rl := r ; . Thus, the action of the LP algorithm is equivalent to lowering the feedforward gain K,. In the case where the set of inequalities is inconsistent ( f q ) and a solution does not exist, thel9op gain Ki needs to be reduced. This is accomplished by increasing (say doubling) the LQR algorithm's R R penalty parameter, and repeating the optimization process.
1299 TABLE 75.3
Reference Model.
'
Implementation The implementation of the advanced tracking control law developed in the preceding sections is discussed. The aircraft model represents the conventionally linearized second-order pitch plane dynamics (short period) of an F-16 aircraft derivative at the M = 0.7 and h = 10,000 ft flight condition. The bare aircraft model represented by Equations 75.38 and 75.39 is given in state-space form in Equation 75.47, where Z,, Zy, ZS, Ma, M y , MJ are the par1 ial derivatives of normal force (Z) and pitching moment (M) with respect to AOA, pitch rate and stabilator deflection.
ure 75.8. The system consists of a reference model, a receding horizon LQ controller which also includes a LP algorithm, and the plant model. Integral action is included in the flight control system and is mechanized by augmenting the dynamics, viz., zk+l = zk ( q - r'). Thus, although the bare aircrah model has two states, with the actuator and integrator included, there are four states: a, q , 8 , and z. A parameter estimation module is also included, rendering the flight control system adaptive and reconfigurable. The simulation is performed with both the receding horizon LQILP controller and the System Identification module in the loop. The pilot commands are 1.0 sec duration pitch rate pulses -, at times 0.0, of 0.2 radlsec magnitude, with polarities of 3.0, and 6 seconds, respectively. At 5.0 seconds, aloss of one horizontal stabilator is simulated. The failure causes a change in the trim condition of the aircraft, which biases the pitcli acceleration (q) by -0.21 r a d / s e c 2 . Figure 75.9 is a comparison plot of the performance of two controllers desikned to prevent actuator saturation. REF is the desired pitch rate produced by a model of the desired pilot input to pitch rate response. The GAIN curve represents an inner loop gain limiter (0-1) denoted as K, in Flgure 75.10. The K , value is computed from driving a rate and position limited actuator model and a linear actuator model with the command vector from the receding horizon controller. The ratio of the limited and unlimited value at each time step is computed: the smallest ratio is K , (see Fngure 75.10). The PROG curve is from an LP augmented solution that determined the largest rnagn~tudeof the input that can be applied without saturating the actuator. This is e~uivalentto outer-loop gain KO attenuation. As can be seen, both PROG and GAIN track very well. At 5.0 seconds the failure occurs with a slightly larger perturbation for PROG than for GAIN. PROG has a sl~ghtlygreater overshoot afi er the failure, primarily due to the reduced equivalent actuator rate.
+
+, +,
The stability and control derivatives are defined in Table 75.1. The dynamics entail the algebraic manipulation of the equations
TABLE 75.1
Nominal Model.
of motion to eliminate the dr derivatives. Failure or damage to an aerodynamic control surface results in a loss of control authority, which, in turn, exacerbates the actuator saturation problem. A failure is modeled as a 50ยฐ/o loss of horizontal tail areawhich occurs 5 secinto the "flight." The failure affects all of the short period stability and control derivatives. The short period model's parameters, after failure, are shown in Table 75.2. The reference model has the same structure as the
Failure Model.' 6 a
75.1.10 Nonlinear Inner Loop Desig.n
TABLE 75.2
4
plant model. A first-order actuator with bandwidth of 20 radlsec is included. The parameters of M are shown in Table 75.3. Only the pitch rate output ofthe model is tracked. The actuator output is rate limited to f 1 radlsec and deflection limited to f0.37 rad. The block diagram for the complete system is shown in Fig-
During high AOA maneuvers the pitch and lateralldirectional channels are coupled and nonlinear effects are important. Robust control can be brought to bear on nonlinear control problems. In this Section is shown how QFT and the concept of structured plant parameter uncertainty accommodate the nonlinearities in the "fast" inner loop. The plant is specifiedby Equations 75.3 1 to 75.33 with a = iu = B = 0. Based on available data, a maximum bound of approximately 30ยฐ/sec can be placed on the magnitude of P for a 30 degree AOA velocity vector roll. Structured uncertainty is modeled with a set of LTI plants which describe the uncertain plant over the range of uncertainty. Therefore, intro-
THE CONTROL HANDBOOK
Figure 75.9
Tracking performance.
ducing a roll parameter Pp which varies over the range of 0 to 30ยฐ/sec in place of the state variable P in the nonlinear terms, allows the roll rate to be treated as structured uncertainty in the plant. Note that the roll rate parameter replaces one of the state variables in the p2 term, viz., p2 x PpP. With the introduction of the roll rate parameter, a set of LTI plants with three states which model the nonlinear three DOFs plant are obtained. The generalized control vector is also three dimensional. The state and input vectors are
x
=
[ P q
u
=
[
Sur
T I ' ,
Ser
Srp
STV~
~ T V V
6rv
1'
.
Sep, STVp, and Sry are the differential (roll) where Sur, ailerons, differential (pitch) elevators, collective pitch elevators, pitch thrust vectoring, yaw thrust vectoring, and rudder deflections, respectively.
put matrices, are *
Cnp, P~ 0.
Cns,,
O Cns,,
Cm~,p
0
Cn,
0
C1STVV Clsrv
Cn1sTVp O
0 Cn~TVv
Cnsrv o
I
The low q corner of the flight envelope is considered and four values of the roll rate parameter (PI, = 0, 8, 16, 24) deglsec are used to represent the variation in roll rate during the roll onset or roll arrest phases [9], [2]. Thus, structured uncertainty is used to account for the nonlinearities introduced by the velocity vector roll maneuver. Finally, the QFT linear robust ccntrol design method is used to accommodate the structured uncertainty and design for the flying quality specifications of Section 75.1.5, thus obtaining a linear controller for a nonlinear plant. The weighting matrix 0.2433 0.1396
-0.03913 -0.11813
0 0
0
0.1745 0.5236
is used, and the respective roll, pitch and yaw channels controllers in the G1 block, in pole-zero format, are
Figure 75.10
Actuator antiwindup scheme.
The linear, but parameter-dependent plant dynamics, and in-
75.1. FLIGHT CONTROL OF PILOTED AIRCRAFT
Flight Control of Piloted Aircraft
75.1.11
The pilot is closing an additional outer loop about the flight control system, as illustrated in Figure 75.6. This brings into the picture the human operator's weakness, namely, the pilot introduces a transport delay into the augmented flight control system. This problem is anticipated and is factored into the FCS design using the following methods.
Neal-Smith Criterion This is a longitudinal FCS design criterion which includes control system dynamics used in preliminary design. The target tracking mission is considered and the aircraft's handling qualities are predicted. The expected pilot workload is quantified in terms of the required generation of lead or lag by the pilot. These numbers have been correlated with qualitative Pilot Ratings (PR) of the aircraft, according to the Cooper-Harper chart [lo], [3], and [4]. Hence, this design method uses a synthetic pilot model in the outer control loop. The augmented "plant" model includes the airframe and FCS dynamics. Thus, the plant model represents the transfer function from stick force F, to pitch rate q. The augmented longitudinal FCS with the pilot in the loop is illustrated in Figure 75.1 1.
Figure 75.12
I ~ CStrong I ~ PIO ~ Tendencies
J
Figure 75.1 1
"Paper" pilot.
The "pilot" is required to solve the following parameter optimization problem: Choose a high gajn K p so that the closedloop system's bandwidth is ? 3.5 radlsec, and at the same time try to set the lead and lag parameters, tp,,and rp2,SO that the , , I is minimized and the peak closed-loop response / droop at low frequencies is reduced. The closed-loop bandwidth where the closed-loop system's phase is specified by the frequ~ncy angle becomes -90'. Hence, the pilot chooses the gain and lead and lag parameters K p ,tp,,and rPzSO that the closed-loop system's frequency response is as illustrated in Figure 75.12. Next, the angle
+
l N l W RESPONSE SLUGGISH
W
rn
Figure 75.13
1
Closed-Loopfrequency response.
Pilot rating prediction chart.
Pilot-Induced Oscillations Pilot-Induced Oscillations (PIOs) are a major concern in FCS design. PIOs are brought about by the following factors: (1) the RHP zero of the pilot model, (2) an airframe with high o,l, (as is the case at low altitude and high 4 flight) and low damping Csp, (3) excessive lag in the FCS (e.g., space shuttle in landing flare flight condition), (4) high-order compensators in digital FCSs (an effective transport delay effect is created), (5) FCS nonlinearities (interaction of nonlinearity and transport delay will cause oscillations),(6) stick force gradient too low (e.g., F-4), and/or "leading stick" (e.g., F-4, T-38 aircraft), and (7) high stress mission, where the pilot applies "high gain." The Ralph-Smith criterion [ l l ] has been developed for use during the FCS design cycle, to predict the possible onset of PIOs. Specifically, it is recognized that the pilot's transport delay "eats up" phase. Hence, the PI0 tendency is predicted from the openloop transfer function, and a sufficiently large FYhaseMargin (PM) is required for P I 0 prevention. The Ralph-Smith criterion requires that the resonant frequency w~ be determined (see Figure 75.14). The following condition must be satisfied: P M 2 14.3 w~ , in which case the aircraftlFCS is not P I 0 prone.
2
is read off the chart. This angle is a measure of pilot compensation required. If LPC > 0,the model predicts that the pilot will need to apply lead compensation to obtain adequate handling qualities. c 0, the model predicts that lag compensation If however L must be applied by the pilot. Thus, the aircraft's handling qualities in pitch are predicted from the chart shown in Figure 75.'13, where P R corresponds to the predicted Cooper-Harper pilot rating.
Pilot Role It is of utmost importance to realize that (fighter) pilots play a crucial role when putting their aircraft through aggres-
-
THE CONTROL HANDBOOK
(,,{
-. .. OPEN -LOOP TF
Figure 75.14
Ralph-Smith P I 0 criterion.
sive maneuvers. When the pilotlaircraft manlmachine system is investigated, the following are most pertinent: 1. In Handling Qualities work, the problem of which
variable the pilot actually controls is addressed. This 15 dictated by the prevailing flight condition. In the pitch channel and at low dynamic pressures q , the controlled variables are a or Q; at high dynamic pressures, the controlledvariables are d!, n,, or C* = n, + Uq. In the lateral channel, the pilot commands roll rate P,. In the directional channel and at low q, the pilot controls the sideslip angle p, and, at high q , the pilot controls or n,. 2. The classical analyses of "pilot in the loop" effects are usually confined to single channel flight control. The discussion centers on the deleterious effects of the ePt" transport delay introduced into the flight control system by the pilot. The transport delaycaused phase lag can destabilize the augmented FCS system, which includes the pilot in its outer loop. Hence, a linear stability analysis of the augmented flight control system is performed, as outlined in Sections 75.1.1 1 and 75.1.11. Thus, a very narrow and a very specialized approach is pursued. Unfortunately, this somewhat superficial investigation of control with a "pilot in the loop" only amplifies the obvious and unavoidable drawbacks of manual control of flight vehicles. It is, however, our firm belief that the above rash conclusion is seriously flawed. In simple, "one-dimensional," and highly structured scenarios it might indeed be hard to justify the insertion of a human operator into the control loop, for it is precisely in such environments that automatic machines outperform the pilot. However, in unstructured environments, (automatic) machines have a hard time beating the control prowess of humans. Human operators excel at high level tasks, where a degree of perception is required. This important facet of "pilot in the loop" operation is further amplified in items 3 and 4 in the sequel. A subtle aspect of the pilot's work during maneuvering flight entails the following. The pilot compensates for deficiencies in the existing control law. These are caused by discrepanciesbetween the sim-
plified plant model used in control law design and the actual nonlinear dynamics of the aircraft. Obviously, some of this "modeling error" is being accommodated by the "benefits of feedback" and the balance, we believe, is being relegated to "pilot workload". However, the FCS designer's job is to strive to reduce the pilot's workload as much as possible. 4. The analysis of different trim conditions in Section 75.1.3 indicates that, in order to perform operationally meaningful maneuvers, not only the roll rate P, but also Q, R, a andp need to be controlled. Thus, control laws for the three control axes are simultaneously synthesized in the outer control loop which includes the pilot. Furthermore, since perfect decoupling is not realistically achievable, it is the pilot who synergetically works the FCS's three control channels to perform the required maneuvers. In other words, the pilot performs the nontrivial task of multivariable control. Although the inclusion of a pilot in the flight control loop comes not without a price-a transport time lag is being introducedthe benefits far outweigh this inherent drawback of a human operator, for the pilot brings to the table the intelligent faculty of on-line multivariable control synthesis. Indeed, high AOA and maneuvering flight entails a degree of on-line perception and pattern recognition. The latter is colloquially referred to by pilots as "seat of the pants" flying. Hence stickand-rudder prowess is not a thing of the past, and the pilot plays a vital role in maneuvering flight.
References [ l ] Blakelock, J.H.,Automatic Control ofAircraff and Missiles, John Wiley & Sons, New York, 1991.
[2] Pachter, M., Modern Flight Control, AFIT Lecture Notes, 1995, obtainable from the author at AFITIENG, 2950 P Street, Wright Patterson AFB, OH 45433-7765. [3] MIL-STD- 1797A: Flying Qualities of Piloted Aircraft, US Air Force, Feb. 1991. [4] Roskam, J., Airplane Flight Dynamics, Part 1, Roskam Aviation and Engineering Corp., Lawrence, Kansas, 1979. [5] Etkin, B., Dynamics of Flight: Stability and Control, John Wiey & ~0nsJNe.r~ York, 1982. [6] Stevens, B.L. and Lewis, EL., Aircraff Control and Simulation, John Wiley & Sons, New York, 1992. [7] Reynolds, O.R., Pachter, M., and Houpis, C.H., Full Envelope Flight Control System Design Using QFT, Proceedings of the American Control Conference, pp 350-354, June 1994, Baltimore, MD; to appear in the ALAA Journal of Guidance, Control and Dynamics. [8] Chandler, P.R., Mears, M., and Pachter, M., A Hybrid LQR/LP Approach for Addressing Actuator Saturation
75.2. SPACECRAFT ATTITUDE CONTROL in Feedback Control, Proceedings ofthe Conference on Decision and Control, pp 3860-3867, 1994, Orlando,
FL. [9] Boyum, K.E., Pachter,,M., and Houpis, C.H., High Angle Of Attack Velocity Vector Rolls, Proceedings of the 13th IFAC Symposium on Automatic Control in Aerospace, pp 51-57, 1994, Palo Alto, CA, and Control Engineering Practice, 3(8), 1087-1093, 1995. [lo] Neal, T.P. and Smith, R.E., An In-Flight Investigation to Develop Control System Dcsign Criteria for Fighter Airplanes, AFFDL-TR-70-74, Vols. 1 and 2, Air Force Flight Dynamics Laboratory, Wright Patterson AFB, 1970. 1111 Chalk, C.R., Neal, T.P., and Harris, T.M., Background Information and User Guide for MIL-F-8785 B - Military Specifications and Flying Qualities of Piloted Airplanes, AFFDL-TR-69-72, Air Force Flight Dynamics Laboratory, Wright Patterson AFB, August 1969.
Further Reading
~
a) Monographs 1. C. D. Perkins and R. E. Hage, "Airplane Perfor-
mance, Stability and Control,"Wiley, NewYork, 1949. 2. "Dynamics of the Airframe," Northrop Corporation, 1952. 3. W. R. Kolk, "Modern Flight Dynamics," Prentice Hall, 1961. 4. B. Etkin, "Dynamics of Atmospheric Flight:' Wiley, 1972. 5. D. McRuer, I. Ahkenas and D. Graham, "Aircraft Dynamics and Automatic Control," Princeton University Press, Princeton, NJ, 1973. 6. J. Roskam, "Airplane Flight Dynamics, Part 2," Roskam Aviation, 1979. 7. A. W. Babister, "Aircraft Dynamic Stability and Response," Pergamon Press, 1980. 8. R. C. Nelson, "Flight Stability and Automatic Control:' McGraw-Hill, 1989, Second Edition. 9. D. McLean, "Automatic Flighi Control Sys. tems:' Prentice Hall, 1990. 10. E. 13. Pallett and S. Coyle, "Automatic Flight Control," Blackwell, 1993, Fourth Edition. 11. A. E. Bryson, "Control of Spacecraft and Aircraft:' Princeton University Press, Princeton, NJ, 1994. 12. "Special Issue: Aircraft Flight Control:' International Journal of Control, Vol. 59, No 1, January 1994. b) The reader is encouraged to consult the bibliography listed in References [8], [9] and [lo] in the text.
75.2 Spacecraft Attitude Control Vincent Coppola, Department of Aerospace Engineering, The Universityof Michigan, Ann Arbor, MI N.Harris McClamroch, Department of Aerospace Engineering, The University of Michigan, Ann Arbor, MI 75.2.1 Introduction The purpose of this chapter is to provide an introductory account of spacecraft attitude control and related control problems. Attention is given to spacecraft kinematics and dynamics, the control objectives, and the sensor and control actuation characteristics. These factors are combined to develop specific feedback control laws for achieving the control objectives. In particular, we emphasize the interplay between the spacecraft kinematics and dynamics and the control law design, since we believe that the particular attributes of the spacecraft attitude control problem should be exploited in any control law design. We do not consider specific spacecraft designs or implementations of control laws using specific sensor and actuation hardware. Several different rotational control problems are considered. These include the problem of transferring the spacecraft to a desired, possibly time-variable, angular velocity and maintaining satisfaction of this condition without concern for the orientation. A special case is the problem of transferring the spacecraft to rest. A related but different problem is to bring the spacecraft to a desired, possibly time-variable, orientation and to maintain satisfaction of this condition. A special case is the problem of bringing the spacecraft to a constant orientation. We also consider the spin stabilization problem, where the spacecraft is desired to have a constant spin rate about a specified axis of rotation. AU of these control problems are considered, and we demonstrate that there is a common framework for their study. Our approach is to provide a careful development of the spacecraft dynamics and kinematics equations and to indicate the assumptions under which they are valid. We then provide several general control laws, stated in terms of certain gain constants. No attempt has been made to provide algorithms for selecting these gains, but standard optimal control and robust control approaches can usually be used. Only general principles are mentioned that indicate how these control laws are obtained; the key is that they result in closed-loop systems that can be studied using elementary methods to guarantee asymptotic stability or some related asymptotic property.
75.2.2 Modeling The motion of a spacecraft consistsof its orbitalmotion, governed by translational equations, and its attitude moiion, governed by rotational equations. An inertial frame is chosen to be at the center of mass of the orbited body and nonrotating with respect to some reference (e.g., the polar axis of the earth or the fixed stars). Body axes X , y, z are chosen as a reference frame fixed
THE CONTROL HANDBOOK in the spacecraft body with origin at its center of mass. For spacecraft in circular orbits, a local horizontal-vertical reference frame is used to measure radial pointing. The local vertical is defined to be radial from the center of mass of the orbited body with the local horizontal aligned with the spacecraft's velocity. The spacecraft is modeled as a rigid body. The rotational inertia matrix is assumed to be constant with respect to the x , y , z axes. This rigidity assumption is rather strong considering that many real spacecraft show some degree of flexibility and/or internal motion, for example, caused by fuel slosh. These nonrigidity effects may be sometimes modeled as disturbances of the rigid spacecraft. We assume that gravitational attraction is the dominant force experienced by the spacecraft. Since the environmental forces (e.g., solar radiation, magnetic forces) are very weak, the spacecraft's orbital motion is well modeled as an ideal two-body, purely Keplerian orbit (i.e., a circle or ellipse), at least for short time periods. Thus, the translational motion is assumed decoupled from the rotational motion. The rotational motion of the spacecraft responds to control moments and moments arising from environmental effects such as gravity gradients. In contrast to the translational equations, the dominant moment for rotational motion is not environmental, but is usually the control moment. In such cases, all environmental influences are considered as disturbances.
75.2.3 Spacecraft Attitude Sensors and Control Actuators There are many different attitude sensors and actuators used in controlling spacecraft. Sensors provide indirect measurements of orientation or rate; models of the sensor (and sometimes of the spacecraft itself) can be used to compute orientation and rate from available measurements. Actuators are used to control the moments applied to influence the spacecraft rotational motion in some desired way. Certain sensors and actuators applicable for one spacecraft mission may be inappropriate for another, depending on the spacecraft characteristics and the mission requirements. A discussion of these issues can be found in [S]. Several types of sensors require knowledge of the spacecraft orbital motion. These include sun sensors, horizon sensors, and star sensors, which measure orientation angles. The measurement may be taken using optical telescopes, infrared radiation, or radar. Knowledge of star (and possibly planet or moon) positions may also be required to determine inertial orientation. Magnetometers provide a measurement of orientation based upon the magnetic field. The limited ability to provide highly accurate models of the magnetosphere limit the accuracy of these devices. Gyroscopes mounted within the spacecraft can provide measurements of both angular rate and orientation. Rate gyros measure angular rate; integrating gyros provide a measure of angular orientation. The gyroscope rotor spins about its symmetry axis at a constant rate in an inertially fixed direction. The rotor is supported by one or two gimbals to the spacecraft. The gimbals move as the spacecraft rotates about the rotor. Angular position and rate are measured by the movement of the gimbals. The
model of the gyroscope often ignores the inertia of the gimbals and the friction in the bearings; these effects cause the direction of the gyro to drift over time. An inertial measurement unit (IMU) consists of three mutually perpendicular gyroscopes mounted to a platform, either in gimbals or strapped down. Using feedback of the gyro signals, motors apply moments to the gyros to make the angular velocity of the platform zero. The platform then becomes an inertial reference. The Global Positioning System (GPS) allows for very precise measurements of the orbital position of earth satellites. Two GPS receivers, located sufficientlyfar apart on the satellite, can be used to derive attitude information based on the phase shift of the signals from the GPS satellites. The orientation and angular rates are computed based upon a model of the rotational dynamics of the satellite.
,
Gas-jet thrusters are commonly employed as actuators for spacecraft attitude control. At least 12 thrust chambers are needed to provide three-axis rotational control. Each axis is 6 controlled by two pairs of thrusters: one pair provides clockwise moment; the other pair, counterclockwise. A thruster pair consists of two thrusters that operate simultaneously at the same thrust level but in opposite directions. They create no net force on the spacecraft but do create a moment about an axis perpendicular to the plane containing the thrust directions. Thrusters may operatecontinuously or infull-on, full-offmodes. Although the spacecraft loses mass during thruster firings, it is often negligible compared to the mass of the spacecraft and is ignored in the equations of motion. Another important class of actuators used for attitude control are reaction wheel devices. Typl~ally,balanced reaction wheels are mounted on the spacecraft so that their rotational axes are rigidly attached to the spacecraft. As a reaction wheel is spun up by an electric motor rigidly attached to the spacecraft, there is a reaction moment on the spacecraft. These three reaction moments provide the control moments on the spacecraft. In some cases, the effects of the electric motor dynamics are significant; these dynamics are ignored in this chapter.
75.2.4 Spacecraft Rotational Kinematics The orientation or attitude of a rigid spacecraft can be expressed by a 3 x 3 rotation matrix R [3], [4]. Since a body-fixedx, y, z frame is rigidly attached to the spacecraft, the orientation of the spacecraft is the orientation of the body frame expressed with respect to a reference frame X, Y , Z. The columns of the rotation matrix are the components of the three standard basis vectors of the X, Y , Z frame expressed in terms of the three body-fixed standard basis vectors of the x, y, z frame. It can be shown that rotation matrices are necessarily orthogonal matrices; that is, they have the property that RRT = I and R ~ R = I ,
(75.48)
75.2. SPACECRAFTATTITUDE CONTROL where I is the 3 x 3 identity matrix, and det(R) = 1. Upon differentiating with respect to time, we obtain
3. A rotation of the spacecraft about the body-fixed x
axis by a roll angle @ It can be shown that the rotation matrix can be expressed ir. terms of the three Euler angle parameters Y,0 , 4 according to the relationship (41
Consequently, ( R R ~= ) -(RR~)~
(75.50)
is a skew-symmetric matrix. It can be shown that this skewsymmetric matrix can be expressed in terms of the comyonents of the angular velocity vector in the body-fixed coordinate frame
where
Consequently, the spacecraft attitude is described by the kinematics equation R = S(w) R . (75.53) This linear matrix differential equation describes the spacecraft attitude time dependence. If the angular velocity vector is a given vector function of time, and if an initial attitude is specified by a rotation matrix, then the matrix differential Equation 75.53 can be integrated using the specified initial data to obtain the subsequent spacecraft attitude. The solution of this matrix differential equation must necessarily be an orthogonal matrix at all instants of time. It should be noted that this matrix differential equation can also be written as nine scalar linear differential equations, but if these scalar equations are integrated, say, numerically, care must be taken to guarantee that the resulting solution satisfies the orthogonality property. In other words, the nine scalar entries in a rotation matrix are not independent. The description of the spacecraft attitude in terms of a rotation matrix is conceptually natural and elegant. However, use of rotation matrices directly presents difficulties in computations and in physical interpretation. Consequently, other descriptions of attitude have been dgveloped. These descriptions can be seen as specific parameterizations of rotation matrices using fewer than nine parameters. The most common parameterization involves the use of three EuYer angle parameters. Although there are various definitions of Euler angles, we introduce the most common definition (the. 3-2-1 definition) that is widely used in spacecraft analyses. The three Euler angles are denoted by Y,9, and 4 , and are referred to as the yaw angle, the pitch angle, and the roll angle, respectively. A general spacecraft orientation defined by a rotation matrix can be achieved by a sequence of three elementary rotations, beginning with the body-fixed coordinate frame coincident with the reference coordinate frame, defined as follows: 1. A rotation of the spacecraft about the body-fixed
I
cos 0 cos Y
cos 0 sin 'P (cos 4 cos Y sin@sinBcosY) sin@sin9sin'P (sin@sin'P+ (-sin@cos@+ cos 4 sin 0 cos 'P) cos 4 sin 19 sin Q) (- cos 4 sin Y
R=
+
+
- sin 9
sin4cosH
I
,
cos 4 cos 0 . (75.54) The components of,a vector, expressed in terms of the x , p, z frame, are the product oftherotation matrix and the components of that same vector, expressed in terms of the X, Y , Z frame. One ofthe deficiencies in the use ofthe Euler angles is that they do not provide a global parameterization of rotation matrices. In particular, the Euler angles are restricted to the ra~ige
in order to avoid singularities in the above representation. This limitation is serious in some attitude control problems and motivates the use of other attitude representations. Other attitude representations that are often used indude axisangle variables and quaternion or Euler parameters. The latter representations are globally defined but involve the use of four parameters rather than three. Attitude control problems can be formulated and solved using these alternative attitude parameterization~.In this chapter, we make use only of the Euler angle approach; however, we are careful to point out the difficulties that can arise by using Euler angles. The spacecraft kinematic equations relate the components of angular velocity vector to the rates of change of the Euler angles. It can also be shown 141 that the angular velocity components in the body-fixed coordinate frame can be expressed in terms of the rates of change of the Euler angles as
Conversely, the rates of change of the Euler angles can be expressed in terms of the components of the angular velocity vector as
2
axis by a yaw angle Y 2. A rotation of the spacecraft about the body-fixed y axis by a pitch angle 9
These equations are referred to as the spacecraft kinematics equations; they are used subsequently in the development of attitude , control laws.
THE CONTROL HANDBOOK
75.2.5 Spacecraft Rotational Dynamics We consider the rotational dynamics of a rigid spacecraft. The body-fixed coordinates are chosen to be coincident with the principal axes of the spacecraft. We first consider the case where three pairs of thrusters are employed for attitude control. We next consider the case where three reaction wheels are employed for attitude control; under appropriate assumptions, the controlled spacecraft dynamics for the two cases are identical. The uncontrolled spacecraft dynamics are briefly studied. We first present a model for the rotational dynamics of a rigid spacecraft controlled by thrusters as shown in Figure 75.15.
Figure 75.16
Spacecraftwith three reaction wheels.
duce attitude control moments; the rotation axes of the reaction wheels are the spacecraft principal axes. The resulting spacecraft dynamics are given by [2]
Figure 75.15
Spacecraftwith gas-jet thrusters.
These dynamic equations are most naturally expressed in body-fixed coordinates. We further assume that the body-fixed coordinate frame is selected to be coincident with the spacecraft yrincipal axes and that there are three pairs of thrusters that produce attitude control moments about these three yrincipal axes. The resulting spacecraft dynamics are given by
where (w, , my, w,) are the components of the spacecraft angular velocityvector, expressedin body-fixed coordinates; (u,, uy, u,) are the components of the control moments due to the thrusters about the spacecraftprincipal axes; and (Tx,Ty , Tz)are the components of the external disturbance moment vector, expressed in body-fixed coordinates. The terms I,, , Iyy, Izz are the constant moments of inertia of the spacecraft with respect to its principal axes. We now present a model for the rotational dynamics of rigid spacecraftcontrolled by reaction wheels as shown in Figure 75.16. As before, we express the rotational dynamics in body-fixed coordinates. We further assume that the body-fixed coordinate frame is selected to be coincident with the spacecraft principal axes and that there are three balanced reaction wheels that pro-
and the dynamics of the reactions wheels are given by
where (w, , wy , w,) are the components of the spacecraft angular velocity vector, expressed in body-fixed coordinates; (v,, v y , v,) are the relative angular velocities of the reaction wheels with re-. spect to their respective axes of rotation (the spacecraft principal axes); (u,, u y , u,) are the control moments developed by the electric motors rigidly mounted on the spacecraft with shafts aligned with the spacecraft principal axes connected to the respective reaction wheels; and (Tx,Ty, T,) are the components of the external disturbance moment vector, expressed in body-fixed coordinates. Letting (I,,, l y y , I Z z ) denote the principalmoments of inertia of the spacecraft,I,, =,,I - Jx, IyY = lYY - Jy IZZ= I,, - J,, where (J,, Jy , J,) are the (polar) moments of inertia of the reaction wheels. Our interest is in attitude control of the spacecraft, so that Equation 75.60 for the reaction wheels does not play a central role. Note that if the terms in Equation 75.59 that explicitly
.
75.2. SPACECRAFT ATTITUDE CONTROL couple the spacecraft dynamics and the reaction wheel dynamics are assumed to be small so that they can be ignored, then Equation 75.59 reduces formally to Equation 75.58. Our subsequent control development makes use of Equation 75.58, or equivalently, the simplified form of Equation 75.59, so that the development applies to spacecraft attitude control using either thrusters or reaction wheels. In the latter case, it should be kept in mind that the coupling terms in Equation 75.59 are ignored as small. If these terms in Equation 75.59 are in fact not small, suitable modifications can be made to our subsequent development to incorporate the effect of these additional terms. One conceptually simple approach is to introduce a priori feedback loops that eliminate these coupling terms., Disturbance moments have been included in the above equations, but it is usual to ignore external disturbances in the control design. Consequently, we subsequently make use of the controlled spacecraft dynamics given by
where it is assumed that the disturbance moment vector (T,, T,,, T7) = (0, 0.0). If there are no control or disturbance moments on the spacecraft, then the rotational dynamics are described by the Euler equations
These equations govern the uncontrolled angular velocity of the spacecraft. Both the angular momentum H and the rotational kinetic energy T given by
are constants of the'motion. These constants describe two ellipsoids in the (w, , w,, m,) space with the actual motion constrained to lie on their intersection. Since the intersection occurs as closed one-dimensional curves (or points), the angular velocity components (w,, my, w,) are periodic time functions. Point intersections occur wh n two of the three components are zero: the body is said to be 'n a simple spin about the third body axis. Moreover, the spin axis is aligned with the (inertially fixed) angular momentum vector in the simple spin case. Further analysis shows that simple spins about either the minor or major axis (i.e., the axes with the least and greatest moment of inertia, respectively) in Equation 75.62 are Lyapunov stable, while a simple spin about the intermediate axis is unstable. However, the minor axis spin occurs when the kinetic energy is maximum for a given magnitude of the angular momentum; thus, if dissipation effects are considered, spin about the minor axis becomes unstable. This observation leads to the major-axis rule:
!
An energy-dissipating rigid body eventually arrives at a simple spin about its major axis. Overlooking this rule had devastating consequences for the U.S.'first orbital satellite Explorer I. The orientation of the body with respect to inertial space is described in terms of the Euler angle responses Q, 0 , # [3]. This is best understood when the inertial directions are chosen so that the angular momentum vector lies along the inertial Z axis. Then the x axis is said to be precessing about the angular momentum vector at the rate dW/dt (which is always positive) and nutating at the rated01dt (which is periodic and single signed). The angle (7r/2 - 0) is called the coning or nutation angle. The motion of the x axis in inertial space is periodic in time; however, in general, the orientation of the body is not periodic because the spin angle q5 is not commensurate with periodic motion of the x axis. Although simple spins about majorlminor axes are Lyapunov stable to perturbations in the angular velocities, neither is stable with respect to perturbations in orientation since the angular momentumvectorcannot resist angular rate pestuibations about itself [4]. However, it can resist perturbations orthogonal to itself. Hence, both the major and minor axes are said to be linearly directionally stable (when dissipation is ignored) since small orientation perturbations of the simple spin do not cause large deviations in the direction of the spin axis. That is, the direction of the spin axis is linearly stable.
75.2.6 Linearized Spacecraft Rotational Equations The spacecraft kinematics and dynamics are described by nonlinear differential Equations 75.57 and 75.61. In this section, we develop linear differential equations that serve as good approximations for the spacecraft kinematics and dynamics in many instances. We fiist consider a linearization of the spacecraft equations near a rest solution. Suppose that there are no external moments applied to the spacecraft; we note that the spacecraft can remain at rest in any fixed orientation. This corresponds to a constant attitude and zero angular velocity. If the reference coordinate frame defines this constant attitude, then it is easily seen that
satisfy the spacecraft kinematics and dynamics given by Equations 75.53 and 75.61. Equivalently, this reference attitude corresponds to the Euler angles being identically zero
and it is easily seen that the kinematics Equation 75.56 is trivially satisfied. Now if we assume that applied control moments do not'perturb the spacecraft too far from its rest solution, then we can assume that the components of the spacecraft angular velocity vector are small; thus, we ignore the product terms to obtain the linearized approximation to the spacecraft dynamics
'
THE CONTROL H A N D B O O K
Similarly, by assuming that the Euler angles and their rates of change are small, we make the standard small angle assumptiolls to obtain the linearized approximation to the spacecraft kinematics equations
4
=
0
=
wx, w,,
Q
=
w,,
(75.65)
Thus, the linearized dynamics and kinematics equations can be combined in a decoupled form that approximates the rolling, pitching, and yawing motions, namely,
We next present a linearized model for the spacecraft equations neara constant spin solution. Suppose that there is no control moment applied to the spacecraft. Another solution for the spacecraft motion, in addition to the rest solution, corresponds to a simple spin at constant rate of rotation about an inertially fixed , axis. To be specific, we assume that the spacecraft has a constant spin rate about the inertially fixed Z axis; the carresponding solution of the spacecraft kinematics Equation 75.57 and the spacecraft dynamics Equation 75.61 is given by (Q, 0 , 4 ) = ( a t , O,O), (wx, w,, wz) = (O,O, a ) . Now if we assume that the applied control moments do not perturb the spacecraft motion too far from this constant spin solution, then we can assume that the angular velocity components wx, my, and wz - are small. Thus, we can' ignore the products of small terms to obtain the linearized apbroximation to the spacecraft dynamics
Similarly, by assuming that the Euler angles 0 and 4 (but not Q) are small, we obtain the linearized approximation to the spacecraft kinematics
which makes clear that the yawing motion (corresponding to the motion about the spin axis) is decoupleci from the pitching and rolling motion. There is little loss of generality in assuming that the nominal spin axis is the inertially fixed Z axis; this is simply a matter of defining the inertial reference frame in a suitable way. This choice has been made since it leads to a simple formulation in terms of the Euler angles.
75.2.7 Control Specifications and Objectives There are a number of possible rotational control objectives that can be formulated. In this section, several common problems are described. In all cases, our interest, for practical reasons, is to use feedback control. We use the control actuators to generate control moments that cause a desired spacecraft rotational motion; operation of these control actuators depends on feedback of the spacecraft rotational motion variables, which are obtained from the sensors. Most commonly, these sensors provide instantaneous measurements of the angular velocity vector (with respect to the body-fixed coordinate frame) and the orientation, or equivalently, the Euler angles. The use of feedback control is natural for spacecraft rotationai control applications; the same control law can be used to achieve and maintain satisfaction of the control objective in spite of certain types of model uncertainties and external disturbances. The desirable features of feedback are best understood in classical control theory, but the benefits of feedback can also be obtained for the fundamentally nonlinear rotational control problems that we consider here. Conversely, feedback, if applied unwisely, can do great harm. Consequently, we give considerable attention to a careful description of the feedback control laws and to justification for those control laws via specification of formal closedloop stability properties. This motivates our use of mathematical models and careful analysis and design, thereby avoiding the potential difficulties associated with ad hoc or trial-and-error based control design. We first consider spacecraft control problems where the control objectives are specified in terms of the spacecraft angular velocity. A general angular velocity control objective is that the spacecraft angular velocity vector exactly track a specified angular velocity vector in the sense that the angular velocity error be brought to zero and maintained at zero. If (wxd, wvd, wZd) denotes the desired, possibly time-variable, angular velocity vector, this control objective is described by the asymptotic condition that (wx, w,, wz) -, (wxd, wzd) as t -+ m. @,d3
Thus, the linearized dynamics and kinematics equations can be combined into
One special case of this general control objective corresponds to bringing the spacecraft angular velocity vector to zero and maintaining it at zero. In this case, the desired angular velocity vector is zero, so the control objective is described by the asymptotic condition that (w,, my, wz) -+(0,0,O) as t + oo. Another example of this general control objective corresponds
75.2. SllACECRAFT ATTITUDE CONTROL to specifying that the spacecraft spin about a specified axis fixed to the spacecraft with a given angular velocity. For example, if (w,d, w,d, wzd) = (0, 0, Q), the control objective is that (w,, w!,, wZ)-+:(o,0 , ~as) t
4
1309 If the reference coordinate frame is selected to define the desired spacecraft orientation, then Rd = I, the 3 by 3 identity matrix, and (Qd, Od, @d)= (0,0,O) and the control objective is
oo.
so that the spacecraft asymptoticallyspins about its body-fixed z axis with a constant angular velocity 52. In all of the above cases, the spacecraft orientation is of no concern. Consequently, control laws that achieve these control objectives make use solely of the spatecraft dynamics. We next consider spacecraft control problems where the control objectives are specified in terms of the spacecraft attitude. A more general class of control objectives involves the spacecraft attitude as well as the spacecraft angular velocity. In particular, suppose that it is desired that the spacecraft orientation exactly track a desired orientation in the sense that the attitude error is brought to zero and maintained at zero. The specified orientation may be time variable. If the desired orientation is described in terms of a rotation matrix function of time Rd, then the control objective is described by the asymptotic condition that
or equivalently,
Control laws that achieve the above control objectives should make use of both the spacecraft kinematics and the spacecraft dynamics. Special cases of such control objectives are given by the requirements that the spacecraft always point at the earth's center, a fixed star, or another spacecraft. Another control objective corresponds to specifying that the spacecraft spin about a specified axis that is inertially fixed with a given angular velocity a.If the reference coordinate frame is chosen so that it is an inertial frame and its positive Z axis is the spin axis, then it follows that the control objective can be described by the asymptotic conditioi~that
R 4 Rdast-+co. If we assume that Rd is parameterized by the desired Euler angle functions (Yd, Od, &), then the control objective can be described by the asymptotic condition that (Q, 0 , 4 )
-+
(*d76d,@d) ast
-+
oo.
Since the specified matrix Rd is a rotation matrix, it follows that there is a corresponding angular velocityvectorwd = (wxd, wyd, w,d) that satisfies Rd = S(wd)R& Consequently, it follows that the spacecraft angular velocity vector must also satisfy the asymptotic condition (wx, w,, w,)
-+
(wxd, w,d. w d ) as t
-+m .
An important special case of the gerlerarattitude control objective is to bring the spacecraft attitude to a specified constant orientation and to maintain it at that orientation. Let Rd denote the desired constant spacecraft orientation; then the control objective is described by the asymptotic condition that
R + R d a s t - + co. Letting Rd be parameterized by the desired constant Euler angles (Qd, Od, &), the control objective can be described by the asymptotic condition that
(w,@,@I-+
(qd, Od, @dl as t
-+
co.
Since the desired orientation is constant, it follows that the angular velocity vector must be brought to zero and maintained at zero; that is, (w,, my, w ~ + ) (O,O, 0) as t + m.
which guarantees that the spacecraft tends to a constant angular velocity about its body-fixed z axis and that its z axis is aligned with the inertial Z axis. Controllaws that achieve the above control objectives should make use of both the spacecraft kinematics and the spacecraft dynamics. An essential part of any control design process is performance specifications. When the spacecraft control inputs are adjusted automatically according to a specified feedback control law, the resulting system is a closed-loop system. In terms of control law design, performance specifications are naturally imposed on the closed-loop system. Since the spacecraftkinematics and dynamics are necessarily nonlinear in their most general form, closedloop specifications must be given in somewhat nonstandard form. There are no uniformly accepted technical performance specifications for this class of control problems, but it is generally accepted that the closed loop should exhibit rapid transient response, good steady-stateaccuracy, good robustness to parameter uncertainties, a large domain of attraction, and the ability to reject certain classes of external disturbances. These conceptual control objectives can be quantified, at least if the rotational motions of the closed loop are sufficiently small so that the dynamics and kinematics are adequately described by linear models. In such case, good transient response depends on the eigenvalues or characteristic roots. Desired closed-loop properties can be specified if there are uncertainties in the models and external disturbances of certain classes. Control design to achieve performance specifications, such as steady-stateaccuracy, robustness, and disturbance rejection, have been extensively treated in the theoretical literature, for both linear and nonlinear control systems, and are not explicitly studied here. We note that if there are persistent disturbances, then there may be nonzero steady-state errors for the control laws that we subsequently propose; those controllaws can easily be modified to include integral
THE CONTROL HANDBOOK error terms to improve the steady-state accuracy. Examples of relatively simple specifications of the closed loop are illustrated in the subsequent sections that deal with design of feedback control laws for several spacecraft rotational control problems.
75.2.8 Spacecraft Control Problem: Linear Control Law Based on Linearized Spacecraft Equations In this section, we assume that the spacecraft dynamics and kinematics can be described in terms ofthe linearized Equations 75.64 and 75.65. We assume the control moments can be adjusted to any specified level; hence, we consider the use of linear control laws. We first consider control of the spacecraft angular velocity. It is assumed that the spacecraft dynamics are described by the linearized Equation 75.64; a control law (u,, u y , u z ) is desired in feedback form, so that the resulting closed-loop system satisfies the asymptotic condition
and has desired closed-loop properties. Since the spacecraft dynamics are linear and uncoupled first-order equations, a standard linear control approach can be used to design a linear feedback control of the form
where the control gains c, , c y ,cZ are chosen as positive constants. Based on the linearized equations, introduce the variables
and it is guaranteed to bring the spacecraft asymptotically to rest. It is important to note that the preceding analysis holds only for sufficiently small perturbations from rest. For large perturbations in the angular velocity of the spacecraft, the above control laws may not have the desired properties. An analysis of the closed-loop system, using the nonlinear dynamics Equation 75.61, is required to determine the domain of perturbations for which closed-loop stabilization is achieved. Such an analysis is beyond the scope of the preserlt chapter, but we note that the stability domain depends on both the desired control objective and the control gains that are selected. We next consider control of the spacecraft ntlg~rlarnttitude. It is assumed that the spacecraft dynamrcb and kinematics are described by the linearized Equation 75.66 and a control law ( u x , u,,, 1i7) is desired in feedback form, so that the resulting c1osed:loop system satisfies the attitude control conditions
where the kinematic conditions of Equation 75.56 are assumed to be satisfied, and the closed-loop system has desired closedloop properties. Since the equations are linear and uncoupled second-order equations, a standard linear control approach can be used to obtain linear feedback control laws of the form
Recall that the angular velocity components are the rates of change of the Euler angles according to the linearized relations of Equation 75.65, so that the control law can equivaleritly be expressed in terms of the Euler angles and their rates of change. Based on the linearized equations, introduce the variables
so that the closed-loop equations are so that the closed-loop equations are
Hence, the angular velocity errors in roll rate, pitch rate, and yaw rate are brought to zero asymptotically, at least for sufficiently small perturbations. The values of the control gains can be chosen to provide specified closed-loop time constcnts. Consequently, the control law is guaranteed to achieve exact tracking asymptotically, with speed of response that is determined by the values of the control gains. A simple special case of the above control law corresponds to the choice that (wxd, wyd, wzd) = ( 0 , 0 , O ) ; i.e., the spacecraft is to be brought to rest. In simplified form, the controllaw becomes
Hence, the spacecraft attitude errors in roll angle, pitch angle, and yaw angle are brought to zero asymptotically, at least for sufficiently small perturbations. The values of the control gains can be chosen to provide specified closed-loop natural frequencies and damping ratios. Consequently, the above control laws are guaranteed to bring the spacecraft to the desired attitude asymptotically, with speed of response that is determined by the values of the control gains. A simple special'case of the above control law corresponds to the choice that ( q d , Bd , & ) is a constant and (wxd, wyd, mZd) = ( 0 , 0 , O ) . The simplified control law is of the form
75.2. SPACECRAFT ATTITUDE CONTROL and it is guaranteed to bring the spacecraft asymptotically to the desired constant attitude and to maintain it in the desired attitude. The resulting control law is referred to as an attitude stabilization control law. Further details are available in [I]. Again, it is important to note that the preceding analysis holds only for sufficiently small perturbations in the spacecraft angular velocity and attitude. For large perturbations in the angular velocity and attitude of the spacecraft,the control law may not have the desired properties. An analysis of the closed-loop system, using the nonlinear dynamics Eqyation 75.61 and the nonlinear kinematic Equation 75.57, is required to determine the domain of perturbations for which closed-loop stabilization is achieved. Such an analysis is beyond the scope of the present chapter, but we note that the stability domain depends on the desired attitude and angular velocity and the control gains that are selected. We now consider spin stabilization of the spacecraft about a specijied inertial axis. A control law (u,, u,, u,) is desired in feedback form, so that the resulting closed-loop system satisfies the asymptotic control objective (O,q5) -+ (0, 0) as t -+ co, (w,, w,, w z ) -+ (O,0, Q) as t + 00,
corresponding to an asymptotically constant spin rate Q about the body-fixed z axis, which is aligned with the inertial Z axis, and the closed loop has desired properties. It is assumed that the spacecraft dynamics and kinematics are described by the linearizedEquations 75.67 and75.68. Since the spacecraft equations are linear and uncoupled, a standard linear control approach can be used. Consider the linear feedback control law of the form
Recall that the angular velocity components are related to the rates of change of the Euler angles according to the linearized relations of Equation 75.68, so that the control law can be expressed either in terms of the rates of change of the Euler angles (as in Equation 75.78) or in terms of the components of the angular velocity vector. Based on the linearized equations, introduce the variables
so that the closed-loop equations are
Hence, the spacecraft attitude errors in pitch angle and roll angle are brought to zero asymptoticallyand the yaw rate is brought to the value 52 asymptotically, at least for sufficiently small perturbations. The values of the control gains can be chosen to provide specified closed-loop transient responses. Consequently, the above control law is guaranteed to bring the spacecraft to the
desired spin rate about the specified axis of rotation, asymptotically, with speed of response that is determined by the values of the control gains. The control law of Equation 75.78 has the desirable property that it requires feedback of only the pitch and roll angles, which characterize, in this case, the errors in the instantaneous spin axis; feedback of the yaw angle is not required. Again, it is important to note that the preceding analysis holds only for sufficientlysmall perturbations in the spacecraft angular velocity and attitude. For large perturbations in the angular velocity and attitude of the spacecraft, the above control law may not have the desired properties. An analysis of the closed-loop system, using the nonlinear dynamics Equation 75.61 and the nonlinear kinematic Equation 75.57, is required to determine the domain of perturbations for which closed-loop stabilization is achieved. Such an analysis is beyond the scope of the present chapter, but we note that the stability domain depends on the desired spin rate and the control gains that are selected.
75.2.9
Spacecraft Control Problem: Bang-Bang Control Law Based on Linearized Spacecraft Equations
As developed previously, it is assumed that the spacecraft kinematics and dynamics are described by linear equations obtained by linearizing about the rest solution. The linearized spacecraft dynamics are described by Equation 75.64, and the linearized spacecraft kinematics are described by Equation 75.65. Since certain types of thrusters are most easily operated in an on-off mode, the control moments produced by each pair of thrusters may be limited to fixed values (of either sign) or to zero. We now impose these control constraints on the development of the control law by requiring that each of the control moment components can take only the values {-U, 0, + U ) at each instant of time. We first present asimplecontrollawthat stabilizes thespacecraft to rest, ignoring the spacecraft attitude. The simplest control law, satisfying the imposed constraints, that stabilizes the spacecraft to rest is given by
where the signum function is the discontinuous function defined by
It is easily shown that the ;esulting closed-loop system has the property that ( W Xo, y , w,) -+ ( 0 , 0 , O ) as t -+
co,
at least for sufficiently small perturbations in the spacecraft angular velocity vector.
THE CONTROL HANDBOOK There are necessarily errors in measuring the angular velocity vector, and it can be shown that the above control law can be improved by using the modified feedback control law
where the dead-zone function is the discontinuous function defined by
where S is an open set containing ( O , o , O ) ; in fact, the spacecraft attitude is asymptotically periodic with a maximum amplitude that tends to zero as F tends to zero. Thus, the dead-zone parameter E can be selected appropriately, so that the angular velocity vector and the attitude errors are maintained small while the closed loop is not excessively sensitive to measurement errors in the angular velocity vector or the attitude errors. Further details are available in [ 11.
75.2.10 Spacecraft Control Problem: Nonlinear Control Law Based on Nonlinear Spacecraft Equations and E is a positive constant, the dead-zone width. The resulting closed-loop system has the property that
where S is an open set containing (0,0,0);the (maximum) diameter of S has the property that it goes to zero as F goes to zero. Thus, the dead-zone parameter E can be selected appropriately, so that the angular velocity vector is maintained small while the closed loop is not excessively sensitive to measurement errors in the angular velocity vector. We now present a simple control law that stabdizes the spacecraft to a f i e d attitude, which is given by the Euler angles being all zero. The simplest control law, satisfying the imposed consiraints, is of the form
where t,, t,, sz are positive constants and the linear arguments in the control laws of Equation 75.85 are the switching functions. Consequently, the closed-loop system is decoupled into independent closed loops for the roll angle, the pitch angle, and the yaw angle. By analyzing each of these dosed-loop systems, it can be shown that
at least for sufficiently small perturbations in the spacecraft angular velocity vector and attitude. It should be noted that a part of the solution involves a chattering solution where the switching functions are identically zero over a finite time interval. There are necessarily errors in measuring the angular velocity vector and the attitude, and it can be shown that the above control law can be improved by using the modified feedback control law
where a is a positive constant, the dead-zone width. The resulting closed-loop system has the property that (Q, 0 , @ )4 S as t
-+
m,
In this section, we present nonlinear control laws that g~i,~rantee that the closed-loop equations are exactly I~near;t h ~ sapproach can be viewed as using feedback both to cancel out the nonl~near terms and then to add In linear terms that result in good closedloop linear characteristics. Thls approach is often referred to as feedback linearization or dynamic inversion. We first consider control of the spacecraft aflgt*lar veloczty. It is assumed that the spacecraft dynamlcs are described by the nonlinear Equation 75.61 and acontrollaw ( L ( , , 1 1 , 1 1 : ) 1s desired in feedback form, so that the resulting closed-loop system is h e a r and satisfies the asymptotic condition
.
Control laws that accomplish these objectives are given in the form
where the control gains c x , c,, cz are chosen as positive constants. If the variables
are introduced, the closed-loop equations are
Hence, the angular velocity errors in roll rate, pitch rate, and yaw rate are brought to zero asymptotically. The values of the control gains can be chosen to provide specified closed-loop time constants. Consequently, the above control laws are guaranteed to achieve exact tracking asymptotically, with speed of response that is determined by the values of the control gains.
75.2. S P A C E C R A F T A T T I T U D E C O N T R O L
A simple special case of the above control law corresponds to the choice that ( w r d ,w,,d, w,d) = (0,O.O). Thus, the simplified control law is given by
and it is guaranteed to bring the spacecraft asymptotically to rest. The preceding analysis holds globally, that is, for all possible perturbations in the angular velocity vector, since the closedloop system is exactly linear and asymptotically stable. This is in sharp contrast with the development that was based on the linearized equations. Thus, the family of control laws given here provides excellent closed-loop properties. he priceof such good performance is the relative complexity ofthe control laws and the associated difficulty in practical implementation. We now consider spin stabilization of the spacecraft about a specified inertial axis. It is assumed that the spacecraft dynamics are described by the nonlinear Equation 75.61, and the kinematics are described by the nonlinear Equation 75.57. A control law ( / I , , u,, , u z )is desired in feedback form, so that the resulting closed-loop system is linear and satisfies the control objective
This corresponds to control of the spacecraft so that it asymptotically spins about its body-fixed z o i s at a spin rate Q, and this spin axis is aligned with the inertially fixed Z axis. In order to obtain closed-loop equations that are exactly linear, differentiate each of the first two equations in Equation 75.61, substitute for the time derivatives of the angular velocities from Equation 75.61; then select the feedback control law to cancel out all nonlinear terms in the resulting equations and add in desired linear terms. After considerable algebra, this feedback linearization approach leads to the following nonlinear control law ux
=
u . ~=
uz
=
-Ix,r.fx - c.x$~- k x 4 ,
+
+
- [ I Y yf y eye kyOl sec 4 , (75.91) (I;? - Ixxlwxoy - cz(wz - a),
where
+ -( I z z -- I x x ) w,o, sin 4 tan N
Note that the control law can be expressed either in terms of the rates of change of the Euler angles or in terms of the components of the angular velocity vector; the above expression involves a mixture of both. In addition, the control law can be seen not to depend on the yaw angle. If we introduce the variables
it can be shown (after substantial algebra) that the closed-loop equations are the linear decoupled equations
Hence, the spacecraft attitude errors in pitch angle and roll angle are brought to zero asymptotically and the yaw rate is brought to the value L?asymptotically. The values of the control gains can be chosen to provide specified closed-loop responses. Consequently, the above control laws are guaranteed to bring the spacecraft to the desired spin rate about the specified axis of rotation with speed of response that is determined by the values of the control gains. The preceding analysis holds nearly globally, that is, for all possible perturbations in the angular velocity vector and for all possible perturbations in the Euler angles, excepting the singular values, since the closed-loop system of Equation 75.94 is exactly linear. This is in sharp contrast with the previous development that was based on linearized equations. It can easily be seen that the control law of Equation 75.78, obtained using the linearized approximation, also results from a linearization of the nonlinear control law of Equation 75.91 obtained in this section. Thus, the control law given by Equation 75.9 1 provides excellent closed-loop properties. The price of such good performance is the relative complexity of the control law of Equation 75.91 and the associated difficulty in its practical implementation.
75.2.11 Spacecraft Control Problem: Attitude Control in Circular Orbit An important case of interest is that of a spacecraft in a circular orbit. In such a case, it is natural to describe the orientation of the spacecraft not with respect to an inertially fixed coordinate frame, but rather with respect to a locally horizontal-vertical coordinate frame as reference, defined so that the X axis of this frame is tangent to the circular orbit in the direction of the orbital motion, the Z axis of this frame is directed radially at the center of attraction, and the Y axis completes a right-hand orthogonal frame. Let the constant orbital angular velocity of the locally horizontal coordinate frame be (75.95)
d " +my-[sm4tanel cit
d +wz-[cos4tanB]. dt
where g is the local acceleration of gravity and R is the orbital radius; the direction of the orbital angular velocity vector is along the negative Y axis.
THE CONTROL HANDBOOK The nonlinear dynamics for a spacecraft in circular orbit can be described in terms of the body-fixed angular velocity components. However, moment terms arise from the gravitational forces o n the spacecraft when the spacecraft is modeled as a finite body rather than a point mass. These gravity gradient moments can be expressed in terms of the orbital angular velocity and the Euler angles, which describe the orientation of the body-fixed frame with respect to the locally horizontal-vertical coordinate frame. The complete nonlinear dynamics equations are not presented here due to their complexity. The linearized expressions for the spacecraft dynamics about the constant angular velocity solution ( 0 , -a, 0) are obtained by introducing the perturbation variables
the resulting linearized dynamics equations are
The external moments on the spacecraft are given by
where (u,, u,, u z ) denotes the control moments, and the other terms describe the linearized gravity gradient moments on the spacecraft. Thus, the linearized spacecraft dynamics are given by
+
~x.rjx = I,,,&, .. =
Q ( I z z - IyY)6z 3 ~ ~ ( 1 z zIyy)@ - 3 ~ 2 ~ ( 1 , , - Izz)O u , ,
~:zjz =
Q (I.,, - 1xx)sx
+
+ux,
A control law (u,, u,,, u Z )is desired in feedback form, so that the resulting closed-loop system satisfies the asymptotic attitude conditions (*, Q , @ )-+ ( 0 , 0 , O ) as t -+ oo, ( S , , S Y , S z ) -+ (O,O,O) a s t -, oo,
which guarantees that
that is, the spacecraft has an asymptotically constant spin rate consistent with the orbital angular velocity as desired. Since the preceding spacecraft equations are linear, a standard linear control approach can be used to obtain linear feedback control laws for the pitching motion and for the rolling and yawing motion of the form
uy
=
uz
=
-eye - [ ~ Q ~ ( I-x I,,) , + k,,]0, (75.103) -cz+ - [-fi2(1,, . - I,,) + k z ] Y + s-2 (Ixx + Izz - I.,")$. .
The control law can be expressed either in terms of the rates of change of the Euler angles or in terms of the components of the angular velocity vector. If we introduce the perturbation variables
the resulting closed-loop system is described by
(75.99)
+ uz.
The linearized kinematics equations are given by
where the extra terms arise since the locally horizontal-vertical coordinate frame has a constant angular velocity of -a about the Y axis. Thus, the linearized dynamics and kinematics equations can be combined in a form that makes clear that the pitching motion is decouyled from the rolling and yawing motions
but the rolling and yawing motions are coupled
If the gains are chosen so that the pitching motion is asymptotically stable and the rolling and yawing motion is asymptotically stable, then the spacecraft attitude errors are automatically brought to zero asymptotically, at least for sufficiently small perturbations of the orientation from the locally horizontal-vertical reference. The values of the control gains can be chosen to provide specified closed-loop response properties. Consequently, the above control laws are guaranteed to bring the spacecraft to the desired attitude with speed of response that is determined by the values of the control gains. We again note that the preceding analysis holds only for sufficiently small attitude perturbations from the local horizontalvertical reference. For large perturbations, the above control laws may not have the desired properties. An analysis of the closedloop system, using the nonlinear dynamics and kinematics equations, is required to determine the domain of perturbations for which closed-Ioop stabilization is achieved. Such an analysis is beyond the scope of the present chapter.
75.2. Sl'ACECAAFT ATTITUDE CONTROL
75.2.12 Other Spacecraft Control Problems and Control Methodologies Our treatment of spacecraft attitude control has been limited, both by the class of rotational control problems considered and by the assumptions that have been made. In this section, we briefly indicate other classes of problems for which results are available in the published literature. Throughout our development, specific control laws have been developed using orientation representations expressed in terms of Euler angles. As we have inditated, the Euler angles are not global representations for orientation. Other orientation representations, including quaternions and Euler axis-angle variables, have been studied and various control laws have been developed using these representations. Several of our control approaches have been based on use of the linearized spacecraft kinematics and dynamics equations. We have also suggested a nonlinear control approach, feedback linearization, to develop several classes of feedbackcontrol laws for the nonlinear kinematics and dynamics equations. Other types of control approaches have been studied for spacecraft reorientation, including optimal control and pulse-width-modulated control schemes. We should also mention other classes of spacecraft attitude control problems. Our approach has assumed use of pairs of gas-jet thrusters or reaction wheels modeled in the simplest way. Other assumptions lead to somewhat different models and, hence, somewhat different control problems. In particular, we mention that control problems in the case there are only two, rather than three, control moments have recently been studied. A key assumption throughout our development is that the spacecraft is a rigid body. There are important spacecraft designs where this assumption is not satisfied, and the resulting attitude control problems are somewhat different from what has been considered here; usually these control problems are even more challenging. Examples of such control problems occur when nutation dampers or control moment gyros are used for attitude control. Dual-spin spacecraft and multibody spacecraft are examples where there is relative motion between spacecraft components that must be taken into account in the design of attitude control systems. Numerous modern spacecraft, due to weight constraints, consist of flexible components. Attitude control of flexible spacecraft is a very important and widely studied subject.
75.2.13 Defining Terms Attitude: The orientation of the spacecraft with respect to some reference frame. Body axes: A reference frame fixed in the spacecraft body, and rotating with it. Euler angles: A sequence of angle rotations that are used to parametrize a rotation matrix. Pitch: The second angle in the 3-2- 1 Euler angle sequence. For small rotation angles, the pitch angle is the rotation angle about the spacecraft y axis. '
Roll: The third angle in the 3-2- 1Euler angle sequence. For small rotation angles, the roll angle is the rotation angle about the spacecraft x axis. Rotationmatrix: A matrix of direction cosines relating unit vectors of two different coordinate frames. Simple spin: A spacecraft spinning about a body axis whose direction remains inertially fixed. Yaw: The first rotation angle in the 3-2-1 Euler angle sequence. For small rotation angles, the yaw angle is the rotation angle about the syacecraft z axis.
References [ I ] Bryson, A.E., Control of Spacecraft and Aircraft, Princeton University Press, Princeton, NJ, 1994. (21 Crouch, P.E., Spacecraft attitude control and stabilization: applications of geometric control theory to rigid body models, IEEE Trans.Autom. Control, 29(4), 321-331, 1984. [3] Greenwood, D.T., Principles of Dynamzcs, 2nd ed., Prentice Hall, Englewood Cliffs, NJ, 1988. [4] Hughes, P.C., Spacecraft Attitude Dynamics, Wiley, New York, 1986. [5] Wertz, J.R., Ed., SpacecraftAttitudeDetermination and Control, Kluwer, Dordrecht, Netherlands, 1978.
Further Reading The most complete reference on the control of syacecraft is Spacecraft Attitude Determination and Control, a handbook edited by J. R. Wertz [5]. It has been reprinted often and is available from Kluwer Academic Publishers. Two introductory textbooks on spacecraft dynamics are Wiesel, W.E. 1989, Spaceflight Dynamics, McGraw-Hill, New York, and Thomson, W.T. 1986, Introduction to Space Dynamics, Dover Publications (paperback), NY, 1986.
A more comprehensivetreatment can be found in $pacecraft Attitude Dynamics by P.C. Hughes [4]. A discussion of the orbital motion of spacecraft can be found in Bate, Mueller, White, 1971, Fundamentals of Astrodynamics, Dover Publications, New York, and Danby, J.M.A. 1988, Fundamentals of Celestial Mechenics, 2nd ed., Willmann-Bell Inc., Richmond, VA.
THE CONTROL HANDBOOK
75.3 Control of Flexible Space Structures S.
M.Joshi and A. G. Kelkar,
NASA
Langley Research Center
A number of near-term space missions as well as future mission concepts will require flexible space structures (FSS) in low Earth and geostationary orbits. Examples of near-term missions include multipayload space platforms, such as Earth observing systems and space-based manipulators for on-orbit assembly and satellite servicing. Examples of future space mission concepts include mobile satellite communication systems, solar power satelIltes, and large optical reflectors, which would require large antennas, platforms, and solar arrays. The dimensions of such structures would range from 50 meters (m) to several kilometers (km). Because of their relatively light weight and, in some Lases, expansive sizes, such structures tend to have low-frequency, lightly damped structural (elastic) modes. The natural frequenues of the elastic modes are generally closely spaced, and some natural frequencies may be lower than the controller bandwidth. in addition, the elastic mode characteristics are not known accurately. For these reasons, control systems design for flexible space structures is a challenging problem. Depending on their missions, flexible spacecraft can be roughly c itegorized as single-body spacecraft and multibody spacecraft. iivo of the most important control problems for single-body FSS .ire (1) fine-pointing of FSS in space with the required precision in attitude (represented by three Euler angles) and shape, and ( 2 ) large-angle maneuvering ("slewing") of the FSS to orient to a different target. The performance requirements for both of these problems are usually very high. For example, for a certain mobile ~ommunicationsystem concept, a 122-meter diameter space anxenna will have to be pointed with an accuracy of 0.03 degree root mean square (RMS). The requirements for other missions vary, but some are expected to be even more stringent, on the order of 0 01 arc-second. In some applications, it would be necessary to maneuver the FSS quickly through large angles to acquire a new target on the Earth in minimum time and with minimum fuel expenditure, while keeping the elastic motion and accompanying stresses within acceptable limits. Once the target is acquired, the FSS must point to it with the required precision. For multibody spacecraft with articulated appendages, the main control problems are (1) fine-pointing of some of the appendages to their respective targets, (2) rotating some of the appendages to follow prescribed periodic scanning profiles, and (3) changing the orientation of some of the appendages through large angles. For example, a multipayload platform would have the first two requirements, while a multilink manipulator would
have the third requirement to reach a new end-effector position. The important feature that distinguishes FSS from coIlveI1tional older generation spacecraft is their highlyprominent structural flexibility which results in special dynamic characteristics. Detailed literature surveys on dynamics and control of FSS may be found in [ I 11, [20]. The organization of this chapter is as follows. The problem of fine-pointing control of single-body spacecraft is colisidered in Section 75.3.2. This problem not only represents an important class of missions, but also permits analysis in the linear, time-invariant (LTI) setting. The basic linearized mathematical model of single-body FSS is presented, and the problems encountered in FSS control systems design are discussed. Two types ofcontroller design methods, model-based controllers and passivity-based controllers, are presented. The highly desirable robustness characteristics of passivity-based controllers are summarized. Section 75.3.3 addresses another important class of missions, namely, multibody FSS. A generic nonlinear mathematical model of a multibody flexible space system is presented and passivity-based robust controllers are discussed.
75.3.2 Single-Body Flexible Spacecraft Linearized Mathematical Model Simple structures, such as uniform beams or plates, can be effectively modeled by infinite-dimensional systems (see [12]). In some cases, approximate infinite-dimensional models have been proposed for more complex structures such as trusses [ 2 ] . However, most of the realistic FSS are highly complex, not amenable to infinite-dimensional modeling. The standard practice is to use finite-dimensional mathematical models generated by using the finite-element method 1191. The basic approach of this method is dividing a continuous system into a number of elements using fictitious dividing lines, and applying the Lagrangian formulation to determine the forces at the points of intersection as functions of the applied forces. Suppose there are r force actuators and p torque actuators distributed throughout the structure. The ith force actuator produces the 3 x 1 force vector f, = (f,,, f v c , along the X , Y, and Z axes of a body-fixed coordinate system centered at the nominal tenter of mass (c.m.). Similarly, the ith torque actuator produces the torque vector TI = (T,, , Tvi, T,, ) T . Then the linearized equations of motion can be written as follows [12]: rigid-body translation:
rigid-body rotation: r
elastic motion:
h his article is based on the work performed for the U.S.Government. The responsibility for the contents rests with the authors.
D
~ ~ = x ~ i+ X x Tf , ,, and 1 =I 1=l
(75.107)
I
75.3. C O N T R O L OF FLEXIBLE SPACE STRUCTURES where M is the mass, z is the 3 x 1 position of the c.m., R, is the location of f, on the FSS, J is the 3 x 3 moment-ofinertia matrix, cr is the attitude vector consisting of the three Euler rotation angles ( # , 8 , $), and q = ( q l ,92, . . . ,q,,q)T is the nq x 1 modal amplitude vector for the nq elastic modes. ( " x " in Equation 75.107 denotes the vector cross-product.) In general, the number of modes ( n q )necessary to characterize an FSS adequately is quite large, on the order of 100-1000. A:, are the nq x 3 translational and rotational mode shape matrices at the ith actuator location. The rows of A:, 07 represent the X,Y , Z components of the translational and rotational mode shapes at the location of actuator i.
where o k is the natural frequency of the kth elastic mode, and D is an nq x nq matrix representing the inherent damping in the elastic modes:
The inherent damping ratios ( p ,s) are typically on the order of 0.001-0.01. The finite-element method cannot model inherent damping. This proportional damping term is customarilyadded after an undamped finite-element model is obtained. The translational and rotational positions z p and y p , at a location with coordinate vector R, are given by
and
zp
= z-Rxa+,q
yp
= a+5q,
(75.111) (75.1 12)
where the ith columns of and 3 represent the ith 3 x nq translational and rotational mode shapes at that location. Figure 75.17 shows the mode-shape plots for a finite-element model of a completely free, 100 ft. x 100 ft. x 0.1 in. aluminum plate. The plots were obtained by computing the elastic displacements at many locations on the plate, resulting from nonzero values of individual modal amplitudes qi . Flgure 75.107 shows the mode-shape plots for the 122-m diameter, hoop/column antenna concept [22), which consists of a deployable central mast attached to a deployable hoop by cables held in tension.
.
Moda
Figure 75.17
5 (0.0226 Hz)
ma* s
.
(0.0397 HZ)
Mode-shape plots for a completely free plate.
is the nwhere x = ( a T , d r T , q r , q ' l ., .. , qnq,QnY)' dimensional state vectar (n = 2nl) and u is the rm x 1 control vector consisting of applied torques.
(Ok and Ik denote the k x k null and identity matrices, respectively.)
Controllability'and Observability Precision attitude control is usually accomplished by torque actuators. Therefore, in the following material, only rotational equations of motion are considered. No force actuators are used. With fi = 6)and denoting 4 = [ a ? , ql,q2, . . .qnq]T , Equations 75.107 and 75.108 can be written as
Ag +
+?t = r T u
= diag(J, I n , ) , B = diag(03, D),
where #: represents the kth row of aT. If a three-axis attitude sensor is placed at a location where the rotational mode shapes are given by the rows of the nq x 3 matrix Y T ,the sensed attitude (ignoring noise) would be
(75.113) = diag(O3, A) (75.114)
+
where = [a:,0 ; . . . .a;], = nq 3, and m = 3 p . The system can be written in the state-space form as
The conditions for controllability are given below. ControllabilityConditions The system given by Equation 75.1 16 is controllable if, and only if, (iff) the following conditions are satisfied: 1. Rows of @T corresponding to each distinct (in frequency) elasticmodehave at least one nonzero entry.
THE CONTROL HANDBOOK
f
Mode 2 : 1 . 3 5 radlsec
( f i r s t bending,
p r hmp lvppon obkr
x-z
plane)
r
Bode 3: 1 . 7 0 r a d l s e c (first bending. Y-Z plane)
t'
Figure 75.18
(surface torsion)
mode 5 : 4 . 5 3 r a d l s e c
(second bending,
t'
i'
Mode 4 : 3 . 1 8 r a d l s e c
Y-Z plane)
/(second bending, x-2 p l a n e )
Made 8 : 6 . 8 4 r a d l s e e
Typical mode shapes of hooplcolumn antenna.
2. If there are v elastic modes with the same natural
frequency iZ,the corresponding rows of ( linearly independent set.
P form ~
a
A proof ofthis result can be found in [ 121. Condition ( I ) would be satisfied iff the rotational mode shape (X, Y, or Z component) for each mode is nonzero at the location of at least one actuator. Condition (2) needs to be tested only when there is more than one elastic mode with the same natural frequency, which can typically occur in the case of symmetric structures. Similar necessary and sufficient conditions can be obtained analogously for observability. It should be noted that the rigidbody modes are not observable using attitude-rate sensors alone without attitude sensors. However, a three-axis attitude sensor can be sufficient for observabilityeven if no rate sensors are used. Problems in Controller Design for FSS Precision attitude control requires controlling the rigid rotational modes and suppressing the elastic vibration. The objectives of the controller are
I. fast transient response: Quickly damp out the pointing errors resulting from step disturbances such as thermal distortion resulting from entering or leaving Earth's shadow or nonzero initial conditions, resulting from the completion of a large-angle attitude maneuver.
2. disturbance rejection: Maintain the attitude as close as possible to the desired attitude in the presence of noise and disturbances.
The first objective translates into the closed-loop bandwidth requirement, and the second translates into minimizing the RMS pointing error. In addition, the elastic motion must be very small, i.e., the RMS shape distortions must be below prescribed limits. For applications such as large communications antennas, the typical bandwidth requirement is 0.1 radlsec., with at most a 4 sec. time constant for all of the elastic modes (closed loop). Typical allcwable RMS errors are 0.03 degrees pointing error, and 6-mm surface distortion. The problems encountered in designing an attitude controller are
75.3. CONTROL OF FLEXIBLE SPACE STRUCTURES the attitude controller must be a "robust" one, that is, it must at least maintain stability, and perhaps performance, despite modeling errors, uncertainties, nonlinearities, and component failures. The next two sections present linear controller design methods for the attitude control problem.
1. An adequate model of an FSS is ofhigh order because it contains a large number of elastic modes; however, a practically implementable controller has to be of sufficiently low order. 2. The inherent energy dissipation (damping) is very small. 3. The elastic frequencies are low and closely spaced. 4. The parameters (frequencies, damping ratios, and mode shapes) are not known accurately.
Model-Based Controllers Consider the nth order state space model of an FSS in the presence of process noise and measurement noise,
The simplest controller design approach would be truncating the model beyond a certain number of modes and designing a reduced-order controller. This approach is routinely used for controlling relatively rigid conventional spacecraft, wherein only the rigid modes are retained in the design model. Second-order filters are included in the loop to attenuate the contribution of the elastic modes. This approach is not generally effective for FSS because the elastic modes are much more prominent. Figure 75.19 shows the effect of using a truncated design model. When constructing a control loop around the "controlled" modes, an unintentional feedback loop is also constructed around the truncated modes, which can make the closed-loop system unstable. The inadvertent excitation of the truncated modes by the input and the unwanted contribution of the truncated modes to the sensed output were aptly termed by Balas [4] as "control spillover" and "observation spillover': respectively. The spillover terms may cause performance degradation and even instability, leading to catastrophic failure.
- - - - - - - - - - - - - - -1
Conmlled
I I.
Spillover
I
Spillover 0
Sensor Output
Figure 75.19
Control and observation spillover.
In addition to the truncation problem, the designer also lacks accurate knowledge of the parameters. Finite-element models give reasonably accurate estimates of the frequencies and mode shapes only for the first few modes and can provide no estimates of inherent damping ratios. Premission ground testing for parameter estimation is not generally possible because many FSS are not designed to withstand the gravitational force (while deployed), and because the test facilities required, such as avacuum chamber, would be excessively large. Another consideration'in controller design is that the actuators and sensors have nonlinearities, and finite response times. In view of these problems,
where v ( t ) and w (t) are, respectively, then x 1 and 1 x '1 process noise and sensor noise vectors. v and w are assumed to be mutually uncorrelated, zero-mean, Gaussian white noise processes with covariance intensity matrices V and W . A linear quadratic Gaussian (LQG) controller can be designed to minimize
1
If
J = lim if +co
'f'
+
[ x T ( i ) ~ x ( t ) uT(r)llu(rj]dr (75123)
eT
where "Endenotes the expectation operator and Q = > 0, R = llT > 0 are the state and control weighting matrices. The resulting nth order controller consists of a Kalman-Bucy filter (KBF) in tandem with a linear quadratic regulator (LQR) with the form,
where 2 is the controller state vector, and G,, ,,, and K, 1. are the LQR and KBF gain matrices. Any controller using an observer and state estimate feedback has the same mathematical structure. The order of the controller is n, the same as that of the plant. An adequate model of an FSS typically consists of several hundred elastic modes. However, to be practically implementable, the controller must be of sufficiently low order. A reduced-order controller design can be obtained in two ways, either by using a reduced-order "design model" of the plant or by obtaining a reduced-order approximation to a high-order controller. The former method is used more widely than the latter because high-order controller design relies on the knowledge of the high frequency mode parameters, which is usually inaccurate. A number of model-order reduction methods have been developed during the past few years. The most important of these include the singular perturbation method, the balanced truncation method, and the optimal Hankel norm method (see [7] for a detailed description). In the singular perturbation method, higher frequency modes are approximated by their quasi-static representation. The balanced truncation method uses a similarity transformation that makes the controllability and observability matrices equal and diagonal. A reduced-order model is then obtained by retaining the most controllable and observable state variables. The optimal Hankel norm approximation method aims to minimize the Hankel norm of the approximation error and can yield a smaller error than the balanced truncation method. A disadvantage of the balanced truncation and the Hankel norm methods is that the resulting (transformed) state vari- , ables are mutually coupled and do not correspond to individual
THE CONTROL HANDBOOK modes, resulting in the !oss of physical insight. A disadvantage of the singular perturbation and Hankel norm methods is that they can yield non-strictly proper reduced-order models. An alternate method of overcoming these difficulties is to rank the elastic modes according their contributions to the overall transfer function, in the sense of Hz,H,, or L1 norms [9]. The highest ranked modes are then retained in the design model. This method retains the physical significance of the modes and also yields a strictly proper model. Note that the rigid-body modes must always be included in the design model, no matter which order-reduction method is used. A model-based controller can then be designed based on the reduced-order design model. L w Controller An LQG controller designed for the reduced-order design model is guaranteed to stdbilize the nominal design model. However, it may not stabilize the full-order plant because of the control and observation spillovers. Some time-domain methods for designing spillover-tolerant, reduced-order LQG controllers are discussed in [12]. These methods basically attempt to reduce the norms of spillover terms 11 Bt G 11 and 11 K Ct 11, where Bt and Cl denote the input and observation matrices corresponding to the truncated modes. Lyapunov-based sufficient conditions for stability are derived in terms of upper bounds on the spillover norms and are used as guidelines in spillover reduction. The controllers obtained by these methods are generally quite conservative and also require the knowledge of the truncated mode parameters to ensure stability. Another approach to LQG controller design is the application of multivariable frequency-domain methods, wherein the truncated modes are represented as an additive uncertainty term A P ( s ) that appears in parallel with the design model (i.e., nominal plant) transfer function P(s), as shown in Figure 75.20.
Additive uncertainty
the stability test Equation 75.125 for the 122-m hoop/column antenna where the design model consists of the three rigid rotational modes and the first three elastic modes.
Uncertainty Enve 1ope
Frequency, radlsec Figure 75.21
Stability test for additive uncertainty.
A measure of the nominal closed-loop performance is given by the bandwidth of the closed loop transfer function, G,l = P C ( I PC)-', shown in Figure 75.22 for the hooplcolumn antenna.
+
Singular values
-10-l
loo
lo1
Frequency, radlsec
Figure 75.22
Figure 75.20 ics.
Additive uncertainty formulation of truncated dynam-
A sufficient condition for stability is [7]:
+
a [ A P ( j w ) ] .= l / a ~ ~ ( j w ) [ ~( ~( j w ) ~ ( j w ) ~ - l } , for
05w
t
oo,
(75.125)
where C(s) is the controller transfer function and F[.] denotes the largest singularvalue. An upper boundonF[AP(jw)] can be obtained from (crude knowledge of) the truncated mode parameters to generate an "uncertainty envelope". Figure 75.21 shows
Closed-loop transfer function.
The results in Figures 75.21 and 75.22 were obtained by iteratively designing the KBF and the LQR to yield the desired closed-loop bandwidth while still satisfying Equation 75.125. The iterative method, described in 1121, is loosely based on the LQGILoop Transfer Recovery (LTR) method [23]. The resulting design is robust to any truncated mode dynamics which lie under the uncertainty envelope. However, the controller may not provide robustness to parametric uncertainties in the design model. A small uncertainty in the natural frequencies of the design model (i.e., the "controlled modes") can cause closedloop instability because the very small open-loop damping ratios cause very sharp peaks in the frequency response, so that a small error in the natural frequency produces a large error peak in the frequency response [12]. H,- and p-Synthesis Methods The H, controller design method [7] represents a systematic
I
75.3. CONTROL OF FLEXIBLE SPACE STRUCTURES method for obtaining the desired performance as well as robustness to truncated mode dynamics. A typical design objective is minimizing the H, norm of the frequency-weighted transfer function from the disturbance inputs (e.g., sensor and actuator noise) to the controlled variables, while insuring stability in the presence of truncated modes. An example of the application of the H, method to FSS control is given in [17]. The problem also can be formulated to include parametric uncertainties represented as unstructured uncertainty. However, the resulting controller design is usually vgry conservative and provides inadequate performance. The structured singular value method [ 6 ] ,[211 also known as the "p-synthesis" method, can olrercome the conservatism of the H , method. In this method, the parametric uncertainties are individually "extracted from the system block didgram and arranged as a diagonal block that forms a feedback connection with the nominal closed-loop system. The controller design problem is formulated as one of H,-norm minimization subject to a constraint on the structured singular value of an appropriate transfer function. The p-synthesis problem can also be formulated to provide robust performance, i.e., the performance specifications must be satisfied despite model uncertainties. An application of the p-synthesis method to FSS control is presented in [ 3 ] . The next section presents a class of controllers that can circumvent the problems due to spillovers and parametric uncertainties.
plane, and the poles on the imaginary axis are simple and have nonnegative definite residues. It can be shown that it is sufficient to check for positive semidefiniteness of T ( s ) only on the imaginary axis (s = j w , 0 5 19< cm),i.e., the condition becomes T (jw) T* (jw) 2 0, where *denotes complexconjugate transpose. Suppose ( A , B , C , D ) is an nth order minimal realization of T ( s ) . From [ 11, a necessary and sufficient condition for T ( s ) to be positive real is that an n x n symmetric positive definite matrix P and matrices W and L exist so that
+
This result is also generally known in theliterature as the KalmanYakubovich lemma. Positive realness of G 1 ( s )gives rise to a large class of robustly stabilizing controllers, called dissipative controllers. Such controllers can be divided into static dissipative and dynamic dissipative controllers. Static Dissipative Controllers Consider the proportional-plus-derivative control law,
where Gp and 6 , are symmetric positive-definite matrices. The closed-loop equation then becomes
Passivity-Based Controllers Consider the case where an attitude sensor and a rate sensor are collocated with each of the p torque actuators. Then them x 1 (m = 3 p ) sensed attitude and rate vectors yp and y, are given by
The transfer function from U ( s ) to Y p ( s )is G ( s ) = G 1 ( s ) / s , where G 1 ( s )is given by
where the m x 3 matrix Z = ( 1 3 , 1 3 , . . . 131T, and 4: denotes the ith row of @T. The entries of @IT represent the rotational mode shapes for the ith mode. An important consequence of collocation of actuators and sensors is that the operator from u to y,. is passive [5], or equivalently, G 1 ( s )is Positive-Real as defined below.
DEFINITION 75.1 A rational matrix-valued function T ( s ) of the complex variable s is said to be positive-real if all of its elements are analytic in Re[s] > 0, and T ( s ) T~ ( s t ) 2 0 in Re [ s ]> 0,where * denotes the complex conjugate.
+
Scalar positive-real (PR) functions have a relative degrae (i.e., the difference between the degrees of the denominator and numerator polynomials) of -1, 0, or 1 [24].PR matrices have no transmission zeros or poles in the open right-half of the complex
+
+
) (? r T G p r )are It can be shown that (B r T ~ , rand positive-definite matrices, and that this control law stabilizes the plant G ( s ) , i.e., the closed-loop system of Equation 75.130 is asymptotically stable (see [12]). The closed-loop stability is not affected by the number of truncated modes or the knowledge of the parametric values, that is, the stability is robust. The only requirements are that the actuators and sensors be collocated and that the feedback gains be positive definite. Furthermore, if GI,, G , are diagonal, then the robust stability holds even when the actuators and sensors have certain types of nonlinear gains. Stability in the Presence of Actuator and Sensor Nonlinearities Suppose that G p ,6 , are diagonal, and that 1. the actuator nonlinearities, @,, (v), are monotoni-
cally nondecreasing and belong to the (0, oo) sector, i.e., @,, (0) = 0, and v@'~,(v) r 0 for v # 13. 2. the attitude and rate sensor nonlinearities, @, ( v ) , belong to the (0, co)sector. Then the closed-loop system with the static dissipative control law is globally asymptotically stable. A proof of this result can be obtained by slightly modifying the results in [12]. Examples of permissible nonlinearities are shown in Figure 75.23. It can be seen that actuator and sensor saturation are permissible nonlinearities which will not destroy the robust stability property.
THE CONTROL HANDBOOK Permissible actuator nonlinearity
ermissible sensor nonlinearity
Figure 75.23
Permissible actuator and sensor nonlinearities.
Some methods for designing static dissipative controllers are discussed in [12].In particular, the static dissipative control law minimizes the quadratic performance function,
Stabilitywith Dynamic Dissipative Controller Suppose that the controller transfer function K(s) has no transmission zeros at the origin and that C(s) = K(s)/s is MSPR. Then K(s) stabilizes the plant G(s). This condition can also be stated in terms of [ A k ,Bk , Ck, D L ] , a minimal realization of K(s) [ 1 4 ] . The stability property depends only on the positive realness of G'(s), which is a consequence of actuator-sensor collocation. Therefore, just as in the case of the static dissipative controller,the stabilityproperty holds regardless of truncated modes or the knowledge of the parameters. Controllers which satisfy the above property are said to belong to the class called "dynamic dissipative" controllers. The condition that K(s)/s be MSPR is generally difficult to test. However, if K(s) is restricted to be diagonal, i.e., K(s) = diag[lCl (s), . . . , K,(s) J, the condition is easier to check. For example, for the diagonal case, let Ki (s) = ki
,+ ++ Boi
s2 s
Blis
+aliS
aoi
(75.131)
It is straightforward to show that K(s)/s is MSPR if, and only if, , Bli are positive, (for i = 1, . . . , m ) , ki, a o i , a ~ iDoi, This performance function can be used as a basis for controller design. Another approach for selecting gains is to minimize the norms of the differences between the actual and desired values of the closed-loop coefficient matrices, (I! F * G , ~ )and r r ~ ~(see r [) 1 2 ] ) . The performance of static dissipative controllers is inherently limited because oftheir restrictive structure. Furthermore, direct output feedback allows the addition of unfiltered sensor noise to the input. These difficulties can be overcome by using dynamic dissipative controllers, which also offer more design freedom and potentially superior performance. Dynamic Dissipative Controllers The positive realness of G1(s) also permits robust stabilization of G(s) by a class of dynamic compensators. The following definition is needed to define this class of controllers.
+
(e+
DEFINITION 75.2 A rational matrix-valued function T (s) of the complex variable s is said to be marginally strictly positivereal (MSPR) ifT(s) isPR, and T(jw) + T*(jw) > 0 for w E (-00,
m).
This definition of MSPR matrices is a weaker veraion of the definitions of "strictly positive-real (SPR)" matrices which have appeared in the liter'ature [18].The main difference isthat MSPR matrices can have poles on the imaginary axis. It has been shown in [13]that the negative feedback connection of a PR system and an MSPR system is stable, i.e., the composite system consisting of minimal realizations of the two systems is asymptotically stable (or equivalently, one system "stabilizes" the other system). Consider an m x m controller transfer function matrix K(s) which uses the position sensor output yp ( t ) to generate the input ~ ( t )The . following sufficient condition for stability is proved in [141.
For higher order Xis, the con\ditionson the polynomial coefficients are harder to obtain. One systematicprocedure for obtaining such conditions for higher order controllers is the application of Sturm's theorem [24]. Symbolic manipulation codes can then be used to derive explicit inequalities similar to Equations 75.132 and 75.133. Using such inequalities as constraints,the controller design problem can be posed as a constrained optimization problem which minimizes a given performance function. An example design of a dynamic dissipative controller for the hoop/column antenna concept is presented in 1141, wherein diagonal K(s) is assumed. For the case of fully populated K(s), however, there are no straightforward methods and it remains an area of future research. The preceding stability result for dynamic dissipative controllers can be used to show that the robust stability property of the static dissipative controller is maintained even when the actuators have a finite bandwidth: 1. Forthe static dissipative controller (Equation 75.129),
suppose that Gp and Gr are diagonal with positive entries (denoted by subscript i), and that actuators represented by the transfer function GAi(s) = are present in the i t h control channel. Then the closed-hop system is asymptotically stable if Gri > Gpi/ai (for i = 1, ..., m ) . 2. Suppose that the static dissipative controller also includes the feedback of the acceleration y,, (= r g ) , that is,
75.3. CONTROL OF FLEXIBLE SPACE STRUCTURES
where G p ,Gr, and G, are diagonal with positive entries. Suppose that the actuator dynamics for the i t h input . channel are given by GAi(s) = ki /(s2 pis vi), with ki, p i , vi positive. Then the closedloop system is asymptotically stable if
+
+
central body has a large mass and moments of inertia as compared to any other appendage bodies. As a result, the motion of the central body is small and can be assumed to be in the linear range. For this special case, the robust stability results are given for linear static as well as dynamic dissipative compensators. The effects of realistic nonlinearities in the actuators and sensors are also considered.
Because of the requirement that K(s)/s be MSPR, the controller K(s) is not strictly proper. From a practical viewpoint, it is sometimes desirable to have a strictly proper controller because it attenuates sensor noise as well as high-frequency disturbances. Furthermore, the rnost common types of controllers, which include the LQG as well as the observerlpole placement controllers, are strictly proper (they have a first-order rolloff). It is possible to realize K(s) as a strictly proper controller wherein both yp and y, are utilized for feedback. Let [Ak,Bk, Ck, Dk] be a minimal realization of K(s) where Ck is of full rank. Strictly Proper Dissipative Controller The plant with ,yp and y, as outputs is stabilized by the controller given by
Branches
Figure 75.24
where the nk x m(nk L m) matrix L is a solution of
Equation 75.136 represents m2 equations in mnk unknowns. If m < nk (i.e., the compensator order is greater than the number of plant inputs), there are many possible solutions foy L. The solution which minimizes the Frobenius norm of L is
If m = nk, Equation 75.136 gives the unique solution, L = c,' D ~ . The next section addresses another important class of systems, namely, multibody flexible systems, which are described by nonlinear mathematical models.
75.3.3 Multibody Flexible Space Systems This section considers the problem of controlling a class of nonlinear multibody flexible space systems consisting of a flexible central body with a number of articulated appendages. A complete nonlinear rotational dynamic model of a multibody flexible spacecraft is considered. It is assumed that the model configuration consists of a branched geometry, i.e., it has a central flexible body to which various flexible appendage bodies are attached (Figure 75.24). Each branch by itself can be a serial chain of structures. The actuators and sensors are assumed to be collocated. The global asymptotic stability of such systems is established using a nonlinear feedback control law. In many applications, the
Multibody flexible system.
Mathematical Model Equations of Motion Consider a spacecraft (shown in Figure 75.24) consisting of a central flexible body and a chain of (k - 3) flexible links. (Although a single chain is considered here, all the results are also valid when several chains are present.) Using the Lagrangian formulation, the following equations of motion can be obtained (see [15] for details):
where @ = {wT,BT, riTIT; co is the 3 x 1 inertial angular velocity vector (in body-fixed coordinates) for the central body; 0 = (01,02, .., 01k-3))T, where 0, denotes the joiqt angle for the ith joint expressed in body-fixed coordinates; q is the (n - k) vector of flexible degrees of freedom (modal amplit u d e s ) ; ~= (yT, O T , q T ) T , a n d p = co; M(p) = M ~ ( >~Ois) the configuration-dependent, mass-inertia matrix, and K is the symmetric positive-definitestiffness matrix related to the flexible degrees of freedorn. C(p, p) corresponds to Coriolis and centrifugal forces; D is the symmetric, positive semidefinitedamping matrix; B = [ I k x k Okx(,,-k) ] is the control influence matrix and u is the k-vector of applied torques. The first three components of u represent the attitude control torques (about the X-, Y-, and Z- axes) applied to the central body, and the remaining components are the torques applied at the (k - 3) joints. K and D are symmetric, positive-semidefinite stiffness and damping matrices,
THE CONTROL HANDBOOK It can be also shown [ l o ]that the quaternion obeys the following kinematic differential equations:
-
~ = ~ ( w x t i + a 4 ~ and ) , 8 4 = -;wTti.
where K and D are symmetric positive definite. The angular measurements for the central body are Euler angles (not the vector y), whereas the remaining angular measurements between bodies are relative angles. In deriving equations of motion, it is assumed that the elastic displacements are small enough to be in the linear range and that the total displacement can be obtained by the principle of superposing rigid and elastic motions. One important inherent property (which will be called "Property S " ) of such systems crucial to the stability results is given next. Property S: For the system represented by Equation 75.138, the matrix (; M - C ) is skew-symmetric. The justification of this property can be found in [15]. The central body attitude (Euler angle) vector g is given by E(g)fj = w, where E(g) is a 3 x 3 transformation matrix [ 8 ] . The sensor outputs consist of three central body Euler angles, the (k - 3 ) joint angles, and the angular rates, i.e., the sensors are collocated with the torque actuators. The sensor outputs are then given by y,, = B @
and
y, = B p
(75.140)
where i, = (gT, OT, q T ) T in which g is the Euler angle vector for the central body. y,, = (gT, OT)T and y, = ( a T , BTlT are measured angular position and rate vectors, respectively. It is assumed that the body rate measurements, w , are available from rate gyros. Using Property S , it can be proved that the operator from u to y, is passive [14]. Quaternion as a Measure of Attitude The orientation of a free-floating body can be minimally represented by a three-dimensional orientation vector. However, this representation is not unique. One minimal representation commonly used to represent the attitude is Euler angles. The 3 x 1 Euler angle vector g is given by E (q)fi = w , where E(g) is a 3 x 3 transformation Matrix. E (g) becomes singular for certain values of g. However, the limitations imposed on the allowable orientations due to this singularity are purely mathematical without physical significance. The problem of singularity in threeparameter representation of attitude has been studied in detail in the literature. An effective way of overcoming the singularity problem is to use the quaternion formulation (see [ l o ] ) . The unit quaternion a is defined as follows.
& = (&*, 22. &31T is the unit vector-alongthe eigen-axis of rotation and q5 is the magnitude of rotation. The quaternion is also subjected to the norm constraint,
(75.143) (75.144)
The quaternion representation can be effectivelyused for the central body attitude. The quaternion can be computed [ l o ]using Euler angle measurements (Equation 75.140). The open-loop system, given by Equations 75.138,75.143, and 75.144, has mula4,, , where the subtiple equilibrium solutions: script 's~'denotes the steady-state value (the steady-state value of q is zero). Defining p = (a4 - 1) and denoting p = v, Equations 75.138, 75.143, and 75.144 can be rewritten as
In Equation 75.145 the matrices M and C are functions of p, and (p, p), respectively. It should be noted that the first three elements of p associated with the orientation of the central body can be fully described by the unit quaternion. .The system represented by Equations 75.145-75.148 can be expressed in the state-space form as follows: x = f(x,u) (75.149) where x = (ijiT, D, O r , q T , vT)'. Note that the dimension of x is (2n I), which is one more than the dimension of the system in Equation 75.138. However, one constraint (Equation 75.142) is now present. It can be verified from Equations 75.143 and 75.144 that the constraint (Equation 75.142) is satisfied for all t > 0 if it is satisfied at t = 0 .
+
A Nonlinear Feedback Control Law Consider the dissipative control law u, given by
where j = { z T ,oTjT. Matrices G, and G, are symmetric positive-definite (k x k ) matrices and G, is given by
Note that Equations 75.150 and 75.151 represent a nonlinear control law. If G, and G, satisfy certain conditions, this control law renders, the time rate of change of the system's energynegative along all trajectories; i.e., it is a 'dissipative' control law. The closed-loop equilibrium solution can be obtained by equating all the derivatives to zero in Equatio~s75.138, 75.147, and 75.148. After some algebraic manipulations, there appear to be two equilibrium points in the state space. However, it can be shown [14]that they refer to a single equilibrium point.
75.3. CONTROL OF FLEXIBLE SPACE STRUCTURES If the objective of the control law is to transfer the state of the system from one orientation (equilibrium) position to another without loss of generality, the target orientation can be defined as zero and the initial orientation, given by [Z(O),a4(0), 0 (O)], can always be defined so that lei (011 r T,0 5 (r4(0) 1 (corresponding to 141 5 n )and [Z(O),a4(0)] satisfy Equation 75.142. The following stability result is proved in [14]. Stability Result: Suppose that GP2(k-3)x(k-3) and Gr (k xk) are symmetric and positive definite, and Gpl = pI3, where p > 0. Then, the closed-loop system given by Equations 75.149-75.151 is globally asymptotically stable (g.a.s.). This result states that the control law in Equation 75.150 stabilizes the nonlinear system despite unmodeled elastic mode dynamics and parametric errors, that is, a multibody spacecraft can be brought from any initial state to the desired final equilibrium state. The result also applies to a particular case, namely, single-body FSS, that is, this control law can bring a rotating spacecraft to rest or perform robustly stable, large-angle, restto-rest maneuvers. A generalization of the control law in Equations 75.150-75.151 to the case with fully populated Gpl matrix is given in [16]. The next section considers a special case of multibody systems. Systems i n Attitude-Hold Configuration Consider a special case where the central body attitude motion is small. This can occur in many realistic situations. For example, in the case of a space-station-based or shuttle-based manipulator, the moments of inertia of the base (central body) are much larger than that of any manipulator link or payload. In such cases the rotational motion of the base can be assumed to be in the linear region, although the payloads (or links) attached to it can undergo large rotational and translational motions and nonlinear dynamic loading due to Coriolis and centripetal accelerations. For this case, the attitude of the central body is simply y , the integral of the inertial angular velocity w, and the use of quaternions is not necessary. The equations of motion (Equation 75.138) can now be expressed in the state-space form as
wherey = { p T , p T l T , and p = ( Y T , O T , q T ] T .Note that M and C are functions of T, and hence the system is nonlinear. Stability with Dissipative Controllers The static dissipative control law u is given by
whereGI, and G, are constant symmetric positive definite (k x k) matrices, y,, = Bp and yr = Bp. (75.154)
-
Where yp and y, are measured angular position and rate vectors. The following result is proved in [14]. stability ~ ~ ~suppose ~ l that t :GnL and G~~~~ are symmetric and positive definite. Then the'"c1'osed-loop system given by
1325 Equations 75.152, 75.153, and 75.154 is globally asymptotically stable. The significance of the two stability results presented in this section is that any nonlinear multibody system belonging to these classes can be robustly stabilizedwith the dissipative control laws given. In the case of manipulators, this means that one can accomplish any terminal angular position (of the links) from any initial position with guaranteed asymptotic stability. Furthermore, for the static dissipative case, the stabdity result also holds when the actuators and sensors have nonlinearities. In particular, the stability result of the Section on page 1321 for the linear, single-body FSS also extends to the case of nonlinear flexible multibody systems in attitude-hold configuration, that is, the closed-loop system with the static dissipative controller is globally asymptotically stable in the presence of monotonically nondecreasing (0, m)-sector actuator nonlinearities, and (0, m)-sector sensor nonlinearities. This result is proved in [14]. For the more general case where the central body motion is not in the linear range, the robust stability in the presence of actuatorlsensor nonlinearities cannot be easily exlended because the stabilizing control law (Equation 75.150) is nonlinear. The robust stability with dynamic dissipative controllers (Section 75.3.2) can also be extended to the multibody case (in attitude-hold configuration). As stated previously, the advantages of using dynamic dissipative controllers include higher performance, more design freedom, and better noise attenuation. Consider the system given by Equation 75.152 with the sensor outputs given by Equation 75.154. As in the linear (case,consider a k x k controller K(s) which uses the angular position vector yp ( t ) to produce the input u(t). The closed-loop system consisting of nonlinear plant (Equation 75.152) and the controller K(s) is shown in Figure 75.25. K ( s ) is said to stabilize the nonlinear plant if the closed-loop system is globally asymptotically stable (with K(s) represented by its rninimal realization). The conditions under which K(s) stabilizes the nonlinear plant are the same as the linear, single-bodycase, discussed in the Section on page 1322, that is, the closed-loop system in Figure 75.25 is g.a.s. if K(s) has no transmission zeros at s = 0, and C(s) = is MSPR. A proof of this result is given in 1141. The proof does not make any assumptions regarding the model order or the knowledge of the parametric values. Hence, the stability is r o b ~ ~tos tmodeling errors and parametric uncertainties. As shown in Section 75.3.2, this controller can also be realized as a strictly proper controller that utilizes the feedback of both yp and y,.
+
JI
I Plant
I
L~lq i-i Figure 75.25
Stabilization with dynamic dissipative controller.
THE C O N T R O L H A N D B O O K
75.3.4 Summary This chapter has provided an overview of various control issues for flexible space structures (FSS), which are classified into single-body FSS or multibody FSS. The first problem considered is fine attitude pointing and vibration suppression of single-body FSS, which can be formulated in the linear, time-invariant setting. For this problem, two types of controllers, model-based and passivity-based ("dissipative"), are discussed. The robust sta'Jility properties of dissipative controllers are highlighted and design techniques are discussed. For the case of multibody FSS, a generic nonlinear mathematical model is presented and is shown to have a certain passivity property. Nonlinear and linear dissipative control laws which provide robust global asymptotic stability are presented.
References [ l ] Anderson, B. D. O., A System Theory Criterion for Positive-Real Matrices, SIAM J. Control, 5, 1967. (21 'Balakrishnan, A. V., Combined Structures-Controls Optimization of Lattice Trusses, Computer Methods Appl. Mech. Eng., 94, 1992. [3] Balas, G. J. and Doyle, J. C., Control of Lightly Damped Flexible Modes in the Controller Crossover Region, J. Guid. Control Dyn., 17(2), 370-377, 1994. [4] Balas, M. J., Trends in Large Space Structures Control Theory: Fondest Hopes, Wildest Dreams, IEEE Trans. Automat. Control, AC-27(3), 1982. [5] Desoer, C. A. and Vidyasagar, M., Feedback Systems: Input-Output Properties, Academic, New York, 1975. [6] Doyle, J. C., Analysis of Feedback SystemsWith Structured Uncertainty, IEEE Proc., 129D(6), 1982. [7] Green, M. and Limebeer. D. J. N., Linear Robust Control, Prentice Hall, Englewood Cliffs, NJ, 1995. [8] Greenwood, D. T., Principles of Dynamics, Prentice Hall, Englewood Cliffs, NJ, 1988. 191 Gupta, S., StateSpaceCharacterization andRobustStabilization of Dissipative Systems, D.Sc. Thesis, George Washington University, 1994. [lo] Haug, E. G., Computer-Aided Kinematics and Dynamics of Mechanical Systems, Allyn and Bacon Series in Engineering, 1989. [ I I ] Hyland, D. C., Junkins, J. L., and Longman, R. W., Active Control Technologyfor Large Space Structures, J. Guid. Control Dyn., 16(5), 1993. [12] Joshi, S. M., Control of Large Flexible Space Structures, Lecture Notes in Control and Information Sciences, Springer, Berlin, 1989, Vol. 131. 1131 Joshi, S. M. and Gupta, S., Robust Stabilization of Marginally Stable Positive-Real Systems. NASA TM109136, 1994. (141 Joshi, S. M., Kelkar, A. G., and Magharni, P. G., A Class of Stabilizing Controllers for Flexible Multibody Systems. NASA TP-3494, 1995. [15] Kelkar, A. G., Mathematical Modeling of a Class
of Multibody Flexible Space Structures. NASA TM109166,1994. Kelkar, A. G. and Joshi, S. M., Global Stabilization ofMultibody Spacecrafs Using Quaternion-Based Nan: linear Control Law, Proc. Am. Control Conf., Seattle, Washington, 1995. Lim, K. B., Magharni, P. G., and Joshi, S. M., Comparison of Controller Designs for an Experimental Flexible Structure, IEEE Control Syst. Mag., 3, 1992. Lozano-Leal, R. and Joshi, S. M., Strictly Positive Real Functions Revisited, IEEE Trans. Automat. Control, 35(1 l), 1243-1245, 1990. Meirovitch, L., Methods of Analytical Dynamics, MacGraw-Hill. New York. 1970. 1201 Nurre, G. S., Ryan, R. S., Scofield, H. N., and Sims, J. L., Dynamics and Control of Large Space Structures, J. Guid. Conrrol Dyn., 7(5), 1984. 1211 Packard, A. K., Doyle, J. C., and Balas, G. J., Linear Multivariable Robust Control With a p-Perspective, ASME J. Dyn. Meas. ei. Control, 115(2(B)), 426438, 1993. (221 Russell, R. A., Campbell, T. G., and Freeland, R. E., A Technology Development Program for Large Space Antenna, NASA TM-81902,1980. [23] Stein, G. and Athans, M., The LQGILTRProcedurefor Multivariable Feedback Control Design, IEEE Trans. Automat. Control, 32(2), 105-114, 1987. [24] Van Valkenberg, M. E., ~ntroductionto Modern Network Synthesis, John Wiley & Sons, New York, 1965.
75.4 Line-of-Sight Pointing and Stabilization Control System David Haessig,
GEC-Marconi Systems
Corporation, Wayne, NJ
75.4.1 Introduction To gain some insight into the functions of line-of-sight pointing and stabilization, consider the human visual system. It is quite apparent that we can control the direction of our eyesight, but few are aware that they have also been equipped with a stabilization capability. The vestibulo-occular reflex is a physiological mechanism which acts to fix the direction of our eyes inertially when the head is bobbling about for whatever reason [I]. (Try shaking your head rapidly while viewing this text and note that you can ~IX your eyesight on a particular word.) Its importance becomes obvious when you imagine life without it. Catching a football while runsing would be next to impossible. Reading would be difficult while traveling in a car over rough road, and driving dangerous. Our vision would appear jittery and blurry. This line-of-sight pointing and stabilization mechwism dearly improves and expands our capabilities. Similarly, many man-made systems have been improved by adding pointing and stabilization functionality. In the night vi-
1
75.4. LINE-OF-SIGHT POINTING AND-STABILIZATION CONTROL SYSTEM sion systems used extensively during the Persian Gulf War, image clarity and resolution, and their precision strike capability, were enhanced by the pointing and stabilization devices they contained. The Falcon Eye System [ l 11was developed in the late 1980s to build upon the strengths of the night vision systems developed earlier and used extensively during Desert Storm. The Falcon Eye differs from previous night vision systems developed for fighter aircraft in that it is head-steerable. It provides a high degree of night situational awareness by allowing the pilot to look in any direction including directly above the aircraft. This complicates the control system design problem because not only must the system isolate the line of sight from image blurring angular vibration, it also must simultaneously track pilot head motion. Nevertheless, this system has been successfully designed, built, extensively flight tested on a General Dynamics F-16 Fighting Falcon, and shown to markedly improve a pilot's ability to find and attack fixed or mobile targets at night. The Falcon Eye System is pictured in Figures 75.26 and 75.27. It consists of (1) a helmet-mounted display with dual optical combiners suspended directly in front of the pilot's eyes to permit a merging of Flir (forward looking infrared) imagery with external light, (2) a head angular position sensor consisting of a magnetic sensor attached to the helmet and coupled to another attached to the underside of the canopy, and (3) a Flir sensor providing three-axis control of the line-of-sight orientation and including a two-axis fine stabilization assembly. The infrared image captured by this sensor is relayed to the pilot's helmetmounted display, providing him with a realistic Flir image of the outside world. The helmet orientation serves as a commanded reference that the Flir line of sight must follow.
75.4.2 Overall System Performance Objectives The system's primary design goal was to provide a night vision capabilitythat works and feels as much like natural daytime vision as possible. This means the pilot must be able to turn his head to look in any direction. Also, the Flir scene cannot degrade due to the high frequency angular vibration present in the airframe where the equiprnent is mounted. The Falcon Eye Flir attempts to achieve these effects by serving as a buffer between the vehicle and the outside world, which must (1) track the pilot's head with sufficient accuracy so that lag or registration errors between the Flir scene and the outside world scene are imperceptible and (2) isolate the Flir sensor fromvehicle angular vibration which can cause the image to'appear blurry (in a single image frame) and jittery (jumping around from frame to frame). The pilot can select either a 1:1 or a 5.6:l level of scene magnification. Switching to the magnified view results in a reduction in the size of the scene fro~n22 x 30 degrees to the very narrow 4 x 4.5 degrees,hence the name narrowfield of view. When in narrow field-of-view made, the scene resolution is much finer, 250 ,mads, and therefore the stabilization requirements are tighter. This chapter focuses on the more difficult task of designing a
1327
controller for the stabilization assembly in narrow field-of-view mode.
Control System Structure The Falcon Eye's field of regard is equal to that of the pilot's. Consequently, the system rnust precisely control the orientation of the line of sight over a very large range of angles. It is difficult to combine fine stabilization and large field-of-regard capabilities in a single mechanical control effector. Therefore, these tasks are separated and given to two different parts of the overall system (see Figure 75.28). A coarse gimbal set accomplishes the task of achieving a large field of regard. A fine stabilization assembly attached to the inner gimbal acts as a vernier which helps to track commands and acts to isolate the line of sight from angular vibration, those generated by the vehicle and those generated within the coarse gimbal set. The coarse positioning gimbal system tracks pilot head position commands, but will fall short because of its limited dynamic response and because of the effects of torque disturbances within the gimbal drive systems. These torque disturbances include stiction, motor torque ripple and cogging, bearing torque variations, and many others. The stabilization system assists in tracking the commanded head position by (1) isolating the line-of-sight from vehicle vibration and (2) reducing the residual tracking error left by the gimbal servos. The reason for sending the command rates rather than the command angles to the stabilization system will be covered in the description of the stabilization system controller.
Control System Performance Requirements Ideally the Flir sensor's line of sight should perfectly track the pilot's line of sight. The control system must therefore limit the differencebetween these two pointing directions. (Recognize that the pilot's line of sight is defined here by the orientation of his head, not that of his eyes. A two-dimensional scene appears in the display and he is free to direct his eyes at any particular point in that scene, much like natural vision.) Two quantities precisely define the pilot'sline of slght in inertial space: vehicle attitude relative to inertial space 8, and pilot head attitude relative to the vehicle (8 is used for inertial variables and 4 for relative variables.) This is depicted in one dimension in Figure 75.29, i.e., Bp = 4p el,. The Flir, although connected to the aircraft, is physically separated from the pilot by a nonrigid structure which can exhibit vibrational motion relative to the pilot's location. This relative angular motion will be referred to as 4,. The Flir's line of sight 8f equals the vehicle attitude 8, plus vehicle vibration $,, plus the Flir line of sight relative to the vehicle 4, , i.e., 8f = 4, 4, 8,. The difference between the two is the error,
+
+ +
As Equation 75.155 indicates, for this error to equal zero, the Flir must track pilot head motion and move antiphass to vehicle vibration. Head orientation is therefore accurately described as a commanded reference input, and vehicle vibration as a disturbance.
THE CONTROL HANDBOOK
Figure 75.26 The Falcon Eye Flir, consisting of a helmet-mounted display, a head position sensor, a Flir optical system, and the line-of-sight position control system partially visible just forward of the canopy. (By courtesy of Texas Instruments, Defense Systems &Electronics Group.).
Figure 75.27 The Falcon Eye gimbals follow pilot head motion. An image stabilization assembly attached inside the ball helps @foilow the pilot's head and isolates the line of sight from angular vibration.
75.4. LINE-OF-SIGHT POINTING A N D STABILIZATION CONTROL SYSTEM
2::%2;:
-
{(] Frequem
Commands relative to Vehicle
Figure 75.28 Partitioning of the control responsibility: the gimbals track commands; the stabilization system helps to track commands and stabilizes the image.
Glmbal Angle relative to Veh~cle (due to torque disturbances
Llne of S ~ g h t Angle relatlve to lnertlal Space
910,
%
Vehicle Angle relative to I n e r t l d
6"
0
F r m m
Datum
Figure 75.29 Trackingof the pilot'sline of sight involvesisolation from vehicle vibration I$,, , tracking of vehicle maneuvering 8,, and tracking of pilot head motion relative to the vehicle The ideal response to various inputs to the system is shown in Figure 75.30. The output is the Flir line of sight angle relative to inertial space. The inputs consist of vehicle angle relative to inertial space at the Flir (8, = 6, I$,,), pilot head commanded angle relative to the vehicle, and a third input not mentioned thus far, the tracking error due to torque disturbances generated within the gimbals. The goal is to point the Flir sensor's line of sight continuously in the same direction as the pilot's. Therefore, pilot head commands are ideally tracked perfectly, as the upper plot indicates. However, because these commands are relative to the vehicle, the vehicle maneuver motion O,, which is generally low frequency, must also be tracked, as the lower plot depicts. At higher frequencies, vehicle angular vibration I$,, injects an input that affects the Flir sensor but not the pilot. So, as the lower plot indicates, the ideal response to vehicle vibration is complete attenuation. Between these frequency regimes there is a transition region. In this application, maneuver tracking must occur for frequencies up to about 0.3 Hz. The transition occurs between 0.3 and about 5 Hz, and good isolation performance is needed in the 3 io 30 Hz range. Beyond that, because there is little vibration energy at the higher frequencies, the transfer function from vehicle motion to line-of-sight motion can increase without significant penalty.
+
The center curve of Figure 75.30 is the ideal system response to torque disturbances generated within the gimbals. Ideally, internally generated tracking errors should be completely attenuated by the stabilization assembly. Thus the magnitude of this transfer function is zero, as shown. This is implied by the perfect command tracking ideal response depicted in the upper plot. However, it is shown separatelyto distinguish and clearly identify these two characteristics. The ideal performance requirements discussed thus far serve primarily to guide the specification of realistic, quantifiable performar.ce requirements defining allowable responses to specific
Figure 75.30 The response of an ideal line-of-sight control system to the various inputs.
input environments. For the Falcon Eye these fall into 4 areas: (1) stabilization error, the high frequency line-of-sight motion caused by vehicle vibration or other disturbancits, (2) tracking smoothness, the trackingerror due to internally generated torque disturbances, (3) command tracking, the error in tracking specifically defined head motion commands, and (4) maneuver tracking, the tracking error that occurs in response to specific maneuvers. In this chapter our evaluation of performance will cover the first two in detail. Stabilization Requirement Vehicle angular vibration produces line-of-sight motion at higher frequencies (above 5 Hz) that can degrade the resolution of the image. Resolution is not impaired by vibration, but is limited by the size of the pixel associated with the infrared (IR) detector, if the line of sight rms motion due to vibration is less than 114the size of I he pixel, which in the Falcon Eye is 250 prads in narrow field of view. The image stabilization assembly must allow no more than 62 prads rms of line-of-sight motion when subjected to vehicle angular vibration. Tracking Smoothness Requirement Testing has indicated that a pilot can see and is bothered by a jump or jitter in the image at intermediate frequencies (2 to 10 Hz) when they approach or exceed the sue of one pixel, i.e., 250 prads. Therefore, a smoothness of tracking requirement is applied which limits the peak-to-peak tracking error to less then 250 prads in the 2 to 10 Hz range when following constant-rate head motion. CornmandandlManeuver TrackingReqltirements In the next section we will see that the stabilization system must provide command and maneuver tracking bandwidths of greater than 30 and 0.3 Hz, respectively.
75.4.3 Physical System Description and Modeling The stabilization assembly, pictured in Figure 75.31, is rigidly attached to the inner gimbal of the coarse positioning system (see Figure 75.27). It produces small changes in the line-of-sight
THE CONTROL HANDBOOK direction by changing the angular position of the mirror relative to the base structure. The mirror is attached to the base by a flexurehinge, a solid but flexible structure which permits angular motion only about two-axes and applies a small restoring spring torque. Rotation is limited to f 5 p a d s (f1/3O), by hardstops under the mirror. Between the mirror and the base there is a two-axis torquer which applies the control signals. A two-axis angular position sensor, referred to as the pickoff sensor, measures the angular position of the mirror relative to the base. A two-axis angular rate sensor attached rigidly in the base senses angular rate, including the vibration disturbance that the stabilization assembly must counter. The rate sensor is oriented to detect any rotation about axes perpendicular to the line-of-sight, because only those alter the line-of-sight direction. While operating within the hardstops, each axis of the mirror is governed by the dynamic equation,
where 0, and Oh are the mirror and base inertial angular positions, respectively, and tm is the control torque. Other parameters are defined in Table 75.4. Unmodeled nonlinear dynamics such as centripetal, coriolis, and friction torques are small and neglected. The variable that we are interested in controlling is the lineof-sight angle @,,.This variable cannot be sensed directly but is related to other variables that are sensed. Consider the situation where the mirror is fixed to the base so that @rn = Om - 0b = 0. Then the line of sight follows the base exactly: 01,, = Bb. If the mirror rotates relative to the base and the base remains stationary, the line of sight moves h times farther due to optical effects (h is 2 for a typical mirror). Thus
Perfect inertial stabilization would be achieved by driving the mirror position $m to -eb/h. However, not everything in %bis to be rejected, only vehicle vibration, not vehicle maneuvers.
Exogenous Input Environments Mathematical descriptions of the input command and disturbance environments (exogenousinputs) are used in evaluating and designing the control system compensators. Models of the disturbance inputs serve as a basis for the structure of the design model and become embedded within the controller. The resulting designs were evaluated, analytically and through simulation, using these models to define or generate the inputs. Vehicle Vibration Vehicle angular vibration at the location where the Flir equipment is mounted was expected to be equal or less than that defined by the solid line in the spectrum of Figure 75.32. It has an rms angular acceleration and angular position of 4.4 rad/sec2 and 560 prads, respectively. The bulk of its energy is concentrated around 10 Hz and it extends from 5 to 200 Hz. Angular motion contributed beyond 200 Hz is negligible in magnitude. Angular motion below 5 Hz falls in the jitter range and does not have a blurring effect.
A stationary process having the spectrum of Figure 75.32 can be generated by passing unit variance white noise through a shaping fiIter having the transfer function,
where K , = 2.22 x cul = 94 radlsec, crz = 377 radlsec, cug = 56.5 radlsec, a4 = 188 radlsec, a s = 942 radlsec, CZ = 62.8 rad/sec, and { = 3.5. Twice integrating yields angular position,
The image stabilization requirement of 65 p a d s rms must be achieved when subjected to this input (see Figure 75.36). Vehicle Maneuvering A precise mathematical definition of all input environments is not always available to the control system designer, or if one is available, it may not accurately represent the input environment that will exist when the equipment is in use. It may be necessary, as was the case here, to estimate the environment. The F-16 is capable of very high dynamic maneuvering. However, a pilot would not be doing those types of maneuvers when using this equipment in the 5.6:l magnified, narrow field-of-view mode. (It is describedto belike flying through a straw.) Under these conditions, maneuvers would be held to a minimum. Therefore, an assumption was made that the bulk of maneuvering motion energy, when in narrow field of view, would be concentrated at low frequencies, between D.C. and 0.3 Hz. This consideration will impact the design of the LQG (linear quadratic guassian) controller. By selecting the filter weighting matrices, the poles of the Kalman filter associated with the maneuver state will be positioned at locations having a magnitude of 0.3 Hz. Pilot Head Motion Like maneuver motion, a good mathematical description of pilot head motion (when flying through the straw) was not availabie. One would think that the narrow field-of-view conditions would result in very little head motion, however, the pilots will glance down at their instruments, producing periods of moderately high rates of motion that must be tracked. Our assumption was that this head motion frequency content would extend out to about 3 or 4 Hz. Tight tracking of those inputs require bandwidths of 30 Hz or greater (well beyond the capabilities of the coarse positioning servo). Self-Induced Torque Disturbances There are many sources of internally generated error present in the typical coarse line-of-sight pointing system. These include motor cogging (also called slot-lock or detent), torque ripple, stiction of seals and bearings, bearing retainer friction, bearing disturbance torques, etc. Friction or stiction in the gimbals causes a hang-off error whenever the gimbals change direction or whenever motion is initiated after a p eriod of rest. Friction estimation andcompensation techniques, described in 121, [a],as well as in this handbook in Section 77.1 (Friction Compensation), are particularly important in systems that do not include a fine stabilization assembly. Tracking error due to cogging and torque ripple is of particular concern. Both are periodic functions of motor shaft angle
75.4. LINE-OF-SIGHT POINTING AND STABILIZATION CONTROL SYSTEM
Torquer Coils /
Figure 75.31 The two-axis stabilization assembly, consisting of a mirror, a two-axis angular rate Multisensor (Trademark of GEC Systems), mirror angular position sensors and torquers. Changes in line-of-sight direction are effected by tilting the mirror.
*jVeh~cleAngular Accelerat~onPower Density
.
.
, r
Frequency (Hz)
,.$-I
1 I 1 o0
.
.
I 1
Figure 75.33 The stabilization system helps attenuate tracking error caused by internally generated torque disturbances.
J
. . 1 oP
10'
10'
Frequency (HZ)
Figure 75.32 When subjected to this angular vibration environment (solid) having an amplitude of 560 p a d s rms, the image stabilization assembly cannot permit more than 65 p a d s rms ofline-of-sight motion. An approximation (dashed) is used for controller design.
which produce a noticeable jumping of the image when the pilot is steadily turning his head. The coarse positioning system will have a certain response to these disturbances as defined by their Torque Disturbance Sensitivity function, which is typically shaped like the curve in Figure 75.33. In the Falcon Eye, torque disturbances produce a worst-case gimbal tracking error of about 3 mrad peak to peak, far in excess of the 250 brad limit imposed by the IR detector resolution. The stabilization system therefore must attenuate those tracking errors by more than a factor of 12 for frequencies up to 10 Hz.
Stabilization Control System Structure and Modeling The architecture of the overall control system is shown in some detail in Figure 75.34. It represents a single axis of the two-axis system (coordinate transformations not shown). A discussion of the coordinate systems relating gimbals to mirror stabilization assembly is not provided because it will not help to understand this control system design effort. All that one needs to recognize is that coordinate frames used in designing the stabilization system controllers are attached locally to the base of the stabilization assembly, and that, in the actual implementation, transformations from the gimbal to the stabilization assembly coordinate frames, and back, were employed.
Linear Truth Model A linear model of one axis of the stabilization control system is given in Figure 75.35. It defines the stabilization system's response to the three inputs discussed in conjunction with Figure 75.30 and will be used in evaluating system performance with the compensators to be derived. @, and O,, to Transfer functions from those three inputs,
-
command rates
THE CONTROL HANDBOOK
n Stabil-
form:
1
I
lcy.ol Gimbals
I
Gimbal Angle Measurements Distuhance Torques
Figure 75.34
-
t I
ehicle Angle
75.4.4 Controller Design Design Procedure Our objective was to develop a compensator that achieves a performance approaching the ideal performance described with Figure 75.30. This was accomplished using the tools of Linear Quadratic Gaussian (LQG) optimal control theory. The design procedure, briefly, is as follows. It begins with the development of a design model capturing the essentialfeatures of the plant and exogenous inputs. This model serves as a basis for the design of a steady-state LQ regulator and Kalman filter, which combine to form alinear time-invariant dynamic compensator. The designer adjusts the regulator and filter weighting matrices (Q, R , V, and W )until a design is produced which achieves satisfactory system performance. This is described in greater detail below. 1. Design Model Definition The first step in the design process entails defining a linear state-space model of the process to be controlled. It becomes imbedded within the controller, therefore the simplest model that adequately describes the plant and exogenousdisturbance inputs should be used. In this application an adequate description is one which will enable the controller, given the measured inputs, to distinguish the various exogenous inputs from one another, and thereby to effect the proper, control action. The design model will be represented in the usual state-space
y
=
Cx+w,
E{vul} =
I
the line of sight angle 01,, are to be referred to as Tg(s), T,(s), and T,(s), respectively. The system parameters of this model appear in Table 75.4. Units N-m refer to Newton-meters. Although implemented digitally, the compensator is shown herein continuous-time form. It was possible to design and analyze the compensator in the continuous-time domain because the fastest time constants of the closed-loop system were significantly less (e.g., 4 times) than the sample frequency (2700 Hz). Nevertheless, delays due to processing and the sample and holding (S/H) of data, and other performance limiting effects were included in the model. These include a 2 kHz, three-pole, lowpass Butterworth antialias filter A(s), and a 1 kHz, three-pole low-pass Butterworth reconstruction filter R ( s ) on the control output from the D/A converter.
= Ax+Bu+v,
(75.160)
where x , y, and u are the appropriately dimensioned state, measurement, and control vectors, respectively, all functions of time. The vectors v and w are independent, white Gaussian noise signals, having autocovariance matrices,
ev
Structure controller system.
f
V6(t),
and
E { w w l } = WS(r).
(75.161)
A , B , and C are constant coefficient system matrices. 2. Linear Quadratic Regulator Design
At this stage of the design process the state and control weighting matrices, Q and R , of the quadratic performance index,
are selected, and the controf matrix G of the linear control law u = -Gx
(75.163)
is derived in accordance with G = R-' B ' M , where M is the solution to the Control Algebraic Riccati Equation (see Chapter 36),
0 = MA
+ A'M
- MBR-'B'M
+ Q.
(75.164)
The coefficients of Q, the state weighting, are defined in accordance with the error signal to be minimized. This error will be some linear combination of state variables. The control weighting R is adjusted to achieve some other criterion. It usually suffices to weight each control independently. Then all of the off-diagonal coefficients in R are zero. The diagonal elements are typically chosen to achieve a bandwidth criterion, as was the case in this application, described below. 3. Kalman Filter Design The state vector x of the linear control law (Equation 75.163) is typically not completely known. Some states will not be measured directly, and those measured may be corrupted by noise. The steady-state Kalman filter
is enlisted to provide an estimate i of the actual state x . The designer specifies the coefficients in the weighting matrices, V and.^, which, along with the system matrices A and C and the Filter Algebraic Riciati Equation (also in Chapter 36),
define the filter gain K = PC' W - I . It is important to recognize that we are not designing a Kalman filter for accurately estimating the system state, but we are using
75.4. LINE-OF-SIGHT POINTING AND STABILIZATION CONTROL SYSTEM Vehicle 0" Angle
Base motion
Ob
Torque Disturbances
I Figure 75.35
I
Linear model used in evaluating stabilizationsystem performance. TABLE 75.4
Svmbol
Jm
Dm Km Tg
Te T@ T~ Tm h
Stabilization System Actuator and Error Source Parameters Value
3.5 x lo-' 4x .096 1.75 85 40 90 240 1.386
Parameter name
Units
~ - m - s ~mirror inertia N-m-s flexure damping N-mlrad flexure stiffness msec gyro signal latency psec
tracking error processing delay
psec
psec
command delay pickoff measurement processing delay
psec -
stab. compensator processing &- S/H delay optical scale factor
the Kalman filter as a synthesis t c ~for l designing a compensator. The matrices, V and W , are not viewed as descriptions of actual noise present in the system, but as design parameters, i.e., knobs, that the designer adjusts to shape the dynamic response of the filter and to achieve design goals. To simplifythe task of finding an acceptable design, one would like to work with the smallest set of parameters that provides the needed design freedom. To this end we assume that the state and measurement noise vectors, v and w , are not only independent of each other, but that the individual elements of v and w are independent. Then, only the diagonal elements of V and W are nonzero and are employed as adjustaLle design parameters. (There are cases when the off-diagonal elements must be adjusted to achieve certain design characteristics. An example is given in [4].) The filter and regulator combine to form a dynamic compensator having the matrix transfer function,
where
4. Evaluation of Performance
At this stage we verify that the design goals have been satisfied. In this application, this involved evaluating the performance predicted by the linear model of the system (Figure 75.35) with the newly derived controller. If the requirements are met, a more thorough evaluation can be performed by simulating the
operation of this controller in a detailed truth model of the system. This is a model that includes all of the important dynamic and nonlinear effects, the coordinate transformations, Coriolis and centripetal couplings, the gimbals, friction, backlash, vehicle and pilot head attitude data, etc. If satisfactory performance is achieved there, the design is complete. If not, a judgment is made as to what can be changed to improve performance, and the design process is repeated. This entails 1) going back to Steps 2 and 3, modifying one or ma .e weighting matrices, Q, R , V, and W, and generating a new compensator based on the same design model, or (2) going back to Step 1, altering the design model to incorporate some effect now understood to be important, and repeating the design. This continues until satisfactory performance with an adequate controller is achieved. In this application the stabilization performance was computed in accordance with the model of Figure 75.36, where Ve ( j o )is the shaping filter (see Equation 75.158) definingworstcase angular motion, and T v ( j o )is the stabilization system's response to vehicle angular motion inputs. The resulting line-of-sight motion due to angular vibration in the frequency range over which the stabilization requirement applies (5 to 200 Hz) is given by 2 ulos =
I
g
2~200
/Ir5
l ~ ~ ( j w ) ~ , ( j uwi d) 1w ~ (75.169)
where a: = 1. The response of the complete system to internal torque disturbances was generated in accordance with the model in Figure 75.37 where T g ( j m )is the stabilization system's response to
THE CONTROL HANDBOOK vehicle
white gaussian
- t-
line-of-sight motion
andar
vibration
nnise ..----
-
0"
no)
"
Model
T,(jo)
Y2
j
dodcl
Tracking Error Model
Figure 75.36 Model used to compute line-of-sight motion due to anrmlar vehicle vibration. " Gimbal Servo Sbbilization System Torque Disturbance at Gimbal *'
Torque Disturbance Sensitivity
el,,
Qg
'
TE lim ) ~ i ~ b ~ l Angle
+
Measurement
"2
d3
Tracking Error
Command Model "3
Line of Sight Angle
a
+~~~~i-j I Mensurement
$+
LS!
4
I 1 Vehicle Vibration Model
I
Command Rate
Measurement
Figure 75.37 Model used to compute line of sight motion due to torque disturbances. gimbal angular input (whose ideal value is depicted by the center curve of Figure 75.30). Because the gimbal servos are not described, this check will entail verifying that the stabilization system provides adequate isolation (i.e., > 12 in the 2 to 10 Hz range) from gimbal angular position errors due to torque disturbances.
Design Model Development The process of developing a design model is commonly one requiring several iterations. One reason for this is that it is highly unlikely, except in simple cases, that the designer will anticipate all of the important characteristics that should be in the design model in the first attempt. Secondly, it pays to begin with a simple model which may or may not be adequate, then to increase complexity until satisfactory performance is achieved, rather than beginning immediatelywith what may be an unnecessarily complicated, high-order model. The design models developed for this application all include the stabilization mirror dynamics (Equation 75.156) and some form of base motion model. (Recall that base motion is defined as motion about a local coordinate system attached to the base to which the mirror is mounted.) Since the controller must isolate the line of sight from vehicle vibration, the base motion model must include a model of that disturbance. The actual vibration model, defined by the shaping filter of Equation 75.158 is fifthorder and probably more compliceted than need be. A thirdorder approximation was developed for the purpose of the controller design. This model, when excited by white noise, yields a signal whose second derivative has the spectrum shown as a dashed line in Figure 75.32. It has an angular position output given by the transfer function, v;(s, =
+
+
I
,
,
(75.170)
where Go = 126 radlsec, G I = 62.8 radtsec, and (1 = 0.5. This model is part of the controller design model block diagram shown in Figure 75.38, depicting the design model in its final form after several stages of development. Those stages are described below. The first design model developed for this application involved a base motion model consisting of vehicle vibration only. The
Figure 75.38
Controller design model.
resulting controller did a fine job in isolating the line of sight from vibration. However, it knew nothing of vehicle maneuvers or of gimbal motion, and interpreted those large motions, when tested in simulation, as vibration to be rejected, driving the mirror into its stops. To remedy this, models of vehicle maneuvering and commanded gimbal motion were added to the design model. Both are modeled as double integrators because white noise through a double integrator produces a signal having characteristics expected of maneuver and head motion -infinite DC gain needed to represent constant turning rates and a rolling off with increased frequency. An important distinction between the command and maneuver signals is that the command is a known input while the maneuver signal is not. The command rate is therefore included as a measured output in the design model. A controller developed with this design model worked, in that it did not drive the mirror to the stops. However, stabilization performance was not quite adequate (85 prads rms, relative to the required 62.) Performance was being limited by the pure delay present in the gyro signal measurement. Improved6tabilization performance was achieved by adding a model of the delay to the design model. A first-order Pade' approximation to the delay incorporated in the design model increased the order of the compensator by one, but also improved stabilization performance sufficiently(to 55 prads rms), as described below. Although stabilization performance was now adequate, the tracking smoothness performance was not. Improved ability to isolate the line of sight from gimbal motion due to internally generated torque disturbanceswas achievedby adding the gimbal servo tracking error eg , as an input to the stabilization system compensator. Two additional integrator states were added to allow including the tracking error as an input to the compensator, and therefore as an output of the design model. The resulting Lesign model in its final form is shown in Figure 75.38.
75.4. LINE-OF-SIGHT POINTING A N D STABILIZATION CONTROL SYSTEM
1335
Two of the integrators in the design model are drawn with dashed lines. The regulator gains on those states turn out to be zero. Consequently, they have no effect on the output of the controller and can be eliminaQd from the design model without effect. The model and the resulting compensator are therefore tenth order. You will note that the command angle dc is no longer present as a state in the design model. And even if it were, it could not be used as a reference input to the stabilization system because the gain on that state is zero. The command rate is therefore used as the reference input defining pilot head motion. The ten-element state vector and four-element measurement vector are ordered as follows:
-
mirror inertial rate mirror angular position gimbal tracking error rate gimbal tracking error command rate vehicle vibration filter state vehicle vibration angular rate vehicle vibration angular position vehicle maneuver angular rate gyro delay model variable mirror pickoff angle gyro rate gimbal tracking error command rate
1
The system parameters contained in the design model are given in Table 75.5. The units of these parameters were changed from seconds to deciseconds (ds), adjusting the scaling of the system matrices to eliminate numerical problems which were precluding the solution of the Control and Filter AlgebraicRiccatiEquations. The controller gain and filter gain matrices were derived for this scaled system, and then scaled back to units of seconds prior to their usage. The corresponding system matrices are
The small negative entries (E = - 1 x lop5) on the diagonal of A and the small spring term (-.01) at A34, were also added to eliminate the numerical problems precluding the convergence of the Riccati equation solver.
LQ Regulator Design A quadratic performance integral was formed which contains both the error e that the regulator must drive to zero and the control action rm :
Because there is only a single control variable, the mirror torque Tm, the control weightmg, p , is scalar. An expression for the error e was derived from the tracking error expression given by Equation 75.155. The Flir line of sight relative to the vehicle #f is composed of the gimball relative angle r # ~ and ~ the line of sight'angle relative to the gimbals due to mirror rotation A#, :
To include gimbal tracking error in this expresslion,we express the gimbal angle in terms of the command and gimbal tracking error:
We also assume that the pilot's head attitude relative to the vehicle # p ofEquation 75.155 equals the command angle dC(i.e., latency and noise in the head tracker data is ignored). Combining this and Equations 75.155, 75:172, and 75.173 yields the error equation,
The state weighting matrix Q is formed to penalize this error expression. To compute Q easily, we express e in terms of the state
THE CONTROL HANDBOOK TABLE 75.5
Design Model Scaled Svstem Parameters.
Parameter Mirror inertia Jm Mirror hinge damping pm Mirror hinge stiffness K m Gyro delay parameter TKd Shaping filter parameter Shaping filter parameter S2 1 Shaping filter parameter (1
Value 3.54 X lo-' 4.0 X ,096 1/2(.175) 1.26 ,628 0.5
vector x : e = L x . With optical scale factor h = 1.386 (which applies in narrow field-of-view mode) this vector L becomes
Units ~-m-ds~ N-m-ds N-rn/rad ds
radlds radids -
noise variances, respectively, which yielded
W = diag[l
5
.1
16.71
(75.177)
The five nonzero diagonal coefficients of V , and with this the state weighting Q is computed as follows:
a
on It was necessary to add small positive terms ( E = 1 x the diagonal to makes Q slightly more positive definite, because, without them, the Ricatti Equation Solver halted when checking ' and mistaking Q for a negative-definite matrix. Only a single parameter, the control weighting p of the matrix R = [ p ] ,must be selected to define completely all of the matrices entering into the Control Algebraic Ricatti Equation. As p is decreased, the eigenvalues of the controlled plant, eig(s I - A BG), associated with the mirror dynamics, move to the left and approach asymptotes at 45' above and 45' below the negative real axis. The other eigenvalues, those associated with the exogenous inputs, d o not move because those modes are uncontrollable. The coefficient p was adjusted to place the mirror eigenvalues at a location having a magnitude of about 100 Hz. This enables the regulator to respond adequately to exogenous inputs having lesser frequency content, and yet does not move those eigenvalues into a frequency regime where unmodeled phase lag in the system begins to have a destabilizing effect. With R = [0.125] the controller matrix G (for the scaled system), computed via solution of the Control Algebraic Ricatti Equation, is
+
were selected by performing manual search over the fivedimensional paranleter space spbnned by the plant noise variances vl through us. At selected points within that space, control system compensators were computed and the performance of the stabilization system with that compensator was evaluated. This included examining how closely its performance approached the ideal (Figure 75.30),and the calculations necessary to determine if the stabilization and tracking smoothness requirements were met. This manual search did not proceed blindly, however. It was possible to predict how adjustments made to specific plant noise levels should alter closed-loop yerformance. For example, what would you expect the compensator to do if you tell it there is more noise driving the vibration portion of the design model by increasing v4? It should increase the degree of isolation against vehicle vibration. And in fact it does that, as shown in Figure 75.39. As v4 is increased, the transfer function from vehicle angular motion to line-of-sight motion deepens. This manual search settled upon
to which corresponds the Kalman gain matrix:
Filter Design There are a total of nine nonzero coefficients in V and W, all on the diagonal, which the designer must specify. To reduce the number of design parameters to a manageable number, the coefficients in W were set to fixedvalues 3:sportional to the actual noise variances of the associated measurements. The tracking error and command rate signals are not actual measurements but are digital signals fed forward from the gimbal servoes, so their variances were set to values consistent with their quantization levels. The elements of W were, therefore, set to values proportional to the pickoff, gyro rate, tracking error, and command rate
This matrix and the G matrix given above were unscaled, from units of deciseconds to seconds, before use.
75.4.5 Performance Achieved Here we examine how closely the stabilization system's response approaches the ideal response defined in Figure 75.30. Transfer functions T, ( s ) , T, ( s ) , Tg( s ) , from vehicle angle, command angle, and gimbal angle, respectively, to the line-of-sight angle, were
7 5 4 . LINE-OF-SIGHT POINTING AND STABILIZATION CONTROL SYSTEM G a ~ n- Vehlcle Mollon to LOS
10'
j
lo2 10'
1337
j
-
Gaan Gimbal Trrc*ma Euor l o LOS
I o0
10'
1
o2
10'
Frequency (Hz1
Figure 75.39 As noise variance u4 driving the vibration part of the plant is increased, the level of line-of-sight (LOS) stabilization against vibration improves, as shown here with uq = lo4 (dotted), 105 (solid), and lo6 (dashed). All other elements in V , W , Q , and R were not changed. derived in accordance with the linear model of Figure 75.35. The magnitudes ofthese transfer functions areplottedin Figure 75.40. Clearly these plots approach the ideal response curves given in Figure 75.30, primarily in the low to middle frequency regimes. In the upper plot, command tracking is unity (the ideal) to a frequency of 10 Hz, and a command tracking bandwidth of nearly 100 Hz is provided, which exceeds the Command Tracking Requirement defined earlier. The controller contributes a measure of peaking in the response to reduce phase lag at lower frequencies where tight command tracking is needed.
20'
1 on
10' FreQUMY(Hz)
G.m
- V.his*
w
10'
I$
n I0 LOS
As shown in the middle plot, the stabilization system achieves exceptionally good isolation from internally generated gimbal tracking errors. Fjnally, in thelower plot, which concerns the system's response to vehicle angular mbtion, there is adequate maneuver tracking with a bandwidth of just over 0.3 Hz, and there is a transition from maneuver tracking at low frequencies to isolation from high frequency vibration.
Stabilization Performance Th,e adequacy of the stabilization performance was verified by numerically evaluating Equation 75.169 with T U ( j w )given by the lower plot of Figure 75.40. This results in a[,, = 55 prads rms
which meets the 62 prads requirement.
Smoothness of Tracking Performance The center plot of Figure 75.40 clearly indicates that this requirement (greater than a factor of 12x, or 22 dB of attenuation in the 2 to 10 Hz range) is met by a large margin.
Figure 75.40 Response of the line-of-sight (LOS) ststbilizationsystem to the various inputs (compare to Figure 75.30).
75.4.6 Concluding Remarks This chapter describes the design of a controller for the lineof-sight pointing and stabilization system contained within the Falcon Eye, a head-steered night vision system developed for the F16 Falcon. The chapter begins by discussing the purpose and performance objectives of the overall night vision system. This is followed by a description of the electro-optical subsystem used to point and stabilize the line-of-sight, i.e., the plant, and of the disturbance inputs. The chapter then discusses tlhe performance
THE C O N T R O L H A N D B O O K requirements applying specifically to this subsystem. Finally, the design, using modern control techniques, of the controller that met those requirements is presented. Although only one was described, seven distinct controllers were developed for this system. Four were developed for the two-axis stabilization assembly - one per axis with different controllers applied in wide and narrow field-of-view modes and one for each cf the three gimbal servos, azimuth, elevation, and derotation. All were extensively and successfullyflight tested, as was the overall Falcon Eye night vision system. In fact, test pilots developed enough confidence in the system to use it at night when taking off, landing, and duringlow level flight (- 200 ft above ground) with the Falcon Eye as their only visual source.
75.4.7 Defining Terms Flir: An infrared imaging system typically directed to look forward in the direction of travel. Exogenous inputs: Inputs arising from states that are unaffected by the control.
References [ 11 Benson, A.J. and Barnes, G.R., Vision During Angular Oscillation: The Dynamic Interaction of Visual and Vestibular Mechanisms,Aviation, Space, andEnvironmental Medicine, 49(lSec.II), 1978. 121 Friedland, B. and Park, Y.J., On Adaptive Friction Compensation, Proc. 30th ZEEE Con$ Decision Control, Brighton, England, 1991, pp. 2899-2903. I31 Fuchs, C., et al., Two-axisMirrorStabilization Assembly, U.S. Patent 4 881 800, 1989. [4] Haessig, D., Selection of LQGILTR Weighting Matrices through Constrained Optimization, Proc. ACC, Seattle, WA, 1995, pp. 458-460. [5] Haessig, D. A., Mirror Positioning Assembly for Stabilizing the Line-of-Sight in a Two-Axis Line-of-Sight Pointing System, U.S. Patent 5 220 456, 1993. [6] Haessig, D. A., Image Stabilization Assembly for an Optical System, U.S. Patent 5 307 206, 1994. [7] Johnson, C.D. and Masten, M.K., Fundamental Concepts and Limitations in Precision Pointing and Tracking Problems, Acquisition, Tracking, and Pointing VII; SPIE Proc., 1.993, Vol. 1950, pp. 58-72. [8] Maqueira, B. and Masten, M.K., Adaptive Friction Compensation for Line-of-Sight Pointing and Stabilization, Proc. ACC, 1993, pp. 1942-46. [9] Masten, M.K.,, Electromechanical Systemsfor Optical Target Tracking Sensors. In Multitarget-Multisensor Tracking: Advanced Applications, Artech House, Norwood, MA, 1990, pp 32 1-360. [ l o ] Masten, M.K. and Sebesta, H.R., Line-of-Sight Stabi1izationlTracking Systems: An Overview, Proc. ACC, 2, 1987, pp. 1477-1482. [I 11 Scott, W.B., Falcon Eye Flir, GEC Helmet Aid F-16 Mission Flexibility, Aviation Week, April 17, 1989.
Control of Robots and Manipulators Marl< W. Spong Tlic Cooriliiilitcd Scieiicc Lriboratory, U i ~ i v e r s i t of y I l l i ~ l o ~(its U r / ) a ~ i ~ - C l ~ u ~ ~ ~ p i i ~ g i ~
Joris De S c h u t t e r I
Herman B r u y n i n c k x I < ~ i t l ~ o l ~U e knc~ v e r s i t e iLeuven, t D e p a r t n ~ e n tof Mechaiilcal E i ~ g ~ i ~ e c r iLeuven, ~ l g , Belgium
J o h n Ting-Yung W e n D q ' i ~ r t n i c ~o~f tElectricul, Co1111,uter,arld S y s t e ~ i l sEngineering Re~issclacrI'olytechnic 111stitnte
76.1 Motion Control of Robot Manipulators.. . . . . . . . . . . . . . . . . . . . . . . . . . . 1339 Introduction Dynamics PD Control Feedback Linearization Robust and Adaptive Control Time-Optimal Control Repetitive and Learning Control References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .13 . 51 . 76.2 Force Control of Robot Manipulators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1351 Introduction System Equations Hybrid ForceiPositionControl Adaptive Control Impedance Control References. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1358 .. 76.3 Control of Nonholonomic Systems.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1359 Introduction Test of Nonholonomy Nonholonomic Path Planning Problem Stabilization References.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1368
76.1 Motion Control of Robot Manipulators Mark Mr. Spong, The Coordinated Science Laboratory, University of lllinois at Urbana-Champaign 76.1.1 Introduction The design of intelligent, autonomous machines to perform tasks that are dull, repetitive, hazardous, or that require skill, strength, or dexterity beyond the capability of humans is the ultimate goal of robotics research. Examples of such tasks include manufacturing, excavation, construction, undersea, space, and planetary exploration, toxic waste cleanup, and robotic-assisted surgery. The field of robotics :s highly interdisciplinary, requiring the integration of control theory with computer science, mechanics, and electronics. The term robot has been applied to awidevariety ofmechanical devices, from children's toys to guided missiles. An important class of robots are the manipulator arms, such as the PUMA robot shown in Figure 76.1 These manipulators are used primarily in materials handling, welding, assembly, spray painting, grinding, deburring, and other manufacturing applications. This chapter discusses the motion control of such manipulators. Robot manipulators are basically multi-degree-of-freedom positioning devices. 'The robot, as the "plant to be controlled," is a multi-inputlmulti-output, highly coupled, nonlinear mechatronic system. The main challenges in the motion control problem are the complexity of the dynamics and uncertainties, both 0-8493-8570-9/96/$0.00+$.50 @ 1996 by CRC Press, Inc.
Figure 76.1
PUMA robot manipulator.
parametric and dynamic. Parametric uncertainties arise from impreciseknowledge of kinematic parameters and inertia parameters, while dynamic uncertainties arise from joint and link flexibility, actuator dynamics, friction, sensor noise, and unknown environment dynamics. There are a number of excellent survey articles on the control of robot manipulators from an elementary viewpoint [2], [4], [6]. These articles present the basic ideas of independent joint control using linearized models and classical transfer function analysis. That material is not repeated here. Instead, we survey more recent and more advanced material that su~rnmarizesthe work of many researchers from about 1985to the present. Many of the ideas presented here are found in the papers reprinted in
THE CONTROL HANDBOOK
[ 101, which is recommended to the reader who wishes additional details. The present chapter is accessible to anyone having an elementary knowledge of Lagrangian dynamics and the state-space theory of dynamical systems, including the basics of Lyapunov stability theory. The textbooks by Asada and Slotine [I] and Spong and Vidyasagar [ 111 may be consulted for the background necessary to follow the material presented here.
Configuration Space and Task Space
A
We consider a robot manipulator with n-links interconnected by joints into an open kinematic chain as shown in Figure 76.2. For simplicity of exposition we assume that all joints are rotational, or revolute. Most of the discussion in this chapter remains valid for robots with sliding or prismatic joints. The joint variables, ql , . . . , q,, are the relative angles between the links; for example, q, is the angle between link i and link i - 1. A vector q = (ql, . . . , qnlT, with each qi E [O, 21r) is called a configuration. The set of all possible configurations is called configuration space or joint space, which we denote as C. The configuration space is an n-dimensional torus, 7"= S' x .'. . x s', where S' is the unit circle. The task space, or end-effector space, is the space of all positions and orientations (called poses) of the end-effector (usually a gripper or tool). If a coordinate frame, called the base frame or world frame, is established at the base of the robot and a second frame, called the end-effector frame or task frame, is attached to the end-effector, then the end-effector position is given by a vector x E lR3 specifying the coordinates of the origin of the task frame in the base frame, and the end-effector orientation is given by a 3 x 3 matrix R whose columns are the direction cosines of the coordinate axes of the task frame in the base frame. This orientation matrix, R, is orthogonal, i.e. R~ R = I , and satisfies det(R) = +1. The set of all such 3 x 3 orientation matrices forms the specialorthogonal group, S 0 (3). The task space is then isomorphic to the special Euclidean group, SE (3) = IR3 x S 0 (3). Elements of S E (3) are called rigid motions [ l 11.
BASE FRAME
L J
~ i & e 76.2
' TASK FRAME
Configuration space and task space variables.
Kinematics Kinematics refers to the geometric relationship between the motion of the robot in joint space and the motion of the
end-effector in task space without consideration of the forces that produce the motion. The forward kinematics problem is to determine the mapping
from configuration space to task space, which gives the endeffector pose in terms of the joint configuration. The inverse kinematics problem is to determine the inverse of this mapping, i.e., the joint configuration as a function of the end-effector pose. The forward kinematics map is many-to-one, so that several joint space configurations may give rise to the same end-effector pose. This means that the forward kinematics always has a unique pose for each configuration, while the inverse kinematics has multiple solutions, in general. The kinematics problem is compounded by the difficulty of parameterizing the rotation group, SO(3). It is well-known that there does not exist a minimal set of coordinates to "cover" S0(3), i.e., a single set of three variables to represent all orientations in SO(3) uniquely. The most common representations used are Euler angles and quaternions. Representational singularities, which are points at which the representation fails to be unique, give rise to a number of computational difficulties in motion planning and control. Given a minimal representation for S0(3), for example, a set of Euler angles $, 8, p, the forward kinematics mapping may also be defined by a function
where x(q) E lR3 gives the Cartesian position of the end-effector and o(q) = [+(q), 8(q), $(q)lT represents the orientation of the end-effector. The nonuniqueness of the inverse kinematics in this case includes multiplicities due to the particular representation of SO(3) in addition to multiplicities intrinsic to the geometric structure of the manipulator. The velocity kinematics is the relationship between the joint velocities and the end-effectorvelocities. If the mapping fo from Equation 76.1 is used to represent the forward kinematics, then the velocity kinematics is given by
where Jo(q) is a 6 x n matrix, called the manipulator Jacobian, and vT = (vT, wT) represznts the linear and angular velocity of the end-effector. The vector v E IR3 is just $x(q), where x(q) is the end-effector position vector from Equation 76.1. It is a little more difficult to see how the angular velocity vector w is computed bince the end-effector orientation in Equation 76.1 is specified by a matsix R E SO(3). If w = (a,, wy, w , ) ~is a vector in Kt3, we may define a skew-symmetric matrix, S(w), according to
76.2. MOTION CONTROL OF ROBOT MANIPULATORS The set of all skew-symmetricmatrices is denotedby so(3). Now, if R(t) belongs to SO(3) for all t , it can be shown that
for a unique vector w (t) [ 111. The vector w (t) thus defined is the angular velocity of the end-effector frame relative to the base frame. Singularities in the Jacobian Jo(q), i.e., configurations where the Jacobian loses rank, are important configurations to identify for any manipulator. At singular configurations the manipulator loses one or more degrees-of-freedom. In many applications,it is important to place robots in work cells and to plan their motion in such a way that singular configurations are avoided. If the mapping f l is used to represent the forward kinematics, then the velocity kinematics is written as
where Jl(q) = afi/'aq is the 6 x 6 Jacobian of the function f l . Singularities in J1 include the representational singularities in addition to the manipulator singularities present in Jo. In the sequel, we use J to denote either the matrix Joor J1.
been precomputed. For background on the motion planning and trajectory generation problems the reader is referred to [lo].
SENSORS
Figure 76.3
Block diagram of the robot control problem.
76.1.2 Dynamics The dynamics of n-link manipulators are convenientlydescribed by Lagrangian dyxmics. In the Lagrangian approach, the joint variables, q = (ql , . . . , q,)T, serve as asuitable set of generalized coordinates. The kinetic energy of the manipulator is given by a symmetric, positive definite quadratic form,
Trajectories Because of the complexity of both the kinematics and the dynamics of the manipulator and of the task to be carried out, the motion control problem is generally decomposed into three stages: motion planning, trajectory generation, and trajectory tracking. In the motion planning stage, desired paths are generated in the task space, SE(3), without timing information, i.e., without specifying velocity or acceleration along the paths. Of primary concern is the generation of collision-freepaths in the workspace. In the trajectory generation stage, the desired position, velocity, and acceleration of the manipulator along the path, as a function of time or as a function of arc length along the path, are computed. The trajectory planner may parameterize the end-effector path directly in task space, either as a curve in SE(3) or as a curve in It6 using a particular minimal representation for S O (3), or it may compute a trajectory for the individual joints of the manipulator as a curve in the configuration space C. In order to compute a joint space trajectory, the given endeffector path must be transformed into a joint space path via the inverse kinematics mapping. Because of the difficulty of computing this mapping on-line, the usual approach is to compute a discrete set of joint vectors along the end-effector path and to perform an interpolation in joint space among these points in order to complete the joint space trajectory. Common approaches to trajectory interpolation include polynomial spline interpolation, using trapezoidal velocity trajectories or cubic polynomial trajkctories, as well as trajectories generated by reference models. The computed reference trajectory is then presented to the controller, whose function is to cause the robot to track the given trajectory as closely as possible. A typical architecture for the robot control problem is illustrated in the block diagram of Figure 76.3. This chapter is concerned mainly with the design of the tracking controller, assuming that the path and trajectory have
where D(q) is the inertia matrix of the robot. Let p : C -+ R be a continuously differentiable function (called the potential energy). For a rigid robot, the potential energy is due to gravity alone while for a flexible robot the potential energy also contains elastic potential energy. We define the functioin C = K - P , which is calledthe Lagrangian. The dynamics of the manipulator are then described by Lagrange's equations [ I 11,
where tl,. . . , r,, represent input generalized forces. In local coordinates, Lagrange's equations can be written as
(76.9) where
are known as Christoffel symbols of the first kind,and #k
ap
= -. aqk
(76.11)
In mstrix form we can write Lagrange's Equation 76.9 as
Properties of t h e Lagrangian Dynamics The Lagrangian dynamics given in Equation 76.12 possess a number of important properties that facilitate analysis and control system design. Among these are
THE CONTROL HANDBOOK 1. The inertia matrix D(q) is symmetric and positive definite, and there exist scalars, pl ( q ) and pz(q), such that
Moreover, if all joints are revolute, then are constants.
~1
and p l
2. The matrix W ( q ,q) = D ( ~ -) 2C(q, q) is skew symmetric. This property is easily shown by direct calculation. The kj-th component of D ( q ) is given by the. chain rule as
by skew symlnetry of D -2C. Integrating both sides of Equation 76.19 with respect to time gives
since the total energy H ( T ) is non-negative, and passivity follows with fi = H ( 0 ) . 4. Rigid robot manipulators are fully actuated; i.e., there is an independent control input for each degree-of-freedom. 13y contrast, robots possessing joint or link flexibility are no longer fully actuated and the control problems are more difficult, in general. 5. The equations of motion given in Equation (76.12)
and the kj-th component of the matrix C ( q ,q) is given as
Therefore the k j component of the matrix W ( q ,q ) is given by
Skew symmetry of W ( q ,q ) now follows by symmetry of the inertia matrix D ( q ) . Strongly related to the skew symmetry property is the so-called passivity property. 3. The mapping T + q is passive; i.e., there exists B > 0 such that
To showthis property, let H be the total energy of the system
Then the change in energy, H, satisfies 1 H = :4'~(q)q
+ 4T [ ~ ( q )+q g(q)]
(76.18)
since g ( q ) T is the gradient of P. Substituting Equation 76.12 into Equation 76.18 yields
are linear in the inertia parameters. In other words, there is a constant vector 0 E W1' and a function Y ( q ,q , 4) E &?"'I' such that
The function Y iq, q , q ) 1s called the regressor The parameter vector 0 1s compr~sedof link masses, moments of ~nertla,and the like, in varlous combinations The drmenslon of the parameter space is not unlque, and the search for the parameterlzatton that minimizes the dimension of the parameter space 1s an important the appearance of the pass~v~ty and 1,lear problem. H~stor~cally, parameterization properties In the early 1980s marked watershed events In robot~csresearch. Us~ngthese propert~es,researchers have been able to prove elegant global convergence and stabil~ty results for robust and adaptlve control. We detail some of these results in the following. Additional D y n a m i c s So far, we have discussed only rigid body dynamics. Other important contributions to the dynamic description of manipulators include the dynamics of the actuators, joint and link flexibility, friction, noise, and disturbances. In addition, whenever the manipulator is in contact with the environment, the complete dynamic description includes the dynamics of the environment and the coupling forces between the environment and the manipulator. Modeling all of these effects produces an enormously complicated model. The key in robot control system design is to model the most dominant dynamic effects for the particular manipulator under consideration and to design the controller so that it is insensitive or robust to the neglected dynamics. Friction and joint elasticity are dominant in geared manipulators such as those equipped with harmonic drives. Actuator dynamics are important in many manipulators, while noise is present in potentioheters and tachometers used as joint position afid velocity sensors. For very long or very lightweight robots, particularly in space robots, the link flexibility becomes an important consideration. Actuator Inertia and Friction The simplest modification to the rigid robot model given in Equation 76.12 is the
76.1. MOTION CONTROL OF ROBOTMANIPULATORS inclusion of the actuator inertia and joint friction. The actuator inertia is specified by an n x n diagonal matrix
where Ii and ri are the actuator inertia and gear ratio, respec'tively, of the i-th joint. The friction is specified by a vector, f (q, q), and may contain only viscous friction, Bq, or it may include more complex models that include static friction. Defining M (q) = D(q) I, we ma~modifythe dynamics to include these additional terms as
+
As can be seen, the inclusion of the actuator inertias and friction does not change the order of the equations. For simplicity of notation we ignore the friction terms f (q, q ) in the subsequent development. Environment Forces Whenever the robot is in contact with the environment, additional forces are produced by the robotlenvironment interaction. Let F, denote the force acting on the robot due to contact with the environment. It is easy to show, using the principleofvirtual work [ 111, that a joint torque r, = J (q) F, results, where J (q) is the manipulator Jacobian. Thus, Equation 76.23 may be further modified as
to incorporate the external forces due to environment interaction. The problem of force control is not considered further in this chapter, but is treated in detail in Chapter 78.2. Actuator Dynamics If the joints are actuated with permanent magnet dc motors, we may write the actuator dynamics as di LRi = V - Kbq, (76.25) dt
+
where i, V are vectors representing the armature currents and voltages, and L , R , Kb are matrices representing, respectively, the armature inductances, armature resistances, and back emf constants. Since the joint torque t and the armature current i are related by r = Kmi,where Km is the torque constant of the motor, we may write the complete system of Equations 76.23 to 76.25 as
The inclusion of these actuator dynamics increasesthe dimension of the state-space from 2n to 3n. Other types of actuators, such as induction motors or hydraulic actuators, may introduce morecomplicated dynamics and increase the system order furiher. Joint Wasticity Joint elasticity, due to elasticity in the motor shaft and gears, is an important effect to model in many robots. If the joints are elastic, then the number of degrees-offreedom is twics the number for the rigid robot, since the joint angles and motor shaft angles are no longer simply related by the gear ratio, but are now independent generalized coordinates. If
we represent the joint angles by q l and the motor shaft angles by 92 and model the joint elasticity by linear springs at the joints, then we may write the dynamics as
where I is the actuator inertia matrix and K is a diagonal matrix of joint stiffness constants. The model given in E.quations 76.28 and,76.29 is derived under the assumptions that the inertia of the rotor is symmetric about its axis of rotation and that the inertia of the rotor axes, other than the motor shaft, may be neglected. For models of flexible joint robots that include these additional effects, the reader is referred to the article by DeLuca in 110, p. 981. It is easy to show that the flexible joint model given in Equations 76.28 and 76.29 defines a passive mapping from motor torque r to motor velocity 92, but does not define a passive mapping from t to the link velocity q1 This is related to the classical problem of collocation of sensors and actuators and has important consequences in the control system design. Singular Perturbation Models In practice the arrnature inductances in L in Equation 76.25 are quite small, whereas the joint stiffness constants in K in Equations 76.28 and 76.29 are quite large relative to the inertia parameters in the rigid model given in Equation 76.23. This means that the rigid robot model (Equation 76.23) may be viewed as a singular perturbation [ 3 ] of both the system with actuator dynamics (Equations 76.23 to 76.25) and of the flexible joint system (Equations 76.28 and 76.29). In the case of actuator dynamics, suppose that all entries L , Id!, in the diagonal matrix R-' L are equal to 6 << 1 for simplicity and write Equations 76.26 and 76.27 as
In the case of joint elasticity, define z = K(q2 - q l ) and suppose that all the joint stiffness constants are equal to l / for ~ simplicity. It is easy to show that Equations 76.28 and 76.29 may be written as
In both cases above, the rigid model given by Equation 76.23 is recovered in the limit as 6 -+ 0. Singular perturbation techniques are thus of great value in designing and analyzing control laws for manipulators.
76.1.3 PD Control In light of the complexity of the dynamics of n-ltnk robots, it is quite remarkable to discover that very simple control laws can be
~
THE CONTROL HANDBOOK used in a number of cases. For example, a simple independent joint proportional-derivative (PD) control can achieve global asymptotic stability for the set-point tracking in the absence of gravity. This fact is a fundamental consequence of the passivity property discussed earlier. To see this, consider the dynamic equations in the absence of friction M(q)ij f C ( q , 9 ) 9 = u
(76.34)
where u = r - g ( q ) . Let u=Kp(9
d
-q)-Kdq
(76.35)
be an independent joint PD control, where q d represents a constant reference set-point, and K p and Kd are positive, diagonal matrices of proportional and derivative gains, respectively. Consider the Lyapunov function candidate
Then a simple calculation using the skew symmetry property shows that
LaSalle's invariance principle [5] can now be used to show that the equilibrium state q = q d , q = 0 1s globally asymptotically stable. Thus, if gravity is absent or is compensated as
then a simple independent joint PD control achieves global asymptotic stability. Unfortunately, these results do not extend to the case of time-varying reference trajectories since they rely on LaSalle's Theorem. Other researchers have investigated proportional-integralderivative (PID) control and adaptive gravity compensation in order to overcome the requirement that the gravity parameters in Equation 76.38 be known exactly. It has also been shown that asymptotic tracking may be achieved using only the reference set-point q d in the gravity compensation, i.e.,
provided the PD gains are chosen suitably, which simplifies the on-line computational requirements. These latter results require the addition of cross terms to the Lyapunov function (Equation 76.36) in o ~ d eto r show asymptotic stability. In practice, however, the input given by Equation 76.35 or Equation 76.38 may result in input saturation because of the large gains that are usually required and the large initial tracking error. One of the primary reasons for trajectory generation, such as trapezoidal velocity profiles, in the first place is to help achieve proper scaling of the amplifier inputs. Nevertheless, these simple PD results provide theoretical justification for the widespread use of such controllers in commercial industrial manipulators. In fact, it is not too difficult to show the same result for a flexible joint robot in the absence of gravity, provided the PD control is implemented using the motor variables. This is because the flexible joint robot dynamics defines a passive mapping from motor torque r to the motor velocity 4 2 .
76.1.4 Feedback Linearization The notion of feedback linearization of nonlinear systems is a relatively recent idea in controi theory, whose practical realization has been made possible by the rapid development of microprocessor technology. The basic idea of feedback linearization control is to transform a given nonlinear system into a linear system by use of a nonlinear coordinate transformation and nonlinear feedback. Feedback linearization is a useful paradigm because it allows the extensive body of knowledge from linear systems to be brought to bear to design controllers for nonlinear systems. The roots of feedback linearization in robotics predate the general theoretical development by nearly a decade, gci..g back to the early notion of feedforward-computed torqw 181. In the robotics context, feedback linearization is known as inverse dynamics. The idea is to exactly compensate a11 of the coupling nonlinearities in the Lagrangian dynamics in a first stage so that a second-stage compensator may be designed based on a linear and decoupled plant. Any number of techniques may be used in the second stage. The feedback linearization may be accomplished with respect to the joint space coordinates or with respect to the task space coordinates. Feedback linearization may also be used as a basis for force control, such as hybrid control .and impedance control.
Joint Space Inverse Dynamics We first present the main ideas in joint space where they are easiest to understand. The,control architecture we use is important as a basis for later developments. Thus, given the plant model
we compute the nonlinear feedback control law
where aq E Rnis, as yet, undetermined. Since the inertia matrix M ( q ) is invertible for all q , the closed-loop system reduces to the decoupled double integrator
Given a joint space trajectory, q d ( t ) ,an obvious choice for the outer loop term aq is as a PD plus feedforward acceleration control
Substituting Equation 76.43 into Equation 76.42 and defining
we have the linear and decoupled closed-loop system
We can implement the joint space inverse dynadics in a socalled inner Ioop/outerloop architecture as shown in Figure 76.4.
76.1. MOTION CONTROL OF ROBOT MANIPULATORS LINEARIZED SYSTEM
.......................... :TORY NElt
OUTER WP CONTROLLER
so that the Cartesian space tracking error, X = A' - x d , satisfies
IW
CON1
I Figure 76.4
1
Inner loop/outer loop architecture.
The computation of the nonlinear terms in Equation 76.41 is performed in the inner loop, perhaps with a dedicated microprocessor to obtain high computation speed. The computation of the additional term aq is performed in the outer loop. This separation of the inner loop and outer loop terms is important for several reasons. The structure of the inner loop control is fixed by Lagrange's equations. What control engineers traditionally think of as control system design is contained primarily in the outer loop. The outer loop control given in Equation 76.43 is merely the simplest choice of outer loop control and achieves asymptotic tracking of joint space trajectories in the ideal case of perfect knowledge of the model given by Equation 76.40. However, one has complete freedom to modify the outer loop control to achieve various other goals without the need to modify the dedicated inner loop control. For example, additional compensation terms may be included in the outer loop to enhance the robustness to parametric uncertainty, unmodeled dynamics, and external disturbances. The outer loop control may also be modified to achieve other goals, such as tracking of task space trajectories instead of joint space trajectories, regulating both motion and force, etc. The inner looplouter loop architecture thus unifies many robot control strategies from the literature. Task Space Inverse Dynamics As a first illustration of the importance of the inner loop1 outer loop paradigm, we show that tracking in task space can be achieved by modifying our choice df outer loop control aq in Equation 76.42 while leaving the inner loop control unchanged. Let X E R~ represent the end-effector pose using any minimal representation of SO@). Since X is a function of the joint variables q E C, we have
where J = J1 is the Jacobian defined in Section 76.1.1. Given the double integrator (Equation 76.42) in joint space we see that if aq is chosen as
then we have a double integrator model in task space coordinates
Given a task space trajectory xd(t), we may choose a x as ax = xd
+ K ~ ( x- ~X ) + K ~ ( x- ~X)
.
(76.50)
Therefore, a modification of the outer loop control achieves a linear and decoupled system directlyin the taskspiice coordinates, without the need to compute a joint trajectory and without the need to modify the nonlinear inner loop control. Note that we have used a minimal representation for the orientation of the end-effector in order to specify a trajectory X E It6. In general, if the end-effector coordinates are g,iven in SE(3), then the Jacobian J in the preceding formulation is the Jacobian Jo defined in Section 76.1.1. In this case
and the outer loop control
applied to Equation 76.42 results in the system
Although, in this case, the dynamics have not been linearized to a double integrator, the outer loop terms a , and a, may still be used to achieve global tracking of end-effector trajectories in SE(3). In both cases, we see that nonsingularity of the Jacobian is necessary to implement the outer loop control The inverse dynamics control approach has been proposed in a number of different guises, such as reso1ve:d acceleration control and operational space control. These seemingly distinct approaches have all been shown to be equivalent and may be incorporated into the general framework shown previously (see the article by Kreutz in [ l o ,p. 771). It turns ou! that both the robot model including actuator dynamics given in Equations 76.26 and 76.27 and the model including joint flexibility given in Equations 76.28; and 76.29 are also feedback linearizable using a nonlinear coordinate transformation and nonlinear feedback. For the system with actuator dynamics given in Equations 76.26 and 76.27, if one chooses as state variables, the link positions, velocities, and accelerations, a nonlinear feedback control exists to reduce the system to a decoupled set of third-order integrators. The coordinates in which the flexible joint system given in Equations 76.28 and 76.29 may be exactly linearized are the link positions, velocities, accelerations, and jerks, and a decoupled set of fourth-order integrators is achievable by suitable nonlinear feedback. See the article by Spong [p. 1051 for details. Composite Control Although global feedbacklinearizationin the case of actuator dynamics or joint flexibilityis possible in theory, in practice it
THE CONTROL HANDBOOK is difficult to achieve, mainly because the coordinate transformation is a function of the system parameters and, hence, sensitive to uncertainty. Also, the large differences in magnitude among the parameters, e.g., between the joint stiffness and the link inertia, may make the computation of the control ill-conditioned and the performance of the system poor. Using the singular perturbation models derived earlier, so-called composite control laws may produce better designs. We illustrate this idea using the flexible joint model given by Equations 76.32 and 76.33. The corresponding result for the system with actuator dynamics is similar. Given the system of Equations 76.32 and 76.33, we define two related systems, the quasi-steady-state system and the boundary layer system, as follows. The quasi-steady-state system is the reduced-order system calculated by setting c = 0 in Equation 76.33, Z=? -I&, (76.57) where the overbar indicates quantities computed at c = 0, and eliminating 2 in Equation 76.32. It can easily be shown that the quasi-steady-statesystem thus derived is equal to
M(GI)$
+ c(&,;i)& + g ( ~=) 7.
(76.58)
The quasi-steady-state is thus identical to the rigid robot model in terms of G l . The boundary layer system represents the dynamics of q = z - 2, computed in the fast time scale, o = t / c . This system can be shown to be
where tf = r - 7. A composite control for the flexible joint system (Equations 76.32 and 76.33) is a control law of the form
speaking, the response of the system with joint flexibility using the composite control given by Equation 76.60 is nearly the same as the response of the rigid robot model using only t. An important consequence is that the inverse dynamics control for the rigid robot may be used for the flexible joint robot, provided a correction term is superimposed on it to stabilize the boundary layer system. The additional control term rj represents the damping of the joint oscillations. Other "rigid" controllers may also be used, such as the robust and adaptive controllers detailed in the next section.
76.1.5 Robust and Adaptive Control The feedback linearization approach of the previous section exploits important structural properties of robot dynamics. However, the practical implementation of such controllers requires consideration of various sources of uncertainties such as modeling errors, computation errors, external disturbances, unknown loads, and noise. Robust andadaptive control are concerned with the problem of maintaining precise tracking under uncertainty. We distinguish robust from adaptive control in the sense that an adaptive algorithm typically incorporates some sort of online parameter estimation scheme while a robust, nonadaptive . scheme does not. Robust Feedback Linearization
A number of techniques from linear and nonlinear control theory have been applied to the problem of robust feedback linearization for manipulators. Chief among these are sliding modes, Lyapunov's second method, and the method of stable factorizations. Given the dynamic equations
the control input is chosen as where t is designed for the quasi-steady-state system and tf is designed for the boundary layer system. The significant features of this approach are that 1. The control design is based on reduced-order sys-
tems, which is generally easier than designing a controller for the full-order system. 2. The quasi-steady-statesystem is just the rigid robot model, so existing controllers designed for rigid robots may be used without modification. If tf is designed to render the boundary layer system asymptotically stable in the fast time scale and 7 is any control that achieves asymptotic tracking for the rigid system, it follows from Tichonov's Theorem [3] that
where (:) represents the computed or nominal value of (.) and indicates that the theoretically exact feedback linearization cannot - due to the uncertainties in the system. The be achieved in practice error or mismatch (.) = (.) - (.) is a measure of one's knowledge of the system d'pamics. The outer loop control term aq may be used to compensate for the resulting perturbation terms. If we set
andsubstitute Equations 76.64 and 76.65 into Equation 76.63 we obtain, after some algebra,
4 + ~ d c i+: K uniformly on a time interval [O, 1 / 0 1 . Stronger results showing asymptotic tracking may also be achieved, depending on the particular control chosen for the quasi-steady-state system. Roughly
+
~ = G &a q ( q , q , ~ at ),
(76.66)
where
v ' M - ' { M ( ~ ~ ~ + K ~ G + K ~ G + S ~ ) + C ~ (76.67) .+~}.
76.1. MOTIOI\I CONTROL OF ROBOT MANIPULATORS In state-space we may write the system given by Equation 76.66 as x = Ax B(6a q } (76.68)
+
+
where
The approach is now to search for a time-varying scalar bound, p(x, t ) > 0, on the uncertainty q, i.e.,
and to design the additional input term Sa to guarantee asymptotic stability or, at least, ultimate boundedness of the state trajectory x(t) in Equation 76.68. The bound p is difficult to compute, both because of the complexity of the perturbation terms in q and because the uncertainty q is itself a function of Sa. The sliding mode theory ofvariable structure systems has been extensively ayplied to the design of Sa in Equation 76.68. The simplest such sliding mode control results from chobsing the components Sa, of Sa according to &ai= pi (x, t)sgn(s,), i = 1, . . . , n
(76.71)
Gi +
where pi is a bound on the i-th component of q , s, = hiGi represents a sliding surface in the state-space, and sgn(.) is the signum function
An alternative but similar approach is the so-called theory of guaranteed stability of uncertain systems, based on Lyapunov's second method. Since K p and Kd are chosen in Equation 76.68 so that A is a Hurwitz matrix, for any Q > 0 there exists a unique ov symmetric P > O satisfying the ~ ~ a p u n equation,
Using the matrix P, the outer loop term Sa may be chosen as
1347 The method of stable factorizations has also been applied to the robust feedback linearization problem. In this approach a linear, dynamic compensator C(s) is used to generate Sa to stabilize the perturbed system. Since A is a Hurwitz matrix, the Youla-parameterization may be used to generate the entire class, a,of stabilizing compensators for the unperturbed system, i.e., Equation 76.68 with q = 0. Given bounds on the uncertainty, the Small Gain Theorem is used to generate a sufficrent condition for stability of the perturbed system, and the design problem is to determine a particular compensator, C ( s ) , frolm the class of stabilizingcompensators 52 that satisfies this sufficient condition. The interesting feature of this problem is that the perturbation terms appearing in Equation 76.68 are finite in the L,.norm, but not necessarily in the L2 norm sense. This means that standard H, design methods fail for this problem. For this reason, the robust manipulator control problem was influential in the development of the L1 optimal control field. Further details on these various outer loop designs may be found in [lo].
Adaptive Feedback Linearization Once the linear parameterization property for manipulators became widely known in the mid-1980s, the first globally convergent adaptive control results began to appear. These first results were based on the inverse dynamics or feedback linearization approach discussed earlier. Consider the plant (Equation 76.63) and control (Equation 76.64) as previously, but now suppose that the parameters appearing in Equation1 76.64 are not fixed as in the robust control approach, but are time-varying estimates of the true parameters. Substituting Equation 76.64 into Equation 76.63 and setting ay = qd
+ Kd(gd - 4) + ICp(qd - q),.
(76.75)
it can be shown, after some algebra, that
where Y is the regressor function, and 8 = 6 - 6, and 6 i's the estimate of the parameter vector 8 . In state-space we write the system given by Equation 76.76 as
where (76.74) The Lyapunov function V = x T P x can be used to show that v is negative definite along solution trajectories of the system given by Equation 76.68. In both the slidingmode approach and the guaranteed stability approach, problems arise in showing the existence of solutions to the closed-loop differential equations because the control signal Sa is discontinuous in the state x . In practice, a chattering control signal results due to nonzero switching delays. There have been many refinements and extensions to the above approaches to robust feedback linearization, mainly to simplify the computation of the uncertainty bounds and to smooth the chattering in the cortrol signal 1101
with K p and Kd chosen so that A is a Hurwitz matrix. Suppose that an output function y = Cx is defined for Equation 76.77 in such a way that the transfer function C(sZ - A)--'B is strictly positive real (SPR).It can be shown using the passivity theorem that, for Q > 0, there exists a symmetric, positive definite matrix P satisfying
THE CONTROL HANDBOOK If the parameter update law is chosen as
Passivity Based Robust Control In the robust approach, the term 6 in Equation 76.85 is chosen as
where r = rT > 0, then global convergence to zero of the tracking error with all internal signals remaining bounded can be shown using the Lyapunov function
Furthermore, the estimated parameters converge to the true parameters, provided the reference trajectory satisfies the condition of persistency of excitation, .
where O0 is a fixed nominal patameter vector and u is an additional control term. The system given in Equation 76.86 then becomes
where 6 = 80 -8 is a constant vector and represents the parametric uncertainty in the system. If the uncertainty can be bounded by finding a non-negative constant p 2 0 such that
r T ( q d ,qd,i d ) y ( q d q, d ,ijd)dt 5 pr (76.83) for all to, where a,B , and T are positive constants. In order to implement this adaptive feedback linearization scheme, however, one notes that the acceleration q is needed in the parameter update law and that M must be guaranteed to be invertible, possibly by the use of projection in the parameter space. Later work was devoted to overcome these two drawbacks to this scheme, by using so-called indirect approaches based on a (filtered) prediction error.
then the additional term u can be designed to guarantee stable tracking according to the expression
The Lyapunov function
Passivity-Based Approaches By exploiting the passivity of the rigid robot dynamics it is possible to derive more elegant robust and adaptive control algorithms for manipulators, which are, at the same time, simpler to design. In the passivity-based approach we modify the inner loop control as
qd - Aij " = 4" - ~4
where u, a, and r are given as v
=
a
=
r
= q.d - u = i j + A - 4
with K, A diagonal matrices of positive gains. In terms of the linear parameterization of the robot dynamics, the control given in Equation 76.84 becomes
and the combination of Equation 76.84 with Equation 76.63 yields
Note that, unlike the inverse dynamics control given in Equation 76.41, the modified inner loop control of Equation 76.63 does not achieve a linear, decoupled system, even in the known parameter case, 6 = 8. However, the advantage achieved is that the regressor Y in Equation 76.86 does not contain the acceleration q, nor is the inverse of the estimated inertia matrix required.
can be used to show asymptotic stability of the tracking error. Note that is constant and so is not a state vector as in adaptive control. Comparing this approach with the approach in Section 76.1.5 we see that finding a constant bound p for the constant vector 6 is much simpler than finding a time-varying bound for r] in Equation 76.67. The bound p in this case depends only on the inertia parameters of the manipulator, while p (x, t) in Equaticn 76.70 depends on the manipulator state vector and the reference trajectory and, in addition, requires some assumptions on the estimated inertia matrix ~ ( q ) . Various refinements of this approach are possible. By replacing the discontinuous control law with the continuous control
not only is the problem of existence of solutions due to discontinuities in the control eliminated, but also the tracking errors can be shown to be globally exponentially stable. It is also possible to introduce an estimation algorithm to estimate'the uncertainty bound p so that no a priori information of the uncertainty is needed. Passivity-Based Adaptive Control In the adaptive approach, the vector 6 in Equation 76.86 is now taken to be a time-varying estimate of the true parameter vector 19.Instead of adding an additional control term, as in the robust approach, we introduce a parameter update law for Combining the control law given by Equation 76.84 with Equation 76.63 yields
e.
76.1. MOTION CONTROL OF ROBOT MANIPULATORS
he parameter estimatee may be ~om~utedusin~standardmeth- Either the end-effector path (Equation 76.99) or the joint space path (Equation 76.100) may be used in the subsequent developods, such as gradient or least squares. For example, using the ment. We illustrate the formulation with the joint space path gradient update law (Equation 76.100) and refer the reader to [lo] for further details. The final time tf may be specified as together with the Lyapunov function
results in global convergence of the tracking errors to zero and boundedness of the parameter estimates. A number of important refinementsto this basic result arepossible. By using the reference trajectory instead of the measured joint variables in both the control and update laws,-i.e.,
which suggests that, in order to minimize the final time, the velocity S along the path should be maximized. It turns out that the optimal soldtion in this case is bang-bang; i.e., the acceleration s is either maximum or minimum along the path. Since the trajectory is parameterized by the scalar s, phase plane techniques may be used to calculate the maximum acceleration s, as we shall see. From Equation 76.100 we may compute
with
8; == -r-'yT
(gd, qd, ijd)r,
(76.97)
it is possible to show exponential stability of the tracking error in the known parameter case, asymptotic stability in the adaptive case, and convergence of the parameter estimation error to zero under persistence of excitation of the reference trajectory.
76.1.6 Time-Optimal Control For many applications, such as palletizing, there is a direct correlation between the speed of the robot manipulator and cycle time. For these applications, making the robot work faster translates directly into an increase in productivity. Since the input torques to the robot are limited by the capability of the actuators as
where e, is the Jacobian ofthe mapping given by Equation 76.100. Therefore, once the optimal solutions (t) is computed, the above expressions can be used to determine the optimal joint space trajectory. Substituting the expressions for q and q into the manipulator dynamics (Equation 76.63) leads to a set of secondorder equations in the scalar arc length parameter s,
Given the bounds on the joint actuatcr torques from Equation 76.98 bounds on s can be determined by substituting Equation 76.104 into Equation 76.98
where ai
pi
Formulation of the Time-Optimal Control Problem Consider an end-effector path p(s) E R~parameterized by arc length s along the path. If the manipulator is constrained to follow this path, then
+ bi(s)i2 + ci (s) 5 rim"'
i = :I, . . . , n, (76.105) which can be written as a set of n-constraints on s as
rimin 5 ai (s)i
it is natural to consider the problem of:time-optimal control. In many applications, such as seam tracking, the geometric path of the end-effector is constrained in the task space. In such cases, it is useful to produce time-optimal trajectories, i.e., time-optimal parameterizations of the geometric path that can be presented to For this reason, most of the research into the feedback contr~ller.~ the time-optimal control of manipulators has gone into the problem of generating a minimum time trajectory along a given path in task space. Several algorithms are now available to compute such time-optimal trajectories.
with
(ry - bii2 - ci)/ai = (riB - bib2 - ci)/ai
=.
= riminand rf = rim""
if
12,
rff = Tmclx and rf = rimin 1 1
if
ai < 0
T:
Thus, the bounds on the scalar 3 are determined as u(s, s) 5 s 5 p(s, S)
where f (q) is the forward kinematics map. We may also use the inverse kinematics map to write
>0
(76.107)
where W(S,S) = ma~{(Yi(s, S)} ; B(s, S) = min{Bi(s, S)] (76.108)
-
THE CONTROL HANDBOOK Since the solution is known to be bang-bang, the optimal control is determined by finding the times, or positions, at which s switches between
and
s
= P(s, S )
s
= cr(s,S).
These switching times may be found by constructing switching curves in the phase plane s-s corresponding to cr(s, s ) = p(s, s ) . Various methods for constructing this switching curve are found in the references contained in (10,part 61. A typicalbinimumtime solution is shown in Figure 76.5.
. trol
where Ayk(t) = yk(t) - yd(t), such that ( ( A y k (-+ l 0 ask -+ oo in some suitablydefined function space norm, I 1 . 1 1 . Suchlearning control algorithms are attractive because accurate models of the dynamics need not be known a priori. Several approaches have been used to generate a suitablelearning law F and to prove convergence of the output error. A P-type learning law is one of the form
so called because the correction term to the input torque at each iteration is proportional to the error Ayk. A D-type 1earning.law is one of form
A more general PID type learning algorithm takes the form
d
tk+i ( t ) = t k ( t ) - r d t A y k ( t ) - @ A ~ k ( t-)
Switch point
Figure 76.5
Minimum-time trajectory in the s-S plane.
76.1.7 Repetitive and Learning Control Since many robotic applications, such as pick-and-place operations, painting, and circuit board assembly, involve repetitive motions, it is natural to consider using data gathered in previous cycles to try to improve the performance of the manipulator in subsequent cycles. This is the basic idea of repetitive control or learning control. Consider the rigid robot model given by Equation 76.23 and suppose one is given a desired output trajectory on a finite time interval, yd ( t ) ,0 5 t 5 T , which may represent a joint space trajectory or a task space trajectory. The reference trajectory yd(t) is used in repeated trials of the manipulator, assuming either that the trajectory is periodic, yd(T) = yd(0) (repetitive control), or that the robot is reinitialized to lie on the desired trajectory at the beginning of each trial (learning control). Hereafter we use the term learning control to mean either repetitive or learning control. Let rk(t) be the input torque during the k-th cycle, which T . The inputloutput pair produces an output yk ( t ) ,0 _( t [ t k ( t ) yk(t)] , may be stored ard utilizedin the k+ 1-st cycle. The initial control input to(t) can be any control input that produces a stable output, such as a PD control. The learning control problem is to determine a recursive con-
/
@Ayk(u)du. (76.112)
Convergence of yk(t) to yd(t) has been proved under various assumptions on the system. The earliest results considered the robot dynamics linearized around the desired trajectory and proved convergence for the linear time-warying system that results. Later results proved convergence based on the complete . Lagrangian model. Passivity has been shown to play a fundamental role in the convergence and robustness of learning algorithms. Given a joint space trajectory q d ( t ) ,let r d ( t ) be defined by the inverse dynamics, i.e.,
+
+
rd(t) = M(9d(t))iid(t) C ( 4 d ( t ) ,qd(t))qrl(t) g(qd(t)). (76.113) The function rd ( t ) need not be computed; it is sufficient to know
that it exists. Consider the P-type learning control law given by Equation 76.110 and subtract r d ( t ) from both sides to obtain
where Ark = rk ATL,@-~
- rd. h follows that = ~ r : @ - lA T ~ + A ~ ~ @ A ~ ~ - ~ A ~ ~ A (76.115)'
Multiplying both sides by e-At and integrating over [0, TI it can be shown that
provided there exist positive constants h and
such that
for all k. Equation 76.117 defines a passivity relationship of the exponentially weighted error dynamics. It follows from Equation 76.116 that Ayk + 0 in the Lz norm. See (10,part 51 and the references therein for a complete discussion of this and other results in learning control.
76.2. FORCE CONTROL OF ROBOT MANIPULATORS
References [ I ] Asada, H. and Slotine, J-J.E., RobotAnalysis and Control, John Wiley & Sons, Inc., New York, 1986. [2] Dorf, R.C., Ed., International Encyclopedia ofRobotics: Applicationsand Automatzon, JohnWiley &Sons, Inc., 1988. (31 Kokotovit, P.V., Khalil, H.K., and O'Reilly, J., Singular Perturbation Methods in Control: Analysis and Design, Academic Press, Inc., London, 1986. [4] Luh, J.Y.S., Conventional controller design for industrial robots: a tutorial, IEEE Trans. Syst., Man, Cybern., 13(3), 298-316, MaylJune 1983. [5] Khalil, H., Nonlinear Systems, Macmillan Press, New York, 1992. [6] Nof, S.Y., Ed., Handbook of Industrial Robotics, John Wiley &Sons, Inc., New York, 1985. (71 Ortega, R., and Spong, M.W., Adaptive control of rigid robots: a tutorial, in Proc. IEEE Conf. Decision Control, 1575-1584, Austin, TX, 1988. [8] Paul, R.C., Modeling,Trajectory Calculation,andServoing of a Computer Controlled Arm, Stanford A.I. Lab, A.I. Memo 177, Stanford, CA, November 1972. [9] Spong, M.W., On the robust control of robot manipulators, IEEE Trans. Autom. Control, 37, 1782-1786, November 1992. [lo] Spong, M.W., Lewis, F., and Abdallah, C., Robot Control: Dynamics, Motion Planning, and Analysis, IEEE Press, 1992. [ l 11 Spong,M.W. andvidyasagar, M., RobotDynamics and Control, John Wiley & Sons, Hc., New York, 1989.
76.2 Force Control of Robot Manipulators foris De Schutter, Katholieke Universiteit Leuven, Department of Mechanical Engineering, Leuven, Belgium Herman Bruyninckx, ~atholiekeUniversiteit Leuven, Department of Mechanical Engineering, Leuven, Belgium 76.2.1 Introduction Robots of the first generations were conceived as "open-loop" positioning devices; they operate with little or no feedback at all from the process in which they participate. For industrial assembly environments, this implies that all parts or subassemblies 'have to be prepositioned with a high accuracy, which requires expensive and rather inflexible peripheral equipment. Providing robots with sensing capabilities can reduce these accuracy requirements considerably. In particular, for industrial assembly, force feedback is extremely useful. But also for other tasks, in which a tool held by the robot has to make controlled contact with a work piece, as in deburring, polishing, or cleaning, it is
not a good idea to rely fully on the positioning accuracy of the robot, and force feedback or force control becomes mandatory. Force feedback is classified into two categories. In passive force feedback, the trajectory of the robot end-effector is modified by the interaction forces due to the inherent compliance of the robot; the compliance may be due to the structural compliance of links, joints, andend-effector or to the compliance ofthe position servo. In passive force feedback there is no actual force measurement, and the preprograrnmed trajectory of the end-effector is never changed at execution time. On the other hand, in active force feedback, the interaction forces are measured, fed back to the controller, and used to modify, or even generate on-line, the desired trajectory of the robot end-effector. Up till now force feedback applications in industrial environments have been mainly passive, for obvious reasons. Passive force feedback requires neither a force sensor nor a modified programming and control system and is therefore simple and cheap. In addition, it operatesvery fast. The Remote Center Compliance (RCC), developed by Draper Lab [ l ] and now widely available in various forms, is a well-known example of passive force feedback. It consists of a compliant end-effector that is designed and optimized for peg-into-hole assembly operations. However, compared to active force feedback, passive force feedback has several disadvantages. It lacks flexibility, since for every robot task a special-purpose compliant end-effector has to be designed and mounted. Also, it can deal only with small errors of position and orientation. Finally, since no forces are measured, it can neither detect nor cope with error conditions involving excessive contact forces, and it cannot guarantee that high contact forces will never occur. Clearly, although active force feedback has an answer to all of these issues, it is usually slower, more expensive, and more sophisticated than purely passive force feedback. Apart from a force sensor, it also requires an adapted programming and control system. In addition, it has been shown [2] that, in order to obtain a reasonable task execution speed and disturbance.rejection capability, active force feedback has to be used in combination with passive force feedback. In this chapter the design of active force controllers is discussed. More detailed reviews are given in [7] and [ 101. In order to apply active force control, the folllowi~lgcomponents are needed: force measurement, task specification, and control. Force measurement. For a general force-controlled task, six force components are required to provide complete contact force information: three translational force ccmponents and three torques. Very often, a force sensor is mounted at the robot wrist, but other possibilities exist. The force signals may be obtained using strain measurements, which results in a stiff sensor, or using deformation measurements (e.g., optically), which results in a compliant sensor. The latter approach has an advantage if additional passive compliance is desired. Task specification. For robot tasks involving constrained momotion as in tion, the user has to specify more than just the des~~red the case of motion in free space. In addition, he has to speci$ how the robot has to interact with the external constraints. Basically
THE CONTROL HANDBOOK
there are two approaches, which are explained in this chapter. In the hybrid force/position control approach, the user specifies both desired motion and, explicitly, desired contact forces in two mutually independent subspaces. In the second approach, called impedance control, the user specifies how the robot has to comply with the external constraints; i.e., he specifies the dynamic relationship between contact forces and executed motion. In the hybrid approach, there is a clear separation between task specification and control; in the impedance approach, taskspecification and control are closely linked. Control. The aim ofthe force control system is to make the actual contact forces, as measured by the sensor, equal to the desired contact forces, given by the task specification. This is called lowlevel or setpoint control, which is the main topic of this chapter. However, by interpreting the velocities actually executed by the robot, as well as the measured contact forces, much can be learned about the actual geometry of the constraints. This is the key for more high-level or adaptive control. The chapter is organized as follows. Section 76.2.2 derives the system equations for control and task specification purposes. It also contains examples of hybrid forcelposition task specification. Section 76.2.3 describes methods for hybrid control of end-effector motion and contact force. Section 76.2.4 showshow to adapt the hybrid controller to the actual contact geometry. Section 76.2.5 briefly describes the impedance approach.
76.2.2 System Equations Chapter 76.1 derived the general dynamic equations for a robot arm constrained by a contact with the environment:
F: = (f i n T ) is a six-vector representing Cartesian forces ( f ) and torques (m) occurring in the contact between robot and environment. q = [ql . . .qn]T are the n joint angles of the manipulator. M(q) is the inertia matrix of the manipulator, expressed in joint space form. C(q, q)q andg(q) are thevelocityand gravity-dependent terms, respectively. t is the vector ofjoint torques. J is the manipulator's Jacobian matrix that transforms joint velocities q to the Cartesian linear and angular end-effector velocities represented by the six-vector vT = (vT w T ) :
The Cartesian contact force F, is determined both by Eqpation 76.1 18 and by the dynamics of the environment. The dynamic model of the environment contains two aspects: (1) the kinematics of the contact geometry; i.e., in hat directions is the manipulator's motion constrained by the environment; and (2) the relationship between force applied to the environment and the deformation of the constraint surface. We consider two cases: (1) the robot and the environment are perfectly rigid; and (2) the environment behaves as a mass-spring-damper system. The first model is most appropriate for: Motion specification purposes: The user has to specify the desired motion of the robot, as well as the de-
sired contact forces. This is easier within a Cartesian, rigid, and purely geometric constraint model. Theoretical purposes: A perfectly rigid interaction between robot and environment is an interesting ideal limit case, to which all other control approaches can be compared 191. The soft environment model corresponds better to most of the real situations. Usually, the model of the robot-environment interaction is simplified by assuming that all compliance in the system-including that of the manipulator, its servo, the force sensor, and the tool-is localized in the environment.
Rigid Environment Two cases are considered: (1) the constraints are formulated in joint or configuration space; and (2) the constraints are formulated in Cartesian space. Configuration Space Formulation Assume that the kinematic constraints imposed by the environment are expressed in configuration space by nf algebraic equations
This assumes the constraints are rigid, bilateral, and holonomic. Assume also that these nf constraints are mutually independent; then they take away nf motion degrees of freedom from the manipulator. The constrained robot dynamics are now derived from the unconstrained robot dynamics by the classical technique of incorporating the constraints into the Lagrangian function (see Chapter 76.1). For the unconstrained system, the Lagrangian is the difference between the system's kinetic energy, K,and its potential energy, P. Each of the constraint equations $*, ,j = 1, . . . , nf, should be identically satisfied. Lagrange's approach to satisfying both the dynamical equations of the system and the constraint requirements was to define the extended Lagrangian
h = [Al . . . hnf ] is a vector of Lagrange multipliers. The solution to the Lagrangian equations
1 then results in J,j,4 (q) is the nf x n Jacobian matrix (i.e., the matrix of partial derivatives with respect to the joint angles q ) of the constraint function qq(q). J*4 is of full rank, nf, because 41 constraints are assumed to be independent. (If not, the constraints represent a so-called hyperstatic situation.) J T (q)h represents the ideal 4 contact forces, i.e., without contact friction.
76.2. FORCE CONTROL OF ROBOTMANIPULATORS Cartesian Space Formulation The kinematic constraints in Cartesian space are easily derived from the configuration space results, as long as the manipulator's Jacobian matrix J is square (i.e., n * 6) and nonsingular. In that case, Equation 76.121 is equivalent to [ JT denotes J transpose and J-T = ( . I ~ ) - ~ ]
Task Specification The task specification module specifies force and velocity setpoints, F~ and vd, respectively. In order to be consistent with the constraints, these setpoints must lie in the force- and velocity-controlled directions, respectively. Hence, the instantaneous task description corresponds to specifying the vectors dd and d : F ~ - - S ~ @ .Vd
Denote J*', (q) J-' by Jqx(9). 'his is an nf x 6 matrix. Then
Comparison with Equation 76.118 shows that
This means that the ideal Cartesian reaction forces F, belong to an nf dimensional vector space, spanned by the full rank matrix J ; ~ . Let Sf denote a basis of this vector space, i.e., a set of nf independent contact forces that can be generated by the constraints:
The coordinate vector @ contains dimensionless scalars; all physical dimensions of Cartesian force are contained in the basis Sf. The time derivative of Equation 76.120 yields
Hence, using Equation 76.1 19 yields
Combining Equations 76.123 and76.126 gives the kinematic reciprocity relationship between the ideal Cartesian reaction forces Fe (spanning the so-calledforce-controlled subspace) and the Cartesian manipulatoxvelocities V that obeythe constraints (spanning the velocity-controlled subspace):
This means that the velocity-controlled subspace is the n, dimensional (n, = 6 - n, ) reciprocal complement of the forcecontrolled subspace. It can be given a basis s,, such that
Again, x is a vector with physically dimensionless scalars; the columns of Sx have the physical dimensions of a Cartesian velocity. From Equation 76.127 it follows that
=Sxxd.
(76.130)
These equations are invariant with respect to the choice of reference frame and with respect to a change in the physical units. However, the great majority of tasks have a set of orthogonal referenck frames in which the task specification becomes very easy and intuitive. Such a frame is called a taskpamu or compliance frame, [6]. Figures 76.6 and 76.7 show two examples. Inserting a round peg in a round hole. The goal of this task is to push the peg into the hole while avoiding wedging and jamming. The peg behaves as a cylindrical joint; hence, it h~astwo degrees subspace of motion freedom (n, = 2) while the force-cont~~olled is of rank four (nf = 4). Hence, the task can be achieved by the following four force setpoints and two velocity setpoints in the task frame depicted in Figure 76.6: 1. 2. 3. 4.
A nonzero velocity in the Z direction Zero forces in the X and Y directions Zero torques about the X and Y direction:: An arbitrary angular velocity about the Z direction
The task continues until a "large" reaction force in the Z direction is measured. This indicates that the peg has hit the bottom of the hole. Sliding a block over a planar surface. The goal of this task is to slide the block over the surface without generating too large reaction forces and without breakingthe contact. 'There are three velocity-controlled directions and three force-controlled directions (n, = nf = 3). Hence, the task can be achieved by the following setpoints in the task frame depicted in Figure 76.7: 1. A nonzero force in the Z direction 2. A nonzero velocity in the X direction 3. A zero velocity in the Y direction 4. A zero angular velocity about the Z direction 5. Zero torques about the X and Y directions
Soft Environment For the fully constrained case, i.e., d l end-effector degrees of freedom are constrained by the environment, the dynamic equation of the robot-environment interaction is given by
AV = V - V, is the Cartesian deformation velocity of the soft environment, with Ve the velocity of the environment. A a = (dAV)/(dt) is the deformation acceleration, and A X = j A Vdt is the deformation. Note that velocity is taken here as the basic motion input, since the difference X - X , of two-position
THE CONTROL HANDBOOK
1354 pseudo inverses, defined as S:
= (S:K~S,)-'S:K~,
S:
=
ST K;'.
(S,T K;'s~)-'
(76.133)
With Equation 76.129 this yields
s J ~ ; ' s ~ = os:K,s,=o. ,
Figure 76.6
Figure 76.7
(76.134)
The projection matrices decompose every Cartesian force F, into aforce of constraint Pf F,, which is fullytaken up by the constraint, and aforce of motion (I6 - P f )F,, which generates motion along the velocity-controlled directions. (I6 is the 6 x 6 unity matrix.) Similarly, the Cartesian velocity V consists of a velocity of freedom part P, V in the velocity-controlled directions and a velocity of constraint part (I6 - P,)V, which deforms the environment. Within the velocity- and force-controlled subspaces, the measured velocity V and the measured force F, correspond, respectively, to the following coordinate vectors with respect to the bases S, and S f :
Peg-in-hole.
Note that in the limit case of a rigid (and frictionless) environment, V and Fe do lie in the ideal velocity- and force-controlled subspaces. As a result, the projections of V and Fe onto the velocity- and force-controlled,subspace, respectively, coincide with V and F,. Hence, Equation 76.135 always gives the same results, whatever weightingmatrices are chosen in Equation 76.133. Using the projection matrices Pf and P,, the ideal contact force of Equation 76.132 is rewritten for the partially constrained case as Fe = Pf.Ke(16 - P x ) A X . (76.136)
Sliding a block.
With Equation 76.134 this reduces to six-vectors is not well defined. Here X represents position and orientation of the robot end-effector,and X, represents position and orientation of the environment. Me is a positive definite inertia matrix; C, and K , are positive semidefinite damping and stiffness matrices. If there is sufficient passive compliance in the system, Equation 76.13 1 is approximated by F, = K , A X .
(76.132)
This case of soft environment is considered subsequently. For partially constrained motion, the contact kinematicsinfluence the dynamics of the robot-environment interaction. Moreover, in the case of a soft environment the measured velocity V does not completelybelong to the ideal velocity subspace, defined for a rigid environment, because the environment can deform. Similarly, the measured force F, does not completely belong to the ideal force subspace, also due to friction along the contact surface. Hence, for control purposes the measured quantities V and F, have to be projected onto the corresponding modeled subspaces. Algebraically, these projections are performed by projection matrices P, = s , ~ !and Pf = sts: [3]. S! and S: are (weighted)
F, = Pf K , A X .
(76.137)
76.2.3 Hybrid Eorce/Position Control The aim of hybrid control is to split up simultaneous control of both end-effectormotion and contact force into two separate and decoupled subproblems [5]. Three different hybrid control approaches are presented here. In the first two, acceleration-resole ~ ? d approaches, the control signals generated in the force and velocity:subspaces are transformed to joint space signals, necessary to drive the robot joints; in terms of accelerations. Both the cases of a rigid and of a soft environment are considered. In the third, velocity-resolved approach, this transformation is performed in terms of velocities. S i m F l y as described in Chapter 76.1, all three approaches use a combination of an inner and an outer controller: the inner controller compensates the dynamics of the robot arm and may be model based; the outer controller is purely error driven. We consider only the case of a constant contact geometry, so
76.2. FORCE CONTROL OF ROBOT MANIPULATORS
Acceleratio-Resolved Control: Case of Rigid Environment The system equations are given by Equation 76.1 18, where F, is an ideal constraint force as in Equation 76.124. The inner loop controller is given by
Similarly, premultiply Equation 76.142 with S : K,. This eliminates the terms containing A anda,, because of Equation 76.134, and results in
S : K , J M - ~ J ~ F ,= S : K , J M - ~ J I F ~ .
(76.147)
Since S; K, J M - I J T is an n f x n f nonsingular matrix, and with Equations 76.124 and 76.1'30, this reduces .to where F" is the desired force, specified as in Equation 76.130. n,, is a desired joint space acceleration, which is related to a,, the desired Cartesian space acceleration resulting from the outer loop controller: uC,= J r 1 ( u X- JG). (76.140) The closed-loop dynamics of the system with its inner loop controller are derived as follows. Substitute Equation 76.140 inEquation 76.139, then substitute the result in Equation 76.1 18. Using the derivative of Equation 76.1 19,
where A is the Cartesian acceleration of the end-effector, this results in
4
with = @d - 4. This proves the complete decoupling between velocity- and force-controlled subspaces. In the velocitycontrolled subspace, the dynamics are assigned by choosing appropriate matrices Kd, and Kp,. The force control is very sensitive to disturbance forces, since it contains no feedback. Suppose a disturbance force Fdl,t acts on the end-effector, e.g., due to modeling errors in the inner loop controller. Thi~schanges Equation 76.149 to = @dr.,t, (76.150)
6
t = SfFdi,yt. This effect is comlpensated for by
In the case of a rigid environment, the Cartesian acceleration A is given by the derivative of Equation 76.128:
The outer loop controller generates an acceleration in the motion-controlled subspace only:
where )? = X d - x ; X d specifies the desired end-effector velocity as in Equation 76.130; x represents the coordinates of the measured velocity, and is given by Equation 76.135; ,fA represents the time integral of 2 ; Kd, and K p x are control gain matrices with dimensions n , x n , and physical units and &, respectively. Substituting the outer loop controller of Equation 76.144 into Equation 76.142 results in the closed-loop equation. This closedloop equation is split up into two independent parts corresponding to the force- andvelocity-controlledsubspaces. First, premultiply Equation 76.142 with S: K;' JeT M J - ~ .This eliminates the terms containing F, and F ~ because , of Equation 76.134, and results in
Since S: KT' l P T M .I-' Sx is a n , x n , nonsingular matrix, and with Equations 76.143 and 76.144, this reduces to
Choosing diagonal gain matrices Kd, and Kpx decouples the velocity control.
where @di,yt modifying Fd in Equation 76.139 to include feedback, e.g., F~ KPf (& - @),with K p f a dimensionless n f x n f (diagonal) force control gain matrix. Using this feedback control law, Equation 76.149 results in
+
Acceleration Resolved Control: Casie of Soft Environment The system equations are given by Equation 76.1 18 for the robot .and by Equation 76.137 for the robot-environment interaction. The inner loop controller is taken as
where F, is the measured contact force, which is supposed to correspond to the real contact force. aq is again given by Equation 76.140. The closed-loop dynamics of the system with its inner loop controller result in
The outer loop controller contains two terms: OX
= a,,
+ a,f.
(76.154)
a,, corresponds to an acceleration of freedom and is given by Equation 76.144. On the other hand, a,f co:rresponds to an acceleration of constraint and corresponds to
Kdf and KPf are (diagonal) control gain matrices with dimensions n f x n f and physical units and respectively.
&
dC7
THE CONTROL HANDBOOK Substituting Equation 76.154 in Equation 76.153 leads to the closed-loop dynamic equation. The closed-loop equation is split up into two independent parts corresponding to the forceand velocity-controlled subspaces. First, premultiply Equation 76.153 with s:. This eliminates the acceleration of constraint a x f ,because of Equation 76.134, and results in
With j = S:A, and with Equation 76.144, this reduces to Equation 76.146. Similarly, premultiply Equation 76.153 with S: K,. In the case of a stationary environment, i.e., X, is constant, the second derivative of Equation 76.137 is given by
With Equation 76.157, with Equation 76.135, and Equation 76.134 this leads to:
Hence, there is complete decouplingbetween velocity- and forcecontrolled subspaces. Suppose a disturbance force Fdist acts on the end-effector. The influence on the force loop dynamics is derived as:
Instead, the state of practice consists of using a set of indeperdent, i.e., completely decoupled, velocity controllers for each robot joint as the inner controller. Usually sucti a velocity controller is an analog proportional-integral (PI) type controller that controls the voltage of the joint actuator based on feedback of the joint velocity. This velocity is either obtained from a tachometer or derived by differentiating the position measurement. Such an analog velocity loop can be made very high bandwidth. Because of their high bandwidth, the independent joint controllers are able to decouple the robot dynamics in Equation 76.1 18 to a large extent, and they are able to suppress the effect of the environment forces to a large extent, especially if the contact is sufficiently compliant. Other disturbance forces are suppressed in the same way by the inner velocity controller before they affect the dynamics of the outer loop. This property has made this approach, in combination with a compliant end-effector, very popular for practical applications. This practical approach is considered below. The closed-loop dynamics of the system with its high bandwidth inner velocity controller is approximated by
or, using Equation 76.119, in the Cartesian space:
vx is the control signal generated by thh outer loop controller. The outer loop controller contains two terms:
Hence, as in the case of a rigid environment, disturbance forces directly affect the force loop; their effect is proportional to the contact stiffness. As a result, accurate force control is much easier in the case of soft contact. This is achieved by adding extra compliance in the robot end-effector.
Remarks. 1. Usually, (hd = J d = 0 . 2. Usually, the measured force signal is rather noisy. Therefore, feedback of = S; F. in the outer loop controller of Equation 76.155 is often replaced by S: K . 14, where the joint velocities 4 are measured using tachometers. In the case of a stationary environment, both signals are equivalent, and hence they result in the same closed-loop dynamics. 3. If in the inner controller of Equation 76.152 the contact force is compensated with the desired force F~ instead of the measured force F,, the force loop dynamics become Equation 76.159, with Fdist = F, - F ~Hence, . the dynamics of the different force coordinates are coupled.
4
Velocity-Resolved Control The model-based inner loop controllers of Equations 76.139 and 76.152 are too advanced for implementation in current industrial robot controllers. The main problems are their computational complexity and the nonavailability of accurate inertial parameters of each robot link. These parameters are necessary to calculate M(q), C ( q , q), and g(q).
vxx corresponds to a velocity offi-eedornand is given by UX
= SX (x%
KPxi
~. )
(76.163)
On the other hand, vxf corresponds to a velocity of constraint and corresponds to
(y+ K P f $ ) . Both K p x and K p f have units A. vxf = K K ; ' S ~
(76.164)
Substituting Equation 76.162 in Equation 76.161 leads to the closed-loop dynamic equation. The closed-loop equation is split up into two independent parts corresponding to the forceand velocity-controlled subspaces. First, premultiply Equation, 76.161 with S l . This eliminates the velocity of constraint v x f , because of Equation 76.134, and results in:
Similarly, premultiply Equation 76.161 with S tf K , . In the case of a stationary environment, i.e., X, is constant, the derivative of Equation 76.137 is given by
This leads to
$ + K , ~ $= 0 .
(76.167)
Hence, there is complete decouplingbetween velocity- and forcecontroued subspaces.
76.2. FORCE CONTROL OF ROBOT MANIPULATORS
76.2.4 Adaptive Control The hybrid control approach just presented explicitly relies on a decomposition of the Cartesian space into force- and velocitycontrolled direction,^. The control laws implicitly assume that accurate models of both subspaces are available all the time. On the other hand, most practical implementations turn out to be rather robust against modeling errors. For example, for the two tasks discussed in Section 76.2.2-i.e., peg-in-hole and sliding a block-the initial relative position between the manipulated object and the environment may contain errors. As a matter of fact, to cope reliably with these situations is exactly why force control is used! The robustness of the force controller increases if it can continuously adapt its model of the force- and velocitycontrolled subspaces. In this chapter, we consider only geometric parameters (i.e., the Sx and S j subspaces), not the dynamical parameters of the manipulator and/or the environment. The previous sections relied on the assumption that S, = 0 and sf = 0. If this assumption is not valid, the controller must follow (or track) the constraint's time varianceby: (1) using feedforward (motion) information from the constraint equations (if known and available); (2) estimating the changes in Sx and Sf from the motion and/or force measurements. The adaptation involves two steps: (1) to identify the errors between the current constraint model and the currently measured contact situation; and (2) to feed back these identified errors to the constraint model. Figures 76.8 and 76.9 illustrate two examples of error identification.
contacting edge of the object; and (2) an uncertainty ACYin the frame's orientation about the same edge (Figure 76.9). Identification equations for these uncertainties are Act = arctan 2~ UY
(velocity based),
(76.168)
(force based). (76.169) ha! = - arctan &fY The results from the force- and/or velocity-based identification should be fed back to the model. Moreover, in the contourfollowing case, the identified orientation error can be converted into an error iA, such that the setpoint control laws of Equation 76.144 or Equation 76.163 make the robot track changes in the contact normal.
Figure 76.8
Estimation of orientation error.
Figure 76.9
Estimation of position error.
Two-Dimensional Contour Following The orientation of the contact normal changes if the environment is not planar. Hence, an error ACYappears. This error angle can be estimated with either the velocity or the force measurements only (Figure 76.8): 1. Velocity based: The Xt Yt frame is the modeled task frame, while XoYo indicates the real task frame. Hence, the executed velocity V, which is tangential to the real contour, does not completely lie along the Xt axis, but has a small component Vyt along the Yt axis. The orientation error A a is approximated by the arc tangent of the ratio Vyt/ Vxt. 2. Force based: The measured (ideal) reaction force F does not lie completely along the modeled normal direction (Yt), but has a component Fxl along Xt. The orientation error ha is approximated by the arc tangent of the ratio Fxt/ FYI. The velocity-based approach is disturbed by mechanical compliance in the system; the force-based approach is disturbed by friction.
Sliding an Edge Over an Edge This task has two geometric uncertainty parameters when the robot moves the object over the environment: (1) an uncertainty Ax of the position of the task frame's origin along the
,
76.2.5 Impedance Control The previous sections focused on a model-based approach towards force control: the controller relies on an explicit geometric model of the force- and velocity-controlleddirections. However, an alternative approach exists, called impedance control. It differs from the hybrid approach both in task specification and in control.
Task Specification Hybrid control specifies desired motion and force trajectories; impedance control [4] specifies (1) a deskred motion trajectory and ( 2 ) a desired dynamic relationship between the devi-
1358
THE CONTROL HANDBOOK
ations from this desired trajectory, induced by the contact with the environment, and the forces exerted by the environment: F=-M,
li+c, V + K ,
/?dl.
In damping control (or rather, inverse damping control), a modified velocity V m is. commanded:
(76.170)
V is the Cartesian error velocity, i.e., the difference between the prescribed velocity vd and the measured velocity V ; v is the Cartesian acceleration; M,, C,, and K , are user-defined inertia, damping, and stiffness matrices, respectively. Compared to hybrid forcelposition control, the apparent advantage of impedance control is that no explicit knowledge ofthe constraint kinematics is required. However, in order to obtain a satisfactory dynamic behavior, the inertia, damping, and stiffness matrices have to be tuned for a particular task. Hence, they embody implicit knowledge of the task geometry, and hence task specification and control are intimately linked.
Besides introducing damping and stiffness, the most general case of admittance control changes the apparent inertia of the robot manipulator. In this case a desired acceleration A'" is solved from Equation 76.170:
X'", V 1 " and , A"' are applied to the motion controller (see Chapter 76.1) in which the constraint forces may be compensated fhr by the measured forces as an extra term. For example, in the general admittance case, the control torques are
Control For control purposes the dynamic relationship of Equation'76.170 can be interpreted in two ways. It is the model of an impedance; i.e., the robot reacts to the "deformations" of its planned position and velocity trajectories by generating forces. Special cases are stiffness control 181, where M, = C, = 0, and damping control; where M, = Kc = 0. However, Equation 76.170 can also be interpreted in the other way as an admittance; i.e., the robot reacts to the constraint forces by deviating from its planned trajectory. Impedance Control In essence, astiffness or damping controller is a proportional-derivative (PD) position controller, with position and velocity feedback gains adjusted in order to obtain the desired impedance. Consider a PD joint position controller (see Chapter 76.1):
where ail is given by Equation 76.140, in which a , = A"'; and A,,, is given by Equation 76.179.
References [ l ] DeFazio, T. L., Seltzer, D.S., Whitney, D. E., The
[2] [3]
[4]
This compares to Equation 76.170, with M, = 0 by writing
[5]
instrumented remote center compliance, Industrial Robot, 11(4), 238-242, 1984. De Schutter, J. and Van Brussel, H., Compliant robot motion, Int. I. Robotics Res., 7(4), 3-33, 1988. Doty, K. L., Melchiorri, C., and Bonivento, C., A theory of generalized inverses applied to robotics, Int. ]. Robotics Res., 12(1), 1-19, i993. Hogan, N., Impedance control: an approach to manipulation, 1-111, Trans. ASME, J. Dynamic Syst., Meas., Control, 117, 1-24, 1985. Khatib, O., A Unified approach for motion and force control of robot manipulators: the operational space formulation, IEEE.]. Robotics Autom., 3(1), 43-53,
1987. [6] Mason, M. T., Compliance and force control for
Hence, in order to obtain the desired stiffness or damping behavior, the PD gain matrices have to be chosen as
Note that the gain matrices are position dependent due to the position dependence of the Jacobian J . Admittance Control In this case, the measured constraint force F, is used to modify the robot trajectory, given by x d ( t ) and v d ( t ) . In stifiess control (or rather, inverse stiffness, or compliance control), a modified position X m is commanded:
computer controlled manipulators, IEEE Trans. Syst., Man, Cybern., 11(6), 418-432, 1981. [7] Patarinski, S. and Botev, R., Robot force control, a review, Mechatronics, 3(4), 377-398, 1993. 181 Salisbury, J. K., Active stiffness control of a manipulator in cartesian coordinates, 19th IEEE Con$ Decision Control, 95-100, 1980. [9] Wang D. and McClamroch, N. H., Position and force control for constrained manipulator motion: Lyapunov's direct method, IEEE Trans. Robotics Autom., 9(3), 308-313, 1993. [lo] Whitney, D. E., Historic perspective and state of the art in robot force control, Int. J. Robotics Res., 6(1), 3-14, 1987.
76.3. CONTROL OF NONHOLONOMIC SYSTEMS
76.3 Control of Nonholonomic Systems -
-
John Ting-Yung Wen, Department of Electncal, Computer, and Systems Engineering, Rensselaer Polytechnic Institute 76.3.1 Introduction When the generalized velocity of a mechanical system satisfies an equality condition that cannot be written as an equivalent condition on the generalized position, the system is called anonholonomic system. Nonholonomic conditions may arise from constraints, such as pure rolling of a wheel, or from physical conservation laws, such as the conservation of angular momentum of a free floating body. Nonholonomic systems pose a particular challenge from the control point of view, as any one who has tried to parallel park a car in a tight space can attest. The basic problem involves finding a path that connects an initial configuration to the final configuration and satisfies all the holonomic and nonholonomic conditions for the system. Both open-loop and closed-loop solutions are of interest: open loop solution is useful for off-line path generation and closed-loop solution is needed for real-time control. Nonholonomic systems typically arise in the following classes of systems: 1. No-slip constraint Consider a single wheel rolling on a flat plane (see Figure 76.10). The no slippage (or pure rolling) contact condition means that the linear velocity at the contact point is zero. Let ij and 3, respectively, denote the angular andlinearvelocity ofthe body frame attached to the center ofthe wheel. Then the no slippage condition at the contact point can be written as ;i-e;x?=o. (76.181) We will see later that part of this constraint is nonintegrable (i.e., not reducible to a position constraint) and, therefore, nonholonomic.
constraint is nonintegrable. The dynamic equations of wheeled vehicles and finger grasping are of similar forms. There are two sets of equations of motion, one for the unconstrained vehicle or finger, the other for the ground (stationary) or the payload. These two sets of equations are coupled by the constraint forceltorque that keeps the vehicle on the ground with no wheel slippage or fingers on the object with no rotation about the local normal axis. These equations can be summarized in the following form:
+ + +
~ ( 9 1 4 C(q, 419 (b) Mca, bc kc =
(a)
(c) (d) (e)
A wheel with no-slip contact.
In modeling the grasping of an object by a robot hand, the so-calledsoft finger contact model is sometimes used. In this model, the finger is not allowed to rotate about the local normal, 2.2 = 0, but is free to rotate about the local x and y axes. This velocity
~
f,
H f = 0, v+ = J e , v- = Av, = v+
(76.182)
+ H ~ W and , a+ = ~0+ J$ a- = Aa, + a =a++ HTw+GTw.
Equations 76.182a and 76.182b are the equations of motion of the fingers and the payload, respectively, f is the constraint force related to the vehicle or fingers via the Jacobian transpose J T , a, denotes the payload acceleration, b, and kc are the Coriolis and gravity forces on the payload, H i s afull row rankmatrix whose null space specifies the directions where motion at the contact is allowed (these are also the directions with no constraint fories), v+ and v- are thevelocityat the two sides ofthe contacts. Silmilarly, a+ and a- denote accelerations and W parameterizes the admissible velocity across the contact. The velocity constraint is specified in Equation 76.182d; premultiplying Equation 76.182d by the annihilator of H T, denoted by H T , H T ( r e - Av,)= 0.
(76.183)
In the single wheel case in Figure 76.10,
H= =
Figure 76.10
+ g(q) = u - ~
[0 , I].
(76.184)
The velocity constraint Equation 76.183 is then the same as Equation 76.18 1. 2. Conservation of angular momentum In a Lagrangian system, if a subset of the generalized coordinates q, does not appear in the mass matrix M(q), they are called the cyclic coordinates. In this case, the Lagrangian equation associated with q, is
f
,
THE CONTROL HANDBOOK After integration, we obtain the conservation of generalized momentum condition associated with the cyclic coordinates. As an example, consider a free floating multibody system with no external torque (such as a robot attached to a floating platform in space or an astronaut unassisted by the jet pack, as shown in Figure 76.11).
Second-Order condition: Consider a robot with some ofthe joints unactuated. The general dynamic equation can be written as
By premultiplying by = [O I ] which annihilates the input vector, we obtain a condition involving the acceleration,
It can be shown that this equation is integrable to a velocity condition, h ( q ,q , t ) = 0, if, and only if, the following conditions hold 111: Figure 76.1 1
Examples of free floating multibody systems.
(b) the mass matrix M ( q ) does not depend on the unactuated coordinates, q, = Bq.
The equation of motion for such systems is
+
[
r)
cll(9.
C21(9.4,
9 . "1 4 2 k . q . 0) 2,.,
][
]
where w is the angular velocity of the multibody system about the center of mass. This is a special case of the situation described above with L = i4T~(q)4 ; o T M b ( q ) w . Identifying q, with o , Equation 76.185 becomes
+
+
which is a nonintegrable condition. 3. Underactuated mechanical system An underactuated mechanical system is one that does not have all of its degrees of freedom independently actuated. The nonintegrable condition can arise in terms of velocity, as we have seen above, or in terms of acceleration which cannot be integrated to a velocity condition. The latter case is called the
second-order nonholonomic condition [ 1 1. First-Order condition: Consider a rigid spacecraft . with less than three independent torques.
where B is a full column rank matrix with rank less than three. Let B be the annihilator of B, i.e., BB = 0. Then premultiplying Equation 76.188 by B gives ( B I ~=) 0. Assuming the initial velocity is zero, then we arrive at a nonintegrable velocity constraint,
%
(a) the gravitational torque for the unactuated variables, g,(q) = B g ( q ) ,is a constant and
This implies that any earthbound robots with nonplanar, articulated, underactuated degrees of freedom would satisfy a nonintegrable second-order constraint because g, ( q )would not be constant. The control problem associated with a nonholonomic system can be posed based on the kinematics alone (with an ideal dynamic controller assumed) or the full dynamical model. In the kinematics case, nonholonomic conditions are linear in the velocity, v, 52(q)v = 0. (76.191) Assuming that the rank of 52 ( q ) is constant over q , then Equation 76.191 can be equivalently stated as,
where the columns of f (q) form a basis of the null space of Q ( q ) . Equation 76.192 can be regarded as a control problem with u as the control variable and the configuration variable, q , as the state if v = q. If v is nonintegrable (as is the case for the angular velocity), there would be an additional kinematic equation q = h ( q ) v (such as the attitude kinematic equation); the control problem then becomes q = h ( q )f (q)u. Note that in either case, the right hand side of the differential equation does not contain a term dependent only in q. Such systems are called driffless systems. Solving the control problem associated with the kinematic Equ3tion 76.192 produces a feasible path. To actually follow the path, a real-time controller is needed to produce the required force or torque. This procedure of decomposing path planning and path following is common in industrial robot motion control. Alternatively, one can also consider the control of the full dynamical system directly. In other words, consider Equation 76.182 for the rolling constraint case, or Equations 76.186, 76.188 or 76.189 for the underactuated case, with u as the control input. In the rolling constraint case, the contact force also needs
76.3. CONTROL OF NONHOLONOMIC SYSTEMS to be controlled, similar to a robot performing a contact task. Otherwise, slippage or even loss of contact may result (e.g., witness occasional truck rollovers on highway exit ramps). The dynamical equations also differ from the kinematic problem Equation 76.192 in a fundamental way: a control-independent term, called the drift term, is present in the dynamics. In contrast to driftless systems, there is no known general global controllability condition for such systems. However, the presence of the drift term sometimes simplifies the problem by rendering the linearized system locally controllable. This chapter focus mainly on the kinematic control problem. In addition to the many research papers alreadypublished on this subject, excellent summaries of the current state of research can be found in [2,31. In the remainder of this chapter, we address the following aspects of the kinematic control of a nonholonomic system: 1. Determination of Nonholonomy. Given aset of constrdints, how does one classify them as holonomic or nonholonomic? 2. Controllablllty. Given anonholonomicsystem, does a path exist that connects an initial configuration to the desired findl configuration? 3. Path Planning. Grven a controllable nonholonomic system, how does one construct apath that connects an initial configuration to the desired final configuration? 4. Stabilizability. Given a nonholonomic system, can one construct a stabilizing feedback controller, and if it is possible, how does one do so? 5. Output stabilizability. Given a nonholonomic system, can one construct a feedback controller that drives a specified output to the desired target while maintaining the boundedness of all the states, and, if it is possible, how does one do so? We shall use a simple example to illustrate various concepts and results throughout this section. Consider a unicycle with a fat wheel, i.e., it cannot fall (see Figure 76.12). For this system,
Figure 76.12
Unicycle model and coordinate definition.
there are four constraints:
2 B .i3 = 0, and v'-2 x li = 0.
(76.193)
The first equation speciiies the no-tilt constraint and the second equation is the no-slip constraint.
1361
76.3.2 Test of Nonholonomy 4 s motivated in the previous section, consider a set of constraints in the following form:
where q E Rnis the configuration variable, q is the velocity, and Q (q) E xexn specifies the constraint directions. The complete integrability of the velocity condition in Equation 76.194 means that Q (q) is the Jacobian of some function, h(q) E Re,i.e.,
In this case, Equation 76.194 can be written as an equivalent holonomic condition, h(q) = c,where c is some constant vector. Equation 76.194 may be only partially integrable, which means that some of the rows of Q(q), say, Qk+l, . . . , ill, satisfy
for some scalar functions h, (9). Substituting Eq~~ation 76.196 in Equation 76.194, we have e - k integrable constraints
which can be equivalently written as hi (q) = ci for some constants ci . Ifl-k is the maximum number ofsuch hi (q) functions, the remaining k constraints are then nonholonornic. To determine ifthe constraict Equation76.194 is integrable,we can apply the Frobenius theorem. We first need some definitions:
DEFINITION 76.1 1. A vectorjeld is a smooth mapping from the configuration space to the tangent space. 2 . A distribution is the subspace generated by a collection of vector fields. The dimension of a distribution is the dimension of any basis of the distribution. 3. The Lie bracket between two vector fields, f and g, i~defined as
4. An involutive distribution is a distribution that is closed with respect to the Lie bracket, that is, i f f , g belong to a distribution A, then [ f,g ] also belongs to A. 5. A distribution, A, with constant dimensioin m, consisting of vector fields in Rnis integrable if n - m functions, h 1, . . . , h,-, exist so that the Lie derivative of hi along each vector field f E A is zero, that is,
THE CONTROL HANDBOOK 6. The involutive closureofadistribution A is the smallest involutive distribution that contains A.
Lie brackets:
The Frobenius theorem can be simply stated:
THEOREM 76.1 A distribution is integrable if; and only if; it is involutive. To apply the Frobenius theorem to Equation 76.194, first observe that q must be within the null space of 52 ( q )denoted by A. Suppose the constraints are independent throughout the configuration space, then the dimension of A is n - f!; let a basis of A be gi(q), . . . , gn-e(q):
a
From the Frobenius theorem, the annihilator of ;? is integrable. Indeed, the corresponding holonomic constraints are what one could have obtained by inspection:
z = constant,
,
Let, be the involutive closure of A. Suppose the dimension of A is constant, n - L k. Since is involutive by definition, from the Frobenius theorem, is integrable. This means that functions hi,i = 1, . . . , f! - k exist, so that annihilates Z:
+
,
(76.200) which is of constant dimension four. The annihilator of is
%
a
for all f E 3 A. Since q E A, it follows that Equation 76.196 is satisfied for all q. Hence, among the & constraints given by Equation 76.194, C - k are holonomic (obtained from the annihilator of and k are nonholonomic. Geometrically, this means that the flows of the system lie on a n - L $ k dimensional manifold given by hi =constant, i = 1, . . . ,& - k. To illustrate the above discussion, consider the unicycle example presented at the end of Section 76.3.1. First write the constraints Equation 76.193 in the same form as Equation 76.191
x)
y? = roll angle = 0.
Eliminating the holonomic constraints results in acommon form of the kinematic equation for unicycle,
where ul = my, and u2 = wZ. Because the exact wheel rotational angle is frequently inconsequential, the $ equation is often omitted. In that case, the kinematic Equation 76.201 becomes
We shall refer to the system described by either Equation 76.201 or Equation 76.202 as the unicycle problem.
76.3.3 Nonholonomic Path Planning Problem This implies that
Represent the top portion of each vector field in the body coordinates, jB = 10, 1, OIT, Z = [O, 0, llT, and the bottom portion in the world coordinates, x ' ~= [f!ce,t s e , 0IT, ce = cos 0 and se = sin 0,O is the steering angle. We have
The nonholonomic path planning problem, also called nonholonomic motion planning, involves finding,apath connecting specified configurations that satisfies the nonholonomic condition as in Equation 76.194. As discussed in Section 76.3.1, this prablem can be written as an equivalent nonlinear control problem: Given the system
and initial and desiredfinal configurations, q(0) = q0 and q f , find g = { u ( t ): t E 10, 11) so that the solution of Equation 76.203 satisfies q (1) = qf .
The involutive closure of A can be computed by taking repeated
The terminal time has been normalize'd to 1. In Equation 76.203, f( q ) is a full rank matrix whose columns span the null space of 52 ( q )in Equation 76.194, and u parameterizes the degree of freedom in the velocity space. By construction, f ( q )is necessarily a tall matrix, that is, an n x m matrix with n > m .
76.3. CONTROL OF NONHOLONOMIC SYSTEMS
Controllability For u = 0, every q in Rnis an equilibrium. The linearized system about any equilibrium q* is
Because f (q) is tall, this linear time-invariant system is not controllable (the controllability matrix, [f (q*) : 0 : . . . : 01, has maximum rank m ) . This is intuitively plausible; as the nonholonomic condition restricts the flows in the tangent space, the system can locally only move in directions compatible with the nonholonomic condition, contradicting the controllability requirement. However, the system may still be controllable globally. For a driftless system, such as a nonholonomic system described by Equation 76.203, the controllability can be ascertained through the following sufficient condition (sometimes called Chow's theorem): THEOREM 76.2 The system given by Equation 76.203 is controllable ifthe ~nvolutiveclosureofthecolumns of f (q) is ofconstant rank n for all q. The involutive closure of a set ofvector fields is in general called the Lie algebra genrrated by these vector fields. In the context of control systems where the vector fields are the columns of the input matrix f (q), the Lie algebra is called the control Lie algebra. For systems with drift terms, the above full rank condition is only sufficient for local accessibility. For a linear time-invariant system, this cor~ditionsimply reduces to the usual controllability rank condition. This theorem is nonconstructive, however. The path planning problem basically deals with finding a specific control input to steer the system from a given initial condition to a given final condition, once the controllability rank condition is satisfied. Since the involutive closure of the null space of the constraints is just the control Lie algebra of the corresponding nonholonomic control system, the control system (with the holonomic constraints removed) is globally controllable as long as the constraints remain independent for all configurations. For the unicycle problem given by Equation 76.201, the control Lie algebra is in Equation 76.200 (with z and $ coordinates removed). Because the dimenskon of and the state-space dimension are both equal to four, tlhe system is globally controllable. An alternate way to view Equation 76.203 is to regard it as a nonlinear mapping of the input function g to the final state q (1):
By definition, global controllability means that F(qo, .) is an onto mapping for every go. For a given g,V,-F(qo, g ) corresponds to the system linearized about a trajectory q = {q(t) : t E [O, 11) which is generated by g:
A where, A(t) =
aJ
(q(t))u(t) : . . . : %(q(t))u(t)], and
A
B(t) = f (q(t)). Since Sq(0) = 0, the solution to this equation is. 1
Sq(1) =
@(l.s)B(s)Su(s)ds
where @ is the state transition matrix of the linearized system. It follows that
Controllability of the system in Equation 76.207 implies that for any final state Sq(1) E Rn, a control Su exists which drives the linear system from Sq(0) = 0 to Sq(1). This is equivalent to the operator V,-F being onto (equivalently, the null space of the adjoint operator, [v, F ] * , being zero). In the case that g = 0, V, F reduces to the linear time-invariant system Equation 76.204. 1;this case, V, F cannot be of full rank because the linearized system is not controllable.
Path Planning Algorithms Steering with Cyclic Input In Equation 76.203, because f (q) is full rank for all q , there is a coordlinate transformation so that f (q) becomes
Then, F (90, 41) = 4, (1; go). In general, the analytic expression for F is impossible 6 obtain.
[
1.
In other words, the
inputs are simply the velocities of rn configuratiori variables. For example, in the unicycle problem described by Equation 76.201, ul and u2 are equal to 6 and $. The subspace corresponding to these variables is called the base space (also called the shape space). A cyclic motion in the base space returns the base variables to their starting point, but the configuration variables would have a net change (called the geometric phase) as shown in Figure 76.13. In the unicycle case, cyclic motions in 0 and 4 result in the following net changes in the x and y coordinates:
a
Given qo and g , denote the solution of Equation 76.203 by
(76.208)
Y(T)-Y(O)
AT
=
sin 6 4 dr =
f
sin 6 d 4 . (76.210)
By Green's theorem, they can be written as surface integrals x(T) -x(O)
=
//s -
sin6 d o dm;
and
where S is the surface enclosed by the closed contour in the ( 4 , 8 ) space.
THE CONTROL HANDBOOK A general strategy for path planning would then consist of two steps: first drive the base variables to the desired final location, then appropriately choose a closed contour in the base space to achieve the desired change in the configurationvariables without affecting the base variables. This idea has served as the basis of many path planning algorithms.
1
(2
1 GI
cos(4nt)sin
sin(2nr)) dt.
Using Fourier series expansion for even functions, cos
(2
c 03
=
sin(2nt))
nk cos(2ntt)
k=O 00
sin (a' sin(2nt)) 2n
Bk cos(2nkt).
= k=O
After the integration, we obtain
bare space
Figure 76.13
Geometric phase.
To illustrate this procedure for path planning for the unicycle example, assume that the base variables, $J and 6, have reached their target values. We choose them to be sinusoids with integral frequencies, so that at t = 1, they return to their initial values: ul = a, cos(4nt), and u2 = a2 cos(2nt).
Because a 1 and #11 depend on as, given the desired motion in x and y, Equation 76.216 results in a one-dimensional line search for a2: Ax AY ---- 0. (76.217) ffl (a2) Bl (a2) Once a2 is found (there may be multiple solutions), a1 can be found from Equation 76.216. The above procedure of using sinusoidal inputs for path planning can be generalized to systems in the following canonical form (written for systemswith two inputs), called the chainform:
I:] I 1
(76.212)
-
By direct integration, a1 4n
a2 2n
$J = - sin(lnt), and B = -sin(2xt).
(76.213)
For several values of a l and a2, the closed contours in the 4 - 0 plane given by Equation 76.213 are as shown in Figure 76.14. The net changes in x and y over the period [0, 11 are given by the surface integrals Equation 76.211 over the area enclosed by the contours. To achieve the desired values for x and y , the two equations can be numerically solved for a1 and a2.
I : q3u1 :
(76.218)
qn qn-lul. For example, the unicycle problem Equation 76.202 can be converted to the chain form by defining qi
= 0
q2 = COX +SOY 93 = s o x - c o y . Then 91
=
Ui
q2
=
-q3Ul
q3
= q2ul.
+
k42
By defining the.right hand side of the q2 equation as the new &, the system is now in the chain form. For a general chain system, consider the sinusoidal inputs,
Figure 76.14
It follows that, for i < k + 2, qi (t) consists of sinusoids with period 1; therefore, qi (1) = qi (0). The net change in qk+2 can be computed as
Closed contour in base space.
This procedure can also be performed directly in the time domain. For the chosen sinusoidal inputs, the changes in x and y are 1
Ax
=
a2 a1 cos(4nt) cos (% sin(2nt)) d t
(76.214)
The parameters a and b can then be chosen so that qk+2 is driven to the desired value in [O, 11without affectingall of the statespreceding it. A steering algorithm will then consist of the following steps:
76.3. CONTROL OF NONHOLONOMIC SYSTEMS 1. Drive ql and q 2 to the desired values.
2. For each q k + 2 , k = 1, . . . ,n - 2, drive qk+2 to its desired values by using the sinusoidal input Equation 76.219 with a and b determined from Equation 76.220 Many. systems can be converted to the chain form, e.g., kine. matic car, space robot etc. In 141, a general procedure is provided to transform a given system to, the chain form. There are also some systems that cannot be transformed to the chain form, e.g., a ball rolling on a flat plate. Optimal Control Another approach to nonholonomic path planning is optimal control. Consider the following two-input, three-state chain system (we have shown that the unicycle can be converted to this form):
1365 The requirement on the zero final state can be used to solve for the constants in the controb
-
u1(0)
= -91(0)
u2(0)
-42(0)
a
-
41 (0)
(76.228)
+ Jq:
(0)
+ 2x91 (0)q2(0).
If the expression within the square root is neg,ative, then the constant c should be chosen as -2n to render it ]positive. The optimization approach described above can be generalized to certain higher order systems, but, in general, the optimal control for nonholonomic systems is more complicated. One can also try finding an optimal solution numerically:,this would, in general, entail the solving a two-point boundary value problem. A nonlinear programming approach based on the following Ritz approximation of the input function space has also been proposed: N
The inputs uj are to be chosen to drive q (t) from qo to q(1) = 0 while minimizing the input energy:
where qk's are chosen to be independent orthonairmal functions (such as the Fourier basis) and ak's are constant vectors parameterizingthe input function. The minimum input energy criterion can then be combined with a final state penalty term, resulting in the following optimization criterion:
The Harniltonian associated with this optimal control problem is 1 (76.222) H(q, u. A) = 7 11 u I I +hTf ~ (4)~ where h is the co-state vector. From the Maximum Principle, the optimal control can be found by minimizing H with respect to u: ul = -(hi h.3q2), u2 = -A2. (76.223)
+
The co-state satisfies
Differentiating the optimal control in Equation 76.223,
where c is a constant (c = --A3). This implies that u 1 and u2 are sinusoids: ul(t)
=
-acosct+ul(O),
uz(t)
=
asinct+u2(0).
The optimal ak's can be solved numerically by using nonlinear programming. The penalty weighting y can be iteratively increased to enforce the final state constraint. The problem is not necessarily convex. Consequently, as in any nonlnnear programming problem, only local convergence can be asserted. In next section, we will describe a similar approach without the control penalty term in J. As a result of this modification, a stronger convergence condition can be established. Path Space Iterative Approach As shown in the beginning of Section 76.3.3, the differential equation governing the nonholonomic motion Equation 76.203 can be written as a nonlinear operator relating an input function, g, to1 a path, q. By writing the final state error as
and (76.226)
Substituting in the equation of motion Equation 76.221 and choosing c = 2r,
the path planning problem can be regarded as a nonlinear leastsquares problem. Global controllability means, that for any q i , there is at least one solution g. Many numerical algorithms exist for the solution of this problem. In general, the solution involves lifting a path connecting the initial y to the desired y = 0 to the ) the first guess of the input u space (see Figure 76.15). Let ~ ( 0be function and y (0) be the corresponding final state error as given by Equation 76.231. The goal is to modifjr u iteratively so that y converges to 0 asymptotically. To this end, choose a path in
THE CONTROL HANDBOOK
the time interval to [O, 11yields a nonsingular control which does not change y. The algorithm is therefore guaranteed to converge to any arbitrary neighborhood of the desired final configuration.
,
Outpul Error Space
O Y(0)
Figure 76.15
Path planning by lifting a path in output error space. o C Y '
the error space connecting y(0) to 0, call it yd(t), where t is the iteration variable. The derivative of y ( t ) is
If VE,F(go, g) is full rank, then we can choose the following update rule for ~ ( r to) force y to follow yd:
Figure 76.16
Generic loop.
This algorithm has been extended to include inequality constraints such as joint limits, collision avoidance, etc. [6],by using an exterior penalty function approach. Consider state inequality constraints given by c(g) 5 0
where a! > 0 and [v, ~ ( qg)]'~ , denotes the Moore-Penrose pseudo-inverse of v,F(g). This is essentially the continuous version of Newton's method. Equation 76.233 is an initial value problem in g with a chosen g(0). With g discretized by a finite dimensional approximation (e.g., using Fourier basis as in Equation 76.229), it can be solved numerically by an ordinary differential equation solver. As discussed in Section 76.3.3, the gradient of F , V,-F(q0, g), can be computed from the system Equation 76.203 linearized about the path corresponding to g. A sufficient condition for the convergence of the iterative algorithm Equation 76.233 is that V, F(qo, g ( t ) ) is full rank for all r , or equivalently,the time varying linearized system Equation 76.207, generated bylinearizing Equation 76.203 about g ( t ) ,is controllable. For controllable systems without drift, it has been shown in [5] that this full rank condition is true generically (i.e., for almost all g in the C, topology). In the cases where V,-F(qo, g)loses rank (possibly causing the algorithm to get stuck), a generic loop (see Figure 76.16) can be appended to the singular control causing the composite control to be nonsingular and thus allowing the algorithm to continue its progress toward a solution. A generic loop can be described as follows: For some small time interval [0, T/2], generate a nonsingular control v, (which can be randomly chosen, due to the genericity Then let 12 be the control on (0, TI cdnsisting of v, on [0, T/2] and -v, on [T/2, TI. Because nonholonomic systems have no drift term, it follows that the system makes a "loop" starting at q(1) = F(q0, g) ending once again at the same point. Appending 2 to g and renormalizing
(76.234)
where q is the complete path in the configuration space, c ( . ) is a vector, and the inequality is interpreted in the componentwise sense. The state trajectory, q, can be related to the input function u through a nonlinear operator (which is typically not possible to find analytically),
The inequality constraint Equation 76.234 can then be expressed b in terms ofg: c(F(90, u)) 5 0.
(76.236)
Inequality constraints in optimization problems are typically handled through penalty functions. There are two types, interior and exterior penalty functions. An interior penalty function sets up barriers at the boundary of the inequality constraints. As the height of the barrier increases, the corresponding path becomes closer to being feasible. If the optimization problem, in our case, the feasible path problem, can be solved for each finite barrier, then convergenceto the optimal solution is assured as the barrier height tends to infinity. In the exterior penalty function approach, the ifh inequality constraint is converted to an equality constraint by using an exterior penalty function,
where yi > 0, ci is ;he ith constraint, Fjdenotes the jfhdiscretized time point where the constraint is checked, and g is a continuous scalar function with the property that g is equal to zero, when ci is less than or equal to zero, and is greater than zero and monotonic when ci is greater than zero. The same iterative approach presented for the equality-onlycase can now be applied
76.3. CONTROL OF NONHOLONOMIC SYSTEMS
to the composite coilstraint vector: and
For a certain class of convex polyhedral constraints, the generic full rank condition for the augmented problem still holds. This approach has been successfully applied to many complex examples, such as cars with multiple trailers, subject to a variety of collision avoidance and joint limits constraints [6].
76.3.4 Stabilization State Stabilization Stabilizability means the existence of a feedback controller that will render the closed-loop system asymptotically stable about an equilibrium point. For linear systems, controllability implies stabilizability. It would be of great value if this were true for special classes of nonlinear systems such as the nonholonomic systems considered in this article (where controllability can be checked through a rank condition on the control Lie algebra). It was shown by (71 that this assertion is not true in general. For a general nonlinear system q = f o ( q , u ) , with equilibrium at go, fo(qo, 0 ) = 0 and fo(., .) continuous in a neighborhood of (go, O), a necessary condition for the existence of a continuous time-invariant control law, that renders (go, 0 ) asymptotically stable, is that f maps any neighborhood of (go, 0) to a neighb'orhood of 0 . For a nonholonomic system described by Equation 76.203, fo(q, u ) = f ( q ) u . Then the range of { fo(q, u ) : ( q , u ) in a neighborhood of (go, 0 ) )is equal to the span of the columns of f ( q ) which is of dimension m (number of inputs). Because a neighborhood about the zero state is n dimensional, the necessary condition above is not satisfied unless m 2 n. There are two approaches to deal with the lack of a continuous time-invariant stabilizing feedback. The first is to relax the continuity requirement to allow piecewise smooth control laws; the second is to relax the time invariance requirement and allow a time-varying feedback. In either approach, an obvious starting point is to begin with an initial feasiblepath obtained by using any one of the open-loop methods discussed in Section 76.3.3 and then to apply a feedback to stabilize the system around the path. Given an initial feasible path, if the nonlinear kinematic model linearized about the path is controllable, almast always true as mentioned in Section 76.3.3, a time-varying stabilizingcontroller can be constructed by using standard techniques. The resulting system will then be locally asymptoticallystable. Consider the unicycle problem Equation 76.202 as an exarnple. Suppose an open-loop trajectory, ( ( x * ( t ) ,y*(t), O*(t), t E [ O , 111, and the corresponding input, { u ; ( t ) , u;(t), t E [O, 111, are already generated by using any of the methods discussed in Section 76.3.3. The system equation can be linearized about this path:
SB =
8142.
(76.239)
This is a linear time-varying system. It can be easily verified that as long as u; is not identically zero, the system is controllable. One can then construct a time-varying stabilizing feedback to keep the system on the planned open-loop path. Stabilizing control laws can also be directly cor~structedwithout first finding a feasible open-loop path. In [8],it was shown that all nonholonomic systems can be feedback stabilized with a smooth periodic controller. For specific classes of systems, such as mobile robots in [ 9 ] ,or, more generally, the sto-called power systems as in [ l o ] ,explicit constructive procedures for such controllers have been demonstrated. We will again use the unicycle example to illustrate the basic idea of constructing a time-varying stabilizing feedback by using a time-dependent coordinate transformation so that the equation of motion contains a time-varying drift terrn~.Define a new variable z by z == 6 k ( t , x , y ) (76.240)
+
where k is a function that will be specified later. Differentiating
Consider a quadratic Lyayunov function candidate V = $ ( x 2 y2 z 2 ) . The derivative along the solution trajectory is
+
+
By choosing
which means u2
=
ak --at
ak cos 0 ax
(-
ak +sin L?)u ay
+
we obtain a negative semidefinite v = -a,( x cos 0 y sin 8 ) 2 a2z2. This implies that, as t + oo, z + 0 and x cos8 y sin 0 + 0 . Substituting in the definition of z , we get 8 (t) + - k ( t , x ( t ) , y ( t ) ) . From the other asymptotic condition, 0 ( t )also converges to - tan-' As i and ) asymptoticallyvanish, x ( t ) and y ( t ) , and therefore, B(t), tend to constants. Equating the two asymptotic expressions for e ( [ ) , we conclude that k ( t , x ( t ) , y ( t ) ) converges to a constant. By suitably choosing k ( t , x , y ) , e.g., k ( t , x , y ) -- ( x 2 y 2 ) sin(t), the only condition under which k(t ,x , y ) can converge to a constant is that x2 y2 converges to zero, which in turn implies that tl(t) + 0 . In contrast to the indirect approach (i.e., using a linear time varying control law to stabilize a system about a planned open-loop path), this control law is globally stabilizing.
+
(3). +
+
Output Stabilization In certain cases, it may only be necessary to control the state to a certain manifold rather than to a particular configuration.
THE C O N T R O L H A N D B O O K For example, in the case of a robot manipulator on a free floating mobile base, it may only be necessary to control the tip of the manipulator so that it can perform useful tasks. In this case, a smooth output stabilizing controller can frequently be found. Suppose the output of interest is
As discussed in Section 76.3.1, the nonholonomic nature of the problem follows from the conservation of the angular momentum Equation 76.187:
Eliminating w , and p < n. At a particular configuration, q ,
Define K ( q ) = V q g ( q )f ( 9 ) . If K ( q ) is onto, i.e., p 5 m and K ( q ) is full rank, then the system is locally output controllable (there is a u that can move y arbitrarily within a small enough ball) though it is not locally state controllable. The output stabilization problem involves finding a feedback controller u (possiblydependent on the full state) to drive y to a set point, yd. Provided that K ( q ) is of full row rank, an output stabilizing controller can be easily found:
Therefore, y is governed by
Under the full row rank assumption on K ( q ) ,K ( q )Q K ( q ) is positive definite, which implies that y converges to yd asymptotically. In general, either y converges to yd or q converges to a singular configuration of K ( q ) (where K ( q )loses row rank) and (y - yd) converges to the null space of K~ ( q ) . We will again use the unicycle problem as an illustration. Suppose the output of interest is (x, $) and the goal is to drive ( x , $) to (xd,B d ) where Bd is not a multiple of %. By choosing the control law,
The closed-loop system for the output is
The closed-loop system contains a singularity at 0 = %,but, if 0d # this singularity will not be attractive. The output stabilization of (x, 0) can be concatenated with other output stabilizing controllers, with other choices of outputs, to obtain full state stabilization. For example, once x is driven to zero, 0 can be independently driven to zero (with ul = O), and, finally, y can be driven to zero without affecting x and 0. These stages can be combined together as a piecewise smooth state stabilizing feedback controller. Consider a space robot on a platform as another example. Suppose the output of interest is the end effector coordinate, y. The singular configurations in this case are called the dynamic singularities. The output velocity is related to the joint velocity and center of mass angular velocity by the kinematic Jacobians,
5,
The effective Jacobian, K ( q ) = J ( q ) - J ~ M F M ' T ( ~ )some, times called the dynamic Jacobian, now contains inertia parameters (hence the modifier "dynamic") in contrast to a terrestrial robot Jacobian which only depends on the kinematic parameters. If the dimension of q is at least as large as the dimension of y, the output can be effectively controlled provided that the dynamic Jacobian does not lose rank (i.e., q is away from the dynamic singularities).
References [ I ] Oriolo, G. and Nakamura, Y., Control of mechanical systems with second-order nonholonomic constraints: Underactuated manipulators, Proc. 30th IEEE Conference on Decision and Control, 2398-2403, Brighton, England, 1991. [2] Li, Z. and Canny, J.F., Eds., Nonholonomic motion planning, Kluwer Academic, Boston, MA, 1993. [3] Sastry, S.S., Murray, R.M., and Li, Z., A Mathematical Introduction to Robotic Manipulation, CRC Press, Boca Raton, FL, 1993. [4] Murray, R.M. and Sastry, S.S., Nonholonomic motion planning - steering using sinusoids, IEEE Trans. Automat. Control, 38, 700-716, 1993. [5] Lin, Y. and Sontag, E.D., Universal formula for stabilization with bounded controls, Syst. Control Lett., 16(6), 393-397, 1991. [6] Divelbiss, A. and Wen, J.T., Nonholonomic motion planning with inequality constraints, Proc. IEEE Int. Conf. Robotics Automat., San Diego, CA, 1994. [7] Brockett, R.W., Asymptotic stabilityand feedback stabilization, in Differential Geometric Control Theory, Brockett, R.W., Millman, R.S., and Sussmann, J.J, Eds., Birkhauser, 1983, vol. 27, 181-208. [8] Coron, J.-M., Global assymptotic stabilization for controllablesystems without drift, Math. Control, Signals, Syst., 5(3), 1992. [9] Samson, C. and Ait-Abderrahim, K., Feedback control of a nonholonomic wheeled cart in Cartesian space, in Proc. IEEE Robotics Automat. Conj, Sacramento, CA, 199l. [lo] Teel, A,, Murray, R., and Walsh, G., Nonholonomic control systems: From steering to stabilization with sinusoids,Proc. 31 th IEEE Conf. Dec. Control, Tucson, AZ, 1992.
Miscellaneous Mechanical Control Systems Brian Armstrong Department of Electrical Engineering and Computer Science, University of Wisconnn-Milwaukee, Milwaukee, WI
Carlos Canudas de Wit Laboratoire d'Automat~quede Grenoble, ENSIEG, Grenoble, France
77.1 Friction Modeling and Compensation .............................. 1369 Introduction Friction Modeling 'Simulation Off-Line Friction Parameter Identification Friction Compensation Conclusion References.. .................................................................. 1382 Thomas R. Kurfess The George W Woodruff School of Mechanical Eng~neering,The 77.2 Motion Control Systems ............................................ 1382 Introduction System Elements and Operation 'Stability Analysis ExamGeorgia Institute of Technology, Atlanta, G A ple 'Motion Profiling 'Tools for Motion Coordination 'Design Example - Glue Dispensing Hodge Jenkins The George W Woodruff School of Mechanical Engineering, The References ....................................................................1386 Georgia Institute of Technology, Atlanta, G A 77.3 Ultra-High Precision Control.. ...................................... 1386 Introduction System Description * System Identification and Modeling Maarten Steinbuch Control Design Example: Diamond turning machine Conclusions Philips Research Laboratories, Eindhoven, The Netherlands Defining Terms References ..................................................................... 1404 Gerrit Schootstra 77.4 Robust Control of a Compact Disc Mechanism.. . :. ................ 1405 Philips Research Laboratories, Eindhoven, The Netherlands Introduction'Compact Disc Mechanism Modeling Performance Spec,ification'y-synthesis for the CD Player Implementation Results ConOkko H. Bosgra clusions Mechanical Engineering Systems and Control Group. Delft References .................................................................... 1411 University of Technology, Delft, The Netherlands
Jacob Tal
Galil Motion Control, Inc
77.1 Friction Modeling and Compensation Brian Armstrong, Department of Electrical Engineering and Computer Science, University of Wisconsin-Milwaukee, Milwaukee, WI Carlos Canudas de Wit, Laboratoire d1Automatique de Grenoble, ENSIEG, Grenoble, krance 77.1.1 Introduction The successful implementation of friction compensation brings together aspects of servo control theory, tribology (the science of friction), machine design, and lubrication engineering. The design of friction compensation is thus intrinsically an interdisciplinary challenge. Surveys of contributions from the diverse fields that are important for friction modeling, motion analysis, and compensation are presented in (21, [lo]. 0-8493-8570-9/96/$0.00+5.50 @ 1996 by CRC Press, Inr.
The challenge to good control posed by friction is often thought of as being stick-slip, which is an alternation between sliding and sticking due to static friction. Stick-slip is most common when integral control is used and can prevent a machine from ever reaching its intended goal position. IIowever, other forms of frictional disturbance can be of equal or greater importance. The tracking error introduced by friction ]into multi-axis motion is an example. This error, called quadrature glitch is illustrated in Figure 77.1. The two-axis machine fails to accurately track the desired circular contour because as one axis goeb through zero velocity, it is arrested for a moment by static friction while the other axis continues to move. Even when frictional disturbances are eliminated, friction in mechanical servos may still impact cost and performance. As outlined in Section 77.1.5 ,lubrication and hardware modification are two commonly used approaches to improving closedloop performance. These approaches are grouped under problem avoidance, and a survey of engineers in industry [2] suggests that they are the most commonly used techniques for eliminat-
THE CONTROL HANDBOOK
Glossary of Terms and Effects Coulomb friction: A force of constant magnitude, acting in the direction opposite to motion. When v(t) # 0 : Ff(t) = -Fcsgn(v(t))(77.1) Viscous friction: A force proportional to velocity. When v(t) # 0 : Ff(t) = - F,v(t)
(77.2)
Static friction: Is not truly a force of friction, as it is neither dissipative nor a consequence of sliding. Static friction is a force of constraint [12]. When v(t)
Figure 77.1 An example of quadrature glitch in 2-axis motion: XY position trace showing quadrature glitch. The desired trajectory is shown with 2% radial reduction to highlight the tracking error.
where u(t) is the externally applied force. For many servo applications, the simplest friction modelinstantaneous friction as a function (any function) of instantaneous sliding velocity-is inadequate to accurately predict the interaction of control parameters and frictional disturbances to motion. In addition to coulomb, viscous, and static friction, dynamic friction must be considered. The state of the art in tribology does not yet provide a friction model derived from first principles. Four friction phenomena, however, are consistently observed in lubricated machiries [lo]:
ing frictional disturbances. But these techniques have a cost that may be hidden if more effective servo compensation of friction is not considered. For example, special materials, called fiiction materials (see Section 77.1.5) are often used on the machine tool slideways to eliminate stick-slip. These materials have a high coulomb friction but a relatively lower static friction, and thus the machine slideway is not prone to stick-slip. But the higher coulomb friction of the slideway introduces increased power and energy requirements, which increases the initial costs for a larger drive and energy costs throughout the lifetime of the machine. More effective friction compensation by feedback control could provide higher performance at lower cost.
77.1.2 Friction Modeling The simplest friction model has the instantaneous friction force, Ff(t), expressed as a function of instantaneous sliding velocity, v(t). Such a model may include coulomb, viscous, andlor static friction terms, which are described in Section 77.1.2. Typical examples are shown in Figure 77.2. The components of this simple model are given in Equations 77.1 to 77.3. Friction Force
Extra Friction at Zero Velaiw. Static Friction
Slope Due to
(3
I Velocity
h lof
I
(b)
.
Velocity
1
Coulomb Friction
Figure 77.2 Overky simplistic,but nonetheless common,friction models: (a) coulomb viscous friction model; (b) static coulomb
+
viscous friction model.
(77.3)
0:
lu(t)l 5 (Fc + Fs) otherwise
X Position [meters]
Fiaion Force
=
+
+
1. Stribeck friction or the Stribeck curve is the negatively sloped and nonlinear friction-velocity characteristic occurring at low velocities for most lubricated and some dry contacts. The Stribeck curve is illustrated in Figure 77.3. The negatively sloped portion of the curve is an important contributor to stick-slip. Because of dynamic friction effects, instantaneous friction is not simply a function of instantaneous velocity. When velocity is steady, however, a steadylevel of friction is observed. This is friction as a function of steady-state velocity and gives the Stribeck curve. 2. Rising static friction. The force required for breakaway (the transition from not-sliding to sliding) varies with the time spent at zero velocity (dwell time) and with force application rate. The physics underlying rising static friction are not well understood, but experimental data and empirical models are available from the tribology literature [2] [lo]. Rising static friction interacts significantly with stick-slip and is represented as a function of dwell time in the model presented in Section 77.1.2, and as a function of force application rate in the model presented in Section 77.1.2.
77.1. FRICTION MODELING A N D COMPENSATION
I
Regime I. No Sliding,
, Elastic Deformation
Sliding Velocity
-
Figure 77.3 Friction as a function of steady-statevelocity,the Stribeck curve. The four regimes of motion are described in section 5.1.a (from [ 2 ] ,courtesy of the publisher).
3. Frictional memory is the lag observed between
changes in velocity or normal load and the corresponding change in friction force. It is most distinctly observed when there is partial fluid lubrication (partial fluid lubrication is a common condition in many machine elements, such as transmission and rolliilg bearings; see Section 77.1.5). As a result of frictional memory, instantaneous friction is a function of the history of sliding velocity and load as well as instantaneous velocity and load. An example of measured friction plotted against velocity is seen in Figure 77.4. The solid line indicates the friction as a function of steady-state sliding velocity and shows a single value for friction as a function of velocity. The solid line may be compared with Figure 77.3. The line marked ooo shows friction during intermittent sliding (stiik-slip). The friction force is a multi-valued function of velocity: during acceleration, the friction force is higher, while during deceleration the friction force is lower. The hysteresis seen in Figure 77.4 indicates the presence of frictional memory. 4. Presliding displacement is the displacement of rolling or sliding contacts prior to true sliding. It arises due to elastic and/or plastic deformation of the contacting asperities and is seen in Figure 77.5. Because metal components make contact at small, isolated points catled asperities, the tangential compliance of a friction contact (e.g., gear teeth in a transmission) may be substantially greater than the compliance of the bulk material. Presliding displacement in machine elements has been observed to be on the order of 1 to 5 micrometers. Because a small displacement In a bearing or between gear teeth may be amplified by a mechanism (for example, by a robot arm [I]), presliding displacement may give rise to significant output motions.
Figure77.4 Measured friction versus slidingvelocity: -, quasi-steady (equilibrium)sliding; ooo, intermittent sliding. (Frorn Ibrahim, R.A. and Soom, A., Eds., Friction-Induced Vibration, Chatter, Squeal, and Chaos, Proc. ASME Winter Annu. Meet., DE-Vol. 49, 139, 1992. With permission.)
25
I
-5
0
I
20
.
Spring-Like Behavior
40 60 Displacement [pm]
80
100
Figure 77.5. Force (--) and velocity (- -) versus displacement during the transitionfrom static to sliding friction (breakaway). The spring-like behavior of the friction contact is seen as the linear fortr-displacement curve over the first 15pm of motion. (Adapted from [12].) Friction models capturing these phenomena are described in Sections 77.1.2 and 77.1.2. In broad terms, these four dynamic friction phenomena can be linked to behaviors observed in servo mechanisms 121. These connections are presented in Table 77.1.
Additional Terms Temporal friction phenomena: This term connotes both rising static friction and frictional memory. Stick-slip: In the literature, stick-slip is used to refer to a broad range of frictional phenomena. Here, it is used to refer to a stable limit cycle arising during motion. Hunting: With feedback control, a type of frictional limit cycle is possible that is not possible with passive systems: a limit cycle that arises while the net motion of the system is zero. This is called hunting and is
1372
THE COATTROLHANDBOOK Observable Consequences of Dynamic Friction Phenomena Dynamic Friction Phen,rmenon PredictedIObserved Behavior Stribeck friction Needed to correctly predict initial conditions and system parameters leading to stick-slip Rising static friction Needed to correctly predict the interaction of velocity with the presence and amplitude of stick-slip Frictional memory Needed to correctly predict the interaction of stiffness (mechanicalor feedback) with the presence of stick-slip Presliding displacement Needed to correctly predict motion during stick(e.g., during velocity reversal or before
TABLE 77.1
breakaway)
most often associated with integral control. Standstill (or lost motion): Even where there is no stable limit cycle, the frictional disturbance to a servo system may be important. Standstill or lost motion refers to the frictional disturbance that arises when a system goes through zero velocity; it is seen in Figure 77.1. An ideal system would show continuous acceleration, but the system with friction may be arrested at zero velocity for a period of time. In machine tools, this is the frictional disturbance of greatest economic importance.
The Seven-Parameter Friction Model The seven-parameter friction model [2] captures the four detailed friction phenomena, as well as coulomb, viscous, and static friction. The model has the advantage that the friction phenomena are explicitly represented with physically motivated model parameters. The model is presented in Equations 77.4 to 77.6, and the parameters are described in Table 77.2. The liability of the seven-parameter model is that it is not, in fact, a single integrated model, but a combination of three models: Equation 77.4 reflects the tangential force of constraint during stick; Equation 77.5 reflects frictional force during sliding; and Equation 77.6 represents a sampled process that reflects rising static friction.
Not-sliding (pre-sliding displacement):
Sliding (coulomb + viscous + Stribeck curve friction with frictional memory):
.
instantaneous friction force coulomb friction force* Fu = viscous friction force* 4 = magnitude of the Stribeck friction [frictionalforce at breakaway is given by Ff(t = tbreakaway) = Fc Fs I Fs,u = magnitude of the Stribeck friction at the end of the previous sliding period F , = magnitude of the Stribeck friction after a long time at rest (or with a slow application force)* Kt = tangential stiffness of the static contact* us = characteristic velocity of the Stribeck friction* t~ = time constant of frictional memory* Y = temporal parameter of the rising static friction* t2 = dwell time, time at zero velocity (*) Marks friction model parameters; other variables are state variables. Ff (.)
=
Fc
=
+
Integrated Dynamic Friction Model Canudas et al. [6] have proposed a friction model that is conceptually based on elasticity in the contact. They introduce the variable z ( t ) to represent the average state of deiormation in the contact. The friction is given by
Ff (t) = aoz(t) + a,
+ Fuv(t)
(77.7)
where Ff(t) is the instantaneous friction and u(t) is the contact sliding velocity. The state variable z(t) is updated according to:
$jp = ~ ( t -) &z(t)Iu(t)l
(77.8)
In steady sliding, i(t) = 0, giving ~ ( O ) l ~=~~ s~ ~d" [~ u -( '~) I~ ~(77.9) ~ ~ which then gives a steady-state friction:
Rising Static Friction (friction level at breakaway): Fso'. t2) = Fs,u where:
+ (FS,CO- FS,,)&
(77.6)
A parameterization that has been proposed to describe the nonlinear low-velocity friction is
77.1. FRICTION MODELING AND COMPENSATION where F, is the coulomb friction; F, is magnitude of the Stribeck friction (the excess of static friction over coulomb friction); and v, is the characteristic velocity of the Stribeck friction, approximately the velocity of the knee in Figure 77.3. With this description of g(v(t)), the model is characterized by six parameters: uo,01, Fu, F,, Fs, and v,. For steady sliding, [~j(t)= 01, the friction force (Figure 77.3) is given by:
When velocity is not constant, the dynamics of the model give rise to frictional memory, rising static friction, and presliding displacement. The model has a number of desirable properties: 1. It captures presliding displacement, frictional memory, rising static friction, and the Stribeck curve in a single model without discontinuities. 2. The steady-state (friction-velocity) curve is captured by the function g(v(t)), which is chosen by the designer (e.g., Equation 77.11). 3. In simulation, the model is able to reproduce the data of a number of experimental investigations [6]. A difficulty associated with dynamic friction models lies with identifying the friction model parameters for a specific system. The static parameters involved in the function g(v(t)) and the viscous friction parameter F, may be identified from experiments at constant velocity (see Section 77.1.4). The dynamic parameters 00 and a1 are more challenging to identify. These parameters are interrelated in their description of the physical phenomena, and their identification is made challenging by the fact that the state z(t) is not physically measurable. A procedure for identifying these parameters is outlined in Section 77.1.4.
Magnitudes of Friction Parameters The magnitudes of the friction model parameters naturally depend upon the mechariism and lubrication, but typical values may be offered: as seen in Table 77.2 (see [I], [2], [3], [12]). The friction force magnitudes, F,, F,, and F,,,, are expressed as a function of normal force F,; i.e., as coefficients of friction. A, is the deflection before breakaway resulting from coniact compliance. In servo machines it is often impossible to know the magnitude of the normal force in sliding contacts. Examples of mechanisms with difficult-to-identify normal forces are motor brushes, gear teeth, and roller bearings, where the normal force is dependent on spring stiffness and wear, gear spacing, and bearing preload, respectively. For this reason, friction is often described for control design in terms of parameters with units of force, rather than as coefficients of friction. In addition to the models described above, state variable friction models have been used to describe friction at very low velocities (pmls). The state variable models are particularly suited to
capture nonlinear low-velocity friction and frictional memory. Velocities of micrometers per second can be important in some control applications, such as wafer stepping or the machining of nonspherical optics (see [2][7]). Friction that depends upon position has been observed in machines with gear-type transmissions. In general, mechanisms in which the normal force in sliding contacts varies during motion show position-dependent friction. Selected components of the Fourier transform and table lookup have been used to model the position-dependent friction. At least one study of friction compensation in an industrial robot has shown that it is important to model the position-dependent friction in order to accurately identify the detailed friction model parameters [ I 1. In practical machines, there are often many rubbing'surfaces that contribute to the total friction: drive elements, seals, rotating electrical contacts, bearings, etc. In some mechanisms, a single interface may be the dominant contributor, as transmission elements often are. In other cases where there are several elements contributing at a comparable level, it may be impossible to identify their individual contributions without disassembling the machine. In these cases, it is often aggregate friction that is modeled. The control designer faces a considerable challenge with respect to friction modeling. On the one hand, parameters of a dynamic friction model are at best difficult to identify while, on the other hand, recent theoretical investigations show that the dynamics of friction play an important role in determining frictional disturbances to motion and appropriate compensation [I], 151, [6], [7], [lo], [12], [14]. Often, it is necessary to use a simplified friction model. The most common model used for control incorporates only coulomb friction, Equation 77.1. For machine tools, the Karnopp model, Equations 77.15 to 77.17, has been employed to represent static friction (e.g., [4]). And presliding displacement, Equation 77.4, has been modeled for precision control of pointing systems (e.g., [14])1.Reference [ l ] provides an example showing how a detailed friction model can be used to achieve high.performance control, but also makes clear the technical challenges, including special sensing, associated with identifying the parameters of a detailed friction model. To date, there has been no systematic exploration of the trade-offs between model complexity and control performance.
77.1.3 Simulation Simulation is the most widely used tool for the behavior of friction compensation. The simulation of systems with friction is made challenging by the rapid change {offriction as velocity goes through zero. In the simplest friction model, friction is discontinuous at zero velocity:
The discontinuity poses a difficulty for nume~ricalintegrators used in simulation. The difficulty can be addressed by "softening" the discontinuity, for example with
1374
THE CONTROL HANDBOOK
TABLE 77.2 Approximate Ranges of Detailed Friction Model Parameters for Metal-on-Metal Contacts Typical of Machines
Fc Fu F,, 00
Parameter Range 0.001 - 0 . 1Fn ~ 0-very large 0 - O.lxFn
Parameter Depends Principally Upon Lubricant viscosity, contact geometry, and loading Lubricant viscosity, contact geometry, and loading Boundary lubrication
1
kt
v, SL
y
(Fs + Fe); A, 1 - 50(pM) 0.00001 - 0.1 (m/s) 1 - 50 (ms) 0 - 200(s)
Material properties and surface finish
Boundary lubrication, lubricant viscosity, material properties and surface finish, contact geometry, and loading Lubricant viscosity, contact geometry, and loading Boundary lubrication
which produces a smooth curve through zero velocity. Models that are modified in this way, however, exhibit creep: small applied forces result in low but steady velocities. Some frictional contacts exhibit creep, but metal-on-metal contacts often exhibit a minimum applied force below which there is no motion. The discontinuous friction model is a nonphysical simplification in the sense that a mechanical contact with distributed mass and compliance cannot exhibit an instantaneous change in force. Friction may be a discontinuous function of steady-state velocity (as are Figure 77.3, and Equation 77.1 l), but a system going through zero velocity is a transient event. The integrated dynamic friction mdhel (Section 77.1.2) uses an internal state (Equation 77.7) to represent the compliance of the contact and thereby avoids both the discontinuity in the applied forces and creep. The model as been used to reproduce, in simulation dynamic friction ph omena that have been experimentally observed. Another model that has been widely applied to simulation is the Karnopp friction model [ 111. The Karnopp model solves the problems of discontinuity and creep by introducing a pseudo velocity and a finite neighborhood of zero velocity over which static friction is taken to apply. The pseudo velocity is integrated in the standard way
1
friction. The modeled friction is given by:
A small neighborhood of zero velocity is defined by D v,as shown in Figure 77.6. Outside this neighborhood, friction isafunction of velocity; coulomb friction is specified in Equation 77.16, but any friction-velocity curve could be used. Inside the neighborhood of zero velocity, friction is equal to and opposite the applied force up to the breakaway friction, and velocity is set to zero
where M is the mass. The integration step must be short enough that at least one value of velocity falls in the range Iv(t)l < D v .At this point velocity is set to zero, according to Equation 77.17. If the applied force is less than the breakaway friction, Fc F r , then p(t) = 0, andihe~~stemremainsat rest. When I FuI > (Fe+&), p(t) # 0 and, perhaps after some time, the condition of Equation 77.17 allows motion to begin. The Karnopp model represents static friction as applying over a region of low velocities rather than at the mathematical concept of zero velocity. The model thus eliminates the need to search for the exact point where velocity crosses zero.
+
Coulomb + Static Friction Coulomb Friction -
While practical for simulation, and even feedbackcontrol (e.g., [4]),the Karnopp friction model is a simplification that neglects frictional memory, preslidingdisplacement, and rising static friction. The simulation predictions are thus limited to gross motions; to accurately predict detailed motion, dynamic friction must be considered. A friction-velocityrepresentationof the Karnopp friction model (adapted from [ 111).
Figure 77.6
where p(t) is the pseudo velocity (identified with momentum in [ l I]), F,(t) is the applied force, and F, (t) is the modeled
For any approach to simulating systems with friction, an integrator with variable time step size is important. The variable time step allows the integrator to take very short time steps near zero velocity, where friction is changing rapidly, and longer time steps elsewhere, where friction is more steady. Variable step size integration is standard in many simulation packages.
77.1. FRICTION A4ODELING AND COMPENSATION The applied'force corresponding to the onset of motion is the breakaway friction. To achieve repeatable results, it is necessary to consider these factors:
77.1.4 Off-Line Friction Parameter Identification At this time, it is not possible to accurately predict the static and dynamic friction model parameters based on the specifications of a mechanism. Consequently, friction compensation methods that require a model require a method to identify the model parameters. The problem is one of nonlinear parameter identification, which has a large literature, including Chapter 58. While roughly determining th'e coulomb and viscous friction parameters may (entailonly some straightforward constant force or constant velocity motions (see Section 77.1.4), determining the dynamic friction parameters often entailsacceleration sensing or specialized approaches to parameter identification.
1. In many machines, friction is observed to be higher after a period of inactivity and to decrease with the first few motions. Bringing lubricants into their steady-state condition accounts for this transient. To repeatably measure any friction parameter, the machine must be "warmed up" by typical motions prior to measurement. 2. The highest spatial frequencies present in the position-dependent friction may be quite high. Spatial sampling must be sufficient to avoid aliasing. 3. Force must be applied consistently. Because the breakaway force depends upon the dwell time and rate of force application (see Section 77.1.2) these variables must be controlled. 4. The method used to detect breakaway must be well selected. The onset of motion is not a simple matter, corresponding to the fact that motion occurs before true sliding begins (presliding displacement, see Figure 77.5). Detecting the first shaft encoder pulse after the beginning of force application, for example, is a very poor detection of sliding because the pulse can arise with presliding displacemeint at any point on the force application curve. Tests lbased on velocity or on total motion from the initial position have been used.
Position-Dependent Friction Mechanisms that are spatially homogeneous, such as direct and belt-driven mechanisms, should not show a substantial position-dependent friction. But mechanisms with spatial inhomogeneities, such as gear drives, can show a large change in friction from one point to another, as seen in Figure 77.7 . The
-4
-3.5
-3
-2.5
-2
-1.5
-1
-0.5
0
0.5
I
1
Joint 2 Position [Radians] Figure 77.7 Positioa-dependent friction observed in joint 2 of a PUMA robot arm. The signal with a spatial period of approximately 0.7 radians corresponds to one rotation of the intermediate gear. One cycle of the predominant high-frequency signal corresponds to one rotation of the motor pinion gear. The 3 N-m peak-to-peak position dependent friction may be compared to the average Coulomb friction level of 12 N-m. (Adapted from [I.].) position-dependent friction can be measured by observing the breakaway friction, the level of force required to initiate motion, throughout the range of motion of the mechanism. To measure b;eakaway friction: 1. The mechanism is allowed to come to rest at the point where friction will be measured. 2. Force (torque) is applied according to a specified curve (generaluy a ramp). 3. The onset of motion is detected.
Gear drives can exhibit significant friction variation with motions corresponding to a tenth of the highest gear pitch. For example, in a mechanism with 15 : 1 reduction and 18 teeth on the motor pinion gear, 1 5 x 1 8 10 ~ = 2700 cycles of positiondependent friction per revolution of the mechanism. Capturing these variations during motion poses a significant sensing bandwidth challenge. Breakaway friction is a suitable tool for identifying position dependency in the sliding friction because it can be measured at many points, corresponding to a high spatial sampling.
Friction as a Function of Steady-State Velocity The steady-state friction-velocity curve (Figure 77.3) can The mobe observed with a series of constant velocity mot~~ons. tions can be carried out with either open- or closed-loop control. If there is significant position-dependent friction, feedforward compensation of the position-dependent friction will improve the accuracy of the steady-sliding friction model. An example is shown in Figure 77.8, where the data show the nonlinear low-velocity friction. . No measurements were taken in the region of the steep friction-velocity curve because stick-slip arises in the tested mechanism, even with acceleration feedback, and constant-velocity sliding at low velocities was not possible. The beginning of the negative velocity friction curve observed during constant-velocitysliding and the value of breakaway friction measured using the breakaway experiment (Section 77.1.4)
THE CONTROL HANDBOOK Fr~clion-veloutymap
would be sufficient to approximately identify the parameters of the Stribeck curve.
[radlsec] Posilion
Breakaway Friction
Velodty
- 5'
(I
0
m
$ m
-
90% Confidence (1.65 Std. Dev.) 7.51
0.05
0.1
0.15
0.2
I
Figure 77.8 Friction as a function of velocity in joint 1 of a PUMA robot. The data points indicate the average level of friction observed in constant-velocitymotions using a stiffvelocityfeedback. The solid curve is giv& by the seven parameter friction model, Equation 77.5. (Adapted from 111.)
Dynamic Friction Model Parameters The dynamic friction phenomena-Stribeck friction, frictional lag, rising static friction, and presliding displacement-are difficult observe directly. They operate over regions of very low vel city, in which mechanism motion may be unstable, or over short time intervals or small distances. Direct observation of the phenomena and measurement of the parameters is possible with acceleration sensing and careful elimination of sources of measurement error [ I ] , [12]. In spite df the fact that sensitive sensing is required to directly observe the dynamic friction phenomena, their impact on mo/ tion may be substantial, motivating both their correct modeling during the design of compensation and providing a basis for identification from observablesystem motions. Figure 77.9 shows the simulated friction, as well as position and velocity curves, for a motor servo and two different sets of friction parameters. The integrated dynamic friction model (Section 77.1.2) was used for the simulation, with the parameters specified in Table 77.3.
B
Parameters Used for the Simulations
of Figure 77.9 Fr = 0.292(N.m) Fs = 0.043(N.m) Case 1 : Case 2 :
FU = 0.0113 (~.m-slradj us = 0.95 (radls)
a0 = 2.0 00= 40.0
01= 0.1
01 = 2.0
The two friction models are seen to give distinct motions, producing differences that can be used to identify the friction
-Ir-
0.
2nd. case
lime [secl
0.25
Velocity [radianslsecond]
TABLE 77.3
'
Figure 77.9
Friction and motion dependence on a0 and a].
model parameters. The parameter identification is a nonlinear optimization problem; standard engineering software packages provide suitable routines. The low-velocity portion of the friction-velocity curve, as well as frictional memory, presliding displacement, and rising static friction parameters, were identified in an industrial robot using open-loop motions and variable compliance. See [ I ] for the details of the experiment and identification. In spite of progress in the area of modeling and demonstrations of parameter identification, there is not, at this time, a generally applicable or thoroughly tested methed for identifying the parameters of a dynamic friction model. None of the methods reported to date has been evaluated in more than one application. Friction modeling thus remains, to a large extent, an application-specificexercise in which the designer identifies friction phenomena important for achieving specific design goals. Tribologyand the models presented in Section 77.1.2 offer a guide to friction phenomena that will be present in a lubricated mechanism. Because of the challenges, sophisticated and successful friction compensation has been achieved in many cases without a dynamic friction model, but at the cost of entirely empirical development and tuning. One can say only that more research is needed in this area.
77.1.5 Friction Compensation Friction compensation techniques are broken down into three categories: problem avoidance, nonmodel-based compensation techniques, and model-based compensation techniques. Problem avoidance refers to modifications to the system or its lubrication that reduce frictional disturbances. These changes often involve machine design or lubrication engineering and may not seem the domain of the control engineer. But system aspects that play a large role in the closed-loop performance, particularly the detailed chemistry of lubrication, may not have been adequately considered prior to the appearance of frictional disturbance to motion (imprecise control), and it may be up to the control engineer to suggest the use of friction modifiers in the lubricant. The
77.1. FRICTION MODELING AND COMPENSATION
division of control-based compensation techniques into modelbased and nonmodel-based reflects the challenge associated with developing an accurate friction model. Problem Avoidance
LubricantModification For reasons that relate to service life and performance, systematic lubrication is common in servo machines. The nature of sliding between lubricated metal contacts depends upon the sliding velocity and distance traveled. When a servo goes from standstill to rapid motion, the physics of friction transition from: Full solid-to-solid contact without sliding, motion by elastic deformation (presliding displacement) To solid-to-solid contact with sliding (boundary lubrication) A mix of fluid lubrication and solid-to-solid contact (partial fluid lubrication) To full fluid lubrication (oil or grease supports the entire load; elastohydrodynamic or hydrodynamic lubrication depending on contact geometry) The physics of these processes are quite different from one another. The wide variety of physical processes and the difficulty of ascertaining which are active at any moment explain, in part, why a complete description of friction has been so elusive: the model must reflect all of these phenomena. Typical ranges for the friction coefficients in these different regimes are illustrated in Figure 77.10. When sliding takes place at low velocities, actual shearing of solid material plays an important role in determining the friction. If the surfaces in contact are extraordinarily clean, the shearing takes place in the bulk material (e.g., steel) and the coefficient of friction is extremely high. More commonly, the sliding surfaces are coated with a thin boundary layer (typical thickness, 0.1 p m ) of oxides or lubricants, and the shearing takes place in this layer. Customarily, additives are present in machine lubricants, which bind to the surface and form the boundary layer, putting it under the control of the lubrication engineer. These additives constitute a small fraction of the total lubricant and are specific to the materials to which they bind. Friction modifiers are boundary lubricants that are specificallyformulated to affect the coefficient of friction [9]. The control engineer should be aware of the possibilities, because lubrication is normally specified to maximize machine life, and friction modification is not always a priority of the lubrication engineer. Hardware Modification The most common hardware modifications to reduce frictional disturbances relate to increasing stiffness or reducing mass. These modifications permit higher gains and help to increase the natural frequency of the mechanism in closed loop. The greater stiffness reduces the impact of friction directly, while a higher natural frequency interacts with frictional memory to reduce or eliminate stick-slip [7]. Other hardware modifications include special low-friction ~, bearings and the use of "friction materials," such as ~ u l o n in machine tool slideways to eliminate stick-slip.
Nonmodel-Based Compensation Techniques
Modifications to Integral Control Integral control can reduce steady-state errors, including those introduced by friction in constant-velocity applications. When the system trajectory encounters velocity reversal, however, a simple integral control term can increase rather than reduce the frictional disturbance. A number of modifications to integral control are used reduce the impact of friction. Their application depends not only on system characteristics, but also upon tlhe desired motions and that aspects of possible frictional disturbances that are most critical. Position-error dead band. Perhaps the most common modification, adead band in the input to th~eintegrator eliminateshunting, but introduces a threshold in the precision with which a servo can be positioned. The dead band also introduces a nonlinearity, which complicates analysis. It can be modified En various ways, such as scaling the dead band by a velocity term to reduce its affect during tracking. The integrator with dead band does not reduce dynamic disturbances, such as quadrature glitch. Lag compensation. Moving the compensator pole off the origin by the use of lag compensation with high but finite dc gain accomplishes something of the same end as a position-error dead band, without introducing a nonlinearity. Integrator resetting. When static friction is higher than coulomb friction (a sign of ineffective boundary lubrication), the reduced friction following breakaway is overcompensated by the integral control action, and overshoot may result. In applications such as machine tools, which have little tolerance for overshoot, the error integrator can be reset when motion is detected. Multiplying the integrator term by the sign of velocity. When there is a velocity reversal and the friction force changes direction, integral control may compound rather than compensate for coulomb friction. This behavior enlarges (but does not create) quadrature glitch. To compensate for this effect, integral control can be multiplied by the sign of velocity, a technique that depends upon coulomb friction dominating the integrated error signal. The sign of desired or reference model velocity is often used; if measured or estimated velocity is used, th~emodification introduces a high-gain nonlinearity into the servo loop.
High Servo Gains (StifiPosition and Velocity Control) Linearizing the Stribeck friction curve (Figure 77.3) about apoint in the negatively sloped region gives a "negative viscous friction" effective during sliding at velocities in the partial fluid lubrication regime. Typically, the negative viscous friction operates over a small range of low velocities and gives a much greater destabilizing influence than can be directly compensated by velocity
THE CONTROL HANDBOOK Unlubricated C
.-
1.0-
2
0.1-
Boundary
Partial Fluid Lubrication
Elastohydrodynamic Lubrication
Hydrodynamic Lubrication
CI
.-6 V.
0.01 -
E 8
U
0.001
-
I
Figure 77.10
Lubrication Condition
Typical ranges for friction in machines, corresponding to the friction process (adaptedfrom 13)).
feedback. The negative viscous friction, not static friction, is often the greatest contributor to stick-slip. As always, high servo gains reduce motion errors directly. In addition, stiff position control interacts with negative viscous friction and frictional memory to create a critical stiffness above which stick-slip is eliminated [7]. In a second-order system with.proportiona1 derivative (PD) control and frictional memory modeled as a time lag, as in Equation 77.5, a critical stiffness above which stick-slip is extinguished is given by
present in the system. For further discussion and references, see [21. Joint Torque Control Reductions of 3O:l in apparent friction have been reported using joint torque control [2], a sensor-based technique that encloses the actuator-transmission subsystem in a feedback loop to make it behave more nearly as an ideal torque source. Disturbances due to undesirable actuator characteristics (friction, ripple, etc.) or transmission behaviors (friction, flexibility, inhomogeneities, etc.) can be significantly reduced by sensing and high-gain feedback. The basic structure is shown in Figure 77.11; an inner torque loop functions to make the applied torque, T,, follow the command torque, T,.
where t~ is the time lag of frictional memory in seconds, and M is the mechanism mass [2]. Qualitatively, the elimination of stick-slip at high stiffness can be understood as the converse of the destabilizing - effect of transport lag in a normally stable feedback loop: the destabilizing effect of negative viscous friction operates through the delay of frictional memory, and the system is stabilized when the delay is comparable to the natural frequency of the system. Achieving the required feedback stiffness may require avery stiff mechanical system. When increasing stiffness extinguishes stick-slip, Equation 77.18 can be used to estimate the magnitude of the frictional memory. In some cases, variable structure control with very high damping at low velocities counters the influence of nonlinear lowvelocity friction and permits smooth motion where stick-slip might otherwise be observed. Machine and lubricant characteristics determine the shape of the low-velocity friction-velocity curve (see Table 77.1), which determines the required velocity feedback for smooth motion.
Learning Control Learning control, sometimes called repetitive control, generally takes the form of a table of corrections to be added to the control signal during the execution of a specific motion. The table of corrections is developed during repetitions of the specific motion to be compensated. This type of adaptive compensation is currently available on some machine tool controllers where the typical task is repetitive and precision is at a premium. The table of corrections includes inertial as well as friction forces. This type of adaptive control is grouped with nonmodel-based control because no explicit model of friction is
I
Sensed Toraue
Sensor I
(High Bandwidth Inner Loop)
I
I (Lower Bandwidth Outer Loop)
Figure 77.11
Block diagram of a joint torque control (JTC) system.
The sensor and actuator are noncollocated, separated by the compliance of the transmission and perhaps that of the sensor itself. This gives rise to the standard challenges of noncollocated sensing, including additional and possibly lightly damped modes in the servo loop. Dither Dither is a high-frequency signal introduced into a system to improve performance by modifying nonlinearities. Dither can be introduced either into the control signal, as is often done with the valves of hydraulic servos, or by external mechanical vibrators, as is sometimes done on large pointing systems. An important distinction arises in whether the dither force acts along a line parallel to the line of motion of the friction interface or normal to it, as shown in Figure 77.12. ,The effect of parallel dither is to modify the influence of friction (by averaging the nonlinearity); the effect ofvibrations normal to the contact is to modify the friction itself (by reducing the friction coefficient). Dither on the control input ordinarily generates vibrations par-
77.1. FRICTION MODELING A N D COMPENSATION allel to the friction interface, while dither applied by an external vibrator may be oriented in any direction.
Inputs
Steps
Numerical Simulation and Generation of the Plant Describing Function
(Nonlinear)
Parallel Vibration Modifies Influence of Friction (e.g. Control Input Dither) C---
u(td
--4
I
Normal Vibration Modifies Friction (e.g. External Vibrator)
.g
c G 8 ( j Wk) Linear O~umlzauonTool
Friction Inter Face Figure 77.12 Direction and influence of dither.
I
Design Conuol for One Specific Plant Input Amplitude, a''
find Compensator gans so that the combined controller/planttransfer functlon most closely matches C G * ( e k ) .
--------------c
kp(r,) k,(r,)
Working with a simple coulomb plus viscous friction model, one would not expect friction to be reduced by normal vibrations, so long as contact is not broken; but because of contact compliance (the origin of presliding displacement), more sliding occurs during periods of reduced loading and less during periods of increased loading, which reduces the average friction. Reduction of the coefficient of friction to one third its unditheredvalue has been reported.
Inverse Describing Function Techniques The describing function is an approximate, amplitude-dependent transfer function of a nonlinear element. The inverse describingfunction is a synthesis~rocedurethat can be usedwhen an amplitudedependent transfer function is known and the corresponding time domain function is sought. The inverse describing function has been used to synthesize nonlinear controllers that compensate for friction [ 131. Nonlinear functions f , (e), ~ fi (el, and ~ D ( Yare ) introduced into the P, I, and D paths of proportional-integral-derivative (PID) control, as shown in Figure 77.13. The synthesis procedure is outlined in Figure 77.14. The method is included with nonmodel-based compensation because, while a friction model is required for the synthesis, it does not appear explicitly in the resulting controller.
kp(rt) -> kl(r,) -> f d e ) kD(rl) -> $(Y)
Inverse Describing Functron --
1
I
4 Nonltnear Time Domain Funcuons f,(e) fl(e) , f,(y)
pigure 77.14 outlineof describingfundioln based synthesis of nonlinearfrictioncompensation,
Model-Based Compensation Techniques When friction can be predicted with sufficrent accuracy, it can be compensated by feedforward application {ofan equal and opposite force. The basic block diagram for such a construction is shown in Figure 77.15. Examples of model-based compensation are provided in [I],[2] [4],(5],[6],[8],[14]. Compensation is addressed in the next section; questions of tuning the friction model parameters off-line and adaFltivelyare discussed in Sections 77.1.4 and 77.1.5 respectively.
System Under Conbol
BF%
"y*', 2*." %
&*a*
,a
V4(r)kFd q-+i G0(s)
f,(e)
,)
q0
X
GI(s)
Standard
Compensation
fi (el
Nonlinear
(Actual Friction)
Figure 77.13
Nonlinear PI
+ tachometer structure.
The inverse describing function method offers the advantages that it has a systematicdesign procedure and it can be applied to systems with relatively complicated linear dynamics. Significant improvement in transient response is reported for a fourth-order system with flexible modes [ H I .
Friction Compensation
s u z c Desired
L. 2 Figure 77.15 Model-based feedforward compensation for friction. [*]I courtesyof the publisher.)
Compensation Three requirements 53r feedfonvard compensation are
THE CONTROL HANDBOOK 1. An accurate friction model
tion model
2. Adequate control bandwidth 3. Stiff coupling between the actuator and friction elemen t The third item merits special consideration: significant friction often arises with the actuator itself, or perhaps with a transmission, in which case the friction may be stiffly coupled to the available control-based compensation. When the friction arises with components that are flexibly coupled to the actuator, it may not be possible to cancel friction forces with actuator commands. Some of the most successful applications of model-based friction compensation are in optical pointing and tracking systems, which can be made very stiff (e.g., [14]). The important axes qf distinction in model-based compensation are 1. The character and completeness of the friction model 2. Whether sensed, estimated, or desired velocity is used as input to the friction model The most commonly encountered friction model used for compensation comprises only a coulomb friction term, with two friction levels, one for forward and the other for reverse motion. Industrially available machine tool controllers, for example, incorporate coulomb friction feedforward. To guard against instability introduced by overcompensation, the coulomb friction parameter is often tuned to a value less than the anticipated friction level. The use of sensed or estimated velocity by the friction model closes a second feedback loop in Figure 77.15. If desired or reference model velocity is used, the friction compensation becomes a feedforward term. Because the coulomb friction model is discontinuous at zero velocity, and any friction model will show a rapid change in friction when passing through zero velocity, the use of sensed or estimated velocity can lead to stabilityproblems. 'A survey of engineers in industry [2] indicates that the use of desired velocity is most common. Additional friction model terms that have been used in compensation include: I 1. Static friction 2. Frictional memory and presliding displacement 3. Position dependence
Static friction. Static friction may be represented in a model like the Karnopp model (see Section 77.1.3 and Figure 77.6). The most substantial demonstration of static friction compensation has been presented by Brandenburg and his co-workers [4] and is discussed in Section 77.1.5. Frictional memory and presliding displacement. Examining Figure 77.5, it is seen that providing the full coulomb friction compensation for motions near the zero crossing of velocity would overcompensate friction. To prevent overcompensation, one industrial servo drive controller with coulomb friction compensation effectively idplements the inverse of the Karnopp fric-
where tz, is the time since the last zero crossing of velocity, and is a parameter that determines the interval over which the coulomb friction compensation is suppressed. The coulomb friction compensation could also be controlled by accumulated displacement since the last zero crossing of velocity (note that friction is plotted as a function of displacement in Figure 77.5) or as a function of time divided by acceleration (applied torque). Friction compensation has also been demonstrated that incorporates a low-pass filter in the friction model and thus eliminates the discontinuity at zero velocity. One construction is given by
tk
where r is a time constant. Walrath [14] reports an application of friction compensation based on Equation 77.20 that results in a 5: 1 improvement in RMS pointing accuracy of a pointing telescope. Because of the accuracy requirement and the bandwidth of disturbances arising when the telescope is mounted in a vehicle, the tracking telescope pointing provides a very rigorous test of friction compensation. In the system described, r is a function of the acceleration duringthe zero crossing, given by
where a1 and a2 are empirically tuned parameters, and u ( t ) is estimated from the applied motor torque. Position-dependent friction. In mechanisms with spatial inhomogeneities, such as gear teeth, friction may show substantial, systematic variation with position. Some machine tool controllers include an "anti-backlash compensation that is implemented using table lookup. Friction is known to influence the table identification and, thus, the compensation. Explicit identification and feedforward compensation of position-dependent friction can also be done [1],[2]. AdaptiveControl, Introduction Adaptive control has been proposed and demonstrated in many forms; when applied to friction compensation, adaptation offers the ability to track changes in friction. Adaptive friction compensation is indirect when an on-line identification mechanism explicitly estimates the parameters of the friction model. The advantage of indirect adaptive control is that the quality of the estimated parameters can be verified before they are used in the compensation loop. Typical examples of such verification are to test the sign of the parameters or their range ofvariation. By introducing an additional supervisory loop, the identified parameters can be restricted to a predetermined range. Direct adaptive compensation rests on a different philosophy: no explicit on-line friction modeling is involved, but rather controller gains are directly adapted to miniinize tracking error. Because there isno explicit friction prediction, it is more difficult to supervise the adaptive process to assure that the control reflects a physically feasible friction model or that it will give suitable transient behavior or noise rejection.
77.1. FRICTION MODELING AND COMPENSATION Direct Adaptive Control Model reference adaptive control (MRAC) is described elsewhere in this handbook. Perhaps the simplest implementation of model reference coulomb friction compensation is given by u(t)
= (PD control)
+ $c(t)sgn[v(t)]
where v(t) is velocity; v, (t) is the reference model velocity; and FC(t)is a coulomb fr~ctioncompensation paramete;. Theparameters C1 and C2 are chosen by the designer and can be tuned to achieve the desired dynamics ofthe adaptiveprocess. A Lyapunov function will show that the (idealized) process will converge for Cl, C2 > 0. Brandenburg and his co-workers [4] have carried out a thorough investigationof friction compensation in a two-mass system with backlash and flexibility (see also [2] for additional citations). They employ an M M C structure to adapt the parameters of a coulomb friction-compensating disturbance observer. Without friction compensation, the system exhibits two stick-slip limit cycles. The MRAC design is based on a Lyapunov function; the result is a friction compensation made by applying a lag (PI) filter to e(t), the difference between model and true velocity. Combined with integrator dead band, their algorithm eliminates stick-slip and reduces lost motion during velocity reversal by a factor of 5. Indirect Adaptive Control When the friction model is linear in the unknown parameters, it can be put in the form
where 0 is the parameter vector and @ (t) is the regressor vector. The regressor vector depends on the system state and can include nonlinear functions. For example, a model that includes the Stribeck curve is given by [5]:
Equation 77.25, the filtered mechanism dynamics are given, and the filtered prediction error is
where-indicates a low-pass filtered signal. An update equation for the parameter estimate is constructed
where h k is a rate gain and Pk is the inverse of the input correlation matrix for the recursive least squares algorithm (RLS) or the identity matrix for the least mean squared algorithm (EMS). The control law is now implemented as u(t) = (Standard control)
1. The estimation should be frozen when operating conditions are not suitable for friction iclentification [e.g., v(t) = 0; this is the persistent excitation issue]. 2. The sign of the estimated parameters shou~ldbe always positive. 3. The compensation can be scaled down to avoid overcompensation. 4. The sign of the reference velocity can be used in place of the sign of the measured velocity in Equation 77.24, when v(t) is $mall. An alternative approach has been presented by Friedland and Park [8],who justify an update law that does not depend on acceleration measurement or estimation. The friction compensation is given by Mti(t) = u(t) - Ff(v(t), F: ~f ( ~ ( t ) Fc) , = Fcsgn[v(t)j u(t) = (Standard Control) pcsgn[v(t)] Fc = ~ ( t ) k Iv(t)\* z(t) = k p ~u(t)(*-'
&[u(t)-f(~(t),~~)]sgn[v(t)]
where M is the mass and u(t) is the force or torque command, and sampling at time t = k T , the friction prediction error is given by:
(77.29)
where @(t) is used for compensation, rather th,an the filtered 6(t). During the implementation of the estimation loop the following points are important:
+
Assuming the simplest mechanism dynamics
+iT@(t)
[System Dynamics, single mass] [Friction Model] [Control Law] [Friction Estimator] [Friction Estimator UpdateLaw]
where M is system mass; u(t) is the control input; z(t) is given by the friction estimator update law; gCis the estimated coulomb friction; and p and k are tunable gains. Defining the model misadjustment
- Fc
e(t)
=
F,*
i(t)
=
-kp lv(t)lC"-' e(t)
(77.30)
one finds that
To avoid the explicit measurement (or calculation) of the acceleration, the friction prediction error can be based on a filtered model. By applying a stable, low-pass filter, F(s), to each side of
(77.31)
making e = 0 the stable fned point of the process. The algorithm significantlyimproves dynamic response [4];experimental results are presented in subsequent papers (see [2] for citations).
THE C O N T R O L H A N D B O O K
77.1.6 Conclusion Achieving machine performance free from frictional disturbances is an interdisciplinary challenge; issues of machine design, lubricant selection, and feedback control all must be considered to cost effectively achieve smooth motion. Today, the frictional phenomena that should appear in a dynamic friction model are well understood and empirical models of dynamic friction are available, but tools for identifying friction in specific machines are lacking. Many friction compensation techniques have been reported in both the research literature and industrial applications. Because a detailed friction model is often difficult to obtain, many of these compensation techniques have been empiricalIy developed and tuned. Where available, reported values of performance improvement have been presented here. Reports of performance are not always available or easily compared because, to date, there has hot been a controlled comparison of the effectiveness of the many demonstrated techniques for friction compensation. For many applications, it is up to the designer to seek out applicationspecific literature to learn what has been done before.
[ l o ] Ibrahim, R.A. and Rivin, E., Eds., Special issue on friction induced vibration, Appl. Mech. Rev., 47(7), 1994. [ l l ] Karnopp, D., Computer simulation of stick-slip fric-
tion in mechanical dynamic systems, ASME J. Dynamic Syst., Meas., Control, 107(1), 100-103, 1985. [12] Polycarpou, A. and Soom, A., Transitions between sticking and slipping, in Friction-Induced Vibration, Chatter, Squeal, and Chaos, Proc. ASME Winter Ann. Meet., Ibyahim, R.A. and Soom, A., Eds., DE-Vol. 49, 139-48, Anaheim: ASME; New York: ASME, 1992. [13] Taylor, J.H. and Lu, J., Robust nonlinear control system synthesis method for electro-mechanical pointing systems with flexible modes, in Proc. Am. Control Con$, AACC, San Francisco, 1993,536-40. 1141 Walrath, C.D., Adaptive bearing friction compensation based on recent knowledge of dynamic friction, Automatica, 20(6), 717-27, 1984.
77.2 Motion Control Systems facob Tal,
References [ l ] Armstrong-Hklouvry, B., Control of Machines with Friction, Kluwer Academic Press, Boston, MA, 1991. [2] Armstrong-Hklouvry, B., Dupont, P., and Canudas de
Wit, C., A survey of models, analysis tools and compensation methods for the control of machines with friction, Automatica, 30(7), 1083-1 138, 1994. [3] Bowden, F.P. and Tabor, D. Friction - A n Introduction to Tribology,Anchor Press/Doubleday, Reprinted 1982, Krieger Publishing Co., Malabar, 1973. [4] Brandenburg, G . and Schafer, U., Influence and partial compensation of simultaneously acting backlash and coulomb friction in a position- and speedcontrolled elastic two-mass system, in Proc. 2nd Int. Con$ Electrical Drives, Poiana Brasov, September 1988. [5] Canudas de Wit, C., Noel, P., Aubin, A., and Brogliato,
B., Adaptive friction compensation in robot manipulators: low-velocities, Int. J. Robotics Res., 10(3), 18999, 1991. [6] Canudas de Wit, C., Olsson, H., Astrom, K.J., and
Galil Motion Control, Inc.
77.2.1 Introduction The motion control field has experienced significant developments in recent years. The most important one is the development of microprocessor-based digital motion controllers. Today, most motion controllers are digital in contrast to 20 years ago when most controllers were analog. The second significrint development in this field is modularization, the division of motion control functions into components with well-defined functions and interfaces. Today, it is possible to use motion control components as building blocks and to integrate them into a system. The following section describes the system components and operation including very simple mathematical models of the system. These models are adequate for the major emphasis of this chapter which discusses the features and capabilities of motion control systems. The discussion is illustrated with design examples.
77.2.2 System Elements and Operation The elements of a typical motion control system are illustrated in the block diagram of Figure 77.16.
Lischinsky, P., A new model for control of systems with friction, IEEE Trans. Autom. Control, in press, 1994. [7] Dupont, P.E., Avoiding stick-slip through PD control, IEEE Trans. Autom. Control, 39(5), 1094-97, 1994. [8] Friedland, B. and Park, Y.-J., On adaptive friction compensation, IEEE Trans. Autorn. Control, 37(10), 1609-12,1992. [9] Fuller, D.D., Theory and Practice of Lubrication for Engineers, John wiley & Sons, Inc., New York, 1984.
HOST COMMAND MOTION COMPMER CONTROLLER
"P Figure 77.16
Elements of a motion control system.
77.2. MOTION CONTROL SYSTEMS
FILTER
The amplifier is the component that generates the current required to drive the motor. Amplifiers can be configured in different ways, thereby affecting their operation. The most common operating mode is the transconductance or current mode, where the output current, I , is prdportional to the applied voltage, V. The ratio between the two signals, K,, is known as the current gain. I = K,V (77.32) When current is applied to the motor, it generates a proportional torque, Tg. Tg = K t I (77.33) The constant Kt is called the torque constant. The effect oftorque on motion is given by Newton's Second Law. Assuming negligible friction, the motor acceleration rate, a!, is given by
where J is the total moment of inertia of the motor and the load. The motor position, 0 , is the second integral of the acceleration. This relationship is expressed by the transfer function
The position of the motor is monitored by a position sensor. Most position sensors represent the position as a digital number with finite resolution. If the position sensor expresses the position as Vj units of resolution (counts) per revolution, the equivalent gain of the sensor is Kf =
N counts 2n rad
The component that closes the position loop is the motion controller. This is the "brain" of the system and performs all the computations required for closing the loop as well as providing stability and trajectory generation. Denoting the desired motor position by R , the position feedback by C , and the position error by E , the basic equation of the closed-loop operation is
The controller includes a digital filter which operates on the position error, E, and produces the output voltage, V. In order to simplify the overall modeling and analysis, it is more effective to approximate the operation of the digital filter by an equivalent continuous one. In most cases the transfer fhction of the filter is PID, leading to the equation,
SENSOR
Figure 77.17
Mathematical model of a typical motion control system.
77.2.3 Stability Analysis Example Consider a motion control system including a current source amplifier with a gain of K, = 2 Amps per volt: the motor has a torque constant of Kt = 0.1 NmIA and a total molment of inertia of J = 2 . lop4 K~ .m2.The position sensor is an absolute encoder with 12 bits of binary output, implying that the sensor output varies between zero and 2'' - 1 or 4095. This means that the sensor gain is
The digital controller has a sampling period of 1 msec. It has a single input, E(K), and a single output X(K). The filter equations are as follows:
The signal X(K) is the filter output signal which is applied to a digital-to-analog converter (DAC) with a 14-bi: resolution, and an output signal range between - 10V and 10V. A mathematical model for the DAC is developed, noting that when X(K) varies over a range f8192 counts, the output varies over f10V. This implies that the gain of the DAC: is 10 G = -= 1.22. 8192
volts count
To model the equation of the filter, we note that this is a digital PID filter. This filter may be approximated by Ithe continuous model of Equation 77.38 with the following equivalent terms: P
=
AG
J
=
CG/T
D
=
BTG
where T is the sampling peribd. For example, if filter parameters are The mathematical model of the complete system can now be expressed by the block diagram of Figure 77.17. This model can be used as a basis for analysis and design of the motion control system. The analysis procedure is illustrated below.
MOTOR
AMP
THE CONTROL HANDBOOK and a sampling period is
the resulting equivalent continuous P:D parameters are
Assuming the filter parameters shown above, we may proceed with the stability analysis. The open loop transfer function of the confrol system is
system stability. To illustrate the process of motion profiling, consider the case where the motor must turn 1 radian and come to a stop in 0.1 seconds. Simple calculation shows that if the velocity is a symmetric triangle, the required acceleration rate equals 400 radlsec2, and the position trajectory is
The required move is accomplished by the motion controller computing the function R ( t ) at the sample rate and applying the result as an input to the control loop.
77.2.5 Tools for Motion Coordination For the given example, L ( s ) equals
Start with the crossover frequency (the frequency at which the open-loop gain equals one) to determine the stability. This equals
The phase shift at the frequency w, is
The system is stable with a phase margin of 60". The previous discussion focused on the hardware components and their operation. The system elements which form the closedloop control system were described. In order to accomplish the required motion, it is necessary to add two more functions, motion profiling and coordination. Motion profiling generates the desired position function which becomes the input of the position loop. Coordination is the process of synchronizing various events to verify that their timing is correct. These functions are described in the following sections.
Motion controllers are equipped with tools to facilitate the coordination between events. The tools may vary between specific controllers but their functions remain essentialiy the same. Coordination tools include stored programs control variables input/output interfaces trip points The stored programs allow writing a set of instructions that can be executed upon command. These instructions perform a specific motion or a related function. Consider, as an exarnple, the following stored program, written in the format of Galil controllers: Instruction #MOVE PR 5000 SP 20000 AC 100000 BGX
EN
Interpretation Label Relative distance of 5000 counts Speed of 20,000 counts/sec Acceleration rate of 100,000 counts/sec2 Start the motion of the X-axis End the program
77.2.4 Motion Profiling Consider the system in Figure 77.17 and suppose that the motor must turn 90': The simplest way to generate this motion is by setting the reference position, R, to the final value of 90". This results in a step command of 90'. The resulting motion is known as the step response of the system. Such a method is not practical as it provides little control on the motion parameters, such as velocity and acceleration. An alternative approach is to generate a continuous timedependent reference position trajectory, R(t).The basic assumption here is that the control loop forces the motor to follow the reference position. Therefore, generating the reference trajectory results in the required motion. The motion trajectory is generated outside the control loop and, therefore, has no effect on the dynamic response and the
This program may be stored and executed in the motion controller, thus allowing independent controller operation without host intervention. The capabilities of the motion controllers increase immensely with the use of symbolic variables. They allow the motion controller to perform hathematical functions and to use the results to modify the operation. Consider, for example, the following program which causes the X motor to follow the position of the Y motor: This is achieved by determining the diEerence in the motor positions, E, and by driving the X motor at a velocity, VEL, which is proportional to E. Both E and VEL are symbolic variables.
77.2. MOTION C O N T R O L SYSTEMS Instruction #FOLLOW
Interpretation Label Set X in jog mode Acceleration rate Start X motion
JGO
AC 50000 BGX #LOOP
Label
E = -TPY - -TPX VEL =; E * 20 JG VEL JP #LOOP EN
Position difference Follower velocity Update velocity Repeat the process End
The inputloutput interface allows motion controllers to receive additional information through the input lines. It also allows the controllers to perform additional functions through the output lines. The input signals may be digital, representing the state of switches or digital cc~mmandsfrom computers or programmable logic controllers (PLCs). The inputs may also be analog, representing continuous functions such as force, tension, temperature, etc. The output signals are often digital and are aimed at performing additional functions, such as turning relays on and off. The digital output signals may also be incorporated into a method of communication between controllers or communication with PLCs and computers. The following program illustrates the use of an input signal for tension control. An analog signal representing the tension is applied to the analog input port #l. The controller reads the signal and compares it to the desired level to form the tension error TE. To reduce the error, the motor is driven at a speed VEL that is proportional to the tension error. Instruction #TEN
Interpretation
JGO AC 10000 BGX
Zero initial velocity Acceleration rate Start motion
#LOOP
Label Read analog input and define as tension
TENSION = @AN [ l ] TE = 6-TENSION VEL = 3 * TE JGVEL TP#LOOP EN
#TRIP PR 10000,1OOOO SP 20000,20000 A11 BGX AD 5000 BGY AD,3000 SB1 WT20 CB1 EN
Label Motion distance for X and Y motors Velocities for X and Y Wait for start signal input 1 Start the motion of X Wait until X moves 5000 clounts Start the Y motion Wait until Y moves 3000 counts Set output bit 1 high Wait 20 ms Clear output bit 1 End
The following design example illustrates the use of the tools for programming motion.
77.2.6 Design Example - Glue Dispensing Considera two-axis system designed for glue dispensing, whereby an XY table moves the glue head along the trajectory of Figure 77.18. To achieve a uniform application rate of glue per unit length, it is necessary to move the table at a constant vector speed and to turn the glue valve on only when the table has reached uniform speed.
Label
Caldate tension error Calculate velocity Adjust velocity Repeat the process End
Synchronizing various activities is best achieved with trip points. The trip point mechanism specifies when a certain instruction must be executed. This can be specified in terms of motor position, input signal, or a time delay. The ability to sGecify the timing of events assures synchronization. Consider, for example, the following program where the motion of the X motor starts only after a start pulse is applied to input #l. After the X motor moves a relative distance of 5000 counts, the Y starts, and after the Y motor moves 3000 counts, an output signal is generated for 20 ms.
F i v e 77.18
Motion path for the glue-dispensing'system.
The motion starts at point A and ends at point C after a complete cycle to allow for acceleration and deceleration distances. The gluing starts and stops at point B. The motion program is illustrated below. It is expressed in the format of Galil controllers. he instructions for straight lines and circular arcs are VP and CR, respectively. The glue valve is activated with the digital output signal # l . This signal is activated and deactivated when the X motor moves through point B. The timing of this event is specified by the trip point FM 2000 which waits until the X motor moves forward through Ithe point X = 2000.
THE CONTROL HANDBOOK Instruction #GLUE DP 0,O VP 4000,O CR 2000,270,180 VP 0,4000 CR 2000,90,180 VP 4000,O VE VS 20000 VA 100000 VD 100000 BGS FM 2000 SB 1 WT 100 FM 2000 CB 1 EN
Interpretation Label Defined starting position as (0,O) Move to point C Follow the arc CD Move to point E Follow the arc EA Move to point C End of motion Vector speed Vector acceleration Vector deceleration Start the motion When X = 2000 (point B) Set output 1-turn glue valve Wait 100 ms When X = 2000 again Turn the glue valve off
End
References [ 1] DC Motors, Speed Controls, Servo Systems, Engineering Handbook, Electrocraft Corp. [2] Tal, J., Step-by-stepDesign of Motion Control Systems, Galil Motion Control, 1994.
77.3 Ultra-High Precision Control Thomas R. Kurfess, The George W. Woodruff School of Mechanical Engineering, The Georgia Institute of Technology, Atlanta, G A Hodge Tenkins, The George W. Woodruff School of Mechanical Engineering, The Georgia Institute of Technology, Atlanta, G A 77.3.1 Introduction Fierce international competition is placing an ever-increasing significance on precision and accuracy in ultra-high precision systems engineering. Control of dimensional accuracy, tolerance, surface finish and residual damage is necessary to achieve or inspect ultra-high precision components required for the latest technology. To aihieve ultra-high precision, specialized control subsystems must be employed for modern machine tools and measurement machines. These systems are common in manufacturing today, and will-become more widespread in response to continually increasing demands on manufacturing facilities. Today, ultra-high precision applications are found in the manufacture and inspection of items such as automobile bearings, specialized optics or mirrors, and platens for hard disk drives. The manufacture of x-ray optical systems for the next generation of microchip fabrication systems has also recently become internationally important. An excellent example of an ultra-high precision machine is the large optic diamond turning machine at the Lawrence Liv-
ermore Nabonal Laboratory 181. This machine operates with a dimensional error of less than inches (1 pin) and has turned optical parts for many systems including the secondary mirror for the Keck telescope. Not only is the shape of the optics precise enough for use without further processing, but the surface finish is o f high enough quality to eliminate polishing. In effect, diamond turning is capable of producing optical parts with extremely stringent optical and specular requirements. In ultra-high precision machining, accuracies of 0. lpin and lower are targeted. To achieve these accuracies, many problems arise not commonly encountered when designing standard machine servocontrollers, such as low-velocity friction, axiscoupling, and phase lag.
Applications of Ultra-High Precision Systems The small, but increasing, number of special applications requiring ultra-high precision control techniques include grinding, metrology, diamond turningmachines, and the manufacture of semiconductors, optics and electronic mass media. Diamond turning and grinding of optical quality surfaces on metal and glass require ultra-high precision movement. Accuracies for diamond turning of mirrors can be on the order of 10 nm. Optical grinding applications also require force control to promote ductile grinding, reducing subsurface damage of lenses [21,[131. Another application for ultra-high precision control is metrology. Precision machined parts dre either measured point by point using a vectored touch to include a sufficient amount of data to determine the part geometry, or they are measured in a continuous motion (scanning) along a part dimension using a constant force trajectory. In either case a force probe (or touch trigger) is used to make or maintain contact with the part surface. To have repeatable data, minimize the force-induced measurement errors, and extend the life of the scanning probe, the contact forces must be minimized and maintained as constant as possible. The primary applications discussed in this chapter are diamond turning machines (DTM) and scanning coordinate measurement machines (CMMs). Both machines have many similarities including basic structure as well as velocity and position control designs. When addressing issues related to both types of machines, the term machine is used. When discussing a cutting machine (e.g., a DTM) the term machine tool is used; the term measurement machine (e.g., a CMM) is used for a machine used exclusivelyto measure.
History Initial machine tool control began with simple numerical control (NC) in the 1950s and 1960s. Later (in the 1970s),computer numerical control (CNC) grew more prevalent as computer technology became less expensive and more powerful. These were either point-to-point controllers or tracking controllers, where the tool trajectory is broken up into a series of s m d straight-line movements. Algorithms for position feedback in most current machine tools are primarily proportional control with bandwidths of nominally 10 to 20 Hz. Increasing the accu-
77.3. ULTRA-HIGH PRECISION CONTROL racy and repeatability of the machine has been the driving force behind the development of NC and CNC technology. However, not until the mid 1960sdid the first diamond turning machines begin to take shape. These machines were developed at the Lawrence Livermore National Laboratory (LLNL) defense complex. DTM's differ widely from other machine tools. Two of the major differences critical for the control engineer are the mechanical design of the machine, and active analog compensators used in the machine control loop. The basic design philosophy employed by the engineers at LLNL is known as the deterininistic theory [4]. The premise of this theory is that machines should be designed as well enough to eliminate the need for complicated controllers. Given such a system, the job of the control engineer is greatly siinplified,however, to achieve the best possible performance, active analog compensators are still successfully employed on DTMs In the 1980s, other advances in machine tool and measurement system control were further developed such as parameter controllers used to servo force or feedrate. Such parameter controllers, which are fairly common today, operate at lower frequencies. For example, closed-loop force control system frequencies of less than 3 Hz are used in many applications [22]. Theoretical and experimental work has been and continues to be conducted using model reference adaptive control in machining (e.g., [ X I , [TI, [ I 11. Both fixed gain and parameter adaptive controllers have been implemented for manipulating on-line feed rate to maintain a constant cutting force. Although such controllers h ~ v eincreased cutting efficiency, they typically reduce machine performance and may lead to instability. Although research in adaptive parameter controllers is fairly promising to date, these controllers have not been widely commercialized. Further details regarding machine tool control can be found in the literature [28].
Current Technology The current technology for designing machines with the highest precision is based on a fundamental understanding of classical control, system identification, and precision components for the controller, drive and feedback subsystems. Control can compensate for poor equipment design only to a limited degree. A well-designed machine fitted with high precision hardware will ease the controller design task. This, of course, relates back to deterministic theory Special hardware and actuators for ultra-high precision equipment such as laser interferometers, piezoelectric actuators, air and hydrostatic bearings are critical in achieving ultra-high precision. These sub'systems and others are being used more frequently as their costs decrease and availability increases. As stated above, the analog compensator design for precision systems is critical. For ultra-high precision surfaces, both position and velocity must be controlled at high bandwidths because air currents and floor vibrations can induce higher frequency noise. The specifications for the velocity compensator combine machine bandwidth and part requirements. Part geometry, in particular, surface gradients in conjunction with production rate
requirements, dictate the velocity specificationsfor measurement machines and machine tools. For example, a part with high surface gradients (e.g., sharp corners) requires fast responses to velocity changes for cornering. Of course, higher production rates require higher bandwidth machines as feed rates are increased. Maximum servo velocity for a machine tool is also a function of initial surface roughness, and desired finish as well as system bandwidth. For measurement machines, only part geometry and machine bandwidth affect the system velocity specifications because no cutting occurs. One way to improve the performance of ultra-high precision machines is to use slow feed rates (on the order of 1 pinlsec), but such rates are usually unacceptable in high quantity production. Until recently, force control was not used in many ultra-high precision machining applications, because the cutting forces are generally negligible due to the low feed rates. Furthermore, face control departs from standard precision machining approaches because force is controlled by servocontrol of feed velocity or position. Control is accomplished via position servocontrol using Hook's spring law, and damping relationships are used to control force if velocity servocontrol is used. Unfortunately, force control in machining generates conflicting objectivesbetween force trajectories and position trajectories (resulting in geometric errors) or velocity trajectories (resulting in undesired surface finish deviations). However, such conflicts can be avoided if the machine and its controllers are designed to decouple force from position and velocity. Such conflicts do not occur to as great an extent with measurement systems where contact forces must be kept constant between the machine measurement probe and the part being inspected. Typical error sources encountered in precision machine control are machine nonlinearities, control law algorithms, structural response (frequency and damping), measurement errors, modeling errors, human errors, and thermal errors. Sources of machine nonlinearities include backlash, friction, actuator saturation, varying inertial loads, and machinelprobe compliance. Methods for addressing these nonlinearities include high proportional gain, integral control, feedforward control, adaptive control, and learning algorithms. Friction is a problem at the low feed rates needed to achieve high quality (optical) surface finishes. Low velocity friction (stiction) is a primary cause of nonlinearities affecting trajectory control of servo tables. At near zero velocities, stiction may cause axial motion to stop. In such a condition, the position tracking error increases and the control force builds. The axis then moves, leading to overshoot..Most industrial controllers use only proportional compensators that cannot effectively eliminate the nonlinearities of stiction [28]. Friction compensation in lead screws has been addressed by several authors concerned with coulomb friction and stiction [27]. Special techniques such as learning algorithms 1251 and cross-coupling control [20] have been employed successfully in X-Y tables to reduce tracking errors caused by low velocity friction.
THE CONTROL HANDBOOK
77.3.2 System Description Because controller and system hardware directly affect machine performance, both must be thoroughly discussed to develop a foundation for understanding precision system control. This section describes the precision hardware components (e.g., servo drives and sensors) and controllers, in terms oftheir function and performance.
Basic Machine Layout The control structure for a typical machine tool can be represented by a block diagram as shown in Figure 77.19. Typically, CNC controllers generate tool trajectories shown as the reference input in Figure 77.19. Given this reference trajectory and feedback from a position sensor, most CNC controllers (and other commercially available motion controllers) compute the position error for the system and provide an output signal (voltage) proportional to that error. A typical CNC output has a limited range of voltage (e.g., -10 to 10 V dc). The error signal is amplified to provide a power signal to the process actuators which, in turn, affect the process dynamics to change the output that is then measured by sensors and fed back to the CNC controller.
'
CNC
' Controller & ower Amplifier
I--
.--I
Measurement
Figure 77.19
Control structure block diagram.
To improve system capability, various controllers may be placed between the position error signal and the power amplifier. This is discussed in detail in later sections. However, before beginning the controller design, it is valuable to examine the system components for adequacy. It may be desirable to replace existing motors, actuators, power amplifiers and feedback sensors, depending on performance specifications and cost constraints. Transmissionsand Drive Trains Typically, machines are based on a Cartesian or polar reference frame requiring converting the rotational displacements of actuators (i.e., servo motors) to linear displacements. Precision ball screws, friction drives or linear motors-areused for this purpose. Nonlinearities,such as backlash, should be minimized or avoided in any drive system. In ball screws backlash can be greatly reduced, or eliminated in some cases, by preloading an axis. Preloading can be accomplished by suspending a weight from one side of the slide, ensuring that only one side of the screw is used to drive the slide. Backlash in gear reductions can be minimized with the use of harmonic drives. Greater precision requires friction (capstan) drives or linear motors, because position repeatability suffers from the recirculating ball screws. Capstan drives have direct contact so that no backlash occurs. However, they cannot transmit large forces, so that they are commonly used in CMMs and some DTMs where forces are low.
Rolling or antifriction elements are required to move heavy carriages along a guide way and to maintain linearity. Several configurations are used depending on the amount ofprecision required. Linear shafts and redrculating ball bearing pillow blocks are for less precise applications. For increased accuracy and isolation, noncontact elements are required at greater expense. Noncontact elements include air bearings, hydrostatic bearings and magnetic bearings. Air bearings are common on CMMs. Hydrostatic bearings are still a maturing technology but provide greater load carrying capability over air bearings [23]. Motors Several motor types are available for use in ultra-high precision control. Typical applications are powered by precision DC servomotors. Recent technology includes linear motors and microstepping motors. Stepper motors provide an accurate and repeatableprecise step motion, yielding a moderate holding torque. However, standard stepper motors have a fundamental step angle limitation (usually 1.8" per step). Stepper motors also have a low resonance frequency causing them to perform poorly at slow speeds, inducing vibrations on the order of 10 Hz. The step size andvibration limitations are somewhat reduced when using microstepping controllers. Finally, these motors can also dissipate large amounts of heat, warming their surroundings, and generating thermal errors due to thermal expansion of the machine. Linear motors have the advantage ofeliminating the lead screw elements of the actuator drive. They have no mechanical drive elements, but generally require,rolling elements to suspend the "rotor" or slide. Heat is usually generated from a linear motor closer to the work piece and can be a significantsource of thermal errors, depending on the application and machine configuration. Power Amplifiers Power sources (amplifiers) can be either constant gain amplifiers or pulse width modulated amplifiers (for DC servo motors). Considerations for the amplifier selection are its linear range, maximum total harmonic distortion (THD) over the linear range, and maximum power output. All of these factors must be examined in designing ultra-high precision control systems. For high bandwidth systems, the amplifier dynamics may be critical. THD can be used to estimate nonlinear effects of the amplifier conservatively. The power limitations provide control limits to maintain controller linearity. (To protect the motors and amplifiers, circuit breakers or fuses should be placed appropriately. Transient currents (maximum current) and steady state current (stall current) must both be considered.)
Instrumentation: Sensors, Locations and Utilization Instrumentation of many types is available for position, velocity and force ftedback to the controller. The sensor choice depends on the particular application. For example, in diamond turning, cutting forces are generally negligible, so that only tool speed and position sensors (e.g., tachometers and encoders) are needed for control. However, in scanning metrology, the probe tip force must be controlled to a very high degree to limit probe deflections. Here force sensors must be chosen carefully to achieve the desired force response.
-
77.3. ULTRA-HIGH PRECISION CONTROL Because the tolerances typically held in ultra-high precision machining are on the order of 100 nm and lower, special sensors are needed for measurement and feedback. Displacement sensors used for ultra-high precision feedback include optical encoders, resolvers, precision potentiometers, linear variable differential transformers (LVDTs), eddy current sensors, capacitance gages, glass scales, magnetic encoders, and laser interferometers. Even if stepper motors are used, feedback transducers should be applied to eliminate a potential slip error. At this small measurement scale, thermal effects and floorjexternal vibrations not typically encountered in lower precision applications, appear more significant. Thus, the following points must be considered: 1. Allow the system to reach thermal equilibrium (attain a steady state temperature). Typically a 24-hour thermal drift test is required to assess the effect of temperature variations, [[I],Appendix C]. 2. Provide environmental regulation. Once the machine has reached thermal equilibrium, it must be kept at a constant temperature to insure repeatability. The standard temperature as defined in [ l ] is 20ยฐC (68ยฐF). It is difficult, if not impossible,to keep the entire machine at a temperature of 20ยฐC. Heat sources, such as motors and friction, will generate "hot spots" on the machine. Typically, the spatial thermal gradients are not detrimental to machine repeatability, but temporal thermal gradients are. Therefore, it is important to keep the machine temperature from varying with time. If laser interferometers are used, the temperature, relativehumidity and pressure ofthe medium through which the laser beam passes (typically air) must be considered. If these quantities are not considered, the interferometer will experience a velocity of light (VOL) error, because the wavelength of light is modulated in a varying environment. It is not necessary to maintain these quantities at a fixed level (clearly, maintaining pressure is a difficult task), because interferometers can be equipped to compensate for VOL errors. In some critical applications, the laser beam is transmitted through a controlled atmosphere via separate beam ways. For the most precise machines developed, the beams are transmitted through a vacuum to eliminate all effects of temperature, humidity and pressure. This solution, however, is extremely difficult to realize. 3. The system must be isolated from floor vibrations. Granite bases, isolation damping pads, and air suspension are some of the techniques used to isolate a system from vibrations. Even the smallest disturbances, such as people conversing next to the machine, can adversely affect the surface finish. For best results in precision control, is important to choose the appropriate feedback sensor(s) and mounting location. Several principles should be followed. Care must be taken to locate the
sensor as close as possible to the line of action or contact, to avoid Abbe offset errors [3]. Of course it is always desnrable to locate sensors and drives in a location that is insensitive to alignment errors and away from harm in case a crash occurs. With the appropriate design and consideration, a machine's accuracy can be improved to the level of the sensor's resolutior~. It is critical to recognize that sensors have dynamics and, therefore, transfer functions. The sensor bandwidth ]must match or exceed the desired system frequencyresponse. Sorne of these sensors will require their own compensation, such as the VOL error on laser interferometers caused by fluctuations in temperature, pressure, and humidity. Discrete sensors, such as encoders (rotary and linear), are often used in precision machine systems. These sensors measure position by counting tick marks precisely ruled on the encoder. These discrete sensors produce two encoded phase signals for digital position indication (A and B channel quadrature). This increases their resolution by a factor of four and provides directional information. Rotary encoders and resolvers, used on the motor rotational axis, are not typically acceptable in ultra-high precision control because they are not collocated with the actual parameter being measured. Furthermore, their repeatability is affected by backlash and drive shaft (screw) compliance. However, linear versions of encoders (either magnetic or glass scales) are commonly used with good results. A limitation here is the minimum resolution size that is directly related to the spacing of the ticks ruled on the scale. Care must be taken to provide a clean environment, because dirt, smudges or other process byproducts can reduce the resolution of these sensors, limiting their utility. The requirement for cleanliness may also require that these devices be kept isolated from potentially harsh environments. Analog sensors are also widely used for feedback. Typical analog sensors used in precision control include LVD'Ts, capacitance gages and inductive (eddy current) gages. The lleast expensive analog displacement sensor is the LVDT, which uses an inductance effect on a moving ferrous core to produce a linear voltage signal proportional to the displacement. In most cases ultra-high precision air bearing LVDTs are necessary to overcome stiction of the core. The best resolution, for analog sensors, can be obtained with capacitance gages and eddy current gages which are widely used in ultra-high precision measurements. Capacitance gages can measure 10-12m displacements [14]. However, at this resolution the range is extremely limited. These sensors require a measurement from a metallic surface, so that locations are typically in-line with the metal part or offset near the equipment ways. Because of the extremely high resolution and low depth of field of these sensors, it is preferable to keep them collinear to the actuators. The interferometer is the most preferred position sensor in precision system design, because it utilizes the wavelength of light as its standard. Thus, it is independent of any physical artifact (such as gage blocks). In general, the laser interferometer is the best obtainable long-range measurement device, with measurements accurate to 0.25 nm, nominally. Because of size limitations, interferometers are typically located with
THE C O N T R O L HANDBOOK making measurements susceptible to offset errors, which can be minimized by good metrological practices. Similar to encoders, interferometers use A and B channel quadrature as their output format. As previously stated, care must be taken to include temperature, humidity and pressure compensation to achieve the best possible resolution for the interferometer. Typical sensors used in force measurement include LVDTs, strain gages, and piezoelectric transducers. LVDTs are used with hiy\ly calibrated springs or flexures, allowing forces to be estimated as a function of spring deflection. Bandwidth for this type of sensor is limited. Force measurements based on strain gages are also common. However, they too suffer from limited frequency response because they operate on the same principle as sensors employing flexures (Hook's Law). Slnce most ultra-high precision applications require a hlgher bandwidth than available from flexures and strain gage aystems, piezoelectric force transducers are typically used in these applications. Piezoelectric-based dynamometers use a quart^ crystal to generate a charge proportional to a force. The dynamometer may be located either behind the tool or under the wotk plece to measure cutting or probe forces. Although these sensors have larger bandwidth, they exhlbit some hysteresis and charge leakage which results in poor capability to measure lower frequency and DC forces due to drifting. However, they are genera!ly stiffer than strain gage force sensors. Piezoelectric force sensors also tend to be large and reduce the effective tool stiffness. Recent work in force sensors for cutting machines has led to the development of sensors that employ a piez electric film. The film is a 28pm of polyvinylidene fluoride placed on tool Inserts close to the cutting edge [17]. Advantages df the film are that the force measurements are close to the cutting edge and there is essentially no loss of stiffness because of its small size. Newer ~, magnetostrictive materials, such as metglass or ~ e r f e n o l are currently being examined for use as force sensors. These materials emit an electromagnetic field in proportion to the force applied.
i
Care must be given to cabling sensors and determining the expected signal-to-noise ratio. All cables must be shielded to prevent stray magnetic fields from inducing current noise in sensor cables. A means must be provided for independent calibration of sensors (e.g., a laser interferometer can be used to independently check the capacitance gage calibration). Some sensors are also calibrated to a specific cable length impedance. Because many sensors are digitally based, anti-aliasing filters must be provided to ensure that signal components above the system Nyquist frequency do not cause aliasing problems. Details regarding the use of antialiasing filters can be found in Section 77.3.3. It is also prudent to remember that sensors themselves are dynamic systems, which vary in magnitude and phase with frequency. Specifications are typically provided by the manufacturer. However, it is strongly recommended that sensor system responses beverified. (System identification techniques'used in system response verification are presented in Section 77.3.3.) This assists in designing and trouble shooting the controller.
Compensators: Position and Velocity Loops Analog controllers can be added to existing CNC-type controllers to improve the speed and precision of existing machines if appropriate attention is paid to understanding the system dynamics. The controller and system block diagram are depicted in Figure 77.20, highlighting the additional compensator for improved performance. The inner loop is the velocity loop, and
Analog System
I Poslbon Controller
Veloc~ty Controller
Laser
Servo Motor
1
posttkon
interferometer
Figure 77.20
1
Mach~newlth add~tlonalcompensator.
the outer loop is the position loop Both position and veloctty must be well-characterized and controlled in ultra-high precision processes. Because tool position relates directly to final part geometry t l ~ eposition compensator, KI,(s), is used to control dimensional accuracy. Similarly, since the tool velocity is directly related to surface finish, the velocity compensator, K , ( J ) , is used to control the surface finish. The plant (or process) is depicted as G(s), with velocity feedback provided by a tachometer and position feedback supplied by a laser interferometer or linear encoders. Details of the control design are provided in Section 77.3.4.
Fast Tool Servos Recent advances have been made in the use ofpiezoelectrlc actuators for achieving greater bandwidth on precision equipment. Piezoelectric actuators are high bandwidth actuators (on the order of 1 kHz) that can produce high forces and are relatively stiff. However, the stroke of these actuators is generally limited to about 100 p m or less. Piezoelectric actuators are used with conventional actuators (lead screws, capstan drives, etc.) to yield small, high frequency displacements and the conventional actuators provide the lower speed, longer range motions. The piezoelectric actuator is powered by a high voltage supply nominally generating 1000 volts. Actuators are run in a closedloop mode, typically using a laser interferometer o r a capacitance gage for displacement feedback. When used in a closed-loop, the piezoelectric actuato; is referred to as a fast tool servo (FTS). Several ultra-high precision applications using FTS have been developed [ 6 ] .The FTS has been used to compensate for cutter runout [18] and for negating the effects of low velocity friction in a series lead screw actuator. FTS's have also been successfully implemented on diamond turning machines [13], [9].
77.3. ULTRA-HIGH PRECISION CONTROL
1391
77.3.3 System Identification and Modeling Successful ultra-high precision control design and implementation depend on the accuracy of system models. Thus, the dynamics of each component must be well-characterized. System identification techniques provide tools for this purpose [19]. A system or component transfer function, G(s), is typically represented in the frequency domain, as shown in Figure 77.2 1. Here X (s) is defined as the input to the system and Y (s) is the system output.
X(S)
-+
System
b y(s)
G(s)
Figure 77.21
System input and outputwith resultingtransferfunction.
G(s), are derived from time histoModel transfer f~~nctions, ries of the input signal, x(t), and the output signal, y(t), using techniques discussed in the following sections. The order of the model should be as small as possible, preferably related to physical characteristics ofthe machine (e.g., drive shaft compliance or carriage inertia). Several software packages provide routines that may be used to examine a range of model orders and determine the best fit for a given data set (e.g., MATLAB). However, this numerical approach should not be blindly used as a substitute for thoroughly understanding the process or physical system.
Multiple Axes - Cross-Coupling Some machines have only a single axis or possess multiple axes that are not strongly coupled. For such machines singleaxis system identification can be employed and is straightforward. The input to output transfer functions are easily found as discussed below. Hoyever, in many machining cases, such as grinding, several axes are involved in the machine control. If the dynamics of these axes are coupled, then cross-coupling effects must be considered. For example, one axis or more may be used to control force while another axis may be used to control velocity. Recent work addresses, in part, the decoupling of force and velocity loops in a grinding system by gain selection in each axis [12]. Several approaches for improved precision control with a cross-coupled system have been developed over the past 10 years. Another recent technique with superior performance is a cross-coupling controller, where the error in one axis is coupled to position control ofthe other axes to minimize contour errors [201.
Experimental Approaches In the development of system models for the controllers, amplifiers and mechanical systems, several experimental ap-
proaches may be used to identify and verify component response. In this section the three most common approaches and the typical hardware configurations necessary to implement them are presented. Stepped Sinusoidal Response Stepped sinusoidal input is the preferred method for determining transfer functions of all system components because it permits detailed system analysis at all frequencies. It is, however, the slowest of the three techniques discussed here. To use this method, sinusoids of various frequencies are fed to the system, one frequency at a time. The response of the system is measured at each frecluency, recording the magnitude and phase variations between the input and output signals. The recorded data are then used to construct accurate Bode plots for the system that represent the plant transfer function, G (s). Special purpose digital signal processing (DSP) equipment is availab!e to sweep automatically through a range of frequencies and determine transfer functions in this manner. A coherence function of the data may be easily generated to assess the validity of the determined transfer function at various frequencies. The magnitude of the coherence functllon varies from zero to unity representing low and high confidence in results, respectively. Step Input Response For first-order or second-order models, a step input response can be used to determine the system dynamics by applying system identification techniques on the input and output data. The sharpness of the actual step input will limit the frequency range of the identifiable response. Sharper steps have broader frequency spectrums and, therefore, may be used to generate models possessing broader frequency spectrums. Fairly sharp steps in voltage inputs can be easily generated, yielding excellent results for systemswith electrical inputs, such as motors. A fast Fourier transform (FFT) is used to determine the system model's magnitude and phase from the step input response. A frequency domain model (transfer function) or a state-space model can be derived using a least-squaresparameter estimation technique. One such method is known as ARX, AutoRegressive external input [19]. The ARX method minimizes the square of the error, e(t), in the following set of equations
and
[BN.HN]
N
= minxe2(t),
(77.40)
t=l
where G(q) represents a discretized version of C;(s), H (q) is a discretized weighting of the residual error of the model, N is ~ Ijhr are the best the number of data points taken, and 6 , and estimates (using the least-squares relationship given in Equation 77.40) of G(q) and H(q), respectively. When using ARX procedures, several precautisns must be taken. First, because ARX models tend to favor higher frequencies [26],the sampled system data should be filtered prior to identification, removing extraneous .high frequency content. (The high frequency sources are typically from structural responses,
THE CONTROL HANDBOOK disturbances, or noise, which cannot be controlled.) Also, simulation of the identified system should be compared with the actual data to verify model accuracy because the proper order model may not have been chosen. Some iteration on the model order may be required for an improved fit. As always, the lowest order model that acceptably incorporates the important system dynamics should be used. WhiteNoiseResponse Many digital signal processing boards use white noise input to determine the frequency response of a system. This method will generate Bode plots and will yield higher order systems than the step input approach. However, this technique may not be as accurate as the other techniques presented because the input signal may not be truly "white," that is, the generated input signal may not exhibit a uniform broad band spectrum. Frequency content may be lacking at certain frequencies in the input signal. Therefore, the model may be inaccurate at those frequencies. White noise input and system output are also run through FFT algorithms to obtain Bode plots of the transfer function. Limitations on the valid frequency range of the transfer function are based on the sampling time. Windowing techniques (e.g., Hamming, Hanning, etc.) are used to decrease signal leakage in the Fourier analysis and may increase the accuracy of the attainable FFT results.
Germram (Swept Sine Wave)
I
I
Interferometer
Figure 77.22 Location of input and output signals for system identification. TABLE 77.4 Locations of Input and Output Signals for System Identification and Response PlantISystem to identify Input Output location location
Kp(s)
B
Ku (s
N D
'3s) Closed-Loop velocity integrator (Laser interferometer) Tach to position Open-Loop position Closed-Loop position
C F F B A
C D
F F
J
J 1
J
Experimental Set-Up Only a few specialized hardware items are necessary to obtain transfer functions. The required equipment includes a signal generator (preferablycomputer based), a signal amplifier, sensors, and signal processing capability. For the stepped-sine or white noise methods, a DSP instrument or DSP-PC board is useful for signal conditioning and transfer function analysis. The sensors required for these tests are the same as the feedback sensors listed in Section 77.3.2. A system, shown in Figure 77.22, depicts the signal injection point from a DSP based instrument. The locations to collect input and output data for determining the component transfer functions are listed in Table 77.4. The system is connected to the CNC controller with a zero reference input (a regulator)., The CNC controller is used to interpret the position from digitd A and B quadrature (A quad B) signals out ofthe position encdder (seen in Figure 77.22 as the laser interferometer). Therefore, position the measurements (point J in Figure 77.22) are ~btainedthrou~h digitalto analog (D/A) output ofthe CNC controlkr. This analog position value is the integral of the tachometer signal. It should be noted that the tachometer and interferometer dynamics are significantly faster than the other system components and may, therefore, be ignored.
Signal Conditioning Because many sensor systems go through an analog to digital conversion (AID), anti-aliasing filters are needed to prevent frequencies above the Nyquist frequency from being falsely interpreted as lower frequencies (aliasing). Although elliptic and Chebyshev filters have steeper drop-off rates, a low-pass Butter-
worth filter is recommended because of its flat response in the pass-band. To increase the attenuation roll-off, cascaded, higher order Butterworth filters are recommended. Because these filters have dynamics that can affect system performance, their dynamic responses must be experimentally characterized. For example, a Butterworth filter is maximally flat. However, when it is realized, it may slightly amplify or attenuate a signal and generate a slight error in a position or velocity command signal. Such slight errors may not be tolerable in ultra-high precision machines. These types of deviations, if not well understood, can reduce the pertormance of the machine.
77.3.4 Control Design Good mechanical design and modeling are two important,components of a precision system design. However, to achieve th; highest level of performance, analog compensators must be employed to increase the bandwidth of the machine and to improve its robustness. Typically, controllers are used to.improve closedloop system performance of both the velocity and position loops, where the velocity control loop is nested inside the position loop shown in Figure 77.20. This section presents various analog controller types used in precision machines. Design methods for these controllers are presented and their implications for the actual precision servo system.
Discussion of Controllers The controllers employed for ultra-high precision applications ate typically embodied as standard analog configurations.
77.3. ULTRA-HIGH PRECISION CONTROL This approach eliminates all issues related to digital control such as sampling and quantization. The following discussion presents control elements used in both the velocity and position loops and outlines the basic the design procedures for these elements. VelocityLoop Compensator Velocitycontrol of a precision machine is critical in maintaining consistent and known surface finish during machining operations. Controlling velocity is also critical in maintaining consistent force trajectories while measuring or machining. The dynamic velocity loop compensator, K, (s), is a combination of a lead controller and a PI controller. The lead cclmpensation provides additional phase margin for the system, because the integrator causes a 90" phase lag. Large amounts of phase and gain margin are usually specified for ultra-high precision systems to ensure their robustness to various outside disturbances. The integrator is used to eliminate steady-state velocity error. This is important because machines at constant feed velocities. are usually servoco~~trolled Lead Contpensator Design Procedure A lead compensator used to increase the phase margin for the velocity loop is depicted in Figure 77.23. Procedures for designing the controller to achieve the desired input to output voltage relationships are summarized in Equations 77.41 and 77.42 and Table 77.5. Such a compensator is designed in the following section for a diamond turning machine where actual specifications for phase and gain margin are provided.
1393 The procedure is started by selecting the maxirnum phase lead , , ,@ and the frequency where it is to occur, wo (the angle, geometric mean of the two corner frequencies). Once these two quantities have been selected, the variables ci and subsequently wl , w2,t~and 52 are determined. Finally, the values for C, R1, R2 and R3 are determined. Although the procedure is relatively straightforward, values for the three resistors and the capacitor must be chosen so that they are commercially available. For example, a resistance value of 10 kQ, which is commercially available, is preferred to a value of 10.3456 kc!, which is not commercially available. PI Compensator Design Figure 77.24 represents a PI compensator design for the velocity loop. The design prccedure is summarized in Table 77.6 and Equations 77.4.3 and 77.44. As previously stated, the PI compensator yields zero steady-state error to a velocity trajectory that is constant, in other words, a constant feed rate.
Figure 77.24
PI compensator diagram.
I
-.
Figure 77.23
Lead compensator diagram.
TABLE 77.6 PI Compensator Design
-
Procedure Determine break frequency, ~ + , = ~ 1~/,r2~ . k Choose high frequency gain . Calculate ti from t2 . Choose C . Calculate R1 and R2.
TABLE 77.5
Lead Compensator Design
Procedure Determine Wo and Calculate CY from #rnax. Calculate wl and w2 from a and WO. Calculate tl and t2 from wl and w2. Choose C . Calculate R1,R2 and R3 from T I ,t2 and C .
Band Reject Filters for Structural Resonances Finally,.it is worth mentioning that machine tools and measurement machines have a variety of structural modes or resonant frequencies. These are often detected during thle system identification phase of the compensator design. To eliminate these modes, band reject filters are employed. These analog filters can be tuned to attenuate a specific range of frequencies and can be extremely beneficial in improving the system performance. Input shaping is another available technique for suppressing resonance frequencies, often used in more flexible systems. The
1394
THE CONTROL HANDBOOK
reader is referred to [21] for more detail. Position Loop Compensator The position loop is a means of achieving the dimensional accuracy desired. For the position loop, the compensator, Kp(s), is a series combination of a lead controller and a proportional control element. The lead compensator is designed in the same manner as the velocity loop. Because a natural integrator occurs in the position loop from the carriage velocity to the position sensor (see Figure 77.20), no additional integral control is necessary to achieve zero steady state to a constant command signal.
plot and the eigenvalue angle vs. gain in an angle gain plot. The magnitude gain plot employs a log-log scale whereas the angle gain plot usesasemilogscale (with logarithms base 10). Although gain is the variable of interest, any parameter may be used in the geometric analysis. The root sensitivity of any system can be computed using the slopes of the g"dn plots [I6]. In classical control theory, the root sensitivity,Sp, is defined as the relative change in a system root or eigenvalue, A, ( i = 1, ..., n), with respect to a system parameter, p. Most often, the parameter analyzed is the forward loop gain, k. The root sensitivity with respect to gain is given by
Advantages and Disadvantages There are some advantages to this type of analog compensator scheme. Analog compensators can be used with most CNC type controllers to improve system bandwidth and precision, because most CNC controllers do not perform any dynamic compensation. Rather, they simply put out a voltage proportional to the position error of the machine tool. Another advantage of analog compensators is that their resolution is only limited to the tolerable signal-to-noise ratio, whereas digital systems are limited by quantization effects as well as noise. There are, of course, some limitations to using analog compensators, the most significant of which is the inability to provide adaptive variable gains based on parameter estimation techniques. However, ultra-high precision plants tend to have less variability by design, eliminating the need for adaptive compensators. The other major limitation to analog compensators is that they are less flexible in design and implementation than digital systems.
Performance Criteria System performance is typically measured in terms of bandwidth, steady-state error, rise time, overshoot, and damping. Bandwidth is important because a machine's bandwidth relates directly to its apparent stiffness. Steady-state error for both velocity and position loops must be eliminated to insure appropriate surface finishes and part geometries, respectively. Specifications for rise time assure that higher spatial frequency characteristics of the part geometry are not eliminated (i.e., the machine must be able to corner reasonably well). Damping and control of overshoot are particularly important in designing the position controller to avoid removing too much material from a part during machining. Although the design approach presented here is nonadaptive, techniques can provide a robust compensator design. Robustness of design is necessary to assure stability and minimal performance variation of the closed-loop system in the presence of some uncertainty in the plants and environment. Robustness of the design in stability can be characterized in terms of gain and phase margins. Design techniques for obtaining compensators robust in performance include the use of root sensitivity and gain plots [16], and other parametric plots. Root Sensitivity Gain plots are an alternate graphical representation of the Evans root locus plot [15] . They explicitly graph the eigenvalue magnitude vs. gain in a magnitude gain
Equation 77.45 is often introduced in determining the break points of the Evans root locus plot for single-input, single-output systems. At the break points, Sk becomes infinite as at least two of then system eigenvalues undergo a transition from the real domain to the complex domain or vice versa. This transition causes an abrupt change in the relationship between the eigenvalue angle L A and gain k, yielding an infinite eigenvalue derivative, dAldk. The root sensitivity function, Sk, is a measure of the effect of parameter variations on the eigenvalues. An expression for the complex root sensitivity function is developed by employing a polar representation of. the eigenvalues in the complex plane. Three assumptions are imposed: (i) the systems analyzed are lumped parameter, linear time-invariant (LTI) systems; (ii) there are no eigenvalues at the origin of the s-plane, i.e.,
(although the eigenvalues may be arbitrarily close to the origin), and (iii) the forward scalar gain, k, is real and positive, i.e., k E %, k > 0. Based on these assumptions, we draw the following observations: the real component of the sensitivity function is
and the imaginary component of the sensitivity function is given bv
These observations may be proven as follows. Equation X7.45 may be rewritten in terms of the derivatives of natural logarithms
The natural logarithm of the complex value, A, is equal to the sum of the logarithm of the magnitude of A and the angle of A. multiplied by j = 2/=T Thus, Equation 77.49 becomes
Because j is a constant, Equation 77.50 may be rewritten as
77.3. ULTRA-HIGH PRECISION CONTROL We next make the observation that the slope of the magnitude gain plot is the real component of S k . The magnitude gain plot slope, M,,,, is
mapped. Any deviation from a flat frequency response can generate positional or velocity errors that are unacceptable because of stringent requirements.
Software and Computing Considerations which mav be rewritten as
corresponding to E,quation 77.47. Furthermore, the slope of the angle gain plot is the product of the imaginary component of Sk and the constant, [log(e)]-l. The angle gain plot slope, M,,is dLh(k)
(77.54)
dLh(k) dih(k) d [ l o g ( e ) l n ( k ) ]= log(e)dln(k)'
(77.55)
M(1 = dlogo' which may be rewritten as
M" ==
' ~ n dhence M,, is l~roportionallyrelated to Equation 77.48 by [log(")]-'. The complex root sensitivity function is now expressed with distinct real and imaginary components employing the polar form of the eigenvalues. It follows from assumption (ii) that ln(k) is real. (In general, most parameters studied are real and this proof is sufficient. If, however, the parameter analyzed is complex, as explored in [16], it is straightforward to extend the above analysis.) The slopes of the gain plots provide a direct measure of the real and imaginary components of the root sensitivity and are available by inspection. The use of the gain plots with other traditional graphical techniques offers the control system designer important information for selecting appropriate system parameters.
Filtering and Optimal Estimation In the presence of noisy sensor signals, special filtering techniques can be applied. Kalman filtering techniques can be used to generate a filt$r for optimal estimation of the machine state based on sensor data. This approach also provides an opportunity to use multiple sensor information to reduce the effects of sensor noise. However, Kalman filters are generally not necessary because much of the sensor noise can be eliminated by good precision system design and thorough control of the machine's environment.
Hardware Considerations Selection of electronic components for the compensator is limited to commercially available resistors and capacitors. Although 1% tolerance is available on most components, the final design must be thoroughly tested to verify the response of the compensator. Off-the-shelf components may be used. However, their transfer functions must also be determined. For example if a Butterworth filter is used, it cannot be assumed that its transfer function is perfectly flat. The filter's transfer function must be
Although this chapter is primarily concerned with analog control, there are some computing issues that are critical and must be addressed to assure peak performance of the closed-loop system. The tool (probe) trajectory in conjunction with the update rate must be carefully considered when designinga precision machine. If they are not an integral part of the control design, optimal system performance cannot be achieved. In many cases the CNC controller must have a fast update tiine corresponding to the trajectory and displacement profile desired. There are many CNC systems (including both stand-alone and PC-based controllers) on the market today, with servo update rates of 2000 Hz and higher, to perform these tasks. The controller must also be able to interface with the various sensors in the feedback design. Most of these controllers provide their own language or use g-codes, and can be controlled via PC-AT, PCMCIA, and VME bus structures. If care is not taken to interface the CNC controller to the analog portion of the machine, the resulting system will not perform to the specifications.
77.3.5 Example: Diamond turning machine The most precise machine tools developed to clate are the diamond turning machines (DTM) at the Lawrence Livermore National Laboratory (LLNL). These machine tools are single point turning machines, or lathes that are capable of turning optical finishes in a wide variety of nonferrous materials. This section presents a basic control design for such a machine. To avoid confusion, the machine vibration modes are not included in the analyses, and the experimental results have been smoothed out via filtering. The approach discussed here is the approach taken when designing analog compensators for diamond turning machines. Furthermore, the compensators developed here are in use in actual machine tools. Figure 77.25 depicts a typical T-based diamond turning machine similar to the systems employed at LLNIL. The machine consists of two axes of motion, X and Z, as well as a spindle o n which the part is mounted. The machine is called a T-based machine because the cross axis (X) moves across a single column that supports it. Thus the carriage and the single column frame form a "T." In this particular T-based configuration, the entire spindle (with part) is servocontrolled in the Z direction which defines the longitudinal dimensions of the part; and the tool is servocontrolled in the X direction, defining the diametrical part dimensions. The DTM is equipped with a CNC controller which plans the tool trajectory and determines any position error for the tool. The output of the CNC controller is a signal proportional to the tool position error, that is, the output of the summing junction for the outer (position) loop in Figure 77.20 (point J, Figure 77.22). The objective of this example is to demonstrate the design of the
THE CONTROL HANDBOOK
Figure 77.25
A T-Based diamond turning machine.
velocity and position controllers, K, (s) and Kp (s), respectively, for the Z-axis. Usually, both K,(s) and Kp(s) are simple gains. However, the performance required for this DTM requires active controllers in both the position and velocity loops. Typically, the velocity compensator is designed first and, subsequently, the position compensator is designed. The velocity loop consists of thevelocity compensator, a power amplifier (with unity gain that amplifies the power of the compensator signal), a unity gain tachometer mounted on the servomotor, and a summing junction. The position loop consists of the CNC controller, a position compensator, the dynamics from the closed-loop velocity system, a lead screw (with a 1 mmlrev pitch), and a (unity gain) laser interferometer providing position feedback. The lead screw acts as a natural integrator convertingthe rotational velocity of the servomotor into slide position. The laser interferometer provides a digital (A quad B) signal back to the CNC controller which, in turn, compares the actual location of the Z carriage to the desired location. This error is then converted to an analog signal by the CNC controller's digital to analog board and fed into the position compensator. For all practical purposes, the CNC controller may be considered a summing junction for the position loop. This assumption becomes invalid at frequencies that approach the controller's sample rate of 2 ms. To determine the transfer function of the servo system, sinusoidal signals of various frequencies are injected into the servoamplifier, and the resulting tachometer outputs are recorded and compared to the inputs. Bode plots are then generated from the empirical data, and a transfer function is estimatedfrom these plots. Figure 77.26 shows a typical Bode plot for the Z-axis of the DTM. From Figure 77.26, the transfer function of the servo drive from the servo amplifier (not including the velocity compensator) input voltage, V(s), to the Z-axis motor velocity, w (s),
Fnqwncy (rndfscc)
Figure 77.26
feedback voltage signals.
This section presents typical specifications (and the reasons behind the specifications)for both the velocity and position loops for a DTM. The specifications for the velocity loop are 1. Zero steady state error to a constant input invelocity,
2.
3.
4.
5. 6. 7.
V(s). This insures that the machine tool can hold constant feedrates without any steady-state error. Maximum of 2 volts input into the servoamplifier. This is a limitation of the amplifier. Based on the current required to drive the motor, no more than 2 volts may be used. The sensitivity to the forward loop gain should be less than 1.0. This reduces deviations in dosed-loop system performance if the forward loop gain varies. Damping ratio, C > 0.8. A small amount of overshoot (2%) in velocity is acceptable. (Note that ( > 0.707 results in a closed-loopsystem that should not have any amplification. Thus the closed-loop Bode plot should be relatively flat below the bandwidth frequency.) Maximize the closed loop bandwidth of the system. Target velocity loop bandwidth, 15 Hz % 95 rad/s. Minimum gain margin 40 dB. Minimum phase margin 90'.
The specijications for the position loop are 1.
The high gain of the system is due to the fact that the machine moves at relatively low velocities that must generate large
Bode plots for a diamond turning machine Gp (s).
Zero steady-statepositioning error.
2. The sensitivity to the forward loop gain should be less than 1.5. This reduces deviations in dosed-loop
system performance if the forward loop gain varies.
3. Damping ratio,
: : -# m T31397
77.3. ULTRA-HIGH PRECISION CONTROL
5 > 0.9. A small amount of over-
shoot (1%) in position is acceptablebecause the typical motion values are on the order of 0.1 ~ i n . 4. Steady-state position error to a step should be zero. 5. Minimum gain margin 60 dB. 6. Minimum phase margin 75'.
I .8
5
-
0
-
-5-
Vtlocitv Loo0 Led Co
,
-
>
.
-10101
10%
Compensator Design: Velocity Loop
.Velocrtv . . .Loop ....,PI Comnensator , . . . . . , ,.
.,r-v
-
7g ,
I
2 -170 -lea 101
.......
Figure 77.28
102
'fd
;-:E" '"":
. .10',Figure 77.29
102
o
104
13.3K
- -
-
20 10' Frequency (radlsec)
Fnqnency ( d s c c ) 10'
1.33K 4pl
,v
100
>
Velocity loop lead compensator frequency response. 10K
n
-
10.1
10'
-150-
The design procedure ofSection77.3.4isusedto determine the velocity loop PI compensator. The final design ofthe velocity loop PI compensator is of the same form as the PI compensator of Figure 77.24, with acapacitance of4pF andboth resistancevalues at 1.33kQ. These connponents are determined from the relationships of Equations 57.57 and 77.58. The measured frequency response of the PI compensator is shown in Figure 77.27. Note
80
103
Frequency ( d s a : )
Velocity loop complete compensator.
cially available, a critical point in designing a compensator. If the components are not commercially available, special (and expensive) components would have to be fabricated. Thus, it is critical to generate designs that can be realized with commerciallyavailable resistors and capacitors.
"/
-3
k
.
-150
-200 10.1
Figure 77.27
.
-
,
.
"
-,
-
2=
.
L1
100
101 Frequmc)' ( d s c c )
10'
103
=
1f0.00532~
0.00532$
& = 188(radls); 1
'
~ 1
w=m
(77.57)
= 1 (77.58)
Velocity loop PI compensator response.
in Figure 77.27 that the integrator is visible with a low frequency slope of -20 dBldec?de and a DC phase of -90". Thelead compensator's final design is depicted in Figure 77.23, with R1 = 38.3kQ, R2 = 19.6kS-2, Rs = 13.3kS2, and C = 0.0475pF. The frequency response of the lead compensator is shown in Figure 77.28. Equations 77.59 and 77.60 are the governing relationships. A high impedance summing junction is placed ahead of the lead compensator. The lead compensator is designed to yield approximately 30" of lead at a frequency of 650 radls. As shown in Equation 77.60, this goal was accomplished within an acceptable error band. An inverting amplifier is used in the compensator design resulting in a phase shift of - 180' in the Bode plot of Figure 77.28. This is corrected in the complete velocity compensator by the fact that the PI and lead compensators both invert their input signals, thus canceling the effects of the inversion when they are cascaded. Figure 77.29 represents the final complete velocity compensator. All of the components shown for this controller are commer-
The open-loop Bode plots for the velocity loo>pare given in Figure 77.30. From the open-loop Bode plots, it .is clear that all of the gain and phase margin specifications h~avebeen met for the system. Also, the low frequency response of the system indicates a type I system, thus there will be no stealdy-state error to a constant velocity input. Now the forward loop gain is determined using the root locus and parametric plot techniques. Figure 77.31 presents the root locus of the system as the loop gain is increased. The lead compensator's final design can be assessed by using the Bode plots in conjunction with the root locus plot. The pole and zero furthest in the left hand plane are from the lead compensator. The second zero and the pole at the origin are from the PI compensator. Clearly from the plot, there is range of gains that results
THE CONTROL HANDBOOK TABLE 77.7, Velocity Loop and Position Loop Controller Specification
Specification Min. closed-loop bandwidth Steady-State error to a step Min. damping ratio, { Max.1 Sk 1 Min. phase margin, #m Min. gain margin, gm '
Velocity loop 15 (Hz)(w95 radls)
Position loop 150 (Hz) (x950 radls)
0
0
0.8 1.0 90"
0.9 1.5
75" 60 (dB)
40(dB)
k
Frequency (radlwc) -50
,
,
, , , , ,,,
,
J -100-
,
, ,
. . ....
Einenvaluc Angle vs Gain
.
<
,
4 -
0
2
%
-150 -
-200 lo-'
'
"""" 100
' """"
10'
10'
k
Fnqucncy (radlsec)
Figure 77.30
Velocity open-loop Bode plots.
Figure 77.32
Velocity loop compensator gain plots. Sensitivity of s vs k
Figure 77.31
Velocity closed-loop root locus.
in an underdamped system and may yield unacceptable damping for the velocity loop. To investigate further the appropriate choice of the forward loop gain, the root locus is shown as a set of gain plots, plotting pole magnitude and angle as functions of loop gain. Figure 77.32 depicts the velocity loop gain plots, showing various important aspects of the loop gain effecls on the closed-loop system dynamics. In particular, low and high gain asymptotic behaviors are evident as well as the gain at the break point of the root locus. Furthermore, because the natural frequency and damping can be related to the eigenvalue (or pole) magnitude and angle, the effects of loop gain variations
k
Figure 77.33
Velocity loop compensator root sensitivity.
on the system response speed and damping may be easily visualized. Clearly, from the gain plots, gains of slightly more than unity result in an overdamped closed-loop system. This places a lower limit on the loop gain. Figure 77.33 plots the magnitude of the root sensitivity as a function of gain. The only unacceptable loop gains are those in the vicinity of the break points. Thus, the design must not use a gain in the region of 1 < K , < 2.
77.3. ULTRA-HIGH PRECISION CONTROL
1399
k
Figure 77.34
Velocity loop compensator phase margin vs. gain.
Fquency ( d s e c )
Figure 77.36
Lead compensator response.
10 0
J
-10-
0 -20
-
-30
10'
Frequency (rad/scc)
.-
Frequency ( d s c c )
Figure77.35
Velocity loop compensator dosed-loopresponse, K y =
Figure 77.37
Position open-loop frequency response.
2. Another parametric plot that can be generated is the phase margin as a function of loop gain (see Figure 77.34). ~ a s e d on this plot, to achieve the appropriate phase margin the gain must be greater than 0.11. The maximum phase margin is at a gain of approximately 3. Also, at high gains, the phase margin approaches 90ยฐ, as expected, because the system has one more pole than zero. The gain margin of the system is infinite because the maximum phase shift is -90'. For this particular design, a value of 2 was chosen for the loop gain. It should be noted that the upper limit on the gain is power related, because higher gains require larger amounts of power. Thus, the target gain achieves all specifications with a minimal value. Figure 77.35 is the closed-loop Bode plot for the velocity loop. Although gain plots and parametric plots are not necessary for the actual design of the compensators (jn particular, the loop gain), they do offer insight into the various trade-offs available.
For example, it is clear from the gain plots that gains much higher than 5 will not significantly affect the closed loop s!vstem response because both lower frequency poles are approaching the system transmission zeros. Higher gains only push the high frequency pole (which is dominated by the lower frequency dynamics) further out into the left hand plane. Thus higher gains, which are costlyto implement, do not significantly enhance the closed-loop system dynamics beyond a certain limit and may excite higher frequency modes.
Compensator Design: Position Loop The position loop is designed similarlyto the velocity loop by determining values for the components of the lead compensator and the loop gain of the proportional controller. The hnal design of the lead compensator has the same form as Figure 77.23, with R1 = 42.2kS2, R 2 = 42.2k52, ,R3 = 21.5k52, and C = 0.22pF, and Equations 77.50 and 77.51 show its de-
.-
THE CONTROL HANDBOOK
Figure 77.38
(a) and (b) - Position loop root locus.
sign values. The objective for the compensator is a maximum lead of 20" at a frequency of 75 rad/s. Figure 77.36 shows the lead compensator frequency response and Figure 77.37 plots the position open-loop response with unity gain on the proportional controller. Once again notice the low frequency slope of -20 dB1decade on the magnitude plot and -90' phase in the phase plot, indicate the natural free integrator in the position loop.
at a gain of approximately 100. Based on these two plots, the gain must be lower than approximately 3000. From the plots above, loop gains of either 300 or 1000 may be employed for the position compensator. The closed-loop frequency response is plotted in Figures 77.43 and 77.44 from the gain values of 300 and 1000, respectively. Either one of these designs results in an acceptable performance. Note that the higher value of gain yields a higher bandwidth system.
Summary Results
To determine the appropriate position loop gain, K p , for the system, a root locus plot in conjunction with gain plots and other parametric plots is used. Figure 77.38a is the root locus for the position loop. Figure 77.38b is a close-up view of the dominant poles and zeros located close to the origin. The pole at the origin is the natural integrator of the system, the other poles and zeros are from the lead compensator and the dynamics of the velocity loop. Clearly, from the root locus, there is a range of gains that yields an unacceptable damping ratio for this system. The gain plots of the position loop, shown in Figure 77.39, indicate that gains of greater than approximately 3000 are unacceptable. It is worthwhile noting that the gain plots show the two break-out points (at gain values of approximately 75 and 2600) and the break-in point at an approximate gain of 590. Due to power limitations, the gain of the position loop is limited to a maximum of 1000. Thus, the root locus shown in Figure 77.38b is more appropriate for the remainder of this design. Figure 77.40 shows the root sensitivityof the position loop. From the root sensitivity plot, it is clear that values of gain close to the break points on the root locus result in root sensitivities that do not meet specifications. Figure 77.40, also shows that the ranges 600 5 K p 5 1000 and 2000 5 K p 5 3000 are unacceptable. Gain and phase margin can also be plotted as functions of the forward loop gain (Figures 77.41 and 77.42). The slope of the gain margin plot is -20 dB/decade as gain margin is directly related to the gain. Note that the maximum phase margin occurs
To summarize the results from ihe velocity and position compensator designs, the Tables 77.8 - 77.10 can be generated. As can be seen from Tables 77.8 - 77.10, the performance objectives have been met for both the velocity and position loops. The only decision that remains is whether to use the lower or higher value for the position loop gain. For this particular case, it was decided to use the lower value which required a smaller and less expensive power supply.
77.3.6 Conclusions Ultra-high precision systems are becoming more commonplace in the industrial environment as competition increases the demand for higher precision at greater speeds. The two most important factors involved in the successful implementation of an ultra-high precision system are a solid design of the open-loop system and a good model of the system and any compensaty components. This chapter presented some basic concepts critical for the design and implementation of ultra-high precision control systems, including some instrumentation concerns, system identification issues and compensator design and implementation. Clearly, this chapter is limited in its scope and there are many other details that must be considered before a machine can approach the accuracies of high performance coordinate measurement machines or diamond turning machines. Such machines have been developed over many years and continue to improve with time and technology. The interested reader is referred to the references at the end of this chapter for further details of designing and implementating ultra-high precision systems.
77.3. ULTRA-HIGH PRECISION CONTROL
250.
2 e
3-
m
, ,
. , , , ,,
,
Eieenvalue Ande vs Gain , , , , , ,,, , , , , ,,
200-
m U
-.
150100 100
W
10'
102
k
Figure 77.39
Position loop compensator gain plots. Sensitivity of s vs k
k
Figure 77.40
Position loop compensator root sensitivity.
k
Figure 77.41
Position loop gain margin.
10'
I o4
THE CONTROL HANDBOOK
Figure 77.42
Position loop phase margin. Posihon Closed-Loop Systcm Bode Plot. K~=300
0 -10
%
2
.
-
-20 -
U
-30-40 100
d
2
E
.
.
. L
10'
102 Frequency (radlsec)
10'
10'
102 Frequency ( d s c c )
10'
-50 -
-100-150-200
100
- -
.-
0w
,-
Figure 77.43
Position closed-loopfrequency response, K p = 300.
Figure 77.44
Position dosed-loop frequency response, K p = 1000.
10'
77.3. ULTRA-HIGIH PRECISION CONTROL
TABLE 77.8
Results Velocity Loop
Specification Min. closed-loop bandwidth Steady-State error to a step Min. damping ratio, { Max. 1 Sk 1 Min. phase margin, Grn Min. gain margin, g,,,
TABLE 77.9
Value achieved 637 (Hz) (x4000 radls) 0
90' 40 (dB)
(dB)
Position Loop Kn = 300
Specification Min. closed-loop bandwidth Steady-State error to a step Min. damping ratio, Max. 1 Sk 1 Min. phase margin, 4, Min. gain margin, gm
<
TABLE 77.10
Design target 150 (Hz) ( x 9 5 0 radls) 0 0.8 1.0
Position Loop
Specification Min. closed-loop bandwidth Steady-State error to a step Min. damping ratio, Max. ( S k 1 Min. phase margin, +m Min. gain margin, g,
<
Design target 15 (Hz) ( x 9 5 radls) 0 0.9 1.5 75O 60(dB)
Value achieved 32 (hz) ( x 2 0 0 radls) 0 0.93 0.4
96" 87 (dB)
K,, = 1000 Design target 15 (Hz) ("-95 radls) 0 0.9 1.5
Value achleved
75O
85'
60(dB)
76 (dB)
160 (Hz) (%1000 rad/s) 0 1.O 1.2
THE CONTROL HANDBOOK
77.3.7 Defining Terms Aquad B: Two phased (90 degrees) signals from an encoder. AID : Analog to digital conversion. Abbe error: Measurement error which occurs when a gage is not collinear to the object measured. aliasing: Identification of a higher frequency as a lower one, when the higher frequency is above the Nyquist frequency. ARX : AutoRegressive external input identification technique. CMM : Coordinate measuring machine. CNC : Computer numerical control. cross-coupling : One axis affecting another. DIA : Digital to analog conversion. DTM : Diamond turning machine. dynamometer : High precision and bandwidth multiaxis force sensor. laser interferometer : Measurement instrument based on light wavelength interference. LVDT : Linear variable differential transformer. metglass : Magnetostrictive material (such as ~ e r f e n o l ~ ) . metrology : The study of measurement. piezoelectric effect : Strain-induced voltage, or voltageinduced strain. stiction : Low velocity friction. THD: Total harmonic distortion. ultra-high precision : Dimensional accuracies of 0.1 pin (1 pin = in). whitenoise: Random signal with a uniform probability distribution.
References
171 Daneshmend, L.K. and Pak, H.A., Model Reference Adaptive Control of Feed Force in rurning, J. Dyrl. Syst. Meas. Control, 108,215-222, 1986. 181 Dorf, R.C. and Kusiak, A., Eds., Handbook ofDesigt1,
Manufacturing and Automation, John Wiley 8( Sons, New York, 1994. [9] Falter, P. J. and Dow, T. A., Design And Performance Of A Small-Scale Diamond Turning Machine, Precision Eng., 9(4), 185-190, 1987. [lo] Fornaro, R. J. and Dow, T. A., High-performance Machine Tool Controller, Conf. Rec. - IEEE Ind. Appl. Soc. Annual Meeting, 35(6), 1429-1439, 1988. [ l l ] Fussell, B.K. and Srinivasan, K., Model Reference Adaptive Control of Force In End Milling Operations, Am. Control Conf., 2, 1189-94, 1988. [12] Jenkins, H.E., Kurfess, T.R., and Dorf, R.C., Design of a Robust Controller for a Grinding System, IEEE Con$ Control Appl,, 3, 1579-84, 1994. [I31 Jeong, S. and Ro, P.I., Cutting Force-Based Feedback Control Scheme for Surface Finish Improvement in Diamond Turning, Am. Control Con$, 2, 1981-1985, 1993. [14] Jones, R. and Richardson, J., The Design and Appli-
cation of Sensitive Capacitance Micrometers, J. Phys. E: Scientijic Instruments, 6, 589, 1973. [15] Kurfess, T. R. and Nagurka, M. L., Understanding the Root Locus Using Gain Plots, IEEE Control Syst. Mag., 11(5), 37-40, 1991. [16] Kurfess, T.R. and Nagurka, M.L., A Geometric Repre-
[17]
1181
1191
[ I ] Methods for Performing Evaluation of Computer
Numerically Controlled Machining Centers. ANSIASME B5.54, 1992. [2] Blake, P., Bifano, T., Dow, T., and Scattergood, R., Precision Machining Of Ceramic Materials, Am. Ceram. SOC.Bull., 67(6), 1038-1044, 1988. [3] Bryan, J. B., The Abbe Principle Revisited-An Updated Interpretation, Precision Eng., 1(3), 129-132, 1989. 141 Bryan, J. B., The Power of Deterministic Thinking in Machine Tool Accuracy, UCRL-91531, 1984. [5] Bryant, M.D. and Reeves, R.B., Precise Positioning
Problems Using Piezoelectric Actuators with Force Transmission Through Mechanical Contact, Precision Eng., 6 , 129-134, 1984. [6] Cetinkunt, S. and Donmez, A., CMAC Learning Controller for Servo Control of High Precision Machine Tools,American Control Conference, 2,1976-80,1993.
[20] [21]
sentation of Root Sensitivity, J. Dyn. Syst. Meas. Control, 116(2), 305-9, 1994. Li, C. J. and Li, S. Y., A New Sensor for Real-Time Milling Tool Condition Monitoring, J. Dyn. Syst. Meas. Control, 115,285-290, 1993. Liang, S.Y. and Perry, S.A., 1n'-processCompensation For Milling Cutter Runout Via Chip Load Manipulation, J. Eng. Ind. Trans. ASME, 116(2), 153-160,1994. Ljung, L., System Identijication: Theory for the User, Prentice-Hall ,Englewood Cliffs, NJ, 1987. Lo, C. and Koren, Y., Evaluation ofMachineToo1 Cantrollers, Proc. Am. Control Conf., 1,370-374, 1992. Meckl, P. H. and Kinceler, R., Robust Motion Control of Flexible Systems Using Feedforward Forcing Functions, IEEE Trans. Control Syst. Technol., 2(3),
245-254, 1994. 1221 Pien, P.-Y. and Tomizuka, M., Adaptive Force Control
of Two Dimensional Milling, Proc. Am. Control Conj, 1,399-403,1992. [23] Slocum, A., Scagnetti, P., and Kane, N., Ce-
ramic Machine Tool with Self-Compensated, WaterHydrostatic, Linear Bearings, ASPE Proc., 57-60, 1994. (241 Stute, G., Adaptive Control, Technology of Machine
Tools, Machine Tool Controls, Lawrence Livermore Laboratory, Livermore, CA, 1980, vol. 4. 1251 Tsao, T.C. and Tomizuka, M., Adaptive and Repetitive
77.4. ROBUST CONTROL O F A COMPACT DISC MECHANISM Digital Control Algorithms for Noncircular Machinl 1,115-120,1988. ing, Proc. Am. C o n ~ oConj, [26] Tung, E. and Tomizuka, M., Feedforward Tracking Controller Design Based on the Identification of Low FrequencyDynamics, J. Dyn. Syst. Meaa Control, 115, 348-356,1993. Tung, E., knwar, G., Tomizuka, M., LowvelocityFriction Compensation and Feedfonvard Solution Based on Repetitive Control, J. Dyn. Syst. Meas. Control, 115, 279-284,1993. Ulsoy, A.G. and Koren, Y., Control of Machine Processes, J. Dyn. Syst. Meas. Control, 115,301-8, 1993.
1405
through a series of optical lenses to give a spot on the information layer of the disc. An objective lens, suspended by two parallel leaf spring, can move in a vertical direction to give a focusing action.
,4 Robust Control of a Compact Disc
Mechanism Maarten Steinbuch, Philips Research Laboratories, Eindhoven, The Netherlands Gerrit Schootstra, Philips Research Laboratories, Eindhoven, The Netherlands Okko 19. Bosgra, Mechanical Engineering Systems and Control Group, Delft University of Technology, Delft, The Netherlands 77.4.1 Introduction A compact disc (CD) player is an optical decoding device that reproduces high-quality audio from a digitally coded signal recorded as a spiral-shaped track on a reflective disc [2]. Apart from the audio application, other optical data systems (CDROM, optical data drive) and combined audiolvideo applications (CD-interactive, CD-video) have emerged. An important research area for these applications is the possibility of increasing the rotational frequency of the disc to obtain faster data readout and shorter access time. For higher rotational speeds, however, a higher servo bandwidth is required that approaches the resonance frequencies of bending and torsional modes of the CD mechanism. Moreover, the system behavior varies from player to player because of manufacturing tolerances of CD players in mass production, wllich'explains the need for robustness of the controller. Further, an increasing percentage of all CD-based applications is for portable use. Thus, additionally, power consumption and shock sensitivity play a decisive role in the performance assessment of controller design for CD systems. In this chapter we concentrate on the possible improvements of both the track-followingand focusing behavior of a CD player, using robust control design techniques.
77.4.2 Compact Disc Mechanism A schematic view of a CD mechanism is shown in Figure 77.45. The mechanism is composed of a turntable dc motor for the rotation of the CD, and a balanced radial arm for track following. An optical element is mounted at the end of the radial arm. A diode located in this element generates a laser beam that passes
Figure 77.45 nism.
Schematicview of a rotating-arm compact disc mecha-
Both the radial and the vertical (focus) position of the laser spot, relativeto the trackofthe disc, have to be controlledactively. To accomplish this, the controller uses position-error information provided by four photodiodes. As input to the system, the controller generates control currents to the radial and focus actuator, which both are permanent-magnetlcoil systems. In Figure 77.46 a block diagram of the control loop is shown. The difference between the radial and vertical track position and the spot position is detected by the optical pickup; it generates a radial error signal (erad) and a focus error signal (efo,) via the optical gain KOPr. A controller K(s) feeds the system with the currents Irad and Ifoc.The transfer functioil from control currents to spot position is indicated by H ( s ) . Only the positionerror signals after the optical gain are available for measurement. Neither the true spot position nor the track positlion is available as a signal.
track
Figure 77.46
'I-
efa
If=
position
Configuration of the control loop.
In current systems, K(s) is formed by two separate proportional-integral-derivative (PID) controllers [2], [ 6 ] ,thus, creating two single-input single-output (SISO) control loops. This is possible because the dynamic interaction between both loops is relatively low, especially from radial current to fiocus error. In these applications the present radial loop has a bandwidth of 500 Hz, while the bandwidth for the focus loop is 800 Hz. For more demanding applications (as discussed in the introduction) it is necessary to investigate whether improvements of the servo behavior are possible.
THE CONTROL HANDBOOK
77.4.3 Modeling A measured frequency response of the CD mechanism G(s) = K,pt H ( s ) is given in Figure 77.47 (magnitude only). It has been determined by spectrum analysistechniques. At low frequencies, the rigid body mode of the radial arm and the lens-spring system (focus) can be easily recognized as double integrators in the 1,l and 2,2 element of the frequency response, respectively. At higher frequencies the measurement shows parasitic dynamics, especially in the radial direction. Experimental modal analyses and finite element calculationshave revealed that these phenomena are due to mechanical resonances of the radial arm, mounting plate, and disc (flexiblebending and torsional modes). With frequency-domain-basedsystem identification,each element of the frequencyresponse has been fitted separately using an output error model structure with a least-square criterion [lo]. Frequency-dependentweighting functions have been used to improve the accuracy of the fit around the anticipated bandwidth of 1 kHz. The 2,l element appeared to be difficult to fit because of the nonproper behavior in the frequency range of interest. Combination of the fits of each element resulted in a 37thorder multivariable model. Using frequency-weighted balanced reduction [HI, [3],this model was reduced to a 2 1st-ordermodel, without significant loss in accuracy. The frequency response of the model is also shown in Figure 77.47.
Uncertainty Modeling The most important system variations we want to account for are 1. Unstructured difference between model and measurement 2. Uncertain interaction 3. Uncertain actuator gain 4. Uncertainty in the frequencies of the parasitic resonances The first uncertainty stems from the fact that our nominal model is only an approximation of the measured frequency response because of imperfect modeling. Further, a very-high-order nominal model is undesirable in robust control design since the design technique yields controllers with the state dimension of the nominal model plus weighting functions. For that reason, our nominal model describes only the rigid body dynamics, as well as the resonance modes that are most relevant in the controlled situation. Unmodeled high-frequency dynamics and the unstructured difference between model and measurement are modeled as a complex valued additive perturbation A,, bounded by a high-pass weighting function. The remaining uncertainty sources 2 , 3 and 4 are all intended to reflect how manufacturing tolerances manifest themselves as variations in the frequency response from player to player. By so doing, we are able to appreciate the consequences of manufacturing tolerances on control design. The uncertain interaction, item 2, is modeled using an antidiagonal output multiplicative parametric perturbation: y =
(I
+ AO)Cxwhere
The scalar weights wOland w02 are chosen equal to 0.1, meaning 10% uncertainty. Dual to the uncertain interaction, the uncertain actuator gains, item 3, are modeled as a diagonal input multiplicativeparametric perturbation: f = Ax B(Z A i ) u where
+
+
The gain ofeach actuator is perturbed by 5%. With this value also non-linear gain variations in the radial loop due to the rotatingarm principle are accounted for along with further gain variations caused by variations in track shape, depth, and slope of the pits on the disc and varying quality of the transparent substrate protecting the disc [6]. , Finally, we consider variations in the undamped natural frequency of parasitic resonance modes, item 4. From earlier robust control designs [ l l ] it is known that the resonances at 0.8, 1.7, and 4.3 kHz are especially important. The modeling of the variations of the three resonance frequencies is carried out with the parametric uncertainty modeling toolbox [ 9 ] . The outcome of the toolbox is a linear fractional transfbrmation description of the perturbed system with a normalized, block diagonal, parametric perturbation structure Apar = diag(dl 12, 8212, b3Z2}. Each frequency perturbation involves a real-repeated perturbation block of multiplicity two. The repeatedness stems from the fact that the frequencies wo appear quadratically in the A-matrix. The lower and upper bounds are chosen 2.5% below and above the nominal value, respectively; see [8] for more details.
77.4.4 Performance Specification A major disturbance source for the controlled system is track position irregularities. Based on standardization of CDs, in the radial direction the disc specifications allow a track deviation of 100 p m (eccentricity) and a track acceleration at scanning velocity of 0.4 m/s2, while in the vertical direction these values are 1 mm and 10 m/s2, respectively. Apart from track position irregularities, a second important disturbance source is external mechanicalshocks. Measurements show that during portable use disturbance signals occur in the frequency range from 5 up to 150 Hz, with accelerations (of the chassis) up to 50 m/s2. For the CD player to work properly, the maximum allowable position error is 0.1 p m in the radial direction and 1 p m in the focus direction. In the frequency domain, these performance specifications can be translated into requirements on the shape of the (output) sensitivity function S = '(I +G K)-' . Note that the track irregularities involve time-domain constraints on signals, which are hard to translate into frequency-domain specifications. To obtain the required track disturbance attenuation, the magnitude of the sensitivity at the rotational frequency should be less
77.4. ROBUST CONTROL OF A COMPACT DISC MECHANISM
Figure 77.47
FREQUENCY [Hz]
FREQUENCY [Hz]
PREQUENCY [Hz]
FREQUENCY [Hz]
Measured frequency response of the CD mechanism (-) and of the identified Zlst-order model (- -).
than in both the radial and the focus direction. Further, for frequencies up to 150 Hz, the sensitivity should be as small as possible to suppress the impact of mechanical shocks and higher harmonics of the disc eccentricity. To still satisfy these requirements for increasing rotational speed, the badwidth has to be increased. As stated before, a higher bandwidth has implications for robustness of the design against manufacturing tolerances. But also, a higher bandwidth means a higher power consumption (very critical in portable use), generation of audible noise by the actuators, and poor playability of discs with surface scratches. Therefore, under the required disturbance rejection, we are strivingtowards the lowest possible bandwidth. In addition to these conflicting bandwidth requirements, the peak magnitude of the sensitivity function should be less than 2 to create sufficientphase margin in both loops. This is important because the controller has to be discretized for implementation. Note that the Bode sensitivity integral plays an important role in the trade-off between the performance requirements.
77.4.5 W-synthesisfor the CD Player To start with, the plerformance specifications on S can be combined with the transfer function K S associated with the complex valued additive uncertainty A,. The performance trade-offs are then realized with a low-pass weighting function W1 on S, while W2 on K S reflects the size of the additive uncertainty and can also be used to force high roll-off at the input of the actuators. In this context, the objective of achieving robust performance [ I ] ; [5] means that for all stable, normalized perturbations A, the closed loop is stable and
II
+ + w 2 ~ a ) ~ I -llmI < 1
Wl [ir (G
This objective is exactly equal to the requirement that WA[FI(P,K)1 < 1
with Fl(P, K ) =
[
W2K S WIS
W2K S WIS
1 -
(77.64)
and pa is computed with respect to the structured uncertainty block A = diag(A,, A,,), in which Ap represlents a fictitious 2 x 2 performance block. Note that this problem is related to the more common H , problem
However, with this H, design formulation it is possible to design only for nominal performance and robust stability. An alternative that we use is to specify performance on the transfer function SG instead of S, reducing the: resulting controller order by 4 [ 1 I]. Then the following standard plant results
This design formulation has the additional advantage that it does not suffer from pole-zero cancellation as in the rriied sensitivity design formulations as in Equation 77.64. Using the D-K iteration scheme [ I ] , [5], p-controllers are synthesized for several design problems. Starting with the robust performance problem of Equation 77.65, the standard plant is augmented step by step with the parametric perturbations 2, 3, and 4 listed earlier. In the final problem, with the complete uncertainty model, we thus arrive at a robust performance problem havingnineblocks: A = diag{Sl12,S212, 8312,b r l ,S L 2 , & , l ,tio2, A,, A p ) . The most important conclusions with respect to the use of p-synthesis for this problem are Convergenceofthe D-K iteration scheme vvas fast (in most cases two steps). Although global convergence of the scheme cannot be guaranteed, in our case the
THE CONTROL HANDBOOK resultingp-controller did not depend on the starting controller. The final p-controllers did have high order 139 for design (Equation 77.65) up to 83 for the full problem] due to the dynamic D-scales associated with the perturbations. Although most of the perturbations are real valued by nature, the assumption in design that a11 perturbations are complex valued did not introduce much conservativeness with respect to robust yerformance; see Figure 77.48. Note that the peak value of p over frequency is 1.75, meaning that robust performance has not been achieved. This is due to the severe performance weighting Wl. The most difficult aspect of design appeared to be the shaping ofweighting functions such that a proper trade-offis obtained between the conflictingperformance and robustness specifications. In Figure 77.49 the sensitivity transfer function is shown for the full problem p-controller. For comparison, the result is also shown for two decentralized PID controllers achieving 800-Hz bandwidth in both loops. Clearly, the p-controller achieves better disturbance rejection up to 150 Hz, has lower interaction, and a lower sensitivity peak value. The controller transfer functions are given in Figure 77.50. CIearly, the p-controller has more gain at low frequencies and actively acts upon the resonance frequencies.
77.4.6 Implementation Results The digital implementation brings along a choice for the sampling frequency of the discretized controller. Based on experience with previous implementations of SISO radial controllers [ l 11and on the location of the fastest poles in the multiple-input multiple-output (MIMO) p-controller (f8 kHz), it is the intention to discretize the controller at 40 kHz. However, in the DSP environment used, this sampling frequencymeans that the order of the controller that can be implemented is, at most, 8. Higher order will lead to an overload of the DSP. This indicates a need for a dramatic controller order reduction since there is a large gap between the practically allowable controller order (8) and the controller order that have been found using the p-synthesis methodology (83). It is, however, unlikely that the order of these controllers can be reduced to 8 without degrading the nominal and robust performance very much. To be able to implement more complex controllers there are two possibilities: 1. Designing for a sampling frequency below 40 kHz
2. Using more than one DSP system and keeping the sampling frequency at 40 kHz In this research we chose the latter, for it is expected that the sampling frequency has to be lowered to such an extent that the performance will degrade too much (too much phase lag around
the bandwidth leading to an unacceptable peak value in the sensitivity function). Although the second option introduces some additional problems, it probably gives a better indication of how much performance increase is possible with respect to PID-like control without letting this performance be influenced too much by the restrictions of implementation. To do this, model reduction of the controller is applied. The first step involves balancing and truncating of the controller states. This reduces the number of states from 83 to 53, without loss of accuracy. The next step is to split up the p-controller into two 1-input 2-output parts: Ks3 = [K& KZ3]. Using a frequency-weighted closed-loop model reduction technique, the order of each of these single input multiple-output (SIMO) parts can be reduced while the loop around the other part is closed. Each of the reduced-order parts can be implemented at 40 kHz, in a separate DSP system, thus facilitating a maximum controller order of 16 (under the additional constraint that each part can be, at most, of 8th order). The resulting controller is denoted K8 = [K; K:]. The actual implementation of the controllers in the DSP systems has been carried out with the dSPACE Cit-Pro software 171. Most problems occurring when implementing controllers are not essential to control theory, and most certainly not for H, and p theory, but involve problems such as scaling inputs and outputs to obtain the appropriate signal levels and to ensure that the resolution of the DSP is optimally used. These problems can be solved in a user-friendly manner using the dSPACE software. The two DSP systems used are a Texas Instruments TMS320C30 16-bit processor w5th floating-point arithmetic and a Texas Instruments TMS320C25 16-bit processor with fixedpoint arithmetic. The analog-to-digital converters (ADC) are also 16 bit and have a conversion time of 5 ps. The maximum input voltage is f10 V. The digital-to-analog converters (DAC) are 12 bit, have a conversion time of 3 ps, and also operate within the range o f f 10 V. The first column K; (s) = [Kll K2, ]* has been implemented in the TMS320C30 processor at a sampling frequency of 40 kHz. The dSPACE software provides for: A ramp-invariant discretization A modal transformation Generation of C code that can be downloaded to the DSP The second column K:(s) = [KZl K~~lT is implemented in the TMS320C25 processor. Since this processor has fixedpoint arithmetic, scaling is more involved; see also 1121. With the dSPACE software, the following steps have been carried out: A ramp-invariant discretization A modal traqsformation l1 scaling for the nonintegrator states State scaling for the integrator states such that their contribution is, at most, 20% Generation of DSPL code that can be downloaded to the DSP
77.4. ROBUST CONTROL O F A COMPACT DISC MECHANISM
FREQUENCY [Hz]
Figure 77.48
Figure 77.49
Complex (-)
and real (- -) @-boundsfor the p-controller on the standard plant, including all perturbations.
FREQUENCY [Hz]
FREQUENCY [Hz]
FREQUENCY [Hz]
FREQUENCY [Hz]
Nomilnal performance in terms of the sensitivity functionI with @-controller(-) and two PID controllers (- -).
For the TMS320C25, a discretization at a sampling frequency of 40 kHz appeared to be too high since it resulted in a processor load of 115%. For that reason, the sampling frequency of this DSP has been lowered to 34 W z , yielding a processor load of 98.6%. The DSP systems have been connected to the experimental setup by means of two analog summing junctions that have been designed especially for this purpose; see Figure 77.5 1. When the externaI 2 x 8 p-controller of the complete design is connected to)the experimental setup, we can measure the achieved performance in terms of the frequency response of the sensitivity function. The measurements have been carried out using a Hewlett Packard 3562 Dynamic Signal Analyzer. Because this analyzer can measure only SISO frequencyresponses, each of the four elements'of the sensitivityfunction has been determined separately.
The measurements are started at the same position on the disc each time. This position is chosen approximately at the halfway point on the disc since the model is identified here and the radial . gain is most constant in this region. In Figure 77.52 the measured and simulated frequency response of the sensitivity function is shown. The off-diagonal elements are not very reliable since the coherence during these measurerr~entswas viery low because of small gains and nonlinearities, leading to bad signal-to-noise ratios. The nominal performance has also been tested in terms of the possibility to increase the rotational frequency of the CD. It appeared possible to achieve an increase in speed of a factor 4 (with an open-loop bandwidth of 1 kHz). Concluding this section, the measurements show that considerable improvements have been obtained in the suppression of track irregularitiesleading to the possibility of increasing the rotational frequencyof the disc to a level that has not been achieved
THE CONTROL HANDBOOK
lo-'
C
1 FREQUENCY [Hz]
FREQUENCY [Hz]
FREQUENCY [Hz]
FREQUENCY [Hz]
Figure 77.50
Frequency response of the p-controller (-)
Figure 77.51
The connection of both DSP systems to the experimental setup.
and two PID controllers (- -).
10'
E 10-2 102
103
104
10-2 102
103
FREQUENCY [Hz]
FREQUENCY [HZ]
FREQUENCY [Hz]
FREQUENCY [Hz]
104
-Figure 77.52 Measured frequency response ef the input sensitivity function for the 2 X 8 reduced-ordercontroller of the complete design (-1 and the simulated output sensitivity function for the 83rd-order controller (- - -1.
77.4. ROBUST CONTROL OF k COMPACT DISC MECHANISM before. Nevertheless, the implemented performance differs on a few points from the simulated performance. It seems useful to exploit this knowledge to arrive at even better results in a next controller synthesis. Notice also the work in [4] where a controller for the radial loop-has been designed using QFT, directly based on the measured frequency responses.
77.4.7 Conclusions In this chapter p-synthesis has been applied to a CD player. The design problem involves time-domain constraints on signals and robustness requirements for norm-bounded structured plant uncertainty. Several different uncertainty structures of increasing complexity are considered. A p-controller has been implemented successfully in an experimental setup using two parallel DSPs connected to a CD player. .<,
,
References [ l ] Balas, G.J., Doyle, J.C., Glover, K., Packard, A.K., and Smith, R., p-Analysis and Synthesis Toolbox, MUSYN Inc., Minneapolis, MN, 1991: [2] Bouwhuis, G. et al., Principles of Optical Disc Systems, Adam Hilger Ltd., Bristol, UK, 1985. [3] Ceton, C., Wortelboer, P., and Bosgra, O.H., Frequency weighted closed-loop balanced reduction, in Proc. ZndEur. Control Con$, Groningen, June 26 - July 1 1993,697-701. [4] Chait, Y, Park, M.S., and Steinbuch, M., Design and implementation of a QFT controller for a compact disc player, J. Syst. Eng., 4, 107-1 17, 1994. [5] Doyle, J.C., Advances in Multivariable Control, lecture notes of the ONRlHoneywell Workshop, Honeywell, MN, 1984. [6] Draijer, W., Steinbuch, M., and Bosgra, O.H., Adaptive control of the radial servo system of a compact disc player, IFAC Automatica, 28,455-462, 1992. 171 dSPACE GmbH, DSPCitPro software package, documentation of dSPACE GmbH, Paderborn, West Germany, 1989. 18) Groos, P.J.M. van, Steinbuch, M., and Bosgra, O.H., Multivariablecontrol of a compact disc player using synthesis, in Proc. 2nd Eur. Control Conf., Groningen, June 26 - July 1, 1993,981-985. [9] Lambrechts, P., Terlouw, J.C., Bennani, S., and Steinbuch, M., Parametric uncertainty modeling using LFTs, in Proc. 1993 Am. Control Conf., June 1993, 267-272 1101 Schrama, R., Approximate Identification and Control Design, Ph.D. thesis, Delft University of Technology, Delft, The Netherlands, 1992. [ l l ] Steinbuch, M., Schootstra, G., and Bosgra, O.H., Robust control of a compact disc player, IEEE 1992Conf. Decision Control, Tucson, AZ, 2596-2600. [I21 Steinbuch, M., Schootstra, G., and Goh, H.T., Closed loop scaling in fixed-point digital control, [EEE Trans.
Control Syst. Technol., 2(4), 3 12-3 17, 1994. (131 Wortelboer, P., Frequency Weighted Balanced Reduction of Closed-loop Mechanical Servo-systems, Ph.D. thesis, Delft University of Technology, Delft, The Netherlands, 1994.
1411
SECTION XVII Electrical and Electronic Control Systems
Power Electronic Controls
George C . Verghese Massachusetts Institute of Technology
David G . Taylor Georgia Institute of Technology, School of Electrical and Computer Engineering, Atlanta, GA
Thomas M. Jahns GE Corporate R d D , Schenectady, NY
Rik W. De Doncker Silicon Power Corporation, Malvern, PA
78.1 Dynamic Modeling and Control in Power Electronics.. ............ 1413 Introduction Prototype Converters Dynamic Modeling and Control ' Extensions Conclusion References.. ........................................................... :...... 1423 Further Reading ............................................................. 1424 78.2 Motion Control with Electric Motors by Input-Output Linearization.. ........................................................ 1424 Introduction Background DC Motors AC Induction Mators AC Synchronous Motors Concluding Remarks References.. .................................................................. 1437 78.3 Control of Electrical Generators ..................................... 1437 Introduction*DCGenerators*SynchronousGenerators Induction Generators Concluding Remarks References.. .................................................................. 1451
78.1 Dynamic Modeling and Control in Power Electronics George C. Verghese,
Massachusetts
Institute of Technology
78.1.1 Introduction This chapter1 is written with the following purposes in mind: To describe the objectives, features, and constraints that characterize power electronic converters, emphasizing those aspects that are relevant to control. To outline the principles by which tractable dynamic models are obtained for power electronic converters; such models are required for application of the various control design approaches and techniques described throughout this handbook. To indicate how controls are typically designed and implemented in power electronics. To suggest to practitioners and researchers in both control and power electronics that power electronic
l ~ h author e is grateful to the following people for helpful comments: Steven Leeb, Bernard Lesieutre, Piero Maranesi, David Perreault, Seth
Sanders, Aleksandar Stankovic, and Joseph Thottuvelil. For invaluable assistance with the figures, thanks are due to Deron Jackson and Steven Leeb. 0-8493-8570-9/96/$0.00+$.50 @ I 1996 by CRC Press, lnc.
systems constitute an interesting and important testbed for control. Much of this chapter is distilled from the more detailed development in 131. We begin, in this section, with an introduction to the issues that shape power electronics. The following section then describes some prototypical power electronic converters. ,The final two sections present more detailed discussio~nsof d y n h i c modeling and control in power electroliics, focusing first on a particular converter as a case study, and moving on to some extensions. The role of averaged models and sampled-data models is highlighted. W h a t is Power Electronics? Power electronics is concerned with high-efficiencyconversionof electricpower, fromthe form available at the input or power source, to the form required at the output or load. Most commonly, one talks of ACIDC, DCIDC, DCIAC, and ACIAC converters, where "AC" here typically refers to nominally sinusoidal voltage waveforms, while "DC" refers to nominally constant voltage waveforms. Small deviations from nominal are tolerable. An ACIDC converter (which has an AC power source and a DC load) is also called a rectifier, and a DCIAC converter is called an inverter. Applications of power electronics can be as diverse as high-voltage DC (HVDC) bulk-power transmission systems involving rectifiers and inverters rated in megawatts, or motor drives rated at a few kilowatts, or 50-W power supplies for electronic equipment. The Dictates of High Eficiency High efficiency reduces energy costs, but as importantly it reduces the amount of dissipated heat that m u g be removed from the power converter. Efficienciesof higher than 99% can be obtained in large,
THE CONTROL HANDBOOK
high-power systems, while small, low-power systems may have efficiencies closer to 80%. The goal of high efficiency dictates that the power processing components in the circuit be close to lossless. Switches, capacitors, inductors, and transformers are therefore the typical components in a power electronicconverter. The switches are operated cyclically, and serve to vary the circuit interconnections- or the "topological state" ofthe circuitover the course of a cycle. The capacitors and inductors perform filtering actions, regulating power flows by temporarily storing or supplying energy. The transformers scale voltages and currents, and also provide electrical isolation between the source and load. Ideal switches, capacitors, inductors, and transformers do not dissipate power, and circuits comprising only such elements do not dissipate power either (provided that the switching operations do not result in impulsive currents or voltages, a constraint that is respected by power converters). In particular, an ideal switch has zero voltage across itself in its on (or closed, or conducting) state, zero current through itself in its of(or open, or blocking) state, and requires zero time to make a transition between these two states. Its power dissipation is therefore always zero. Of course, practical components depart from ideal behavior, resulting in some power dissipation. Semiconductor Switches :A switch in a power electronic converter is implemented via one or a combination of semiconductor devices. The most common such devices are diodes thyristors, which may be thought of as diodeswith an additional gate terminal for control, and which are of various types (for example, silicon controlled rectifiers or SCRs, bidirectional thyristors called TRIACs that function as antiparallel SCRs, and gate turn-off thyristors or GTOs) bipolar junction transistors (BJTs, controlled via an appropriate drive at the base, and designed for operation in cutoff or saturation, rather than in their linear, active range), and insulated gate bipolar transistors (IGBTs, controlled via the gate) metal-oxide-semiconductor field effect transistors (MOSFETs, controlled via the gate) The power loss associated with such switches comes from a nonzero voltage drop when they are closed, a nonzero leakage current when they are open, and a finite transition time from closed to open or vice versa, during which time both the voltage and current may be significant simultaneously. A higher switching frequency generally implies a more compact converter, since smaller capacitors, inductors, and transformers can be used to meet the specified circuit characteristics. However, the higher frequencyalso means higher switching losses associated with the increased frequency of switch transitions, as well as other losses and limitations associated with high-frequency operation of the various components. Switching frequencies above the audible range are desirable for many applications.
Controlling the Switches Each type of semiconductor switch is amenable to a characteristic mode of control. Diodes are at one extreme, as they cannotbe controlled; they conduct or block as a function solely of the current through them or the voltage across them, so neither their turn-on nor turn-off can be directly commanded by a control action. For SCRs and TRIACs, the turn-off happens as for a diode, but the turn-on is by command, under appropriate circuit conditions. For BJTs, IGBTs, MOSFETs and GTOs, both the turn-on and turn-off occur in response to control actions, provided circuit conditions are appropriate. The choice of switch implementation depends on the requirements of each particular converter. For instance, the same circuit topology that is used for rectification can often be used for inversion, after appropriate modification of the switches and/or of the way they are controlled. The only available control decisions are when to open and close the switches. It is by modulating the instants at which the switches are opened and closed that the dynamic behavior of a power electronic converter is regulated. Power electronic engineers have invented clever mechanisms for implementing the types of modulation schemes needed for feedback and feedforward control of power converters.
78.1.2 Prototype Converters This section briefly describes the structure and operating principles of some basic power electronic converters. Many practical converters are directly derived from or closely related to one of the converters presented here. Also, many power electronic systems involve combinations of such basic converters. For instance, a high-quality power supply for electronic equipment might comprise a unity-power-factor, pulse-width-modulated (PWhQ rectifier cascaded with a PWM DCIDC converter; a variable-frequency drive for an AC motor might involve a rectifier followed by a variable-frequencyinverter. High-Frequency PWM DCDC Converters Given a DC voltage of value V (which can represent an input DC voltage, or an output DC voltage, or a DC difference between input and output voltages), we can easily use a controlled switch to "chop" the DC waveform into a pulse waveform that alternates between the values V and 0 at the switchingfrequency. This pulse waveform can then be lowpass-filtered with capacitors andlor inductors that are configured to respond to its average value, i.e., its DC component. By controlling the duty ratio of the switch, i.e., the fraction of time that the switch is closedin each cycle, we can control the fraction of time that the pulse waveform takes the value V , and thereby control the DC component of this waveform. The preceding description applies directly to the simple buck (or voltage step-down) converter illustrated schematically in Figure 78.1. The load is taken to be just a resistor R, for simplicity. If the switch is operated periodically with a constant duty ratio D,and assuming the switch and diode to be ideal, it is easy to see that the converter settles into a periodic steady state in which the
78.1. DYNAMIC MODELING AND CONTROL IN POWER ELECTRONICS
Figure 78.1 The basic principle of high-frequency PWM DClDC conversion is illustrated here for the case of a buck converter with a resistive load. The switch operation converts the DC input voltage into a pulse waveform. The switching frequency is chosen much higher than the cutoff frequency of the lowpass LC fitter, so the output voltage is essentially DC. If the switch is operated periodically with a constant duty ratio D, the converter settles into a periodic steady state in which the output voltage is D V plus a small switching-frequency ripple. average output voltage is D V . If the switching frequency is high enough relative to the cutoff frequency of the lowpass filter, then the switching-frequency component at the output will be greatly attenuated. The output voltage will then be the DC value D V plus some small switching-frequency ripple superimposed on it. The buck converter is the simplest representative of a class of DCIDC converters based on lowpass filtering of a high-frequency pulse waveform. These converters are referred to as switching regulators or switched-mode converters (to distinguish them from linear voltage regulators, which are based on transistors operating in their linear active range, and which are therefore generally less efficient.) Pulse-width modulation (PWM), in which the duty ratio is varied, forms the basis for the regulation of switchedmode converters, so they are also referred to as high-frequency PWM DCIDC converters. Switching frequencies in the range of 15 to 300 kHz are common. The boost (or voltage step-up) converter in Figure 78.2 is a more complicated and interesting high-frequency PWM DCIDC converter, and we shall be examining it in considerably more detail. Here Vin denotes the voltage of a DC source, while the
1415
and C are chosen such that the ripple in the output voltage is a suitably small percentage (typically < 5%) of the nominal load voltage. The left terminal of the inductor is held at a potential of Vi,, relative to ground, while its right terminal sees a pulse waveform that is switched between 0 (when the transvstor is on, with the diode blocking) and the output voltage (when the transistor is off, with the diode conducting). In nominal operation, the transistor is switched periodically with duty ratio D, so the average potential of the inductor's right terminal is approximately (1 - D)V,. A periodic steady state is attained only when the inductor has zero average voltage across itself, i.e., when
Otherwise, the inductor current at the end of a switching cycle would not equal that at the beginning of the cycle, which contradicts the assumption of a periodic steady state. Since 0 < D < 1, we see from Equation 78.1 that the output DC voltage is higher than the input DC voltage, which is why this converter is termed a "boost" converter. Other High-Frequency PWM Converters Appropriate control of a high-frequency PWM DCIDC converter also enables conversion between waveforms that are not DC, but that are nevertheless slowly varying relative to the switching frequency. If, for example, the input is a slowly varying unidirectional voltage -such as the waveform obtained by rectifying a 60-Hz sinewave -while the converter is switched at a much higher rate, say 50 kHz, then we can still arrange for the output of the converter to be essentially DC. The high-frequency PWM rectifier in Figure 78.3 is built around this idea, and comprises a diode bridge followed by a boost converter. The bridge circuit rectifies the AC supply, pro-
Figure 78.2 The boost converter shown here is a more complex highfrequency PWM DCIDC converter. The average voltage across the in-
ductor must be zero in the periodic steady state that results when the transistor is switched periodically with duty ratio D.Also, if the switching frequency is high enough, the output voltage is essentially a DC voltage V,. It follows from these facts that V,, Vi,/(l - D ) i n the steady state, so the boost converter steps up the DC input voltage to a higher DC output voltage. voltage across the load (again modeled for simplicity as being just a resistor) is essentially a DC voltage Vo , with some small switching-frequency ripple superimposed on it. The values of L
Figure 78.3 A unity-power-factor, high-frequency PWM rectifier, comprising a diode bridge followed by a boost converter. The fast, inner current-control loop of the boost converter governs the switching, and causes the inductor current to follow a reference pvi, ( t ) that is a scaled version of the rectified input voltage vin ( t ) . 'The scale factor p is dynamically adjusted by the slow, outer voltage-control loop so as to obtain an essentially DC output voltage. ducing (for the example of the 60-Hz supply mentioned above) the unidirectional voltage vi,(t) = V I sin i % O r t l of period
THE CONTROL HANDBOOK T, = 1/120 sec from the AC voltage V sin 120~rt.The resulting voltage is applied to the boost converter. The switching of the transistor (at 50 kHz, for example) is governed by a fast, inner current-control loop which causes the inductor current to follow a reference that is a scaled replica of U i n (t), namely the rectified sinusoid pvi, (t). The current drawn by the diode bridge from the AC supply is therefore sinusoidal and in phase with the supply voltage, leading to operation at essentially unity power factor. The scale factory that relates the inductor current reference to v,, (t) is dynamically adjusted by the slow, outer voltage-control loop, so as to produce a DC output voltage (corrupted by a slight 120-Hz ripple and even less %kHz ripple). Unity-power-factor PWM rectifiers of this sort are becoming more common as front ends in power supplies for computers and other electronic equipment, because of the desire to extract as much power as possible from the wall socket, while meeting increasinglystringent power quality standarhs for such equipment.
In a high-frequency PWM inverter, the situation is reversed. The heart of it is still a DCIDC converter, and the input to it is DC.. However, the switching is controlled in such a way that the filtered output is a slowly varying rectified sinusoid at the desired frequency. This rectified sinusoid can then be "unfolded" into the desired sinusoidal AC waveform, through the action of additional controllable switches arranged in a bridge configuration. In fact, both the chopping and unfolding functions can be carried out by the bridge switches, and the resulting highfrequency PWM bridge inverter is the most common implementation, available in single-phase and three-phase versions. These inverters are often found in drives for AC servo-motors, such as the permanent-magnet synchronous motors (also called "brushless DC" motors) that are popular in robotic applications. The inductive windings of the motor perform all or part ofthe electrical lowpass filtering in this case, while the motor inertia provides the additional mechanical filtering that practically removes the switching-frequency component from the mechanical motion.
Other Inverters Another common approach to constructing inverters again relies on a pulse waveform created by chopping a DC voltage of value V, but with the frequency of the pulse waveform now equal to that of the desired AC waveform, rather than much higher. Also, the pulse waveform is now generally caused (again through controllable switches configured in a bridge arrangement) to have a mean value of zero, taking values of V, 0, and -V, for instance. Lowpass filtering of this pulse waveform to keep only the fundamental and reject harmonics yields an essentially sinusoidal AC waveform at the switching frequency. The amplitude of the sinusoid can be controlled by varying the duty ratioofthe switches that generatethepulsewaveform; this may be thought of as low-frequency PWM. It is easy to arrange for the pulse waveform to have no even harmonics, and more elaborate design of the waveform can eliminate designated low-order (e.g., third, fifth, and seventh) harmonics, in order to improve the effectiveness of the lowpass filter. This sort of inverter might be found in variable-frequency drives for large AC motors, operating at power levels where the high-frequency PWM inverters described in the previous paragraph would not
be practical (because of limitations on switching frequency that become dominant at higher power levels). The lowpass filtering again involves using the inductive windings and inertia of the motor.
Resonant Converters There is an alternative approach to controlling the output amplitude of a DCIAC converter, such as that presented in the previous paragraph. Rather than varying the duty ratio of the pulse waveform, a resonant inverter uses frequency variations. In such an inverter, a resonant bandpass filter (rather than a lowpass filter) is used to extract the sinewave from the pulse waveform; the pulse waveform no longer needs to have zero mean. The amplitude of the sinewave is strongly dependent on how far the switching frequency is from resonance, so control of the switching frequency can be used to control the amplitude of the output sinewave. One difficulty, however, is that variations in the load will lead to modifications of the resonance characteristics of the filter, in turn causing significant variations in the control characteristics. If the sinusoidal waveform produced by a resonant inverter is rectified and lowpass filtered, what is obtained is a resonant DC/DC converter, as opposed to a PWM DCIDC convertel. This form of DCIDC converter can have lower switching losses and generate less electromagnetic interference (EMI) than a typical high-frequency PWM DCIDC converter operating at the same switching frequency, but these advantages come at the cost of higher peak currents and voltages, and therefore higher component stresses. Phase-Controlled Converters We have already mentioned a diode bridge in connection with Figure 78.3, and noted that it converts an AC waveform into a unidirectional or rectified waveform. Using controllable switches instead of the diodes allows us to partially rectify a sinusoidal AC waveform, with subsequent lowpass filtering to obtain an essentiallyDC waveform at a specified level. This is the basis for phase-controlled rectifiers, which are used as drives for DC motors or as battery charging circuits. A typical configuration using thyristors is shown in Figure 78.4. The associated load-voltage waveform u,(t) is drawn for the case where the load current i,(t) stays strictly positive, as would happen with an inductive load. A thyristor can be turned on ("fired") via a gate signal whenever the voltage across the thyristor is positive; the thyristor turns offwhen thevoltage across it reverses, or the forward current through it falls to zero. The control variable is the firing angle or delay angle, a , which is the (electrical) angle by which the firing of a thyristor is delayed, beyond the point at which it becomes forward biased. The role of cr in shaping the output voltage is evident from the waveform in Figure 78.4. The average voltage at the output of the converter is (2V/n) cosa. I f the admittance of the load has a lowpass characteristic, then the steady-state current through it will be essentially DC, at a level determined by the average output voltage of the converter. The load current in the circuit of Figure 78.4 can never become negative, because of the orientation of the thyristors. Nevertheless, the converter can operate as an inverter too. For example, if
78.1. DYNAMIC MODELING AND CONTROL IN POWER ELECTRONICS
Vsin o f
Figure 78.4 A phase-controlled converter with strictly positive load current, and the associated output voltage waveform, v , ( t ) . The average value of v,(t) is (2Vln) cosor, where or is the firing angle or delay angle. Provided the load current does not vary significantly over a cycle, the converter functions as a rectifier when cr < n/2,and as an inverter when or > n/2. the load current is essentially constant over the course of a cycle (as it would be with a heavily inductive load) and if the average output voltage of the converter is made negative by setting cr > n/2,then the converter is actually functioning as an inverter. Such inversion might be used if the DC side of the converter is connected to a DC power source, such as a solar array or a DC generator, or a DC motor undergoing regenerative braking. AC/AC Converters For ACIAC conversion between waveforms of the same frequency, we can use switches to window out sections of the source waveform, thereby reducing the fundamental component of the waveform in a controlled way; TRIACs are well suited to carrying out this operation. Subsequent filtering can be used to extract the fundamental of the windowed waveform. More intricate use of switches- in a cycloconverterpermits the construction of an approximately sinusoidal waveform at some specified frequency by "splicing" together appropriate segments of a set of three-phase (or multiphase) sinusoidal waveforms at a higher frequency; again, subsequent filtering improves the quality of the output sinusoid. While cycloconverters effect a direct AGIAC conversion, it is also common to construct an ACIAC converter as a 'cascade of a rectifier and an inverter (generallyoperating,at different frequencies), forming a DC-link converter.
1417
Simplified Models through Averaging or Sampling The continuous-time models mentioned in the preceding paragraph capture essentially all the effects that are likely to be significant for control design. However, the models are generally still too detailed and awkward to workwith. The first challenge, therefore, is to extract from such a detailed model a simplified approximate model, preferably time-invariant, that is well matched to the particular control design task for the converter being considered. There are systematic ways to obtain such s~tmplifications, notably through averaging, which blurs out the detailed switchingartifacts sampled-data modeling, again to supptess the details internal to a switching cycle, focusinlginstead on cycle-to-cyclebehavior Both methods can produce time-invariant but still nonlinear models. Several approaches to nonlinear control design can be explored at this level. Linearization of averaged or sampled-data models around a constant operating point yields linear, timeinvariant (LTI) models that are immediately amenable to amuch larger range of standard control design methods. In the remainder of this section, we illustrate the preceding comments through a more detailed examinatioin of the boost converter that was introduced in the previous section.
A Case Study: Controlling The Boost Cottverter Consider the boost converter of Figure 78.2, redrawn in Figure 78.5 with some modifications. The figure includles a schematic illustration of a typical analog PWM control method that uses
78.1.3 Dynamic Modeling and Control Detailed .Models Elementary circuit analysis of a power converter typically produces detailed, continuous-time, nonlinear, time-varying models in state-spaceform. These models have rather low order, provided one makes reasonable approximations from the viewpoint of control design: neglecting dynamics that occur at much higher frequenciesthan the switching frequency (for instance, dynamics due to parasitics, or to snubber elements that are introduced around the switches to temper the switch transitions), and focusing instead on components that are central to the power processing function of the converter. Much of this modeling phase - including the recognition of which elements are important, see [ 6 ]-is automatable. Various computer tools are also available for detailed dynamic simulation of a power converter model, specified in either circuit form or state-space form, see [ 9 ] .
Figure 78.5 Controlling the boost converter. The operation of the switch is controlled by the latch. The switch is moved down every T seconds by a set pulse from the dock to +e latch ( q ( t ) = 1). The clock simultaneously initiates a ramp input of slope F / T to one terminal of the comparator. The modulating signal m ( t ) is applied to the other terminal of the comparator. At the instant tk in the klh cycle when the ramp crosses the level m (tk),the comparator output goes high, the latch resets ( q ( t ) = 01, and the switch is moved up. The resulting duty ratio dk in the kth cycle is m ( t k ) / F .
THE CONTROL HANDBOOK output feedback. This control configuration is routinely and widely used, in a single-chip implementation; its operation will be explained shortly. We have allowed the input voltage vi,(t) in the figure to be time varying, to allow for a source that is nominally DC at the value Vi,, but that has some time-varying deviation or ripple around this value. Although a more realistic model of the converter for control design would also, for instance, include the equivalent series resistance - or ESR of the output capacitor, such refinements can be ignored for our purposes here; they can easily be incorporated once the simpler case is understood. The rest of our development will therefore be for the model in Figure 78.5. In typical operation of the boost converter under what may be called constant-frequency PWM control, the transistor in Figure 78.2 is turned on every T seconds, and turned off dkT seconds later in the kth cycle, 0 < dk i 1, so dk represents the duty ratio in the kth cycle. If we maintain a positive inductor current, iL(t) > 0, then when the transistor is on, the diode is off, andvice versa. This is referred to as the continuousconductionmode. In the discontinuous conduction mode, on theother hand, the inductor current drops all the way to zero some time after the transistor is turned off, and then remains at zero, with the transistor and diode both off, until the transistor is turned on again. Limiting our attention here to the case of continuous conduction, the action of the transistorldiode pair in Figure 78.2 can be represented in idealized form via the double-throw switch in Figure 78.5. We will mark the position of the switch in Figure 78.5 using a switching function q(t). When q(t) = 1, the switch is down; when q (t) = 0, the switch is up. The switching function q (t) may be thought of as (proportional to) the signal that has to be applied to the base drive of the transistor in Figure 78.2 to turn it on and off as desired. Under the constant-frequency PWM switching discipline described above, q(t) jumps to 1 at the start of each cycle, every T seconds, and falls to 0 an interval dkT later in its kth cycle. The average value of q(t) over the kth cycle is therefore dk; if the duty ratio is constant at the value dk = D, then q(t) is periodic, with average value D. In Figure 78.5, q(t) corresponds to the signal at the output of the latch. This signal is set to "1" every T seconds when the clock output goes high, and is reset to "0" later in the cycle when the comparator output goes high. The two input signals of the comparator are cleverly arranged so as to reset the latch at a time determined by the desired duty ratio. Specifically, the input to the "+" terminal of the comparator is a sawtooth waveform of ~ e r i o dT that starts from 0 at the beginning of every cycle, and ramps up linearly to F by the end of the cycle. At some instant tk in the kth cycle, this ramp crosses the level of the modulating signalm ( t )at the "-" terminal of the comparator, and the output of the comparator switches from low to high, thereby resetting the latch. The duty ratio thus ends up being dk = rn (tk)/F in the corresponding switching cycle. By varying m(t) from cycle to cycle, the duty ratio can be varied. Note that the samples m(tk) of m(t) are what determine the duty ratios. We would therefore obtain the same sequence of duty ratios even if we added to m(t) any signal that stayed negative in the first part ofeach cycle and crossed up through Oin the kth cycle
at the instant tk. This fact corresponds to the familiar aliasing effect associated with sampling. Our standing assumption will be that m(t) is not allowed to change significantly within a single cycle, i.e., that m (t) is restricted to vary considerablymore slowly than half the switching frequency. As a result, m (t) x m (tk) in the kth cycle, so m(t)/F at any tirne yields the prevailing duty ratio (provided also that 0 5 m (t) 5 F , of course -outside this range, the duty ratio is 0 or 1). The modulatingsignalm ( t )is generated by a feedback scheme. For the particular case of output feedback shown in Figure 78.5, the output voltage of the converter is compared with a reference voltage, and the difference is applied to a compensator, which produces m(t). The goal of dynamic modeling, stated in the context of this example, is primarily to provide a basis for rational design of the compensator, by describing how the converter responds to variations in the modulating signal m(t), or equivalently, to variations in the duty ratio m(t)/F. (Note that the ramp level F can also be varied in order to modulate the duty ratio, and this mechanism is often exploitedto implement certain feedforward schemes that compensate for variations in the input voltage Vin.) Switched State-Space Model for the Boost Converter Choosing the inductor current and capacitor voltage as natural state variables, picking the resistor voltage as the output, and using the notation in Figure 78.5, it is easyto see that the following state-space model describes the idealized boost converter in that figure: /-
di~(t) dt
1 L [ ( d t ) - l)vc(t)
+ q, (t)]
Denoting the state vector by x(t) = [ i L ( t ) vC(I)]' (where the prime indicates the transpose), we can rewrite the above equations as
where the definitiqps of the various matrices and vectors are obvious from Equation 78.2. We refer to this model as the switched or instantaneous model, to distinguish it from the averaged and sampled-data models developed in later subsections. If our compensator were to directly determine q(t) itself, rather than determining the modulating signal m(t) in Figure 78.5, then the above bilinear and time-invariant model would be the one of interest. It is indeed possible to develop control schemes directly in the setting of the switched model, Equation 78.3. In [I], for instance, a switching curve in the twodimensional state space is used to determine when to switch q(t) between its two possible values, so as to recover from a transient with a minimum number of switch transitions, eventually arriving at a periodic steady state. Drawbacks include the need for full state measurement and accurate knowledge of system parameters.
78.1. DYNAMIC MODELING A N D CONTROL IN POWER ELECTRONICS Various sliding mode schemes have also been proposed on the basis of switched models such as Equation 78.3, see for instance 1131, [lo], [7], and references in these papers. Sliding mode designs again specify a :urface across which q(t) switches, but now the (sliding) motion occurs on the surface itself, and is analyzed under the assumption of infinite-frequencyswitching. THe requisite models are thus averaged models in effect, of the type developed in the next subsection. Any practical implementation of a sliding control must limit the switching frequency to an acceptable level, and this is often Bone via hysteretic control, where the switch is moved one way when the feedback signal exceeds a particular threshold, and is moved back when the signal drops below another (slightly lower) threshold. Constant-frequency implementations similar to the one in Figure 78.5 may also be used to get reasonable approximations to sliding mode behavior. As far as the design of the compensator in Figure 78.5 is concerned, we require a model describing the converter's response to the modulating signal m(t) or the duty ratio m(t)/ F, rather than the response lo the switching function q(t). Augmenting the model in Equation 78.3 to represent the relation between q(t) and m(t) would introduce time-varying behavior and additional nonlinearity, leading to a model that is hard to work with. The models considered in the remaining sections are developed in response to this difficulty.
Nonlinear Averaged Model for the Boost Converter To design the analog control scheme in Figure 78.5, we seek a tractable model that relates the modulating signal m(t) or the duty ratio m(t)/ F to the output vollage. In fact, since the ripple in the instantaneous output voltage is made small by design, and since the details of this small output ripple are not of interest anyway, what we really seek is a continuous-time dynamic model that relates m(t) or m(t)/F to the local average of the output voltage (where this average is computed over the switching period). Also recall that m (t)/ F, the duty ratio, is the local average value of q(t) in the corresponding switching cycle. These facts suggest that we should look for a dynamic model that relates the local average of the switching function q(t) to that of the output voltage v , (t). Specifically, let us define the local average of q(t) to be the lagged running average
and call it the contirtuous duty ratio d(t). Note that d(kT) = dk, the actual duty ratio in the kth cycle (defined as extending from kT - T to kT). If q (t) is periodic with period T, then d(t) = D, the steady-state duty ratio. Our objective is to relate d(t) in Equation 78.4 to the local average of the output voltage, defined similarly by
A natural approach to obtaining a model relating these averages is to take the local average of the state-space description in Equation 78.2. The local average of the derivative of a signal equals the derivative of its local average, because of the LTI nature of the
1419
local averaging operatior1we have defined. The result of averaging the model in Equation 78.2 is therefore the following set of equations:
where the overbars again denote local averages. The terms that prevent the above descriptio~nfrom being a state-spacemodel a r e q ( t ) a n d E ( t ) ; the average of a product is generally not the product of the averages. U.nder reasonable assumptions, however, we can write
One set of assumptions leading to the above simplification requires v ~ ( . ) and iL ( . ) over the averaging interval [t - T, t ] to not deviate significantly from Zc (t) and EL (t), respectively. This condition is reasonable for a high-frequency switching converter operating with low ripple in the state variables. There are alternative assumptions that lead to the same apprc~ximations.For instance, if i~ ( . ) is esser~tiallypiecewise linear and has a slowly varying average, then the approximation in the second equation of Equation 78.7 is reasonable even if the ripple in iL(. ) is large; this situation is often encountered. With the approximations in Equation 78.7, the description in Equation 78.6 becomes
What has happened, in effect, is that all the variables in the switched state-space model, Equation 78.2, have been replaced by their average values. In terms of the matrix notation in Equation 78.3, and with Z(t) defined as the local average of x(t), we have
This is a nonlinear but time-invariant continuous-time statespace model, often referred to as the state-space averaged model [8]. The model is driven by the continuous-tim~econtrol input d(t) -with the constraint 0 5 d(r) 5 1 - and by the exogenous input Zi,(t). Note that, under our assumption of a slowly varying m(t), we can take d(r) x m(t)/F; with this substitution, Equation 78.9 becomes an averaged model whose control input is the modulating signal m ( t ) , as desired. The averaged model in Equation 78.9 leads to much more efficient simulations of converter behavior than those obtained
THE CONTROL HANDBOOK
using the switched model in Equation 78.3, provided only local averages of variables are of interest. This averaged model also forms a convenient starting point for various nonlinear control design approaches, see for instance 1121, [14], [4], and references in these papers. The implementation of such nonlinear schemes would involve an arrangement similar to that in Figure 78.5, although the modulating signalm (t) would be produced by some nonlinear controller rather than the simple integrator shown in the figure. A natural circuit representation of the averagedmodel in Equation 78.8 is shown in Figure 78.6. This averaged circuit can actually be derived directly from the instantaneous circuit models in
Denote the (small) deviations of the various averaged variables from their constant equilibrium values by
Substituting in Equation 78.8 and neglecting terms that are of second order in the deviations, we obtain the following linearized averaged model:
Averaged circuit model for the boost converter in Figure 78.2 and Figure 78.5, obtained by replacing instantaneous circuit variables by their local averages, and making the approximationsq ( t ) % q(t)i?c (t) = d(t)i?c(t) and =(t) x q ( t ) i (t) ~ = d(t)iL(t). The transistor of the original circuit in Figure 78.2 is thereby replaced by a coritrolled voltage source of value [ l - d(t)]iTC(t), while the diode of the original circuit is replaced by a controlled current source of value [1 - d(t)liL(t). Figure 78.6
Figure 78.2 or Figure 78.5, using "circuit averaging" arguments [3]. All the instantaneous circuit variables are replaced by their averagedvalues. The approximations in Equation 78.7 then cause the transistor of the original circuit in Figure 78.2 to be replaced by a controlled voltage source of value [ l - d(t)]Zc(t), while the diode of the original circuit is replaced by a controlled current source of value [ 1 - d (t)]iL(t). This averaged circuit representation can be used as an input to circuit simulationprograms, and again leads to more efficient simulations than those obtained using the original switched model. Linearized Models for the Boost Converter The most common basis for control design in this class of converters is a linearization of the corresponding state-space averaged model. This linearized model approximately governs small perturbations of the averaged quantities from their values.in some nominal (typically steady-state) operating condition. We illustrate the linearization process for the case of our boost converter operating in the vicinity of its periodic steady state. Denote the constant nominal equilibrium values of the averaged state variables by Zr,and Vc. These values can be computed from Equation 78.9 by setting the state derivative to zero, and replacing all other variables by their constant nominal values. The ' equilibrium state vector is thereby seen to be
X =
- [(1 - D)Ao S D A ~ J -b' Vin
This is an LTI model, with control input d(t) and disturbance inyut Ci, ( t ) . Rewriting this model in terms of the matrix notation in Equations 78.9 and 78.10, with
we find
Before analyzing the above LTI model further, it is worth remarking that we could have linearized the switched model in Equation 78.3 rather than the averaged model in Equation 78.9. The only subtlety in this case is that the perturbation G(t) = q(t) - Q(t) (where Q ( t ) denotes the nominal value of q(t) in the,periodic steady state) is still a 011 function, so one has to reconsider what is meant by a small perturbation in q(t). If we consider a small perturbation G(t) to be one whose area is small in any cycle, then we still arrive at a linearized model of the form of Equation 78.14, except that each averaged variable is replaced by its instantaneous version, i.e., D is replaced by Q(t), d ( t ) by i j ( t ) , and so on. For linearization around a periodic steady state, the linearized switched model is (linear and) periodicallyvarying. Compensator Design for the Boost Converter The LTI model in Equation 78.12 that is obtained by linearizing the state-space averaged model around its constant steady state is the standard starting point for a host of control design methods. The transfer function from d ( t ) to G,(t) is straightforward to compute, and turns out to be
78.1. DYNAMIC: MODELING A N D CONTROL IN POWER ELECTRONICS (wherewe have also used Equation 78.1 to simplify the expression in the numerator). The denominator of H(s) is the characteristic polynomial of the system. For typical parameter values, the roots of this polynomial, i.e., the poles of H(s), are oscillatory and lightly damped; note that they depend on D. Also observe that the system is non-minimum phase: the zero of H (s) is in the right half-plane (RHP). The RHP zero correlates with the physical fact that the initial response to a step increase in duty ratio is a decrease in the average output voltage (rather than the increase predicted by the steady-state expression),because the diode conducts for a smaller fraction of the time. However, the buildup in average inductor current that is caused by the increased duty ratio eventually causes the average output voltage to increase over its prior level. The lightly damped poles and the RHP zero of H(s) signify difficulties and limitations with closed-loop control, if we use only measurernents of the output voltage. Nevertheless, pure integral control, which is what is shown in Figure 78.5, can generally lead to acceptable closed-loop performance and adequate low-frequency loop gain (needed to counteract the effects of parameter uncertainty and low-frequency disturbances). For closed-loop stability, the loop crossover frequency must be made smaller than the "corner" frequencies (in the Bode plot) associated with the poles and RHP zero of H ( s ) . The situation for control can be significantly improved by incorporating measurements of the inductor current as well, as is done in current-mode control, which will be described shortly. The open-loop transfer function from 6.; (t) to 5, (t) has the same poles as H ( s ) , and no (finite) zeros; its low-frequency gain is 1/(1 - D), as expected from Equation 78.1. This transfer function is sometimes termed the open-loop audio-susceptibility of the converter. The control design has also to deal with rejection of the effects of input voltage disturbances on the output load voltage, i.e., with shaping the closed-loop audio-susceptibility. In practice, a compensator designed on the basis of an LTI model would be augmented to take account of various largesignal contingencies. A soft-start scheme might be used to gradually increase the reference voltage Vrefofthe controller,thereby reducing stresses in the circuit during start-up; integrator windup in the compensator dould be prevented by placing back-to-back Zener diodes across the capacitor of the integrator, preventing large run-ups in integrator voltage during major transients; overcurrent protections would be introduced; and so on. C~rrmt-~k4ode Control of the Boost Converter The attainable control performance may improve, of course, if additional measurements are taken. We have made some reference in the preceding sections to control approaches that use full state feedback. To design state feedback control for a boost converter, the natural place to start is again with the nonlinear time-invariant model (Equation 78.8 and 78.9) or the LTI model (Equations 78.12 and 78.14). In this subsection, we examine a representative and popular state feedback scheme for high-frequency PWM converters such as the boost clonverter, namely current-mode control [2]. Its name comes from the fact that a fast inner loop regulates the
1421
inductor current to a reference value, while the slower outer loop adjusts the current reference to correct for deviations of the output voltage from its desired value. (Note that this is precisely what is done for the boost converter that forms the heart of the unity-power-factor PWM rectifier in Figure 78.3.) The current monitoring and limiting that are intrinsic to current-mode control are among its attractive features. In constant-frequency peak-current-mode control, the transistor is turned on every T seconds, as before, but is turned off when the inductor current (or equivalently, the transistor current) reaches a specified reference or peak lt:vel, denoted by ip (t). The duty ratio, rather than being explicitly commanded via a modulating signal such as m(t) in Figure 78.5, is now implicitly determined by the inductor current's relation to ip(t). Despite this modification, the averaged model in Equation 78.8 is still applicable in the case of the boost converter. (Instead of constant-frequency control, one could use hysteretic or other schemes to confine the inductor current to the vicinity of the reference current.) A tractable and reasonably accurate continuous-time model for the dynamics of the outer loop is obtained by assuming that the average inductor current is approximately equal to the reference current: i~ (t) x ip (t) (78.16) Making the substitution from Equation 78.16 iin Equation 78.8 and using the two equations there to eliminate d(t), we are left with the following first-order model:
This equation is simple enough that one can use nt to explore various nonlinear control possibilities for adjusting ip (t) to control v ~ ( t or ) iji(t). The equation shows that, for constant ip(t) (or periodic i p (t), as in the nominal operation of the unity-powerfactor PWM rectifier in Figure 78.3), 3$(t) approaches its constant (respectively,periodic) steady state exponentially, with time constant RC/2. Linearizing the preceding equation around the equilibrium corresponding to a constant i p (t) = I p ,we get
It is a simple matter to obtain from Equation '78.18 the transfer functions needed for small-signalcontrol design with disturbance rejection. We are still left with the same RH P zero as before, see E~uation78.15, in the transfer function froim ip to CC, but there is now only a single well-damped pole to deal with. Current-mode control may in fact be seen as perhaps the oldest, simplest, and most common representativeof a sliding mode control scheme in power electronics (we made reference earlier to more recent and more elaborate sliding mode controls). The inductor current is made to slide along the time-varying surface i ~ ( t )= ip(t). Equation 78.17 or 78.18 describes the system
T H E CONTROL HANDBOOK
dynamics in the sliding mode, and also provides the basis for controlling the sliding surface in a way that regulates the output voltage as desired. Sampled-Data Models for the Boost Converter Sampled-data models are naturally matched to power electronic converters, first because of the cyclic way in which power converters are operated and controlled, and second because such models are well suited to the design of digital controllers, which are used increasingly in power electronics (particularly for machine drives). Like averaged models, sampled-data models allow us to focus on cycle-to-cycle behavior, ignoring details of the intracycle behavior. We illustrate how a sampled-data model may be obtained for our boost converter example. The state evolution of Equation 78.2 or 78.3 for each of the two possible values of q(t) can be described very easily using the standard matrix exponential expressions for LTI systems. The trajectories in each segment can then be pieced together by invoking the continuity of the state variables. Under the switching discipline of constant-frequency PWM, where q(t) = 1 for the initial fraction dk of the kth switching cycle, andq (t) = 0 thereafter, and assuming the input voltage is constant at Vin, we find x(kT
+ T) = e ( 1 - 4 ) A o ~ ( e d k A ~ T x (+k ~rl) v , ~ +) r0V,, (78.19)
where
The nonlinear, time-invariant sampled-data model in Equation 78.19 can directly be made the basis for control design. One interesting approach to this task is suggested in [5]. Alternatively, a linearization around the equilibrium point yields a discrete-time LTI model that can be used as the starting point for established methods of control design. For a well-designed high-frequency PWM DCIDC converter in continuous conduction, the state trajectories in each switch configuration are close to linear, because the switching frequency is much higher than the filter cutoff frequency. What this implies is that the matrix exponentials in Equation 78.19 are well approximated by just the first two terms in their Taylor series expansions:
If we use these approximations in Equation 78.19 and neglect , result is the following approximate sampledterms in T ~the data model:
This model is easily recognized as the forward-Euler approximation of the continuous-time model in Equation 78.9. Retaining the terms in T2 leads to more refined, but still very simple, sampled-data models. The sampled-data models in Equations 78.19 and 78.22 were derived from Equation 78.2 or 78.3, and therefore used samples of the natural state variables, iL(t) and vc(t), as state variables. However, other choices are certainly possible, and may be more appropriate for a particular implementation. For instance, we could replace uc(kT) by Zc(kT), i.e., the sampled local average of the capacitor voltage.
78.1.4 Extensions The preceding development suggests how dynamic modeling and control in power electronics can be effected on the basis of either averaged or sampled-data models. Although a boost converter was used for illustration, the same general approaches apply to other converters, either directly or after appropriate extensions. To conclude this chapter, we outline some examples of such extensions. GeneralizedAveraging It is often useful or necessary - for instance, in modeling the dynamic behavior of resonant converters -to study the local fundamental and local harmonics [ 111, in addition to local averages of the form shown in Equations 78.4 and 78.5. For a variable x ( t ) , the local ew, -component may be defined as the following lagged running average:
In this equation, y, is usually chosen as the switching frequency, i.e., 2 x 1 T , and e is an integer. The local averages in Equations 78.4 and 78.5 thus correspond to the choice e = 0; the choice J C J = 1 yields the local fundamental, while Je] > 1 yields the local eth harmonic. A key property of the local em,-component is that
where we have omitted the time argument r to keep the notation simple. Fore = 0, we recover the result that was used in obtaining Equation 78.6 from Equation 78.2, namely that the local average of the derivative equals the derivative of the local average. More generally, we could evaluate the local ey,-component of both sides of a switched state-space model such as Equation 78.3, for several values of e. With suitable approximations, this leads to an augmented state-space model whose state vector comprises the local em.,-components for all these values of e. In the case of the boost converter, for instance, we could choose e = +1 and C = -1 in addition to the choice C = 0 that was used to get Equation 78.9 from Equation 78.3.' The approximation that was used in Equation 78.7, which (with the time argument t still suppressed for notational convenience) we can rewrite as
78.1. DYNAMIC MODELING A N D C O N T R O L IN POWER ELECTRONICS can now be refined to
qx x < q > o < x > o
+
-l<x>l
+
< x > l ejW$*
+
1<x>-1 (78.26) The resulting state-space model will have a state vector comprising < x >o, < x > 1, and < x > -1, and the solution x ( t ) will be approximated as
+
< x > - 1 e-jwvt (78.27) Results in [ 1 1 1 show the significant improvements in accuracy that can be obtained this way, relative to the averaged model in Equation 78.9, while maintaining the basic simplicity and efficiency of averaged models relative to switched mpdels. That paper also shows how generalized averaging may be applied to the analysis of resonant converters. Generalized State-Space Models A sampled-data model for a power converterwill almost invariably involve a statespace description of the form x(t) x <x>o
The vector uk here comprises a set of parameters that govern the state evolution in the kth cycle (e.g., parameters that describe control choices and source variations during the kth cycle), and T k is a vector of switching times, comprising the times at which switches in the converter open or close. The switching-timevector Tk will satisfy a set of constraints of the form
1423
detailed evaluation of the switch stresses and the design of any snubbers requires circuit-level simulation and analysis over a short interval, comparable with the switching period T. The design of the inner current-control loop is conveniently done using a continuous-time averaged modd, with averaging carried out over a window of length T; the model in Equation 78.8 is representative of this stage. The design of the outer voltagecontrol loop may be done using an averaged or sampled-data model computed over a window of length T,, the period of the rectified sinusoidal input voltage; the model Equation 78.17 is representative of - or at least a precursor to - this stage. The lessons of each level have to be taken into account at all the other levels.
78.1.5 Conclusion This has been a brief introduction to some of the major features and issues of dynamic modeling and control in power electronics. The intent was to be comprehensible rather than comprehensive; much that is interesting and relevant has inevitably been left out. Also, the emphasis here is undoubtedly skewed toward the material with which the author is most familiar. The interested reader who probes further will discover power electronics to be a vigorous, rewarding, and important domain for applications and extensions of the full range of control theory and methodology.
References [ I ] Burns, W.W. and Wilson, T.G., Analytic derivation
If Tk can be solved for in Equation 78.29, then the result can be substituted in Equation 78.28 to obtain a standard sampled-data model in state-space form. However, there are many cases in which the constraint Equation 78.29 is not explicitly solvable for T k ,so one is forced to take Equations 78.28 and 78.29 together as the sampled-data model. Such a pair, comprising a state evolution equation along with a side constraint, is what we refer to as a generalized state-space model. Note that the linearized version of Equation 78.29 will allow the switching-time perturbations to be calculated explicit15provided the Jacobian matrix ac/aTk has full column rank. The result can be substituted in the linearized version of Equation 78.28 to obtain a linearizedstate-spacemodel in standard -rather than generalized - form. Hierarchical Modeling A hierarchical approach to modeling is mandated by the range of time scales encountered in a typical power electronic system. A single "all-purpose" model that captured all time scales of inter* fro& the very fast transients associatedwith the switch transitions andparasitics, tovery slow transients spanning hundreds or thousands of switching cycles, would not be much good for any particular purpose. What is needed instead is a collection of models, focused at different time scales, suited to the distinct types of analysis that are desired at each of these scales, and capable of being linked to each other in some fashion. Consider, for example, the high-power-factor PWM rectifier in Figure 78.3, which is examined further in [ 3 ] and [ 1 5 ] . The
and evaluationof a state trajectorycontrol law for DCDC converters, in IEEE Power Electronics .$pecialists Conf. Rec., 70-85, 1977. [ 2 ] Hsu, S.P., Brown, A., Resnick, L., and Middlebrook, R.D., Modeling and analysis of switching DC-tb-DC convertersin constant-frequency currentprogrammed mode, in IEEE Power Electronic~iSpecialists Conf: Rec., 284-301, 1979. [ 3 ] Kassakian, J.G., Schlecht, M.F., and Verghese, G.C., Principles of Power Electronics, Addison-Wesley, Reading, MA, 1991. [4] Kawasaki, N., Nomura, H., and Masuhiro, M., The new control law of bilinear DC-DC converters developed by direct application of Lyapunov, IEEE Trans. Power Electron., 10(3), 3 18-325, 1995. [ 5 ] Khayatian, A. and Taylor, D.G., Multirate modeling and control design for switched-mode power converters, IEEE Trans. Auto. Control, 39(9), 1848-1852, 1994. [ 6 ] Leeb, S.B., Verghese, G.C., and Kirtley, J.L., Recogni-
tion of dynamic patterns in DC-DC switching converters, IEEE Trans. Power Electron., 6 ( 2 ) , ;!96-302, 1991. 17) Malesani, L., Rossetto, L., Spiazzi, G., and Tenti,
P., Performance optimization of ~ u convtrrters k by sliding-mode control. IEEE Truns. Power Electron.. 10(3),302-309, 1995.
THE C O N T R O L H A N D B O O K [S] Middlebrook, R.D., and ~ u kS., , A general unified approach to modeling switchingconverterpower stages, in IEEE Power Electronics Specialists Con$ Rec., 18-34, 1976. [9] Mohan, N., Robbins, W.P., Undeland, T.M., Ndssen, R., and Mo, O., Simulation of power electronic and motion control systems - an overview, Proc. IEEE, 82(8), 1287-1302, 1994. [lo] Sanders, S.R., Verghese, G.C., and Cameron, D.E., Nonlinear control of switching power converters, Control -Theory Adv. Technol.,5(4), 601-627, 1989. [I 1] Sanders, S.R., Noworolski, J.M., Liu, X.Z., and Verghese, G.C., Generalized averaging method for power conversion circuits. IEEE Trans.Power Electron., 6(2), 251-259,1991. [I21 Sanders, S.R. and Verghese, G.C., Lyapunov-based control for switched power converters, IEEE Trans. Power Electron., 7(1), 17-24, 1992. 1131 Sira-Ramirez, H., Sliding motions in bilinear switched networks, IEEE Trans. Circuits Syst., 34(8), 919-933,1987. [14] Sira-Ramlrez, H. and Prada-Rizw, M.T., Nonlinear feedback regulator design for the 6 u k converter, IEEE Trans.Auto. Control, 37(8), 1173-1 180, 1992. [15] Thottuvelil, V.J., Chin, D., and Verghese, G.C., Hierarchical approaches to modeling high-power-factor AC-DC converters, IEEE Trans. Power Electron., 6(2), 179-187,1991.
Further Reading The list of references above suggests what are some of the journals and conferences relevant to the topic of this chapter. However, a large variety of other journals and conferences are devoted to, or occasionally contain, useful material. We limit ourselves here to providing just a few leads. A useful perspective on the state-of-the-art in power electronics may be gleaned from the August 1994 special issue ofthe Proceedings of the IEEE, devotedto "Power Electronics and Motion Control." Several papers on dynamic modeling and control for power electronics are presented everyyear at the IEEE Power Electronics Specialists Conference (PESC) and the IEEE Applied Power ElectronicsConference (APEC). Many of these papers, plus others, appear in expanded form in the IEEE Transactions on Power Electronics. The April 1991 special issue of these Transactions was devoted to "Modeling in Power Electronics". In addition, the IEEE Power Electronics Society holds a biennial workshop on "Computers in Power Electronics", and a special issue of the Transactions with this same theme is slated to appear in May 1997. The power electronics text of Kassakian et al. referenced above has four chapters devoted to dynamic modeling and control in power electronics. The first two of the following
books consider the dynamics and control of switching regulators in some detail, while the third is a broader text that provides a view of a range of applications: [I] Kislovski, A.S., Redl, R., and Sokal, N.O., Dynamic Analysis of Switching-Mode DC/DC Converters, Van Nostrand Reinhold, New York, 1991. [2] Mitchell, D.M., Switching Regulator Analysis. McGraw-Hill, New York, 1988. [3] Mohan, N., Undeland, T.M., and Robbins, W.P., Power Electronics: Converters, Applications, and Design, Wiley, New York, 1995.
78.2 Motion Control with Electric Motors by Input-Output Linearization2
David G. Taylor, Georgia Institute o f ~ e c h nology, School o f ~ l e c t r i c a land Computer Engineering, Atlanta, GA 78.2.1 Introduction Due to the increasing availability of improved power electronics and digital processors at reduced costs, there has been a trend to seek higher performance from electric machine systems through the design of more sophisticated control systems software. There exist significant challenges in the search for improved control system designs, however, since the dynamics of most electric machine systems exhibit significantnonlinearities, not all state variables are necessarily measured, and the parameters of the system can vary significantly from their nominal values. Electric machines are electromechanical energy converters, used for both motor drives and power generation. Nearly all electric power used throughout the world is generated by synchronous machines (operated as generators), and a large fraction of all this electric power is consumed by induction machines (operated as motors). The various types of electric machines in use differ with respect to construction materials and features, as well as the underlying principles of operation. The first DC machine was constructed by Faraday around 1820, the first practical version was made by Henry in ,1829, and the first commercially successful version was introduced i 6 1837. The three-phase induction machine was invented by Tesla around 1887. Although improved materials and manufacturing methods continue to refine electric machines, the fundamental issues relating to electromechanicalenergy conversion have been established for well over a century. In such an apparently well-established field, it may come as a surprise that today there is more research and development
2 ~ h iworkwas s supportedin part by the National Science Foundation under Grant ECS-9158037 and by the Air Force Office of Scientific Research under Grant F49620-93-1-0147.
78.2. MOTION CONTROL WITH ELECTRIC MOTORS BY INPUT-OUTPUT LINEARIZATION activity than ever before. Included in a modern electric machine system is the electric machine itself, power electronic circuits, electrical and/or mechanical sensors, and digital processors equipped with various software algorithms. The recent developments in power semiconductors, digital electronics, and* permanent-magnet materials have led to "enabling technology" for today's advanced electric machine systems. Perhaps more than any other factor, the increasing use of computers, for both the design of electric machines and for their real-time control, is enhancing the level of innovation in this field. This chapter provides an overview of primarily one recent development in control systems design for electric machines operated as motor drives. The chapter takes a broad perspective in the sense that a wide variety of different machine types is considered, hopefully from a unifying point of view. On the other hand, in order to limit the scope substantially,an effort was made to focus on one more recent nonlinear control method, specifically inputoutput linearization, as opposed to the classical methods which have less potential for achieving high dynamic performance. An unavoidable limitation of the presentation is a lack of depth and detail beyond the specific topic of input-output linearization; however, the intention was to highlight nonlinear control technology for electric machines to a broad audience, and to guide the interested reader to a few appropriate sources for further study.
78.2.2 Background Input-Output Linearization The most common control designs for electric machines today, for applicatic~nsrequiring high dynamic performance, are based on forms of exact linearization [8]. The design concept is reflected by a two-loop structure of the controller: in the first design step, nonlinear compensation is sought which explicitly cancels the nonlinea~ritiespresent in the motor (without regard to any specific control objective), and this nonlinear compensation is implemented as an inner feedback loop; in the second design step, linear compensation is derived on the basis of the resulting linear dynamics of the precompensated motor to achieve some particular control objective, and this linear compensation is implemented as an outer feedback loop. The advantage of linear closed-loop dynamics is clearly that selection of controller parameters is simplified, and the achievable transient responses are very predictable. Not all nonlinear systems can be controlled in this fashion; the applicability of (exactlinearization is determined by the type and location of the model nonlinearities: Furthermore, exact linearization is not really a single methodology, but instead represents two distinct notions of linearizability, though in both cases the implementation requires full state feedback. In the first case, exact linearization of the input-output dynamics of a system is desired, with the output taken to be the controlled variables. This case, referred to as input-output linearization, is the more intuitive form of exact linearization, but can be applied in a straightforwarclway only to so-called minimum-phase systems (those systems with stable zero dynamics). A system can
1425
be input-output linearized if it has a well-defined relative degree (see (81 for clarification). In the second case, exact linearization of the entire state-space dynamics of a system is desired, and no output needs to be declared. This case, referred to as input-state linearization, has the advantage of (eliminatingany potential difficulties with internal dynamics but is less intuitively appealing and can be more difficult to apply in practice. Inputstate linearization applies only to systems that aire characterized by integrable, or nonsingular and involutive, distributions (see [8] for clarification). Standard models of most electric machines are exactly linearizable, in the sense(s) described above. Prior literature has disclosed many examples of exact linearization applied to electric machines, including experimental implementation, for various machine types and various types of models. For any machine type, the most significant distinction in the application of exact linearization relates to the order of the model used. When full-order models are used, the stator voltages are considered to be the control inputs. When a reduced-order model is used, the assignment of the control inputs depends ton how the order reduction has been performed: if the winding inductance is neglected, then voltage will still be the input but the mechanical subsystem model will be altered; if a high-gain current loop is employed, then current will be the input in an unaltered mechanical subsystem model. In either case, exact linearization provides a systematic method for designing the nonlinearity compensation within the inner nonlinear loop, so that the outer linear loop is concerned only with the motion control part of the total system.
Relation to Field Orientation Prior to the development of a formal theory for exact linearization design, closely related nonlinear feedback control schemes had already been developed for the induction motor. The classical "field oriented control," introduced in I] over 20 years ago, involves the transformation of electrical variables into a frame of reference which rotates with the rotor flux vector (the dq frame). This reference frame transformation, together with a nonlinear feedback, serves to reduce the complexity of the dynamic equations, provided that the rotor flux is, not identically zero. Under this one restriction, the rotor fluxamplitude dynamics are made linear and decoupled and, moreover, if the rotor flux amplitude is regulated to a constant value, the speed dynamics will also become linear and decoupled. Provided that the rotor flux amplitude may be kept constant, the field oriented control thus achieves an asymptotic linearization and decoupling, where the d-axis voltage controls rotor flux amplitude and the q-axis voltage controls speed. Although the field oriented approach to induction motor control is widely used today and has achieved consilderable success, the formal use of exact linearization design can provide alternative nonlinear control systems of comparable complexity, but achieving true (as opposed to asymptotic) linearization and decoupling of flux and torque or speed. For example, using a reduced-order electrical model (under the assu~nptionthat the rotor speed is constant), an input-state linearization static state-
THE CONTROL HANDBOOK feedback design is reported in [6] that achieves complete decoupling of the rotor flux amplitude and torque responses. The full-order electromechanical model of an induction motor turns out not to be input-state linearizable 1131. However, as shown in [12] (see also [13]), input-output linearization methods do apply to the full-order electromechanical model. With rotor flux amplitude and speed chosen as outputs to be controlled, simple calculations show that the system has welldefined relative degree (2,2), provided that the rotor flux is not identically zero. Hence, under the constraint of nonzero rotor flux, it is possible to derive a nonlinear static state-feedback that controls rotor flux amplitude and speed in aaoninteracting fashion, with linear second-order transients for each controlled output (and with bounded first-order internal dynamics 1131). Although the full-order induction motor model is not inputstate linearizable, the augmented system obtained by adding an integrator to one of the inputs does satisfy this property locally.
Performance Optimization Although the difference Getween the classical control of [ 11 and the exact linearization control of [ 12,131 may appear to be a minor one, the complete decoupling of speed and flux dynamics in the closed-loop system (during transients as well as in steady state) provides the opportunity to optimize performance. For example, as mentioned in 1131, the flux reference will ileed to be reduced from nominal as the speed reference is increased above rated speed, in order to keep the required feed voltages within the inverter limits. Operation in this flux-weakening regime will excite the coupling between flux and speed in the classical field oriented control, causing undesired speed fluctuations (and perhaps instability). There are other motivations for considering time-varying flux references as'well. For instance, in [9, 101 the flux is adjusted as a function of speed in order to maximize power efficiency (i.e., only the minimum stator input power needed to operate at the desired speed is actually sourced). In [9], the flux reference is computed on the basis of predetermined relati6nships derived off-line and the control is implemented in a reference frame rotating with the stator excitation. In [lo], the flux reference is computed on-line using a minimum power search method and the control is implemented in a fixed stator frame of reference. Yet another possibility, presented in [2], would be to vary the flux reference as a-function of speed in order to achieve optimum torque (maximum for acceleration and minimum for deceleration) given limits on allowable voltage and current. In each of these references, high-gain current loops are used so that the exact linearization is performed with respect to current inputs rather than voltage inputs. Clearly, exact linearization permits optimization goals (which require variable flux references) and high dynamic performance in position, speed, or torque control, to be achieved simultaneously.
Wide Applicability Exact linearization has been suggested for the control of many other types of electric machines as well: Both input-state
lil~earizationand input-output linearization are used to design controllers for wound-field brush-commutated DC motors, in [4, 5, 141. For various permanent-magnet machines, there are many references illustrating the use of exact linearization. For instance, in [7], a three-phase wye-connected permanent-magnet synchronous motor with sinusoidally distributed windings is modeled with piecewise-constant parameters (which depend on current to account for reluctance variations and magnetic saturation), and an input-state linearization controller is derived from the rotor reference frame model. In [1.6],input-state linearization is applied to the hybrid permanent-magnet stepper motor with cogging torque accounted for, and it is further shown how constant load torques may be rejected using a nonlinear observer. This work was continued in [3], where the experimental implementation is described, including treatment of practical issues such as speed estimation from position sensors and operation at high speeds despite voltage limitations. Optimization objectives can be considered within exact linearization designs for these other types of machines too.
Review of Theory In order to appreciate the concept of input-output linearization as it applies to electric motor drives, a brief review of the general theory is called for. See also Chapters 55 and 57. For the present purposes, it is sufficient to consider nonlinear multivariable systems of the form
where x E Rn is the state vector, u E Rm is the input vector (i.e., the control), and y E Rm is the output vector (i.e., to be controlled). Note that this is a square system, with the same number of inputs as outputs. Given such a nonlinear system, it is said to possess relative degree (rl , . . . , r m }at x0 if
and rank
The notation in Equations 78.32 and 78.33 stands for the Liederivative of a scalar function with respect to a vector function (see [B]). The first property implies that a chain structure exists, whereas the second property implies that input-output decoup1in.g is possible. For nonlinear systems possessing a well-defined relative degree, a simple diffeomorphic change of coordinate? will bring the system into so-called normal form. In particular, under the change of variables
78.2. MOTION CONTROL WITH ELECTRIC MOTORS BY INPUT-0 UTPUT LIN4ARZZATZON it is easy to show that yj = z1.j where
In other words, the output to be controlled, yj, is the output of a chain of cascaded integrators, fed by a nonlinear but invertible forcing term. If rl . . . r, < n, then not all oithe system's dynamics are accounted for in Equation 78.35. However, any remaining internal dynamics will not influence the input-output response, once the nonlinear feedback describedbelow is applied. More specifically,for nonlinear systems with well-defined relative degree, it is possible to solve for the control vector u from the system of algebraic equations
+ +
To implement this inner-loop design, it is necessary to measure the state vector x , and to have accurate knowledge of the system model nonlinearities f , g l, . . . , g, h 1, . . . , h, . The outer-loop design, which is problem dependent, is very straightforward due to the linearity and complete tkcoupling of the input-output dynamics, as given in Equation 78.39. The outer-loop feedback design will be nonlinear with respect to the original state x, but linear with respect to the computed state of the normal form z. As an example of outer-loop design, to guarantee that yj -+ yjd as t -+ 06,given a desired output trajectory yjd and its first rj time derivatives, it suffices to apply
where zj is computed from x according to Equation 78.43, the desired state trajectory vector zjd is
[ ] Yjd
zJd
=:
and where the gain vector kj is selected such that the matrix - bjkT has left-half plane eigenvalues. Imple~mentationof this outer-loop design requires measurement of the state vector x, and accurate knowledge of the nonlinearities f , h 1, . . . ,h,. In the remainder of this chapter, the formalism oi~tlinedabove on the theory of input-output linearization will be applied to a variety of DC and AC motor drive types. This review section has established the notion of relative degree as the critical feature of input-output linearization, and hence this chapter will now focus on this feature in particular and will bypass the explicit computation of the feedback, defined by Equatio~ns78.37 and 78.44, for each motor. The objective is to provide, within a single self-contained document, a catalog of the main issues to be addressed when designing input-output linearizing controllers for electric motors. The consistent notation which will be used also adds to the clarity and usefulness of the results. The process begins with the fundamental modeling tools, such as those presented in [ I l l , but ends with a formulation of the state variable models of various machines, and the correspondin:grelative degree checks. Implementation of input-output linearization for electric motors will thus be a straightforward extrapolation from the contents of this chapter.
Aj
for all x near xO, given some desired choice v for the term on the left-hand side where the superscript (.) denotes time differentiation. Hence, the inner-loop nonlinearity compensation is performed by u = 4 x 1 B(x)u (78.37)
+
where
, a(x) = -A-"(x)b(x)
p(x) = A-l(x)
which means that for each j = I,
(78.38)
. . .,m
or, in state variable form,
where
A j z j + gjvj
ij
=
Yj
= E;zj
(78.40) (78.41)
78.2.3 DC Motors The most logical point of departure for this catalog of results on electric motor input-output linearizabilitywould be DC motors with separate field and armature windings, since these motors are simpler to model than AC motors yet possess a significant
THE CONTROL HANDBOOK nonlinearity. Field coils are used to establish an air gap flux between stationary iron poles and the rotating armature. The armature has axially directed conductors which are connected to a brush commutator (a mechanicalswitch), and these conductors are continuously switched such that those located under a pole carry similarly directed currents. Interaction of axially directed armature currents and radially directed field flux produces a shaft torque.
Dynamic Model The mechanical dynamic equation which models the rotor velocity w is the same for all types of electric motors. Under the assumption that the mechanical load consists only of a constant inertia J , viscous friction with friction coefficient B, and a constant load torque TI, the mechanical dynamics are given by
Each type of electric motor has its own unique expression for electrical torque T,,which for DC motors is
where M designates the mutual inductance between the field and armature windings which carry currents i f and i,, respectively. The electrical dynamic equations describing the flow of currents in the field and armature windings are
vu
= Ruiu
+ M i f +~ La-di, dt
= Raia
+ Mifw + La-di, dt
= (R,
+ Rx)i, - Rxif + Mifw + La-di, dt
Consider first the separately excited DC motor, in which the field and armature windings are fed from separate voltage sources. In order to match the common notation used for the review of input-output linearization principles, the variables of the separately excited DC motor are assigned according to x1 = O x2 = w x3 = i f x4 = i, ul = vf u2 = u , y1 = O (78.54) Note that only one output, associated with the position control objective, is specified at this point. Using Equations 78.46 to 78.49, these assignments lead to the state variable model defined by
In order to check the relative degree according to the definition given in Equations 78.30 to 78.33, a second output to be controlled needs to be declared, so that the system will be square. Nevertheless, the iarious Lie-derivative calculations associated with the first output (rotor position) are easily verified to be
(78.51)
for the shunt-wound motor, where external resistance Rx is in series with the field winding, and
v
Separately Excited DC Motor
(78.49)
where vf and v, are the voltages applied to the field and armature, Rf and R, are the resistances of the field and armature, and Lf and L , are the self inductances of the field and armature, respectively. The above model is complete for the case where separate voltage sources are used to excite the field and armature windings. For further modeling details, see [ll]. When it is desired to operate the DC motor from a single \ voltage source v , the two windings must be connected together in parallel (shunt connection) or in series (series connection). In either of these configurations, operation of the motor at high velocities without exceeding the source limits is made possible by including an external variable resistance to limit the field flux. The electrical dynamic equations then become
v
for the series-wound motor, where external resistance Rx is in parallel with the field winding. Operation without field weakening requires Rx = 0 for the shunt-wound motor, and Rx = oo for the series-wound motor. For the latter case, as R, + ao an order reduction occurs since the currents flowing in the field and armature windings are identicril in the limit. In order to determine the extent to which the various operating modes of DC motors are input-output linearizable,it is necessary to determine the state-variable models and to assess the relative degree of these models. For sake of brevity, only position control will be taken as a primary control objective; speed control and torque control follow in an obvious manner. Hence, all models will include a state equation for rotor position 9 (although this is typically unnecessary for speed and torque control).
If the second output is chosen to be the field current i f , i.e., if h2(x) = x3, then the remaining Lie-derivative calculations are given by 1
(78.53)
Lglh2(x) = - Lg2h2(x)= 0 Lf
(78.57)
78.2. MOTION CONTROL WITH ELECTRIC MOTORS BY INPUT-OUTPUT LINEARIZATION
the state variable model of the shunt wound DC motor from Equations 78.50 and 78.51 becomes
I11 this case, the calculation det
[
M
-
1429
0
(78.58)
indicates that the decoupling matrix is nonsingular almost globally, and that the relative degree is well defined almost globally, i.e., ( r l , r2) = ( 3 , 1 ) (if i f # 0) (78.59) Note that the singularity at i+ = 0 corresponds to a particular value of one of the controlled outputs, namely y2 = 0. Consequently, this siilgularity is easily avoidable during operation and can be handled at start-up without difficulty. If instead the second output is chosen to be the armature current i,, i.e., if Iz2(x) = x4, then the remaining Lie-derivative calculations
indicate that this system has a well-defined relative degree, except when the field and armature currents satis@aparticularalgebraic constraint. In particular, it is clear that
lead to a decoupling matrk
which again is nonsingular almost globally. This implies that the relative degree is again well defined almost globally, i.e., (1.1, r2} =
The evaluation of relative degree is simpler than before, since the shunt wound DC motor is a single-input system, and there is no need to consider defining a second output. The: Lie-derivative calculations
(3, 11 (if ill
# 0)
(78.62)
with a singularity when i , = 0. Again the singularity corresponds to a particular value of one of the controlled outputs, namely yz = 0. Consequently, this singularity is easily avoidable during operation and can be handled at start-up without difficulty. Provided that some alternative start-up procedure is used to establish a nonzero current in the appropriate winding and that the commanded second output is chosen to be away from zero, the nonlinearity compensation defined by a(x) and P(x) in Equations 78.37 and 78.38 is well defined and easily implemented for the separately excited DC motor. Unfortunately, the value of input-output linearization for the single-sourceexcitationstrategies is significantly more limited.
Note that the singularity occurs in a region of the state-space which is not directly defined by the value of the output variable (rotor position). Hence, singularity avoidance is no longer a simple matter. Most importqnt, though, is the fact that the interconnected windings impose the constraint of unipolar torque (i.e., either positive torque or negative torque); Ihence, it is customary to use just a unipolar voltage source. Consequently, the shunt wound DC motor is not nearly as versatile in operation as the separately excited DC motor, despite the fact that it possesses well-defined relative degree for a large subset of the state-space.
Series Wound DC Motor Consider now the series wound DC motor, in which a single source excites both field and armature windings due to the series connection between the two windings (with external resistance in parallel with the field winding to lirnit the field flux if desired). The variable assignments
Shunt Wound DC Motor Consider now the shunt wound DC motor, in which a single source excites both field and armature windings due to the parallel connection between the two windings (with external resistance in series with the field winding to limit the field flux if desired). The variable assignments
are essentially the same as before, with the exception that just one source voltage v is present. With this variable assignment,
are the same as for the shunt wound DC motor, with the exception that when field weakening is unnecessary (R, := m) the order of the model effectively drops (x3 and x4 are not independent). With this variable assignment, the state variable model of the serieswoundDCmotor from Equations 78.52 and78.53 becomes
THE CONTROL HANDBOOK
Denoting any of the above stator or rotor vectors (voltage, current, flux) by generic notation f, or fr, respectively, the vector structure will be The evaluation of relative degree for this system is completed by computing the Lie-derivatives
and the result is that relative degree is well defined provided that the field current is nonzero, i.e., r = 3 (if if
# 0)
fv
=
[
fos
fhs
f c ~
lT
(78.73)
where components are associated with phases a , b, c in machine variables. Assuming magnetic linearity, the flux linkages may be expressed by
(78.70) with inductance matrices
Again the singularity occurs in a region of the state-space which is not directly defined by the value of the output variable (rotor position). Hence, singularity avoidance is again not so simple. Moreover, as before, the interconnected windings impose the constraint of unipolar torque (i.e., either positive torque or negative torque), so a unipolar voltage source would be used. Consequently, the series wound DC motor is also not nearly as versatile in operation as the separately excited DC motor, despite the fact that it possesses well-defined relative degree for a large subset of the state-space.
78.2.4 AC Induction Motors The most appropriate AC motor to consider first would be the induction motor, due to its symmetry. This motor is without doubt the most commonly used motor for a wide variety of industrial applications.
Dynamic Model The three-phase, wye-connected induction motor is constructed from a magnetically smooth stator and rotor. The stator is wound with identical sinusoidally distributed windings displaced 120ยฐ, with resistance R,, and these windings are wyeconnected. The rotor may be considered to be wound with three identical short-circuited and wye-connected sinusoidally distributed windings displaced 120ยฐ, with resistance Rr. The voltage equations in machine variables may be expressed by
where v, is the vector of stator voltages, is is the vector of stator currents, h , is the vector of stator flux linkages, ir is the vector of induced rotor currents, and hr is the vector of induced rotor flux linkages. The resistance matrices for the stator and rotor windings are
where L , is the stator self-inductance, M , is the stator-to-stator mutual inductance, Lr is the rotor self-inductance, Mr is the rotor-to-rotor mutual inductance, M is the magnitude of the stator-to-rotor mutual inductances which depend on rotor angle 8, and N is the number of pole pairs. For the mechanical dynamics, the differential equation for rotor velocity w is
where the torque of electric4 origin may be determined from the inductance matrices using the general expression
where C(0), i, h denote the complete inductance matrix, current vector, and flux linkage vector appearing in Equation 78.75. For further modeling details, see [ I 11.
Reference Frame Transformation Because of the explicit dependence of the voltage equations on rotor angle 8 , direct analysis of induction motor operation using machine variables is quite difficult. Even the determination of steady-state operating points is not straightfomrd. Hence, it is customary to perform a nonsingular change of variables, called a reference frame transformation, in order to effectively replace the variables associated with the physical stator and/or
78.2. MOTION CONTROL WITH ELECTRIC MOTORS BY INPUT-OUTPUT LINEARIZATION
rotor windings with variables associated with fictitious windings oriented within the specified frame of reference. For the symmetl-icalinduction motor, the interaction between stator and rotor can be easily understood by considering a reference frame fixed on the stator, fixed on the rotor, rotating in synchronism with the applied stator excitation, or even rotating with an arbitrary velocity. The generality with which reference frame transformations may be applied with success to the induction motor is due to the assumed symmetry of both the stator and rotor windings. For present purposes, it suffices to consider a stationary frame of reference located on the stator. In the new coordinates, there will be a q-axis aligned with phase a on the stator, an orthogonal d-axis located between phase a and phase c on the stator, and a Oaxis which carries only trivial information. The transformation matrix used to express rotor variables in this reference frame is
which is naturally 9-dependent, and the transformation matrix used to transform the stator variables is the constant matrix
and the transformed flux linkage equations become
Note that all dependence on rotor angle 9 has been eliminated and, hence, all analysis is substantially simp1ifie:d. On the other hand, this change of variables does not eliminate nonlinearity entirely. Note also that io, = ior = 0 so that the 0-axis equations from above can be completely ignored. The transformed electrical model is presently expressed in terms of the two orthogonal components of current and flux linkage, on both the stator and the rotor. Any state variable description will require that half of the transfbrmed electrical variables be eliminated. Six possible permutations are available to select from, namely (is,ir),(As,Ar),(i.r, Ir),(As,ir),(is,As), and (i,,I,). Substituting the transformed inductances of Equation 78.91 into the torque expression (Equation 78.80),these permutations result in six possible torque ex~ressions,namely
Both matrices are orthonormal, meaning that they are constructed using orthogonal unit vectors, and hence their inverses are equal to their transposes. Formally stated, the change of variables considered is defined by fr
fls
= =
[ [
&r &S
fdr fds
for fos
lT
'Jr = K r s ( @ ) f r
(78.83)
T
fls = Kssfs
(78.84)
]
where the tilde represents the appropriate stator or rotor variable in the new coordinates. Since the windings'on both the stator and rotor are wyeconnected, the sum of the stator currents, as well as the sum of the rotor currents, must always be equal to zero. In other words, ,i it,, -I- ,i = 0 and i, ib, i, = 0. Note that the reference frame transformation, primarily intended to eliminate 9 from the voltage equations, will also satisfy the algebraic current constraint by construction. After some tedious but straightforward algebra, it can be shown that the transformed voltage equations become
+
+ +
Any of these expressipns is valid, provided that the voltage equations are rewritten in terms of the same set of electrical state variables.
Input-Output Linearizability With the above modeling background, it is now possible to proceed with the main objective of determining the extent to which the induction motor is input-output Linearizable. The only remaining modeling step is to select which permutation of electrical state variables to use, and then to construct the state variable model. Taking stator current and rotor flux as state variables
THE CONTROL HANDBOOK and selecting the orthogonal components of stator voltage as the two inputs, rotor angle 6 as the primary output and rotor A:, as the secondary output, the flux magnitude squared A:, remaining standard notation will be
+
for the second output lead to a decouplingmatrixwith singularity condition det
[
k8
k8
TX4 -7'3 2 B t - 6 ~ 3 2BrSx4
]-
2k;'2
+
x:)
(78.99)
and to the conclusion that Using the notations assigned above, the resulting state variable model is defined by Hence, input-output linearization may be applied to the transformed model of the induction motor (and, hence, to the machine variable model of the induction motor via inverse reference frame transformations) provided that the rotor flux is nonzero. Since the singularity at zero rotor flux corresponds to a particular value of the secondary output variable, i.e., to y2 = 0, it is easy to avoid this singularity in operation by commanding rotor fluxes away from zero and by using a start-up procedure to premagnetize the rotor prior to executing the input-output linearization calculations on-line.
78.2.5 AC Synchronous Motors
with constant coefficients given by
An important class of AC machines frequently used as actuators in control applications is the class of synchronous machines. Though these machines. essentially share the same stator structure with the induction motor, the construction of the rotor is quite different and accounts for the asymmetry present in the modeling.
Dynamic Model
The nonlinearities of the induction motor are clearly apparent in Equation 78.95. In order to check relative degree, the Lie-derivative calculations
The class of synchronous machines contains several specific machines that are worth covering in this chapter, and these specific cases differ with respect to their rotor structures. These specific cases can all be considered to be special cases of a general synchronous machine. Hence, this section will begin with a presentation of the basic modeling equations for the general synchronous machine, and then will specialize these equations to the specific cases of interest prior to evaluating input-output linearizabilityfor each specific case. The general synchronous machine, which is commonly used as a generator of electricpower, consistsof a magnetically smooth stator with identical three-phase wye-connectedsinusoidaQydistributed windings, displaced 120'. The rotor may or may not possess magnetic saliency in the form of physical poles. It may or may not possess a rotor cage (auxiliary rotor windings) for the purpose of providing line-start capability andlor to damp rotor oscillations. Finally, it may or may not possess the capability of establishing a rotor field flux, via either a rotor field winding or permanent magnets mounted on the rotor; however, if no provision for rotor field flux exists, then necessarily the rotor must have a salient pole construction. In machine variables, the general expression for the voltage equations will involve the symmetric stator phases (designated by subscripts as, bs, and cs for phases a, b, and c) and the asymmetric rotor windings (designated by subscripts k g and k d for the q-axis and d-axis a d i a r y windings and by subscript f d a
for the first output and
1433
78.2. MOTION CONTROL WITH ELECTRIC MOTORS BY INPUT-0 UTPUT LINEARIZATION
for the field winding, which is assumed to be oriented along the d-axis). Hence, the voltage equations are written
where v , is the vector of stator voltages, i, is the vector of stator currents, A, is the vector of stator flux linkages, vr is the vector of rotor voltages (zero for the auxiliary windings), i, is the vector of rotor currents, and A, is the vector of rotor flux linkages. The stator and rotor resistance matri'ces are given by 0
R,
0
Rkq
0
fr
=
[
fkq
fbs
lT fkd lT
(78.103)
fcs
ffd
where the torque of electricalorigin may be determined from the inductance matrices using the general expression
0
(78.102) When denoting any of the above stator or rotor vectors (voltage, current, flux) by genericnotation f, or f,, respectively, thevector structure will be
.f~ = [ far
number of pole pairs. Note that this model does not account for the possibility of stator saliency (which gives rise to magnetic cogging). For the mechanical dynamics, the differential equation for rotor velocity w is
(78.104)
where stator components are associated with phase windings in machine variables, and rotor components are associated with the two auxiliary windings and the field winding in machine variables. Assuming magnetic linearity, the flux linkages may be expressed by
with inductance matrices
where L(8), i , h denote the complete inductance matrix, current vector, and flux linkage vector appearing in Equation 78.105. For further modeling details, see [ 111.
Reference Frame Transformation The model derived above is not only nonlinear, it is also not in a form convenient for determining the steady-state conditions needed for achieving constant velocity operation, due to the model's periodic dependence on position. This dependence on position can be eliminated by a nonsingular change of variables, which effectively projects the stator variables onto a reference frame fixed to the rotor. Although it is possible to construct transformations to other frames of reference, these would not eliminate the position dependence due to the asynimetry present in the rotor. Since the asymmetric rotor windings are presumed to be aligned with the rotor frame of reference just one transformation matrix is necessary, which transforms circuit variables from the stator windings to fictitious windings which rotate with the rotor, and it is given by cos(N6) sin(N6') 1 -
JZ
Lm(e)= (78.107) M f sin(N9) Md sin(N0) My cos(N6) Md sin(N8 - 9 ) M,, COS(NB- ) M f sin(N0 M,, cos(N6 $) M f sin(N8 T ) M d s i n ( N 8 9 )
+
"
+
2)
cos(N6' sin(N6' 1 -
JZ
F)
cos(NB
I
+9)
y) sin(N9 + F ) 1
Ji
(78.111) This matrix is orthonormal and, hence, its inverse is equal to its transpose. Formally stated, the change of variables considered is defined by
+
where L , is (average) the stator self-inductance, M, is the (average) stator-to-stator mutual inductance, Lm is the stator inductance coefficient that accounts for rotor saliency, Lq3 is the self-inductance of the q-axis auxiliary winding, Ld, is the selfinductance of the d-axis auxiliary winding, L f , is the selfinductance of the field winding, M , is the mutual inductance between the two d - d s rotor windings, Mq, Md, and M f are the magnitudes of the angle-dependent mutual inductance between the stator windings and the various rotor windings, and N is the
where the tilde represents the stator variables in the new coordinates. Since the windings on the stator are wye-connected,the sum of the stator currents must always be equal to zero. In other words, i,, +ibs +i,, = 0. Note that the reference frame transformation, primarily intended to eliminate 6' from thevoltage equations, will also satisfy the algebraic current constraint by construction. After some tedious but' straightforward algebra, it can be shown that the transformed voltage equations become
THE CONTROL HANDBOOK
Since for this motor the torque depends only on i q , but not on id,, the d-axis current is the appropriate choice of the second
and the transformed flux linkage equations become
4.
Note that all dependence on rotor angle 0 has where K = been eliminated and, hence, all analysisis substantiallysimplified. However, nonlinearity has not been entirely eliminated. Note also that io, = 0 so that the 0-axis equation from above can be completely ignored. Due to the rotor asymmetry,themost convenient set ofelectrical variables for expressing the electrical torque consists of stator current and stator flux. Using these variables,.thetorque expression becomes Te = N(iqsAds - idskqs) (78.120) This and other expressionswill now be specialized to cover three common types of synchronous motors: the rotor-surface permanent magnet motor, the rotor-interior permanent magnet motor, and the reluctance motor.
output. With the above variable assignments, the state variable model is
The presence ofnonlinearityin the electricalsubdynamics is clear. In checking for relative degree, the Lie-derivative calculations
for the first output and
for the second output lead to a globally nonsingular decoupling matrix, as confirmed by
SPM Synchronous Motor The surface-magnet PM synchronous motor is obtained when the rotor field winding is replaced by permanent magnets attached to the surface of a smooth rotor, with auxiliary rotor windings removed. In other words, this is the special case where i f d = constant, ikq = ikd = 0, and Lm = 0. Of course, incorporating the effects ofauxiliarywindings is straightforward, but this is not pursued here. In order to siinplify notation, the new coefficients
concerning magnet flux and inductance are defined. Note that these new coefficients allow the torque to be expressed by
Variables are assigned standard notation for states, inputs, and outputs according to
Hence, the conclusion is that the relative degree is globally well defined and (78.128) Iri, r2) = (3, 1) (globally)
IPM Synchronous ~ o t o r The interior-magnet PM synchronous motor is obtqined when the rotor field winding is replaced by permanent magnets mounted inside the rotor, thus introducing rotor saliency, and with auxiliary rdtor windings removed. In other words, this is the special case where i f d = constant, ikq = ikd = 0, and Lm # 0. Of course, incorporating the effects of auxiliary windings is straightforward, but this is not pursued here. In order to simplify notation, the new coefficients 3 k , , , = ~ M f i f d L q = L s - M , - - L m2
3 L d = L s - M s + - L 2m
(78.129) concerning magnet flux and inductance are defined. Note that these new coefficients allow the torque to be expressed by
78.2. MOTION CONTROL WITH ELECTRIC MOTORS BY INPUT-OUTPUT LlNEARIZATION Variables are assigned standard notation for states, inputs, and outputs according to
provide a decoupling matrix
Since torque depends on both the q-axis and d-axis currents, the appropriate choice of the second output is nonunique and should account for the relative magnitude of the two torque production mechanisms. With the above variable assignments, the state variable model is
which yields
f (x)=
x2 ( N A ~ xN ~ ( L d - Lq)x3x4 - B x ~ TI) -1 LC, ( R s x ~ N(Am Ld~4)~2)
+
+
-'Ld
(Rsx4
+
N L ~ X ~ X ~ )
1
For this synchronous motor, nonlinearity also exists in the torque expression. In checking for relative degree, each of the currents will be individually selected as the second output. The Lie-derivative calculations which are common to both cases are
Ir17r2) = ( 3 , 1 ) (if ~2
1435
# 0)
(78.139)
Hence, for this motor, it is clear that input-output linearization is a viable control strategy provided that isolated singularitiesare avoided, and this is simply achieved due to the fact that the singularities correspond to particular values of the second controlled output variable.
Synchronous Reluctance Motor The synchronous reluctance motor is obtained when the rotor field winding and auxiliary windings are removed, but rotor saliency is introduced. In other words, this is the special case where i f d = ikq = ikd = 0 and Lm # 0. Of course, incorporating the effects of auxiliary windings is straightforward, but this is not pursued here. In order to simplify notation, the new coefficients
concerning inductance are defined. Note that these new coefficients allow the torque to be expressed by
Variables are assigned standard notation for states, inputs, and outputs according to
For the case when h 2 ( x ) = x4, the remaining Lie-derivative calculations
provide the decoupling matrix
=
-J++ dLq
(hm
(Ld
- Lq)x4)
Since torque depends on both the q-axis and d-iaxis currents in a symmetric way, the appropriate choice of the second output is nonunique and either of these currents would serve the purpose equally well. With the above variable assignments, the state variable model is
(78.135)
which yields
For the case when h2(x) = calculations
x3,
the remaining Lie-derivative Again, nonlinearity is present in both the electrical and mechanical subdynamics. In checking for relative degree, two cases are
THE CONTROL HANDBOOK considered, corresponding to the two obvious choices for the second output. The common Lie-derivative calculations are given by
Lg2~ j . h(x)
N(Ld - L,) JLd
=
X3
For the case when h2(x) = x4, the remaining calcdations
provide the decoupling matrix
which suggests that
For the case when h2(x) = x3, the remaining calculations
provide the decoupling matrix det
[
G
0
N(Ld - L,) JLdL,
X3
(78.149)
which suggests that
Hence, for this motor, it is clear that input-output linearization is a viable control strategy provided that isolated singularitiesare avoided, and this is simply achieved due to the fact that the singularities correspond to particular values of the second controlled output variable.
78.2.6 Concluding Remarks This chapter has described how input-output linearization may be used to achieve motion control for a fairly wide dass of motors. Although the most crucial issues of state-variable modeling and relative degree have been adequately covered, a few remaining points need to be made. In several cases, namely the shunt-connected and series-connected DC motors and the induction motor, the choice of outputs has led to first-order internal dynamics. It is not difficult to show, however, that these internal dynamics present no stabilityproblems for input-output linearization. Also, it should be emphasized that the control possibilities for shunt-connected and series-connected DC motors
are rather limited because of their unipolar operation, but that the isolated singularities found for all the other motors should not pose any real problems in practice. The material presented in this chapter can be extended in several directions, beyond the obvious step of completely specifying the explicit feedback controls. Input-output linearization n a y be applied also to motors with nonsinusoidal winding distribution, to motors operating in magnetic saturation, or to motors with salient pole stators and concentrated windings (e.g., the switched reluctance motor). Moreover, there are still many supplementary issues that are important to mention, such as the augmentation of the basic controllers with algorithms to estimate unmeasured states and/or to identify unknown parameters. Since model-based nonlinear controllers depend on parameters that may be unknown or slowlyvarying,an on-line parameter identification scheme can be included to achieve indirect adaptive control. Especially parameters describing the motor load are subject to significant uncertainty. There are several motives for agreeing to the additional complexity required for implementation of indirect adaptive control, including the potential for augmented performance, and improved reliability due to the use of diagnostic parameter checks. Typical examples of parameter identification schemes and adaptive control may be found in [13] for induction motors, and in [IS] for permanent-magnet synchronous motors. One of the challenges in practical nonlinear control design for electric machines is to overcome the need for full state measurement. The simplest example of this is commutation sensor elimination (e.g., elimination of Hall-effect devices from inside the motor frame). For electronically commutated motors, such as certain permanent-magnet synchronous motors, some applications do not require control of instantaneous torque, but can get by with simple commutation excitation with a variable firing angle. Even at this level of control, the need for a commutation sensor is crucial in order to maintain synchronism. The literature on schemes used to drive commutation controllers without commutation sensors is quite extensive. A more difficult problem in the same direction is highaccuracy position estimation suitable for eliminating the highresolution position sensor required by most of the nonlinear controls discussed earlier. In some applications, despite the need for high dynamic performance, the use of a traditional position sensor is considered undesirable due to cost, the volume andlor weight of the sensor, or the potential unreliability of the sensor in harsh environments. In this situation, the only alternative is to extract position (andlor speed) information from the available electrical terminal measurements. Of course, this is not an easy thing to do, precisely because of the nonlinearities involved. Some promising results have been reported in the literature on this topic, for permanent-magnet synchronous motors, switched reluctance motors, and for induction motors. For the induction motor specifically, there is also a need to estimate either the rotor fluxes or rotor currents, in order to implement the input-output linearizingcontroller discussed earlier. Thorough treatments of this estimation problem are also available in the literature.
78.3. CONTROL OF ELECTRICAL GENERATORS
References [ I ] Blaschke, F., The principle of field orientation applied to the new transvectbr closed-loop control system for rotating field machines, Siemens Rev., 39, 217-220, 1972. [2] Bodson, M., Chiasson, J., and Novotnak, R., Highperformance induction motor control via inputoutput linearization, IEEE Control Syst. Mag., 14(4), 25-33,1994. [3] Bodson, M., Chiasson, J.N., Novotnak, R.T., and Rekowski, R.B., High-performance nonlinear feedback control of a permanent magnet stepper motor, IEEE Trans. Control Syst. Technol., 1(1), 5-14, 1993. [4] Chiasson, J. and Bodson, M., Nonlinear control of a shunt DC motor, IEEE Trans.Autom. Control, 38(1l), 1662-1666,1993. [5] Chiasson, J., Nonlinear differential-geometric techniques for control of a series DC motor, IEEE Trans. Control Syst. Technol.,2(1), 3542, 1994. [6] De Luca, A. and Ulivi, G., Design of an exact nonlinear controller for induction motors, IEEE Trans. Autom. Control, 34(12), 1304-1307,1989. [7] Hemati, N., Thorp, J.S., and Leu, M.C., Robust nonlinear control of brushless dc motors for direct-drive robotic applications, IEEE Trans.Ind. Electron., 37(6), 460-468, 1990. [8] Isidori, A., Nonlinear Control Systems, 2nd ed., Springer-Verlag, New York, 1989. [9] Kim, D.I., Ha, LJ., and KO,M.S., Control of induction motors via feedback linearization with input-output decoupling, Int. J. Control, 51(4), 863-883, 1990. [ 101 Kim, G.S., Ha, I.J., and KO,M.S., Control of induction motors for both high dynamic performance and high power efficiency, IEEE Trans. Ind. Electron., 39(4), 323-333,1992. [ 111 Krause, P.C., Analysis of Electric Machinery, McGrawHill, New York, 1986. [12] Krzeminski, Z., Nonlinear control of induction motor, Proc. 10th IFAC World Congress, Munich, Germany, 1987, pp. 349-354. [13] Marino, R., Peresada, S., and Valigi, P., Adaptive input-outpui. linearizing control of induction motors, IEEE Trans.Autom. Control, 38(2), 208-22 1, 1993. [14] Oliver, P.D., Feedback linearization of dc motors, IEEE Trans. Ind. Electron., 38(6), 498-501, 1991. [15] Sepe, R.B. and Lang, J.H., Real-time adaptive control of the permanent-magnet synchronous motor, IEEE Trans. Ind. Appl., 27(4), 706-714, 1991. [16] Zribi, M. and Chiasson, J., Position control of a PM stepper motor by exact linearization, IEEE Trans.Autom. Control, 36(5), 620-625, 1991.
78.3 Control of Electrical Generators Thomas M. Tahns,
GE Cojrpo~ateR d D ,
Schenectady, IVY
Rik W. De Doncker,
silicon Power corpo-
ration, Malvern, PA
78.3.1 Introduction Electric machines are inherently bidirectional energy converters that can be used to convert electrical energy into mechanical energy during motoring operation, or mechanical into electrical energy during generating operation. Although the underlying principles are the same, there are some significant differences between the machine control algorithms developed for motion control applications and those for electric power generation. While shaft torque, speed, and position are the controlled variables in motion control systems, machine terminal voltage and current are the standard regulated quantities in generator applications. In this section, control principles will be reviewed for three of the most common types of electric machines, used as electric power generators -DC machines, synchronous machines, and induction machines. Although abbreviated, this discussion is intended to introduce many of the fundamental control principles that apply to the use of a wide variety of alternative specialty electric machines in generating applications.
78.3.2 DC Generators Introduction DC generators were among the first electrical machines to be deployed reliably in military and commercial i~pplications.Indeed, DC generators designed by the Belgian Professor F. Nollet and built by his assistant J. Van Malderen were used as early as 1859 as DC sources for arc lights illuminating the French coast [8]. During the French war in 1870, arc lights powered by the same magneto-electro machines were used to illuminate the fields around Paris to protect the city against night attacks [25]. Another Belgian entrepreneur, Zenobe Gramme, who had worked wjth Van Malderen, developed the renowned Gramme winding and refined the commutator design to its present state in 1869. In that year he demonstrated that the machine was capable of driving a water pump at the World Exposition held in Vienna. Indeed, DC generators are constructed identically to DC motors, although they are controlled differently. Whereas DC motors are controlled by the armature current to relgulate the torque production of the machine [23], DC generators are controlled by the field current to maintain a regulated DC terminal voltage. Typical applications for DC generators have included power supplies for electrolytic processes and variable-voltage supplies for DC servo motors, such as in rolling mill drives. Rotating machine uninteruptible power supplies (UPS), which use batteries for energy storage, require AC and DC generators, as well as AC and DC motors. Other important historical applications indude DC substa-
THE CONTROL HANDBOOK tions for DC railway suppiies. However, since the discovery of DC rectifiers, especiallysilicon rectifiers (1958),rotating DC generators have been steadily replaced in new DC generator installations by diode or thyristor solid-state rectifiers. In those situations where mechanical-to-electrical energy conversion takes place, DC generators have been replaced by synchronous machines feeding DC bridge rectifiers, particularly in installations at high power, i.e., installations requiring severaltens ofkilowatts. Synchronous generators with DC rectifiers are also referred to as brushless DC generators. There are multiple reasons for the steady decline in the number of DC generators. Synchronous machines with solid-state rectifiers do not require commutators or brushes that require maintenance. The commutators of DC machines also produce arcs that are not acceptable in mining operations and chemical plants. In a given mechanical frame, synchronous machines with rectifiers can be built with higher overload ratings. Furthermore, as the technology of solid-state rectifiers matured, the rectifier systems and the brushless DC generator systems became more economical to build than their DC machine counterparts.
Figure 78.8(a) shows the DC generator construction, which is similar to a DC shunt motor. The space flux vector diagram of Figure 78.8(b) shows the amplitude and relative spatial position of the stator and armature fluxes that are associated with the winding currents. The moving member of the machine is called the armature, while the stationary member is called the stator. The armature current i , is taken from the armature via a commutator and brushes. A stationary field is provided by an electromagnet or a permanent magnet and is perpendicular to the field produced by the armature. In the case of an electromagnet, the field current can be controlled by a DC power supply.
Compensation winding
\
Armature
ia
Field
Despite this declining trend in usage, it is still important to understand the DC generator's basic operating principles because they apply directly to other rotating machine generators and motors (especially DC motors). The following subsections describe the construction and equivalent model of the DC generator and the associated voltage control algorithms.
DC Machine Generator Fundamentals Machine Construction and Vector Diagram Most rotating DC generators are part of a rotating machine group where AC iliduction machines or synchronous machines drive the DC generator at a constant speed. Figure 78.7 illustrates a typical rotating machine lineup that converts AC power to DC power. Note that this rotating machine group has bidirectional power flow capability because each machine can operate in its opposite mode. For example, the DC generator can operate as a DC motor. Meanwhile, the AC motor will feed power back into the AC grid and operate as an AC generator.
Fixed three phase ac inout
I
Figure 78.7
Variable dc output
I
Typical line-up of DC generator driven by AC motor.
(a) Construction of DC generator, showing the stator Figure 78.8 windings and armature winding. (b) Vector diagram representing flux linked to field winding armature winding (qu), and series compensation winding (qc).
(q),
Most DC machines are constructed with interpole windings to compensate for the armature reaction. Without this cornpen,sation winding the flh inside the machine would increase to \yf when the armature current (and flux Wa)increases. Due to the magnetic saturation of the pole iron, this increased flux leads to a loss in internal back-emf voltage. Hence, the odtput voltage n by the internal armature resiswould drop faster t h ~ predicted tance whenever load increases. Conducting the armature current
78.3. CONTROL OF ELECTRICAL GENERATORS through the series compensation yinding creates a compensat. flux Y c which adds to the flux Y f such that the total flux of the machine remains at the original level Y f determined by the field current. The vector diagram of 'Figure 78.8(b) shows another way to illustrate the flux linkages inside the DC generator. One can* construct first the so-called stator flux Y , of the DC machine. The stator flux can be defined as the flux produced solely by the stator windings, i.e., field winding and compensation winding. Adding the armature flux Y , to the stator flux Y Syields the field winding flux Y f which represen& the total flux experienced by the magnetic circuit. An~theradvantage of the compensation winding is a dramatic reduction of the armature leakage inductance because the flux inside the DC machine does not change with varying armature current. In other words, the magnetic stored energy inside the machine is greatly decoupled from changes of load current, making the machine a stiffer voltage source during transient conditions. DC GeneratorEquivalentCircuitandEqwtions The DC generator equivalent circuit is depicted in Figure 78.9. The rotating armature windings produce an induced voltage that is proportional to the speed and the field flux of the machine (from the BvE rule [6]): e, = k o m Y f (78.151) Parasitic armature elements are the armature resistance Ru (re-
Equivalent circuit of DC generator showingthe armature impedances ( R , and La) and armatureback-emf (e,). Also shown are the field winding impedances ( R f and L f ) , the field supply voltage ( v f ), and current ( i f ).
Figure 78.9
sistance of armature and compensation windings, cables, and brushes) and the armature leakage inductance Lo (flux not mutual coupled between armature and compensation winding). As a result, the armature voltage loop equation is given by:
The field wnding is perpendicular to the armature winding and the compensation winding and experiences no induced voltages. Hence, its voltage-loop equation simplifies to:
The electromagnetic torque T, produced by the I)C generator is proportional to the field flux and the armature currents (from the Bil.rule [ 6 ] )and can be expressed as:
Note that this torque T, acts as a load on the AC motlor that drives the DC generator, as shown in Figure 78.7. Hence, to complete the set of system equations that describe the dynamic behavior of the DC generator, a mechanical motion equation describing the interaction between the DC generator and the AC drive motor is necessary. We assume that the mechanical motion1 equation of a DCgenerator fed by an electric motor can be characterized by the dynamic behavior of the combined inertia of the generator armature and the rotor of the motor and their speed-]proportional damping. Speed-proportional damping is typically caused by a combination of friction, windage losses, and electrical induced losses in the armature and rotor:
and where J is the total combined inertia of the DC generator drive motor, omis the angular velocity of the motor-generator shaft, and D is the speed-proportional damping cotefficient. The motor torque Tmis determined by the type of prime mover used and its control functions. For example, synchronous machines do not vary their average speed whenever the load torque (produced by the DC generator) varies. On the other hand, the speed of AC induction machines slips a small amount at increased load torque. The torque response of speed-controlled DC machines or AC machines depends greatly on the speed or position feedback loop used to control the machinle. Assuming a proportional-integral speed control feedback 10013, the torque of the driving motor can be described by the followilngfunction:
where omis the measured speed of the drive motor, w: is the desired speed of the drive motor, Kp is the proportional gain of the drive motor feedback controller, and Ki is the integral gain of the drive motor feedback controller. Assuming no field regulation and assuming a constant armature speed, the DC generator output voltage varies in steady state as a function of the load current according to:
Figure 78.10 illustrates the DC generator steacly-state load characteristic. In some appli'cations the drop of the output voltage with increasing load current is not acceptable antd additional control is required. Also, during dynamic conditions, e.g., a step change in the load current, the armature voltage may respond too slowly or with insufficient damping. The follo~wingsection analyzes the dynamic response and describes a typical control algorithm to improve the DC generator performance.
THE CONTROL HANDBOOK
Figure 78.10
DC generator steady-stateload characteristic.
DC Generator Control Principles Basic Generator System Control Characteristics A control block diagram can be derived from the DC generator equations and is illustrated in Figure 78.1 1. As indicated by this block diagram, the DC generator system exhibits some complicated coupled dynamics. For example, whenever the generator load increases quickly (e.g., due to switching of resistive-inductive loads), a rapid increase of the armature current will occur because the armature time constant is typically less than 100 msec.' This leads to an equally fast increase in the generator load torque T,. The speed regulator of the drive motor will respond to this fast torque increase with some time delay (determined by the inertia of the drive and the speed PI controller). Some speed variation can be expected as a result of this delay. Such speed variations directly influencethe armature-induced voltage e, .This causes the armature voltage - and the armature current to change, thereby altering the generator load torque. In conclusion, the DC generator drive behaves as a system of higher order. Figure 78.12 shows that a step change in load current has an impact on the armature voltage, the torque of the DC generator, and the speed of the rotating group. In this example, the output voltage settles to its new steady-state value after an oscillatory transient. Under certain circumstances (e.g., low inertia) the drive may ultimately become unstable. ~eneraiorVoltageRegulation A field controller can be added to enhance the dynamic performance of the generator system. However, this controller acts on the proportionality factor between the back-emf voltage and the speed of the drive. Hence, this control loop makes the system nonlinear, and simple feedback control can be difficult to apply. Feedforward backemf decoupling (from speed) combined with feedback control is therefore often wed in machine controllers because of its simplicity and fast dynamic response. Figure 78.13 shows the block diagram of a field controller which utilizes feedback and feedforward control. The feedforward loop acts to decouple the speed variations from the generator back-emf e,. This controller is obtained by inverting the
model equations of the back-emf and the field winding. Figure 78.14 shows the response of the generator system when onIy feedback control is applied. The armature voltage does not have a steady-state error but still shows some oscillatory behavior at the start of the transient. To improve further the dynamic response of the generator it is essential to decouple the influence of the speed variation on the armature back-emfe,. This can be achieved using feedforward control by measuring the drive speed and computing the desired field voltage to maintain a constant back-emf e, as shown in Figure 78.13. A lead network compensates for the field winding inductive lag (within power limits of the field winding power supply). In practice, the field power supply has a voltage limit which can be represented by a saturation function in the field controller block diagram. Note that the feedforward control loop relies on good estimates of the machine parameters. The feedback control loop does not require these parameters and guarantees correct steadystate operation. Figure 78.15 illustrates the response of the DC generator assuming a 10% error in the control parameters in the feedforward controller. Clearly, a faster response is obtained with less transient error.
78.3.3 Synchronous Generators Synchronous Machine Fundamentals and Models Introduction Synchronous machines provide the basis for the worldwide AC electric power generating and distribution system, earning them recognition as one of the most important classes of electrical generators. In fact, the installed powergenerating capacity of AC synchronous generators exceeds that of all other types of generators combined by many times. The scalability of synchronous generators is truly impressive, with available machine ratings covering at least 12 orders of magnitude from milliwatts to gigawatts. A synchronous generator is an AC machine which, in comparison to the DC generator discussed in Section 78.3.2, has its key electrical components reversed, with field excitation mounted on the rotor and the corresponding armature windings mounted in slots along the inner periphery of the stator. The armature windings are typically grouped into three phases and are specially distributed to create a smoothly rotatingmagneticfluxwave in the generator's airgap when balanced three-phase sinusoidal currents flow in the windings. The rotor-mounted field can be supplied by either currents flowing in a directed field winding (wound-field synchronous generator, or WFSG) or, in special cases, by permanent magnets (PMSG). Figure 78.16 includes a simplified cross-sectionalview of a wound-field machine showing the physical relationship of its key elements. A wealth of technical literature is qvailable which addresses the basic operating principles and construction of AC synchronous machines in every level of desired detail [71, [22], [24]. Synchronous Generator Model There are many alternative approaches which have been proposed for modeling synchronous machines, but one of the most powerful from the standpoints of both analytical usefulness and physical insight is
78.3. CONTROL OF ELECTRICAL GENERATORS
I
I
Drive Machine Transfer Function
& iaqi14 Figure 78.11
dc load
J
Block diagram of basic DC generator system.
30 25
Y 20 0 2 15 s
P .la.
a
10
o ~ , , , . ; , . . , ; . . , , ~ : ., ,, ,. , .
5
0 0.4
0.4 0.5
0.6
0.7
0.8
1
0.9
0.5
0.6
0.7 Time (s)
0.8
)
,
0.9
, ,
,I 1
Tie (s)
Figure 78.12
Response of DC generator with constant field vf
r
PI
feed back
.
1 feed forward
speed sensor
Figure 78.13 Feedfi~rwardand feedback control loops for DC generator field control.
Figure 78.14 Response of DC generator with feedback control on field excitation vf . the dq equivalent circuit model shown in Figure 78.17. These coupled equivalent circuits result from application of the ParkBlonde1dq transformation [15]which transforms the basic electrical equations of the synchronous machine from the conventional stationary reference frame locked to the :stator windings into a rotating reference frame revolving in synchronism with the rotor. The direct (d).axis is purposely aligned with the magnetic flux developed by the rotor field excitation, as identified in Figure 78.16, while the quadrature (q) axis is orthogonally oriented at 90electricaldegrees of separation from the d-axis. As a result of this transformation, AC quantities in the stator-referenced equations\become DC quantities during steady-state operation in the synchronously rotating reference frame. Each ofthe d- and q-axis circuits in Figure 78.1'7 takes the form of a classic coupled-transformer equivalent circuit built around the mutual inductances L,d and Lmq representnng the coupled
THE CONTROL HANDBOOK windings or eddy-current circuits in solid-iron rotor cores. When analyzing the interaction of a synchronous generator (or several generators) with the associated power system, it is often very convenient to reduce the higher-order dq synchronous generator model to a much simpler Thevenin equivalent circuit model, as shown in Figure 78.18, consisting of a voltage source behind an inductive impedance [ 101. The corresponding values of equivalent inductance and source voltage in this model change quite significantly depending on whether the analysis is addressing either steady-stateoperation or transient response. In particular, the generator's transient synchronous inductance L:. is significantly smaller than the steady-state synchronous inductance L, in most cases, while the transient model source voltage E:. is, conversely, noticeably larger than the steady-state source voltage E,,..
Exciters for Wound-Field Synchronous Generators
-0 0.4
, , , , ,
0.5
0.6
, , , I
0.7 Time (s)
0.8
. , , ,
j 0.9
, , , ,
1
Figure 78.15 Response of DC generator with feedforward and feedback control on field excitation vf with a 10% error in the feedforward control parameters. d-Axis
Figure 78.16 Simplified cross-sectional view of wound-field s p chronousgenerator,includingidentificationof d-q axes in rotor-oriented reference frame. stator-rotor magnetic flux in each of the two axes. The principal difference between the d- and q-axis circuits is the presence of an extra set of external excitation input terminals in the d - A s rotor circuit modeling the field excitation. The externally controlled voltage source vf which appears in this rotor excitation circuit represents the principal input port for controlling the output voltage, current, and power of a wound-field synchronous generator. The other rotor circuit legs which appezr in both the d- and q-axis circuits are series resistor-inductor combinations modeling the passive damping effects of special rotor-mounted damper
The exciter is the functional block that regulates the output voltage and current characteristics of a wound-field synchronous generator by controlling the instantaneous voltage (and, thus, the current) applied to the generator's field winding. Exciters for large utility-class synchronous generators (> 1000 MW) must handle on the order of 0.5% ofthe generators' ratedoutput power, thereby requiring high-power excitation equipment with ratings of 5 MW or higher [ 2 7 ] . Typically, the exciter's primary responsibility is to regulate the generator's output AC voltage amplitude for utility power system applications, leading to the simplified representation of the resulting closed-loop voltage regulation system shown in Figure 78.19. Almost all of the basic exciter control algorithms in use today for large synchronous generators are based on classical control techniques which are well suited for the machine's dynamic characteristics and limited number of system inputs and outputs. The dynamics of such large generators are typically dominated by their long field winding time constants, which are on the order of several seconds, making it possible to adequately model the generator as a very sluggish low-pass filter for smallsignal excitation analyses. While such sluggish generator dynamics may simplify the steady-state voltage regulator design, it complicates the task of fulfilling a second major responsibility of the exciter which is to , improve power system stability in the event of large electrical disturbances (e.g., faults) in the generator's vicinity. This largedisturbance stabilization requires that the exciter be designed to increase its output field voltage to its maximum (ceiling) value as rapidly as possible in order to help prevent the generator from losing synchronism with the power grid (i.e., "pull-out") [21]. Newer "high initial rksponse (HIR)" exciters are capable of increasing their output voltages (i.e., the applied field voltage Vf) from rated to ceiling values in less than 0.1 sec. 1121. Exciter Configurations A wide multitude of exciter designs have been successfully developed for utility synchronous generators and can be found in operation today. These designs vary in such key regards as the method of control signal amplifi'
78.3. CONTROL OF ELECTRICAL GENERATORS
o Ra, La
- Electricalsynchronousang. frequency - Statof resistance &leakageinductance
d- and W s mutual iwhlances Field resistanceand leakage inductance Rfd. Lfd Rsd. Rsq, Rtq, Rdor transient and SubtrarIsient Lsd, Lq. kq cki. resistances and i-m vd, vq. id, iq Statordq axe voltages, wments b Lmd+ L, h i s stator inductance Lq = Lw + L, q-axis sfator inductance vd = b id + h d idr - &axis stator nux Y$ = Lq iq + hq iqr waxis stator flux
4nd.h:
-
-
-
Figure 78.17
Synchronous generator d- and q-axis equivalent circuits.
Synchronous generator Thevenin-equivalent circuit model (unprimedvariables identify steady-statecircuit model variables; primed variables identify transient model).
Figure 78.18
Exciter
First-Order Generator Model
Simplified block diagram of synchronous generator dosed-loop voltage regulation system.
Figure 78.19
cation to achieve the required field winding power rating, and the selected source of the excitation power (i.e., self-excitation from the generator itself vs. external excitation using an independent power source). Despite such diversity, the majority of modern exciters in use today can be classified into one of two categories based on the means of power amplification control: "rotating exciters" which use rotating machines as the control elements, and "static exciters" which use power semiconductors to perform the amplification.
Rotating Exciters Rotating exciters have been in use for many years and come in many varieties using both DC commutator machines and AC alternators as the main exciter machines [I], [4]. A diagram of one typical rotating exciter using an AC alternator and an uncontrolled rectifier to supply the generator field is shown in Figure 78.20(a). The output of the exciter alternator is controlled by regulating its field excitation, reflecting the cascade nature of the field excitation to achieve the necessary power amplification. One interesting variation of this scheme is the so-called "brushless exciter" which uses an inverted AC alternator as the main exciter with polyphase armature windings on the rotor and rotating rectifiers to eliminate the need for slip rings to supply the main generator's field winding [ 2 6 ] . The standard control representation developedby :[EEEfor the class of rotating exciterstypifiedby the design in Figure 78.20(a)is shown in the accompanying Figure 78.20(b) [ l I]. The dynamics of the exciter alternator are modeled as alow-pass filter [ l / ( K Ef T E ) ]dominated by its field constant, just as in the case ofthe main generator discussed earlier (see Figure 78.19). The tirne constant ( T A ) of the associated exciter amplifier is considerably shorter than the exciter's field time constant and plays a relatively minor role in determining the exciter's principal control characteristics. The presence of the rate feedback block [ K F , TF] in the exciter control diagram is crucial to thestabilization ofthe overallvoltage regulator. Without it, the dynamics of the voltage regulating loop in Figure 78.19 are dominated by the cascade combination of two low-frequency poles which yield an oscillatory response as the amplifier gain is increased to improve steady-state regulation. The rate feedback (generally referred to as "excitatnon control system stabilization")provides adjustable lag-lead cornpensation for the main regulating loop, making it possible to increase the regulator loop gain crossover frequency to 30 radlsec or higher.
StaticExciters In contrast, a typical static exciter configuration such as the one shown in Figure 78.2 1(a)uses large silicon controller rectifiers (SCKs) rather than a rotatirig machine to control the amount of generator output power fed back to the field winding [17]. The voltage regulating function is performed
THE CONTROL HANDBOOK
Exciter
Generator Output Terminals
Main Generator I
Main Field
Terminal Voltage Feedback
VT*
Magnetic Saturation Function SE = ~(VF)
-
Exciter Amplifier
=-1
KA 1
+
STA
VF
Exciter Alternator SKF 1
+ 'STF
-
Rate Feedback
Figure 78.20 Block diagrams of a rotating exciter scheme using an uncontrolled rectifier showing (a) physical configuration, and (b) corresponding exciter standard control representation 1121.
Exciter
Generator output Terminals
1 Main Field
Terminal Voltage Feedback
-
v~'
"T
1 1
+
STR
vlmax
Transient Gain Reduction
Exciter Amplifier
V~V~max
f4-*--1 + STC 1 +sTg
Vlmin
1
KA STA
+
TB>Tc VrV~min
Rate Feedback
Figure 78.21 Block diagrams of a static exciter scheme using phase-controlled SCRs showing (a) physical configuration, and (b) corresponding exciter standard control representation [18]. (Note that either the "Transient Gain Reduction" block or the "Rate Feedback" block is necessary for system stabilization, but not both.)
78.3. CONTROL 02: ELECTRICAL GENERATORS using classic phase control principles [ 141which determine when the SCRs are gated on during each 60-Hz cycle to supply the desired amount of field excitation. Static exciters generally enjoy the advantage of much faster response times than rotating exciters, making them excellent candidates for "high initial response" exciters as discussed above. This is reflected in the standard static exciter control representation shown in Figure 78.21(b), which lacks the sluggish field time constant TE that dominates the dynamics of the rotating exciter [13]. As a result, the task of stabilizing the main voltage regulating loop is simplified in the absence of this low-frequency pole. Nevertheless, lag-lead compensation is often added to the exciter control in order to prevent the generator's high-frequency regulator gain from reducing the stability margin of the interconnected power system [13]. This compensation (referred to as "transient gain reduction") is provided in the Figure 78.21(b) control diagram using either the T B - Tc block in the forward gain path or the same .type of K F - T F rate feedback block introduced previously in Figure 78.20(b) (only one or the other is necessary in a given installation). Additional Exciter Responsibilities In addition to its basic responsibilities for regulating the generator's output voltage described above, the exciter is also responsible for additional important tasks including power system stabilization, load or reactive power compensation, and exciterlgeneratorprotection. A block diagram of the complete generator-plus-excitersystem identifying these supplementary functions is provided in Figure 78.22. Each of thest: exciter functions will be addressed briefly in this section. Power Systenr Stabilization Since a utility generator is typically one component in a large power grid involving a m+titude of generators and loads distributed over a wide geographic area, the interactions of all these mechanical and electrical systems give rise to complex system dynamics. In some cases, the result is low-frequency dynamic instabilities in the range of 0.1 to 2 Hz involving one or more generators "swinging" against the rest of the grid across intervening transmission lines 151. One rather common sourca of "local" mode instabilities is a remote generator located at the mouth of a mine that is connected to the rest of the power grid over a long transmission line with relatively high per-unit impedance. Other examples of dynamic instabilities involve the interactions of several generators, giving rise to more complicat~ed"inter-area" modes which can be more difficult to analyze and resolve. Unfortunately, the steps that are taken to increase exciter response for improved generator transient stability characteristics (i.e., "high initial respalnse" exciters) tend to aggravatethe power system stability problems. Weak transmission systems which result in large power angles between the generator internal voltage and the infinite bus voltage exceeding 70 degrees tend to demonstrate a particular susceptibilityto this type of power system instability while under voltage regulator control. The typical approach to resolving such dynamic instability problems is to augment the basic exciter with an additional feed-
backcontrolloop [13]known as the powersystem stabilizer (PSS) as shown in simplified form in Figure 78.23. As indicated in this figure, alternative input signals for the PSS that are presently being successfully used in the field include changes in the generator shaft speed (Aw,), generator electrical frequency (Aw,), and the electrical power (A P,). The primary control function performed by the PSS is to provide a phase shift using one or more adjustable lead-lag stages, which compensates for the destabilizing phase delays accumulated in the generator and exciter electrical circuits. Additional PSS signal processing in Figure 78.23 is typically added to filter out undesired torsional oscillations and to prevent the PSS control loop from interfering with the basic exciter control actions during major transients caused by sudden load changes or power system faults. Proper settings for the primary PSS lead-lag gain parameters vary from site to site depending on the characteristics of the generator, its exciter, and the connected power systemi. Since the resulting dynamic characteristics can get quite complicated, a combination of system studies and field tests are itypically required in order to determine the proper PSS gain settings for the best overall system performance. Effective empirical techniques for setting these PSS control gains in the field have gradua!ly been developed on the basis of a significant experiencebase with successful PSS installations [ 161.
Load or Reactive Power Compensation A second auxiliary function provided in many excitation sy:stems is the tailored regulation ofthe generator's terminal voltage to compenl reactive power sate for load impedance effects or to c o n t r ~the delivered by the generator [13]. One particularly straightforward version of this type of compensation is shown in Figure 78.24. As indicated in this figure, the compensator acts to supplement the measured generator's terminal voltage that is being fed back to the exciter's summing junction with extra terms proportional to the generator output current. The current-dependent compensation terms are added both in phase and 90' out of phase with the terminal voltage, with the associated compensatilon gains, Rc and Xc, having dimensions of resistive and reactive impedance. Depending on the values and polarities of these gains, this type of compensation can be used for different purposes. For example, if R, and Xc are negative in polarity, this block can be used to compensate for voltage drops in power system components such as step-up transformers that are downstream from the generator's terminals where the voltage is measured. Alternatively, positive values of Rc and Xc can be selected when two or more generators are bussed together with no intervening impedance in order to force the units to share the delivered reactive power more equally. Aithough not discussed here, more sophisticated compensation schemes have also been developed which modify the measured terminal voltage based on calculated values of the real and reactive power rather than the corresponding measured current components. Such techniques provide means of achieving more precise control of the generator's output power characteristics and find their origins in exciter development work th~atwas completed several decades ago [20].
--
THE CONTROL HANDBOOK
Reactive Power Compensation
and Power System
Vs
Figure 78.22
Block diagram of complete generator-plus-exciter control system, identifying supplementary control responsibilities.
Amr, A , or Ape
Figure 78.23
-
Cascaded Lead-Lag Stages High-Frequency Torsional Filters
+
1 + sT2
vc
=
( v T+
( R ~+ j X c ) l T I
--+
Vc
11.
Figure 78.24 tion.
--c
1 + sT4
Limiter and c Transient Response Decoupler
"S
-+
Basic control block diagram of power system stabilizer (PSS).
Underexcited reactive ampere limit (URAL) - the mininium excitation level is limited as a function of the output reactive current since excessive underexcitation of the generator can cause dangerous overheating in the stator end turns Generator maximum excitation limit -at the other extreme, the maximum excitation current is limited to prevent damage to the exciter equipment and to the generator field winding due to overheating Volts-per-Hertz limiter - excessive magnetic flux levels in the generator iron which can cause internal overheating are prevented by using the exciter to limit the generator's output voltage as a function of output frequency (i.e., shaft speed).
-
1 + sT3
1 + sT1
ExcitedGenerator Protection Although a general review of the important issue of generator protection is well beyond the scope of this chapter [2], the specific role ofthe exciter in the generator's protection system deserves a brief discussion. Some of these key responsibilities include the following:
vT-+
Power System + Stabilizer
Example of generator load/readive power compensa-
78.3.4 Induction Generators Induction Generator Fundamentals and Models Introduction Induction generators are induction machines that operate above synchronous speed and thereby convert mechanical power into electrical power. Induction generators have been extensively used in appli~ationssuch as wind turbines and hydroelectricstorage pumping stations. Frequent direct line starting is required in both of these applications. The AC induction machine offers the advantagethat it can be designed for direct line-start operation, thereby avoiding additional synchronization machines and control. Furthermore, induction machines tolerate the thermal load associated with direct AC line-starting transients better than synchronous machines. At high power levels above 1 MVA, efficiencyconsiderations favor synchronous generators, while at very low power levels below 1 kVA, permanent magnet synchronous generators (e.g., automobile alternators) are more cost effective. One can conclude that induction generators are preferred in generator applications that require frequent starting and that are in a power range of 10 to 750 kVA. Induction generators can operate in two distinctively different modes. In the first mode, the induction generator is connected to a fixed-frequency AC voltage source (e.g., utility linevoltage) or a variable-frequencyvoltage-controlledAC source, such as pulsewidth-modulated IPWM) inverters. The AC source provides the excitation (i.e., the magnetization current) for the induction machine. In this mode, the magnetizing flux is determined or controlled by the AC source voltage. The second mode of operation is the so-called self-excited
78.3. CONTROL OF ELECTRICAL GENERATORS
mode. During self-excitation the magnetizing current for the induction generator is provided by external reactive elements (usually capacitors) or voltage-source inverters operating in six-step waveform mode. Neither of these schemes make it convenient to regulate the terminal voltage of the machine. The output voltage of the generator depends on many variables and parameters such as generator speed, load current, magnetization characteristics of the machine, and capacitor values. The induction generator itself must deliver the necessary power to offset losses induced by the circulating reactive currents during self-excited operation, and the associated stator copper losses are typically high. As a result, self-excited induction generators are rarely used for continuous operation because they do not achieve high efficiency. Moreover, it is difficult to start the excitation process under loaded conditions. Nevertheless, self-excitedoperation with capacitors is sometimes used to brake induction motors t~ standstill in applications demanding rapid system shutdowns. In addition, six-step voltage-source inverters are often used in traction motor drive applications to apply regenerative braking. The control principles of induction generators feeding power into a controlled At2 voltage source will be discussed in the following sections. Self-excited operation of induction generators has been analyzed extensively, and interested readers are referred to the available technical literature for more details [IS]. Induction Generator Model Induction generators are constructed identically to induction motors. A typical induction machine has a squirrel-cage rotor and a three-phase stator winding. Figure 78.25 illustrates the construction of a two-pole (or one pole pair), two-phase induction machine. The stator winding consists of two windings (d and q) that are magnetically perpendicular. The squirrel-cage rotor winding consists of rotor bars that are shorted at each end by rotor end rings.
are not to be confused with the total stator or rotlor flux linkages and%, which represent the superimposed flmc coupling from all windings, as discussed later in this section. The basic equivalent circuit of an induction machine for steady-state operation is illustrated in Figure 78.126. This equivalent circuit shows that each phase winding has parasitic resistances and leakage inductances and that the stator and the rotor are magnetically coupled. However, other equivalent circuits can be derived for an induction machine, as illustrated in Figure 78.27. These equivalent circuits are obtained by transforming stator or rotor current and voltage quantities with a turns ratio "a". Figure 78.27 also specifies the different turns ratios and the corresponding flux vector diagrams which identify the 'flux reference vector used for each of the three equivalent circuits.
Figure 78.26 Single-phase equivalent (steady-state)circuit of induction machine. Lh, main (magnetizing)inductance; L s l , stator leakage inductance; L,1, rotor leakage inductance; L, = Lh + Lsl, stator inductance; Lr = Lh Lri, rotor inductance; R,, stator resistance; Rr , rotor resistance; = vsd j vsq, stator (line-to-neutral)voltage, dq component; = isd jisq, stator current, dq component; . . Lr = lrd jirq, rotor current, dq component;ih= Ly+ I r , magnetizing current;s, slip of the induction machine.
+
+
+ +
Some equivalent circuits are simpler for analysis because one leakage inductance can be eliminated [3]. For example, a turns ratio a = Lh / L r transforms the equivalentcircuilt of Figure 78.26 into the topmost circuit of Figure 78.27, which has no leakage inductance in the rotor circuit. Hence, the rotor flux is selected here as the main flux reference vector. Also, the d-axis d, of the dq synchronous reference frame that corresponds with this turns ratio "a" is linked to the rotor flux so that d, = 61,. Torque-Slip Characteristics The power the induction generator delivers depends on the slip frequency or, equivalently, the slip of the machine. Slip of an induction machine is defined as the relative speed difference of the rotor with respect to the synchronous speed set by the excitation frequency:
Figure 78.25 Two-phaseinduction machine, showing stator and rotor windings and flux diagram associated with stator and rotor currents. The flux linkages.associated with currents in each set of windthe rotor magnetizing ings (i.e., the stator magnetizing flux ghs, flux \yhr,and the d- and q-axis stator magnetizing flux compo:) are shown in the flux diagram. (Unnents ghsdand ghsq., derlined variables designate vector quantities.) Note that each magnetizing flux vector represents a flux component produced by the current in a particular winding. These flux components
with s being the slip of the induction machine, fe the stator al electricalexcitation frequency (Hz), fm the rotor m e ~ ~ n i crotation frequency (Hz), w, the stator excitation angular frequency. (radls), wm the rotor mechanical angular frequency (radls), ne the rotational speed of excitation flux in airgap (rlmin), nm the rotor mechanical shaft speed (rimin). In the case of machines with higher pole-pair numbers, the
THE C O N T R O L HANDBOOK
where Tern is the electromagnetic torque per phase (Nm), Tk is the per-phase pull-out (maximum) torque (Nm), sk is the pullout slip associated with Tk, V, is the rms stator line-to-neutral supply voltage (V), and Ll is the leakage inductance (H). Figure 78.28 illustrates a typical torque-slip characteristic of an induction machine. According to Equation 78.160, the torque of the induction machine at high slip values varies approximately inversely with the slip frequency. Operating an induction machine in this speed range beyond the pull-out slip is inefficient and unstable in the absence of active control, and, hence, this operating regime is of little interest. Stable operation is achieved in a narrow speed range around the synchronous speed ( s = 0) between -sk and +sk which are identified in Figure 78.28.
Lh
Figure 78.27 Modified equivalent circuits of induction machine, showing equivalent circuits and corresponding flux vector diagrams identifying the flux reference vector.
gy= L.& + Lhir 3, = L,ir Qh = Lhih = LhL. -
+ Lh&
+ Lhir = \Ilhs+ S r
mechanical speed is usually defined in electrical degrees according to: nm = Pnrotor (78.159) where p is the pole pair number and nrOtoris the rotor speed as measured by observer in mechanical degrees. The stator resistance R, of medium and large induction machines can usually be neglected because the designer strives to optimize the efficiency of the induction machine by packing as much copper in the stator windings as possible. Using this approximation together with the equivalent circuit of Figure 78.27 that eliminates the stator leakage inductance (a = L,/Lh), the steady-state torque per phase of the AC induction machine can easily be calculated as a function of the supply voltage and the slip frequency, yielding:
where
Figure 78.28
Typical slip-torquecharacteristic of induction machine.
Induction Generator Control Principles Basic Slip Control Principle Whenever the slip$ below the pull-out slip sk, the torque varies approximately linear19 with slip. Hence, the induction machine behaves similarly to a DC generator with constant armature voltage and constant field excitation. As soon as the speed of the generator exceeds the noload (synchronous) speed (i.e., negative slip), mechanical power is transformed into electrical power (i.e., negative torque). Conversely, the machine operates as a motor with positive torque when the speed is below the no-load speed (i.e., positive slip). Clearly, control of the slip frequency provides a direct means for controlling the AC induction generator's output torque and power. In applications where the electrical supply frequency is practically constant (e.g., utility systems), slip control can be realized by sensing the rotor speed and driving the generator shaft with the prime mover at the desired slip value with respect
78.3. CONTROL O F ELECTRICAL GENERATORS to the measured stator excitation frequency. New induction generator systems use inverters connected to the machine's stator t e r ~ i n a l to s control the generator power, as illustrated in Figure 78.29. Inverters are power electronic devices that transform DC power to polyphase AC power or vice versa. Both the amplitude arid frequencyofthe output waveforms delivered by the inverter are independently adjustable. In operation, the inverter provides the magnetization energy for the induction generator while the generator s$aft is driven by a motor or some other type of prime mover. The AC-to-DC inverter converts the generator AC power to DC power, and this DC power can be transformed to A.C power at fixed frequency (e.g., 50 or 60 Hz) using asecond DC-to AC inverter as shown in Figure 78.29. Both inverters are constructed as identical bidirectional units but are controlled in opposite power-flow modes. ac
ac dc
60 Hz
- dc to ac - inverter
ac to dc inverter
-
J
b
dc loads Figure 78.29
Motor Induction Generator
Inverter-fed induction generator system.
Induction generator applications that have a wide speed range or operate under fast varying dynamic load conditions require variable-frequency control to allow stable operation. Indeed, typical rated slip of induction machines is below 2%. Hence, a 2% speed variation around the synchronous speed changes torque from zero to 100% of its rated value. This poses significant design challenges in applications such as wind turbines operating in the presence of strong wind gusts. The high stiffness of the induction generator's torque-speed characteristic makes it very difficult to implement a speed governor to control the pitch of the turbine blades that is sufficiently fast-acting and precise to adequately regulate the machine's slip within this narrow range. On the other hand, an inverter can rapidly adjust the generator's electrical excitation frequency and slip to provide stable system operation with constant output power under all operating conditions.
Field-Oriented Control Principles The fundamental quantity that needs to be controlled in an induction generator is torque. Torque control of the inverter-fed induction machine is usually accomplished by means of fie1d;oriented control principles to ensure stability. With field-oriented control, the torque and the flux of the induction generator are independently controlled in a similar manner to a DC generator with a separately excited field winding, as discussed earlier in this chapte?. The principles of field orientation can best be explained by recognizing from the flux vector diagrams shown in Figure 78.27 that the rotor current vector ir is perpendicular to the rotor flux gr.Furthermore, these vector diagrams illustrate that the stator current space vector LTis composed of the rotor current irand the magnetizing current j h . Note that the space vectors in the vector
diagrams are drawn in a reference frame that is rotating at the synchronous excitationfrequency we.By aligning a dq coordinate system with the rotating rotor flux vector gr,one can prove that under all conditions (including transient conditions) the torque of the induction machine per phase is given by [2i!]:
Positive torque signifies motoring operation, while a negative torque indicates generator operation. Hence, in a synchronous dq reference frame linked to the rotor flux (top diagram in Figure.78.27), the q-axis component of the stator current isq corresponds to the torque-producing stator current component, being equal in amplitude to the rotor current component irq, with the opposite sign. The d-axis component of the stator current isd equals the magnetizing current, corresponding to the rotor fluxproducing current component. A control scheme that allows independent control of rotor flux and torque in the synchronously rotating rotor flu reference frame can now be derived, and the resulting control block diagram is shown in Figure 78.30. A Cartesian-to-polar coordinate transformation calculates the amplitude and the angle yrs of the stator current commands (in the synchronous reference frame) corresponding to the desired flux- and torque-producing components. (Controller command signals are marked with superscript * in Figure 78.30, with negative torque cclmmands corresponding to generator operation.) It is very important to note that the torque and the flux commands are decoupled (i.e., independently controlled) using this field-oriented control scheme. The controller can be seen as an open-loop disturbance feedforward controller, the disturbance signal being the variation of the flux position yr .
Figure 78.30 Field oriented controller allowing independent flux and torque control. yr, angular position of rotor flux with respect to stationary reference; ys, angular position of stator current with respect to stationary reference; yr,, angle between stator current and rotor flux. The C/P block in Figure 78.30 indicates a Cartesian-to-polar coordinate transformation accordingto the followi~iggeneralized equations:
with xd the d-component or real component of space vector
x = xd + j x q , xq the q-component or imaginary component
THE CONTROL HANDBOOK of z, x the amplitude of the space vector x,and cr the angular position of space vector x.
Direct vs. Indirect Field Orientation To complete the controller calculation loops, one needs the rotor flux position yr to calculate the stator current vector position in the stationary reference frame that islinked to thestator ofthemachine. In other words, one needs to determine the orientation of the rotating field fluxvector. It is for this reason that the control methodillustrated in Figure 78.30 was called "field orientation': Two field orientation strategieshave been derived to detect the rotor flux position. Direct field orientation methods use sensors to directly track the flux position. Hall sensors are seldom used because of the high temperature inside the induction machine. Typical flux sensors are flux coils (sensing induced voltage) followed by integrators. The latter gives satisfactory results at frequencies above 5 to 10 Hz. The direct field orientation control block diagram can be completed as shown in Figure 78.31. However, control problems can arise because most flux sensors are positioned on the stator and not on the rotor. As a result, these sensors monitor the stator flux \V,f in a stationary reference frame (marked with superscript "s") and not the rotor flux which is used in the decoupling network. This makes it necessary to use the flux linkage equations to derive the rotor flux from the flux sensor measurements. The required calculations introduce estimated machine parameters (leakage inductances) into the disturbance feedforward path [3] leading to detuning errors. Another approach is to decouple the - machine equations in the stator flux reference frame in which the flux sensors are actually operating. This method requires a decoupling network of greater complexity but achieves high accuracy and zero detuning error under steady-state conditions 191, [19].
the slip angle ymr , as follows:
Figure 78.32 illustrates how a controller can be constructed to calculate the slip frequency command w k r , the slip angle command y,& , and the rotor flux position command y,*. As Figure 78.32 shows, most indirect field-oriented controllers are co~lstructedas open-loop disturbance feedfonvard controllers using the commanded current components instead of measured current quantities. This approach is justified because state-ofthe-art, high-frequency, current-regulated inverters produce relatively precise current waveforms with respect to the current commands. Note that indirect field orientation depends on the rotor time constant Lr/Rr which is composed ofestimated machineparameters. As stated above, direct field orientation also needs machine parameters to calculate the flux vector position and most direct flux sensors do not operate at low frequencies. To circumvent these problems, both methods can be combined using special field-oriented algorithms to ensure greater accuracy. Control Method Comparison Field orientation is a more advanced control technique than the slip controller discussed above. Field orientatiqn offers greater stability during fast-changing transients because it controls torque while maintaining constant flux. Hence, the trapped magnetic energy in the machine does not change when speed or torque variations occur. This decoupled control strategy allows field orientation to control an AC induction machine exactly the same as a separately excited DC machine with a series compensation armature winding (see the section on DC generators). The reader is invited to compare the vector diagram of Figure 78.8 in the DC generator section (illustrating the independent flux and torque control of a DC generator) and the space vector diagram of an induction generator illustrated in Figure 78.25 or Figure 78.27. The rotor fluxvector \Vhrof the induction machine corresponds to the armature flux Y, in the DC machine, while corresponds to the stator magnetizing flux d-component the field winding flux Yj. The compensation winding flux \y, of the DC machine relates to the q-component of the stator magof the AC machine. While the spatial netizing flux vector Ssq orientation of the flux vectors in the DC machine is fixed, the space vectors of the induction machine rotate at synchronous speed. As a result, independent control of the individual space vectors in the AC machine can only be achieved by controlling the amplitude and the phase angle of the stator flux (i.e., stator voltage and current 'vectors). Another approach to understanding the difference between a field-oriented controller and a slip controller is to consider the fact that field-oriented controllers control the flux slip angle and the stator current vector angle while slip controllers only regulate the frequency of these vectors. Controlling the angle of spacevectorsin electricalsystems is andogous to position control
xhsd
FLUX FEEDBACK
Calculator
hd
Figure 78.31
Direct field orientation method.
The second category of field orientation methods are called indirect field orientation because they derive the fl& position using a calculated or estimated value of the angle ymr between the flux and the rotor position. This angle is nothing other than the rotor flux "slip" angle which varies at the slip frequency wmr. The rotor position ym is measured using a shaft position sensor, while the flux slip angle is derived from the slip frequency by integration. The dynamic field-oriented system equations of the induction machine are used to derive the slip frequencywmr and
78.3. C O N T R O L OF ELECTRICAL G E N E R A T O R S
MOTOR
I
Figure 78.32
I
I
ENCODER
-
I
Indirect field orientation method.
in mechanical systems, while frequency control corxesponds to speed control. It is immediately recognized that position control always offers greater stiffness than speed control. Hence, field orientation can offer the same stiffness improvements compared to slip-frequency controllers during transient conditions. One disadvantage of field-oriented control is that it requires considerably more computation power than the simpler slip control algorithms. Many field-oriented controllers are irnplemented using digital signal processors (DSPs). Consequently, increased use of field orientaiion for induction generators will depend on future trends in the cost of digital controllers and sensors as well as the development of new control algorithms that decrease the controller's field installation time (e.g., machine parameter autotuning) while optimizing generator efficiency.
78.3.5 Concluding Remarks This chapter has attempted to provide a concise overview of the major classes tof electrical generators and their associated control principles. Despite the notable differences between the three types of electrical machines reviewed-DC, synchronous, and induction-there are some important underlying control aspects which they share in common. These include the singleinputtsingle-outp~rtnature of the basic regulating control problem in each case, with dynamic response typically dominated by a long magnetic flux (i.e., field) time constant. This long time constant is responsibleTor the slow dynamic response which characterizes the majority of generator regulating systems now in the field. As pointed out in the chapter, the introduction of power electronics provides access to generator control variables which can circumvent thle limitations imposed by the flux time constant, leading to significant improvements in the regulator's dynamic response and other performance characteristics. Such advances have already had a significant impact in many applications, and work is continuing in many locations to extend these techniques to achieve further improvements in generator control performance and economics.
References [I] Barnes, H. C., Oliver, J.A., Rubenstein, A.S., and Temoshok, M., Alternator-rectifier exciter for cardi-
nal plant, IEEE Trans. Power Appar. Syst., 87, 11891198,1968. [2] Berdy, J., Crenshaw, M.L., andTemoshok, IM., Protection of large steam turbine generators during abnormal operating conditions, Proc. CIGRE In t. Con$ on Large High Tension Electric Systems, Paper No. 11-05, 1972. [3] Blaschke, F. and Bayer, K. H., Die Stabilitat der Feldorientierten Regelung von Asynchron-Maschinen, Siemens Forsck. Entwick. Ber., 7(2), 77-81, 1978. [4] Bobo, P.O., Carleton, J. T., and Horton, W.F., A new regulator and excitation system, AIEE Trans. Power Appar. Syst., 72, 175-183, 1953. [5] Bollinger, K.E. (Coordinator),PowerSystenv Stabilization via Excitation on Control, IEEE Tutorial Course Notes, Pub. No. 81 EHO 175-0PWR, IEEE Press, New York, 1981. [6] Brown, D. and Hamilton, E.P., Electromechanical Energy Conversion, Macmillan, New York, 1984. [7] Concordia, C., Synchronous Machines, John Wiley & Sons, New York, 1951. [8] Daumas, M., Ed., Histoire Gtntrale des ;Techniques, VO~. 3, 1978, p. 330-335. [9] De Doncker, R.W. and Novotny, D.W., Thle universal field orientedcontroller, IEEE Trans. Ind. Appl., 30(1), 92-100,1994. [lo] Fitzgerald, A.E., Kingsley, C., and Umans, S.D., Electric Machinery, McGraw-Hill, New York, 1983. [ 111 IEEE Committee Report, Computer representation of excitation systems, IEEE Trans. Power Appar. Syst., 87, 1460-1464,1968. [12] IEEE Standard 42 1-1972, Criteria and Definitions for Excitation Systems for Synchronous Machines, Insti&te of Electricaland Electronics Engineers, NewYork. 1131 IEEE Committee Report, Excitation system models for power system stability studies, IEEE Trans. Power Appar. Syst., 100,494-507,1981. [14] Kassakian, J.G., Schlecht, M.F., .and Verghese, G.C., Principles of Power Electronics, Addison-Wesley, Reading, MA, 1991. [ 151 Krause, P.C., Wasynczuk, O., and Sudhoff, S.D., A n 4 ysis of Electric Machines, IEEE Press, New York, NY, 1995.
THE CONTROL HANDBOOK 1161 Larsen, E.V. and Swann, D.A., Applying Power System
Stabilizers, Part 11. Performance Objectives and Tuning Concepts, Paper 80 SM 559-5, presented at IEEE PES Summer Meeting, Minneapolis, 1980. [17] McClymont, K.R., Manchur, G., Ross, R.J., and Wilson, R.J., Experience with high-speed rectifier excitation systems, IEEE Trans. Power Appar. Syst., 87, 1464-1470, 1968. [I81 Novotny, D., Gritter, D., and Studtman, G., Self-
excitation in inverter driven induction machines, IEEE Trans. Power Appar. Syst., 96(4), 1117-1 183, 1977. [19] Profumo, F., Griva, G., Pastorelli, M., Moreira, J., and
De Doncker, R., Universal field oriented controller based on air gap sensing via third harmonic stator voltage, IEEE Trans. Ind. Appl. 30(2), 448-455, 1994. [20] Rubenstein, A.S. and Walkey, W.W., Control of reactive kVA with modern amplidyne voltage regulators, AIEE Trans. Power Appar. Syst., 76,961-970, 1957. [21] Sarma, M., Synchronous Machines (Their Theory, Stability, and Excitation Systems), Gordon and Breach, New York, NY, 1979. [22] Say, M.G., Alternating Current Machines, 5th ed., Pitman, Bath, U.K., 1983. [23] Sen, P., Thyristor D C Drives, John Wiley & Sons, New York, 198 1. [24] Slemon, G.R. and Straughen, A,, Electric hfachines, Addison-Wesley, Reading, MA, 1580. [25] Tissandier, G., Causeries Sur La Science, Librairie Hachette et Cie, Paris, 1890. [26] Whitney, E.C., Hoover, D.B., and Bobo, P.O., An electric utility brushless excitation system, AZEE Trans. Power Appar. Syst., 78, 1821-1824, 1959. [27] Wildi, T., Electrical Machines, Drives, and Power Systems, Prentice Hall, Englewood Cliffs, NJ, 199 1.
Control of Electrical Power
Harry G. Kwatny Drexel Universiry
Claudio Maffezzoni PoIitecnico Di Miluno
John J. Paserba, Juan J. Sanchez-Gasca, and Einar V. Larsen GE Power Systems B?gineerit~g,Schenectudy, NY
79.1 Control of Electric Power Generating Plants.. ......................1453 Introduction Overview of a Power Plant and its Control Systems ' Power Plant Modeling and Dynamical Behavior Control Systems: Basic Architectures Design of a Drum Level Controller References .................................................................... 1481 Further Reading .............................................................1482 79.2 Control of Power Transmission ...................................... 1483 Introduction Impact of Generator Excitation Control on tlhe Transmission System Power System Stabilizer (PSS) Modulation on Generator Excitation Systems Practical Issues for SupplementalDamping Controls Applied to Power Transmission Equipment โข Examples Recent Developments in Control Design Summary References.. .................................................................. 1494
79.1 Control of Electric Power Generating Plants Harry G. K w a t n y , DrexeI U n ~ v e r s l t y C l a u d i o Maffezzoni, Pol~tecnlco Dl M~lano
791.1 Introduction This chapter provides an overview of the dynamics and control of electric power generating plants. The main goals are to characterize the essential plant physics and dynamical behavior, summarize the principle objectives of power plant control, and describe the major control structures in current use. Because of space limitations the discussion will be limited to fossil-fueled, drum-type steam generating plants. Much of it, however, is also relevant to once-through and nuclear powered plants. The presentation is organized into four major sections. Section 79.1.2 provides a description of a typical plant configuration, explains in some detail the specific objectives of plant control, and describes the overall control system architecture. The control system is organized in a hierarchy, based on time scale separation, in which the highest level establishes set points for lower level regulators so as to meet the overall unit operating objectives. Section 79 1.3 develops somewhat coarse linear models which qualitatively portray the smdl signal process behavior, characterize the essential interactions among process variables, and can be used to explain and justify the traditional regulator architectures. They are also useful for obtaining initial estimates of control system parameters which can then be fine-tuned using 0-8493-8570-9/96/$0.00+$.50 @ 1996 by CRC Press, Inc.
more detailed, nonlinear simulations of the plant.
The configurations commonly used in modern power plants for the main process variables are described in Section 79.1.4. These include controllers for pressure and generation, evaporator (drum level) temperature, and combustion control. The discussion in Section 79.1.4 is mainly qualitative, based on the understanding of plant behavior developed in Section 79.1.3.
Once a control configuration is chosen, the various compensator design parameters are established by applying analytical control design methods combined with extensive simulation studies. Because space is limited, it is not possible to provide such an analysis for each of the plant subsysteims. However, in Section 79.1.5 we do so for the drum level regulator. Drum level control is chosen for illustration because it is particularly important to plant operation and because it highlights the difficulties associated with low load plant dynamics and control. There are many important and outstanding issues regarding automation at low load steam generation levels. In practice, most plants require considerable manual intervention when maneuvering at low load. The most important concerns relate to the evaporation process (the circulation loop) and to the co~nbustionprocess (furnace). Section 79.1.5 revisits the circulation loop, examines the behavioral changes that take place as generation level is reduced, and explains the consequences for control.
THE CONTROL HANDBOOK
79.1.2 Overiiew of a Power Plant and its Control Systems Overall Plant Structure A typical power plant using fossil fuel as its energy source is organized into three main subsystems, corresponding to the three basic energy conversions taking place in the process: the steam generator (SG) (or boiler), the turbine (TU) integrated with the feed-water heater train, and the electric generator (EG) (0. alternator). The SG converts the chemical energy available in the fuel (either oil, or natural gas or coal) into internal energy of the working fluid (the steam). The TU transforms the internal energy of steam flowing from the SG into mechanical power and makes it available at the shaft for the final conversion into electrical power in the EG. The interactions among the principal subsystems are sketched in Figure 79.1, where only the mass and energy flows at the subsystem's boundaries are displayed. The overall process can be described as follows: the feed-water coming from the feed-waterheater train enters the SG where, due to the heat released by fuel combustion, ShS is generated and admitted into the HPT through a system of control valves (TV-hp). Here, the steam expands down to the reheat pressure, transferring power to the HPT shaft, and is discharged into a steam reheater (part of SG) which again superheats the steam (RhS). RhS is admitted into the RhT through the control valve TV-rh, normally working fully open; the steam expands successively in RliT and LPT down to the condenser pressure, releasing the rest of the available power to the turbine shaft. Condensed water is extracted from the condenser and fed to low-pressurefeed-water heaters, where the feed-water is preheated using the steam extractions from RhT and LPT. Then the pressure is increased to its highest value by FwP and the feed-water gets its final preheating in the highIpressure feed-water heaters using steam extractions from HPT and RhT. The mechanical power released by the entire compound turbine is transferred to the EG, which converts that power into electricalpower delivered to the grid via a three-phase line. The control objectives in such a complex process can be synthesized as follows: transferring to the grid the demanded electrical power P, with the maximum efficiency,with the minimum risk of plant trip, and with the minimum consumption of equipment life. As is usual in process control, such a global objective is transformed into a set of simpler control tasks, based on two principal criteria: ( 1) the outstanding role of certain process variables in characterizing the process efficiency and the operating constraints; (2) the weakness of a number of process interactions, which permits the decomposition of the overall process into subprocesses. Referring to Figure 79.1, we observe that the EG, under normal operating conditions, is connected to the grid and is consequently forced to run at synchronous speed. Under those conditions it acts as a mechanical-electricalpower converter with almost negligible dynamics. So, neglecting high frequency, we may assume that the EG merely implies P, = Pm (where P,,,is the mechanical power delivered from the turbine). Of course, the EG is equipped
with its own control, namely voltage control, which has totally negligible interactions with the control of the rest of the system (it works at much higher bandwidth). Moreover, the turbines have very little storage capacity, so that, neglecting high frequency effects, turbines may be described by their steady-state equations:
~ P L are ~ the mechanical power released by the where P H and HPT and the RhT and LPT, respectively. WT is the ShS mass flowrate, h~ the corresponding enthalpy, hrR the steam enthalpy at the HPT discharge, WR is the RhS mass flow-rate, hR the corresponding enthalpy, ho the fluid enthalpy at the LPT discharge, a n d a ~CYR , are suitable constants (5 1) accounting for the steam extractions from the HPT and the RhT and LPT, respectively. With the aim of capturing the fundamental process dynamics, one may observe that the enthalpy drops (hT - htR) and (hR ho) remain approximately unchanged as the plant load varies, because turbines are designed to work with constant pressure ratios across their stages, while the steam flowvaries. This means that the output power Pm consists of two contributions, P H ~ and P L P ,which are approximatelyproportional to the ShS flow and to the RhS flow, respectively. In turn, the flows W T and w~ are determined by the stale of the SG (i.e., pressures and temperatures) and by the hydraulic resistances that the turbines (together with their control valves) present at the SG boundaries. Steam extractions (see Figure 79.1) mutually influence subsystems SG and TU: any variations in the principal steam flow WT create variation in SE flow a n 4 consequently, a change in the feed-water temperature at the inlet of the SG. Feed-water mass flow-rate, on the contrary, is essentially imposed by the FwP, which is generally equipped with a flow control system which makes the FwP act as a "flow-generator". Fortunately, the overall gain of the process loop due to the steam extractions is rather small, so that the feed-water temperature variations may be considered a small disturbance for the SG, which is, ultimately, the subprocess where the fundamental dynamics take place. In conclusion, power plant control may be studied as a function of steam generator dynamics with the turbine flow characteristics acting as boundary conditions at the steam side, the feed-watermass flow-rate and the feed-water temperature acting as exogenous variables, and Equations 79.1,79.2, and 79.3 determining the power output. To understand the process dynamics, it is necessary to analyze the internal structure of the SG. In the following, we will make reference to a typical drum boiler [ l ] ; once-through boilers are not considered for brevity.
Typical Structure of a Steam Generator A typical scheme of a fossil-fueled steam generator, in Figure 79.2 depicts the principal components. In Figure 79.2, the air-gas subsystem is clearly recognizable; the combustion air is sucked in by the fan (1) and conveyed
79.1. CONTROL OF ELECTRIC POWER GENERATING PLANTS RhS
:
Fuel
LPT
Air
GRID se I
FEEDWATER
FEED-WATER
I
HEATERS' TRAIN
Subsystenns interaction. RhS = Reheated steam;StR = Steam to reheat, ShS = Superheated steam; HPT = High pressure turbine; RhT = Reheat turbine; LPT = Low pressure turbine; se = Steam extraction; ExP = Extraction pump; FwP = Feed-water pump; TV-hp = Turbine valve, high pressure; TV-rh = Turbine valve, reheated steam. Figure 79.1
through the air heaters (2) (using auxiliary steam) and (3) (exchanging heat counter flow with the flue gas leaving the furnace backpass) to the furnace wind box (4), where air is distributed to the burners, normally arranged in rows. Fuel and air, mixed at the burner nozzles, produce hot combustion gas in the furnace (S), where heat is released, principally by radiation, from the gas (and the luminous flame) to the furnace walls, usually made of evaporating tubes. The hot gas releases almost 50% of its available heat within the furnace and leaves it at high temperature; the rest of the internal energy of the hot gas is transferred to the steam through a cascade of heat exchangers in the back-pass of the furnace ((6) and (9) superheat the steam to high pressure, while (7) and (8) reheat steam and, at the end of the backpass, to the feed-water in the economizer (10). The gas is finally used in a special air heater (3) (called Ljungstroem) to capture the residual available energy. The flue gas is conveyed to the stack (12), possibly through induced draft fans (1I), which are employed with coal-fired furnaces to keep the furnace pressure slightly below the atmospheric pressure. The heat exchangers making up the furnace walls and the various banks arranged along the flue-gas path are connected on the steam side to generate superheated steam; this can be split into four subprocesses; water preheating, boiling, superheating, and reheating. The flow diagram of the water-steam subsystems is shown in Figure 79.3, where common components are labeled with the same numbers as in Figure 79.2. In the scheme of Figure 79.3, the evaporator is the natural circulation type (also called drum-boiler). It consists of risers, the tubes forining the furnace walls where boiling takes place, and the steam separator (dlrum),where the steam-watermixture from
the risers is separated into dry steam (flowing to superheating) and saturated water which, after mixing with feed-water, feeds the downcomers. There are two special devices (called spray desuperheaters), one in the high-pressure section, the other in the reheat section, which regulate superheated and reheated steam temperatures. Process dynamics in asteam boiler is determined by the energy stored in the different sections of the steam-water ,system, especially in the working fluid and in the tube walls containing the fluid. Storage of energy in the combustion gas is practically negligible, because hot gas has a very low density. For those reasons, it is natural to focus on the steam-water subsystem, except for those special control issues where (fast) combustion dynamics are directly involved.
Control Objectives A generation unit of the type described in Figures 79.1, 79.2, and 79.3 is asked to supply a certain power o ~ ~ t pP,, u t that is (see Equations 79.1-79.3) certain steam flows to the turbines, while insuring that the process variables determining process efficiency and plant integrity are optimal. Because efficiency increases as the pressure and the temperature at the turbine inlet (i.e., at the throttle) increase, whereas stress on machinery goes in the opposite direction, the best trade-off between steam cycle efficiency and plant life results in prescribing certain values to throttle pressure p~ and temperature TT and to reheat temperature TR (reheat pressure is not specified because there is no throttling along the reheating section under normall conditions). Moreover, proper operation of the evaporation section requires correct steam separator conditions, meaning a specified
THE CONTROL HANDBOOK
Figure 79.2 Typical scheme of the steam generator. (1) Air fan; (2) Auxiliary air heater; (3) Principal air heater; (4) Wind box; (5) Furnace (with burners); (6) High-temperature superheater ;(7) High-temperature part ofthe reheater; (8) Low-temperature part 6fthe reheater; (9) Low-temperature superheater; (10) Economizer ;(11) Flue-gas fan (present only with balanced draft furnace); (12) Stack.
SPRAY DESUPERHEATERS
J\IC Figure 79.3
Steam-water subsystem.
= heat exchanger
I
79.1. CONTROL OF ELECTRIC POWER GENERATING PLANTS water level y g rn the drum. Overall efficiency is substantially affected by combustion quality and by the wastr of energy in the flue gas. Because operating conditions also have environmental impact, they are controlled by properly selecting the air-to-fuel ratio, which depends on the condition of the firing equipment (burners etc.). In coal fired units, the furnace needs to be operated slightly below atmospheric pressure to minimize soot dispersion to the environment. Furnace pressure requires careful control, integrated with combustion control. Early control systems gave a static interpretation of the above requirements, because the process variables and control values were set at the design stage based on the behavior expected. More recent control systems allow some adaptation to off-design conditions experienced in operation. This produces a hierarchically structured control system, whose general organization is shown in Figure 79.4. In the scheme of Figure 79.4 are three main control levels: The unit control level, where the overall unit objective in meeting the power system demand is transformed into more specific control tasks, accounting for the actual plant status (partial unavailability of components, equipment stress, operating criteria); the decomposition into control subtasks is generally achieved by computation of set points for the main process variables. The principal regulation level, where the main process variables are controlled by aproper combination of feedforward (model based) and feedback actions. Decoupling of the overall control into independent controllers is based on the special nature of the process. The dependent loop level, where the physical devices allowing the modulation of basic process variables are controlled in a substantially independent manner with a control bandwidth much wider than the upper level regulation. These loops are the means by which the principal regulations may be conceived and designed to control process variables (like feedwater flow) rather than acting as positioning devices affected by sensitive nonlinearities (e.g., the Voigt speed control of the feed-water pump). The tendency to decentralize control actions is quite common in process control and should be adopted generally to allow system operability. To avoid conflict with overall unit optimization, most recent control systems have extended the role of the unit coordinator, which does not interfere with the individual functionality of lower loops, but acts as a set point computer finding the optimal solutiorl within the operation allowed by plant constraints. When assessing control objectives, one of the crucial problems is to define system performance. For power plants, one needs to clarify the following: e
the kinds of services the unit is required to perform,
usually defined in terms of maximal rate for large ramp load variations, the participation band for the power-frequency control of the power system, and the maximum amplitude and response time for the primary speed regulation in case of contingencies; the maximal amplitude of temperature fluctuations during load variations, to limit equipment stress due to creep or fatigue; maximal transient deviation of throttle pressure and drum level, to avoid potentially dangerous conditions, evaluated for the largest disturbanc~es(e.g., in case of load rejection). There are a few control issues still debated. The first is whether it is more convenient to operate the unit at fixed pressure (nominal constant pressure at the throttle), to let the pressure slide with a fully open turbine control valve, or to operate the unit with controlled sliding pressure. The second is the question of how much pressure variation during plant transients should be considered deleterious to some aspect of plant performance. The two questions are connected, because adopting a pure sliding pressure control strategy contradicts the concept of pressure control. Consider the first issue. When the turblne loacl (i.e., the steam flow) is reduced, steam pressures at the different turbine stages are approximately proportional to the flow. Therefore, it is natural to operate the turbine at sliding pressure. On the other hand, the pressure in the steam generator is the index of the energy stored in the evaporator. Because drum boilers have a very large energy capacitance in the evaporator, bo~lerpressure is very difficult to change. Therefore, sliding pressure In the boiler affects load variation slowly (not the case of once-through boilers). The best condition would be to keep the pressure fixed In the boiler while sliding pressure in the turbine: this strategy would require significant throttling on the control valves with dramatic loss of efficiency in the overall steam cycle. The most popular strategy for drum boiler power plants is, therefore, controlled sliding pressure, where the boiler is operated at constant pressure above a certain load (this may be either the technical minimum or 5060% of the MCR)' and pressure is reduced at lower loads. To insure high efficiency at any load, the turbine is equipped with a control stage allowing partial arc admission (i.e., with the control valves always opening in sequence to limit the amount of throttling). The second issue is often the source of misleading design. There is no evidence that loss of performance is directly related to transient pressure deviations within safety limits, which may be very large. On the other hand, it has been demonstrated [ 2 ] that, because throttle pressure control can cause furnace overfiring, too strict pressure control can substantially disturb steam temperature. In the following, we will consider the most common operating condition for a drum boiler, that is, with throttle pressure con-
'MCR = Maximum Continuous Rate
THE CONTROL HANDBOOK POWER SYSTEM DEMAND
1- '
UNIT CONTROL LEVEL
STATE OF LIMITS EVALUATION MEASUREMENTS I
PRINCIPAL REGULATIONSLEVEL
PRESSURE CONTROL
(EXCESS AIR. FURNACE PRESSURE) I
CONTROL
'
I
I
I
--
DEPENDENT L ~ P LEVEL S
r
I
/
,
I
I I
CONTROL
I
TURBINE
I
BOILER PROCESS LEVEL
Figure 79.4
I
Hierarchical organization of power plant control.
trolled at a constant value during load variation, with the main objective of returning pressure to the nominal value within a reasonable time after the disturbance (e.g., load variation), while strictly insuring that it remains within safety limits (which may also depend on the amplitude of the disturbance).
79.1.3 Power Plant Modeling and Dynamical Behavior Models for Structural Analysis Investigating the dynamics of power plants [3] requires detailed models with precise representation of plant components. These models are generally used to build plant simulators, from which control strategies may be assessed. Large 'scale models are generally based on first principle equations (mass, momentum, and energybalances), combined with empiricalcorrelations (like heat transfer -rre!ations), and may be considered as knowledge models, i.e., models through which process dynamics can be thoroughly ascertained and understood. Knowledge models are the only reliable way, beside extensive experimentation, to learn
about power plant dynamics, in particular, the many interactions amQng process variables and their relevance. Today power plant simulators are broadly accepted: overall control system testing and tuning has be= carried out successfully with a real-time simulator of new and novel generating unit design [4].These detailed models are built by considering many "small" fluid or metal volumes containing matter in h o h o g e ~ neous conditions and writing balance equations for each volume. The resulting system can include from hundreds up to some thousands of equations. A different kind of model has proved very helpful in establishing and justifying the basic structure of power plant control systems. Only first-cut dynamics are captured revealing the essential input-output interactions. These models, called interpretation models, are based on extremely coarse lumping of mass and energy balances,whose selection is guided by previous knowledge of the fundamental process dynamics. Interpretation models are credible because they have been derive4 from and compared with knowledge modelsand should be considered as useful tutorial tools to explain fundamental dynamics. Because the scope of this pre-
79.1. CONTROL 01:ELECTRIC POWER GENERATING PLANTS sentation is modeling to support control analysis, only simple interpretation models will be developed. However, these simple models are not useful for dynamic performance evaluation of control systems, because the dynamics they account for are only first order approximations. Nevertheless, they account for gross process interactions, into which they give good qualitative insight. We may start developing the process model by referring to the considerations on the overall plant features presented in Section 79.1.2 and summarized in Figure 79.5. According to Section 79.1.2, the effect of the feed-water cycle on the main steamwater subsystem (SWS) variables is accounted for by including the feed-watertemperature (or enthalpy) as a disturbance among the inputs of the SWS. There are some drastic simplifications in the scheme o i Figure 79.5. Thermal energy released from the hot gas to the SWS walls is not totally independent of the SWS state (i.e., of wall temperatures); there is almost full independence of Q,, (because heat is transferred by radiation from the hot combustion gas). QSH and QRH are more sensitive to the wall temperature because the temperature of the combustion gas , is progressively decreasing; even more sensitive is Q ~ c owhere the gas temperature is quite low. However, because the economizer definitely plays a secondary role in boiler dynamics, the scheme of Figure 79.5 is substantially correct. The dominant inputs affecting the fhermal energy transferred to the SWS are the fuel and air flows and other inputs to the combustion system. In Section 79.1.2, it was also noted that the SWS dynamics (due to massive storage:; of mass and energy) are far slower than combustion and air-gas (C&AG) dynamics; for that reason C&AG dynamics are negligible when the response of the main SWS variables is considered C&AGdynamics are only relevant for specific control and dynamic problems regarding either combustion stability (relevant at low load in coal fired furnaces) or the control of the furnace pressure pg (of great importance to protect the furnace from implositon in case of fuel trip). Then, in most control problems, we may consider the C&AG system together with its feeding system as a nondynamic process segment whose crucial role is determining energy release (and energy release partition) to different sections of the SWS. In this regard, it is important to identify how C&AG inputs may be used to influence steam generation. Increasing the total fuel flow into the furnace will simultaneously increase all heat inputs to the different boiler sections; air flow is varied in strict relation to fuel flow so as to insure the "optimal" air-to-fuel ratio for combustion. Because of the nonlinearity of heat transfer phenomena, Q E c , Q E V ,QSH, and Q R Hdo not vary in proportion to the fuel input, i.e., while varying the total heat input to the boiler, the partition of heat rt:lease is also changed. For instance, when the fuel input is increased, the heat input to the evaporator QEV(that is released in the furnace) increases less than the other heat inputs (QEc, QSH,and Q R H )which are released in the upper section and backpass of the furnace. Thus, while raising the steam gen) ,superheating and eration (roughly PI-oportionalto e ~ ~steam reheating would generally increase if proper corrective measures were not applied to rebalance heat distribution. Those measures
1459
are usually viewed as temperature control devices, because they allow superheating and reheating control in off-design conditions. One type of control measure acts on the C M G system: the most popular approaches are 1) the recirculation of combustion gas from the backpass outlet to the furnace bottom, 2) tilting burners for tangentiallyfired furnaces, and 3) partitioning of the backpass by a suitable screen, equipped with gas dampers to control the gas flow partition between the two branches. The first two approaches influence the ratio between QEV and the rest of the heat release. The last varies the ratio between QSH and QRH. The second type of temperature control measure acts on the SWS. This is the spray desuperheaters (see Figure 79.3), which balances heat input variations by injecting spr:~ywater into the superheating and the reheatingpath. Although superheater spray does not affect the global efficiency, reheater spray worsens efficiency so that it is only used for emergency control (when it is not possible to keep reheater temperature below a limit value by other control measures). Typical drum boiler power plants provide desuperheating sprays in the high pressure and reheat parts of the SWS and, in addition, gas re circulation^ as a " normal" means of controlling reheat temperature. From this discussion, it should also be clear that modulation of heat input to the furnace and variation of recirculation gas flow simultaneously affect all process variables, because they influenceheat release to all sections of the SWS. This is the principal source of interactions in the process. Because the air-to-fuel ratio is varied within a very narrow range to optimize combustion, we may assume that the boiler heat rate is prolportional to w f . Thus, to analyze the SWS dynamics, we may consider wf and the recirculation gas mass flow-rate wrg as the equivalent inputs from the gas side, because they determine the heat transfer rates Q E ~ QEV, , QSH,and Q R H .
Pressure Dynamics A very simple interpretation model of evaporator dynamics can be derived with the following "drastic" assumptions: 1. The fluid in the whole circulation loop (drum, risers, and downcomers) is saturated. 2. The metal walls in the entire circulation loop are at saturation temperature. 3. The steam quality in the risers is, at any tirne, linearly dependent on the tube abscissa. 4. The fluid pressure variations along the circulation loop can be neglected for evaluatingmass and energy storage. The first three assumptions can be roughly considered lowfrequency approximations because, excluding rapid pressure variations, water subcooling at the downcomers' inlet is very small. Moreover, because of the very highvalue of the heat trans, fer coefficient in the risers (in the order of 100 k ~ l m ' ~ )the metal wall very quickly follows any temperature variation in the fluid. Finally, steam quality is nearly linear at steadystate beca the heat flux to the furnace wall is evenly distributed. The
THE CONTROL HANDBOOK
P
OTHER INPUTS OF C&AG
c> COMBUSTION
QECO.
&
FEEDING-
CONTROL
STEAM
AIR-GAS
WATER
PATH
CONTROL VARIABLES
:
[
~
(C&AG)
~
QSH
~
QRH
~
SUBSYSTEM
, (Sws)
OUTHER INPUTS OF SWS
TT TR
*
Input-output structure of the process. wa = air mass flow-rate; w f = fuel mass flow-rate;pg = gas pressure in the furnace; Q E C O= thermal energy to Economizer; Q E = thermal energy to Evaporator; Qsff = thermal energy to SuperHeaters; QRH= thermal energy to ReHeater. Figure 79.5
, assumption
is based on the fact that pressure differences along the loop (which are essential for circulation) are of the order of 1% of the absolute fluid pressure in the drum so that we may identify the pressure in the evaporator with the pressure p ~ in the drum. Then the global energy bdance and the global mass balance in the evaporator are
p ~because , about 50% of the energy is stored in the metal walls, which are insensitive to a. Equation 79.6 can be rewritten approximately as
where
~ E $ , ( P D ,6) ~ P D ' E being the nominal void fraction. Equation 79.7 yields the fundamental dynamics of drum pressure and justifies the popular claim that drum pressure is associated with the stored evaporator energy. Equation 79.7 may be usefully rewritten in normalized per unit (p.u.) form, i.e., referring energy and pressure to the nominal conditions, Q>, = w; ( h v s ( p b ) - hg ), where the superscript O denotes nominal value: CEV =
where ww is the feed-watermass flow-rate (mfr), w v is the steam mfr at the drum outlet, h is the water enthalpy at the economizer outlet, h v s ( p ) is the steam saturation enthalpy at the pressure p, Q E V is the heat-rate to the evaporator, E E v is the total energy stored in the evaporator (fluid and metal of drum, downcomers and risers), and ME" is the total mass stored in the evaporator. E E v and M E V ,beside obvious geometrical parameters like volumes, depend on two process variables: the pressure p ~ and the void fraction a in the evaporator, defined as the ratio of the volume occupied by steam and the totalvolume of the evaporator. A better index for the mean energy level in the evaporator is obtained by subtracting the total mass multiplied by the inlet enthalpy h from Equation 79.4.
Noting that h E is subject to limited and veryslowvariations (the economizer is a huge heat exchanger exploiting a limited temperature difference between flue gas and water), so that d h E / d t is usually small, Equation 79.6 can be interpreted byintroducing the net energy storage in the evaporator, EEv := E E v - h ~ M E v : the difference between the input heat transfer rate Q E v and the powerspent for steam generation, Psg:= w v [ h v s ( p D ) - h E ] , in transient conditions, is balanced by the storage of the net energy E i v in the evaporator. Moreover, whereas the mass MEV depends mainly on a, the net energy E g v depends mainly on
with the subscript n denoting thevariable expressed in p.u.. Typical values for the normalized "capacitance" t~v are 200-300 sec. It may also be observed that s ~ isva function of the drum pressure p ~ and is roughly inversely proportional to the pressure; thus, pressure dynamics will slow down while reducing the operating pressure. Although for the evaporator h E can be considered a slowly varying exogenous variable, w v depends on the drum pressure p ~ and on the total hydraulic resistance opposed by the cascade of superheaters and turbine. Let's first characterize the turbine. For simplicity, assume that the turbine control valves are governed in full arc mode (i.e., with parallel modulation). Then the control stage of the turbine (generally of impulse type) can be viewed as the cascade of a throttle valve and a nozzle. This implies
79.1. CONTROL OF ELECTRIC POWER GENERATING PLANTS where C v ( x ) is the flow coefficient of the control valve set (dependent on the valve's position x ) , p ~ is the steam density at throttle, p ~ is the valve outlet pressure, xv ( p ) is a suitable function of the valve pressure ratio, K N is a nozzle flow constant, p ~ the density at the nozzle inlet, p' the pressure at the nozzle outlet, and X N a function similar to X V . Because the H P T consists of many cascaded stages, the pressure ratio ( p ' I p N )across the control stagewill remain nearly constant with varying flow U J T . Then X N ( p ' / p N = constant. Bearing in mind that superheated steam behaves like an ideal gas and that valve throttling is an isenthalpic expansion, so that pN/pT p ~ / p * eliminate , the pressure ratio p ~ / by p di~ viding Equation 79.9 by Equation 79.10. The ratio p N / p r is a monotonic functlon of Cv ( x ) . Substituting this function in Equation 79.9 results in
1461 assuming that the unit is operated (at leiast in the considered load range) at constant throttle pressure, PT at any steady state equals the nominal pressure P;; the unit is usually operated at constant throttle temperature (TT T;, so that the temperature profile along the superheaters does not change significantly; we may therefore assume that the mean superheating temperature T s H ,at any steady state, equals its value in nominal conditions T i H ; based on Equations 79.1-79.3 and the rlelated remarks, the load L (in p . ~ . )of the plant at any st~ady state equals the ratio between the steam flaw W T and its nominal value w: .
+
Moreover, the following assumptions are mad'e: (a) superheated steam behaves like an ideal gas: where f T ( . ) is a rr~onotonicfunction of its argument. Equation 79.11 says that the cascade of the turbine and its control valves behave like a choked-flow valve with a "special" opening law f; ( x ) . Even when the turbine coritrol valves are commanded in partial-arc mode (i.e., with sequential opening), one arrives at a flow equation of the same type as Equation 79.11, but differently dependent on the valve opening command signal x. To summarize, the HP'T with its control valves determines a boundary condition for the steam flow given by Equation 79.11; with typical opening strategies of full-arc and partial-arc admission, the global flow characteristic f; (x) looks l i e that in Figure 79.6. A more elaborate characterization of sequentially opened valves may be found in 171. To obtain flow conditions for w v instead of W T (i.e., at the evaporator outlet), flow through superheaters must be described (see Figure 79.3). First-cut modeling of superheaters' hydrodynamics is based on the following remarks: 1. Mass storage in the superheaters is very limited (as compared with energy storage in the evaporator) because steamhas low density and desuperheating spray flow wd, is small compared with w v . Thus, head losses along superheaters may be computed with the approximation w v x W T . 2. Head losses in the superheaters develop in turbulent flow, so that
where R is the gas constant; (b) in nominal conditions, desuperheating spray mass flow-rate is zero: w;, = 0' so that wi, = w ; . The model will be expressed in p.u. variables by defining
for any pressure p, Sw = A w / w ;
for any mass flow-rate w ,
for any enthalpy h ,
ST = h T / T O for any temperature T, and
(" denotes, as usual, nominal conditions). Then Equations 79.8, 79.11, and 79.12 yield the following linearized system: 'Ev SPD
where p s is~ a mean density of the superheated steam and a k S H constant. Equations 79.11 and 79.12 can be combined with Equation 79.7 or Equation 79.8 to build a simple model of the fundamental pressure dynamics. To this end, we derive a linearized model for small variations about a given steady state condition, identified as follows:
= SQev - @ S W V
S ~ T=
where
+L 6 h ~
+L B I S ~ D ~
S Y + L S ~ T , and
THE CONTROL HANDBOOK
(a) Full arc command mode
Figure 79.6
(b) Partial-arc command mode (with 4 control valves)
Flow characteristic of HPT.
Let's analyze the scheme of Figure 79.7, bearing in mind that 0 5 a! < 1, 0 < 5 0.1 (for p i > 30 bar, 0.9 < p p c 1 , .O < py < 0.1 and, of course, L* 5 L _i 1, where L is the minimal technical load (typically L* m 0.3). We may observe that the pressure dynamics are characterized by a time constant t p ,given by
is the turbine "admittance", and hE is the value of h~ at the linearization steady state. Note that y o is usually about 0.05 or less, g E undergoeslimited variations (a isvery close to I), B1 is positive for p i > 30 bar and is generally small because h v s ( p ) is a fiat thermodynamic function. Moreover temperature variations 6TsH are generally slow ~ ~negligible. and limited amplitude, so that L ~ S isTtotally Then, pressure dynamics may be approximately represented by the very simple block diagram of Figure 79.7, where Pp =
that is, the ratio between the evaporator energy capacity and the load. Thus, the open-loop response of the pressure to exogenous variables slows down as the load decreases. t' Neglecting the effectsof the small disturbances 6 h and ~ 6wds, a natural way to follow plant load demand in the fixed-pressure operating strategy is to let the turbine admittance 6Y vary according to the load demand ALd and let the heat transfer rate 6Qev vary so as to balance the power spent for steam generation, i.e., a6 V v . This means that
1
1
+ 2y0L2
and As a consequence, 8pd = 0
There is a seeming inconsistency in Figure 79.7, because the variable Y is considered as an input variable, but its definition, Y := w T / ~ Timplies , that it depends on the control variable x and also on the throttle temperature TT. However, it is a common practice to equip the turbine control valve with a wide band feedback loop of the type shown in Figure 79.8. Because valve servomotors today are very fast and no other Iags are in the loop, at any frequency of interest for the model of Figure 79.7, Y G ?. So the turbine admittance actually becomes a control variable and the loop of Figure 79.8 serves two complementary purposes: first, it linearizes the nonlinear characteristics of Figure 79.6 and, second, it rejects the effect of temperature fluctuations on the steam flow to the turbine, thereby decoupling pressure dynamics and temperature dynamics.
and
SWT = SY(1- L p y ) .
Because of the head losses along the superheaters ( p y # O), the strategy expressed by Equation 79.18, keeping the energy ) in the evaporator, actually determines storage (i.e., p ~constant a drop -pySY of the . throttle pressure p~ and, consequently, a reducedpower output (6wT = 6Y(1 - L p y ) c A L D ) . Inother words, if one wants to keep the throttle pressure p r constant when the load is increased, the energy storage in the evaporator also needs to be slightly increased because of the head losses:
79.1. CONTROL OF ELECTRIC POWER GENERATING PLANTS
Figure 79.7
Block diagram of the linearized pressure dynamics.
VALVES' POSITION
Y
PROCESS ---
Figure 79.8
- --- - -
Turbine admittance feedback loop.
In Figure 79.7, observe that only Equation 79.19 implies a boiler overfiring, i.e., a transient extra fuel supply during load increase to adjust the energy stored in the evaporator. If the feedback compensation Equation 79.18 is applied to the boiler, the pressure dynamics become slightlyunstable because of the "intrinsic" positive feedback due to PI. The same result happens if the feedback lioop of Figure 79.8 is realized, as is sometime the case, not as an admittance loop but as a simple mass flow-rate loop (i.e., omitting dividing by p ~ )Thus, . when applying either mass flow-rate feedback, boiler stabilization must be provided.
Drum Level Dynamics Computing drum pressureby thescheme of Figure 79.7, we return to Equation 79.5 that establishes the global mass balance in the evaporator. Equation 79.5 may be linearized as
where
1 KFLR
and a, =
~ ( P Ls 8vs) w;
V is total fluid volume in the evaporator, p ~ and s p v s are the liquid and vapor densities as functions of the pressure, and the upper script - denotes the steady state of linearization. If the unit is operated at constant pressure, a, is independent of the load, and up only slightly dependent. However, Equation 79.20 determines only the globalvoid fraction a,while the relevant variable for the control is the level in the drum. We may write
where Vr and V D are the volumes of the risers and of the drum, respectively, and or, and ad are the separate void fractions relative to V, and V D . If we assume that, at the considered steady state, the level y ~ is equal to the drum radius R D , then,
1464
THE CONTROL HANDBOOK
with
Combining Equations 79.20, 79.21, and 79.22, the following equation is obtained:
and k ,
=
n Vr 2 VD
Equation 79.23 shows that the drum level is subject to three different kinds of variations: the first, of integral type, is due to the inibalance between feed-water flow and outlet steam flow; the second, of proportional type, arises because, even with constant stored mass, the mean fluid density depends on the evaporator pressure p ~the; third, more involved, comes from possible variations of the void fraction in the risers and might occur rapidly because any variation, Acr, immediately reflects onto SYD. We need to understand where Aa, comes from. Recall the assumptions at the beginning of Section 79.1.3, write equations similar to Equations 79.4 and 79.5 but limited to the circulation tubes (i.e., to the downcomers and the risers), and derive the "net energy" stored corresponding to Equation 79.6, where, instead of enthalpy h E , the inlet enthalpy h ~ ofs the downcomer tubes is used:
where ECt is the total energy (fluid + metal) stored in the circulation tubes, McI the corresponding fluid mass, h L s and h v s the liquid and vapor saturation enthalpies, and X , and w, the steam quality and the mass flow-rate at the risers' outlet. Then, based on assumption (3) stated at the beginning of Section 79.1.3, the following relationship is obtained:
To solve the model, we need to derive the circulation mass-flow rate w,, which is obtained from the momentum equation applied to the circulation tubes:
is the mean density in the risers and Cd,, Cr are suitable constants yielding the head losses in the downcomers and risers tubes, respectively. Equations 79.25 and 79.26 may be used to eliminate w, andx, from Equation 79.24; through trivial but cumbersome computations, the following linearized model can be obtained for Aa, (in L-transform form):
where srt is a normalized capacitance similar to T E V in Equation 79.8 but related only to the circulation tubes (typically srr % 0 . 7 v~),~T2 is a small time-constant (a few seconds) associated with the dynamics of the void fraction within the risers, and h2 and hl are suitable constants. The difference6 Q E - t r t s 6 p ~ is the heat transfer rate available for steam generation in the risers, given by the (algebraic) sum of the input thermal energy SQEv and the energy -T,[S p ~ released in the case of pressure decrease and corresponding to a reduction of the stored energy. The L-transformation of Equation 79.23 and substitution of Equation 79.27 give
where
The parameters of model Equation 79.28 are dimensionless, with the following typical values: t~ % 130 sec., k2 % 0.25, k l 0.5, T2 % 4 sec., rrt 0.7, ~ E V= 150 sec. (at nominal pressure), k,, % 0.8. Since T2 << rrr, the time constant Ti % 70 sec. is always positive and is essentially determined by the "capacitance effect" ( r r r ) .The model Equation 79.28 clearly accounts for the wellknown shrink and swell effect due to the nonminimum phase zero (1 - sTl. Because T2 is very small, any perturbation producing sudden pressure derivatives causes a sudden variation ofthe drum level in the direction opposite to the long-term trend. To this aim, referring to Figure 79.7, consider a step perturbation of SY, with, e.g., SY = A / s . Then
with p and T3 suitable constants. At nominal load (L = 1) and with typical values, ( p y = 0.1, a = 1, pp = 0.9, = 0.05, t ~ = v 200 sec.) p' Z 1.06, T3 Z 248 sec., p" 0.054, T4 Z 4600 sec. The effect of the step variation with A . = 0.1 is depicted in Figure 79.9. Equation 79.29 is a very useful model to conceive level control structure.
Reheat and Superheat Steamside Dynamics where Pr = PLS
-~
~ ( PL PS V S )
Superheaters and reheaters are large heat exchangers with steam flowing into the tubes and gas crossing the tube banks
79.1. CONTROL OF ELECTRIC POWER GENERATING PLANTS
1465
Tme (s) Figure 79.9
Drum level dynamic.
{n cross-flow. There are some general properties that are worth recalling: Figure 79.10
Conceptual scheme of temperature dyr~amics.
1. The heat transfer coefficient on the gas side is much
smaller that the one on the steam side, so that steady state behavior is nearly independent of steamside coefficients; 2. The dynamics of these heat exchangers are essentially due to the considerable energy stored in the metal wall, because flue gas has negligible density and steam ha:; much lower capacitance than the corresponding metal wall; 3. The mass stored creates much faster dynamics than energy stored, because only steam is involved. Property 3 can easily be checked bearing in mind that the fundamental time constant of mass storage is,
where Mv is the m a s of steam within the heat exchanger and w v the mass flow-rate flowing through it, whereas the fundamental time constant of energy storage is ,
where C , and C p are the specific heats of steam at constant volume and pressure, ,Wm and Cm the mass and the specific heat of metal wall. Becauie MmCm >> M,C, and Cp x 1.3C,, t ~ >s > MS. For a typical superheater t ~ issa few seconds, but ~ E is S more than 20 times t ~ s Then, . when considering temperature dynamics, which, because of property (1) are due to the metal wall capacitar~ce,mass storage may be neglected (i.e., one can consider the steam flow independent of the tube abscissa). Superheater or reheater outlet temperature To, is influenced by three different variables, the heat transfer rate Q, to the external wall, the steam flow w,, and the inlet temperature T,,. Pressure fluctuations within the heat exchanger have a limited influence on To, and may be neglected. For small variations the situation is described in Figure 79.10. It is relevant to control design to characterize the transfer functions G T (s), Gw(s) , and G Q (s). It is known [3]that adequate modeling of superheaters and reheaters requires a distributed parameters approach. However, reasonable lumped parameter approximation may be used for GT, G, , and G Q .
G , and G Q behave like first-order transfer functions, similar S to each other, dominated by a time constant not fair from ~ E and with gain essentially dependent on the gas-to-wall heat transfer coefficient. The function G T ( s )is, on the contrary, approximated well by:
where N is the integer nearest to S, y, /2wxcp (with Siand yi the steam-to-wall exchange surface and heat transfer coefficient)and p~ is less than 1. Typically, secondary superheaters are short heat exchangers with N x 2. Larger reheaters may have N x 3 - 4. Moreover, it appears that the process is nonlinear, because t ~ is nearly proportional to the inverse of the load. N is only slightly dependent on w, because yi = ~ 2 . ~ . The temperature dynamics are affected by mul1.iple lags, varying with theload. In addition, transducers for steam temperature are generally affected by a small (a few second) andl a larger (some tens of seconds) time lag due to the thermal inertia of the cylinder where the sensor is placed. Of course, multiple lags are in the loop when A,Ti, is the control variable. This is the case when desuperheatin~gspray is used to achieve mixing between the superheated steam at the outlet of the preceding component (e.g., the primary superheater) and the waler spray is modulated by a suitable valve. Because the attemperator has a very small volume, storage in it is negligible, and its equations are given by steady-state mass and energy balances (see Figure 79.11; referring to the superheating section):
w,
+ wds = W T and + wdshds = W T ~ , .
W V ~ U
(79.31) (79.32)
In normal plant operation, the steam flow w ~ in the secondary superheater is imposed (over a wide band) by the lload controller, h , is determined by the upstream superheater, a r ~ dhd, is nearly constant. The second superheater inlet temperature T, is given by the following variation equation:
s
THE CONTROL HANDBOOK FROM PRIMARY
TO SECONDARY SUPERHEATER
\ ! /
hds Figure 79.1 1
Power Generation According to Equations 79.1-79.3 and the subsequent remarks, it can be approximately assumed that
where
hv h i = ENTHALPIES
Desuperheating spray.
with the upper script " denoting nominal values, and 8, = A z / z O ,for any variable z . We know from Section 79.1.3 that W T is given by the scheme of Figure 79.7 and is sensitive only to the high pressure part of the steam-water subsystem. To undersiand the factors influencing W R , let's refer to Figure 79.12. Because HPT has negligible storage
where
and the upper script - denotes the steady-state of linearization. In Equation 79.33 Awd, is the control variable which directly modulates temperature Ti, and AWT and AT, represent disturbances to the temperature. Because the influence of Awds on pressure is very small, temperature control via desuperheating spray does not significantly influence boiler pressure. In Figure 79.10, the heat transfer rate AQ, may also be used to control temperature. This would be very effective because G Q (s) incorporates less phase lags than GT (s). Unfortunately (see Section 79.1.3) it is impossible to modulate the heat transfer rate to a single heat exchanger in the boiler (e.g., by varying fuel flow or recirculation gas flpw) without simultaneously influencing the heat released to all of the other heat exchangers in the boiler. For instance, when recirculation gas dampers are modulated to control reheat temperature, the heat transfer rates to the evaporator and to the superheaters are also simultaneously varied. This fact generates interaction among the different process variables, to the extent that it is often necessaryto introduce feedforward decoupling actions to achieve acceptable control performance. When spray is used to control superheater temperature, then Aw, and A Q , constitute disturbances for the temperature control induced, for instance, by varying fuel flow required by load-pressure control. Fortunately (see Figure 79.7), when the heat transfer rate to the evaporator is increased,the steam generation is also increased so that Aw, and A Q , grow nearly as much. Because Gw(s) and GQ(s) are similar, the global disturbance AD, is much smaller than the two individual disturbances. However, as recalled in Section 79.1.3, when the heat transfer rate QEVto the evaporator isvaried by the fuel flow, the heat transfer rates to the superheaters and reheater do not vary in the same percentage, so that Aw, (nearly proportional to A QEV) does not exactly balance A Q, . This means that AD, # 0.
The reheater is a large steam heat exchanger; feed-water heaters (FWH) fed by steam extractions are large tube and shell heat exchangers where steam extractions are condensed to heat feedwater. Both of these components have significant mass storage. The FWH represented in Figure 79.12 accounts (in an equivalent way) for the overall capacitance of FWHs. Neglecting the desuperheating spray (normally zero), the relevant mass balances are
where MR is the steam mass in the reheater, M F is~the steam mass in FWHs, and we is the condensation mass flow-rate in FWHs. Pressure losses in the reheater are small and can be neglected. Because VE is normally fully open, we may assume that the pressure in the entire reheater is the same as at the RHT inlet. Moreover, reheater temperature has much slower dynamics than mass storage (see Section 79.1.3), so that it may be considered constant while evaluating d M ~ / d t . Applying an equation similar to Equation 79.10 to RHT and considering superheated steam as an ideal gas, the RHT flow equation is WR
= kXHT P B I
JG7
(79.38)
where TRHis the reheater outlet temperature, R is the gas constant, and k i H T is a suitable constant. Then, taking variations of Equations 79.36, 79.37, and 79.38,
where T R is a time constant resulting from the sum of the storage capacitance of the reheater and of FWHs, multiplied by tli* flow resistance of RHT, and the last term results from considering that
'
79.1. CONTROL O F ELECTRIC POWER GENERATING PLANTS
Steam extractions wsr
+
=zm(E) 1 , -
Desuperheating spray
Wx = mass
/
flow-rate
V, = emergency valve
(EQUIVALENT)
CONDENSER Figure 79.12
Steam reheating.
condensation flow-rate variation, Aw, is essentially due to feedwater flow variation f l w , . In controlled conditions Sw, strictly follows 6wT,so that :Equation 79.39 becomes
Equation 79.39 deserves a couple of remarks [5]: the time constant t~has values of about 10-12 sec., nearly 50% due to FWH's capacitance; the possibility of varying the principal steam flow ~ W by R changing the feed-water flow (g6w,) has suggested one of the most recent expedients to realize quick power variations even without HPT throttle reserve [ 6 ] . When special cor~trolof feed-water is not applied, Equation 79.40 can be substituted in Equation 79.34 with the following conclusion:
realizing that about 113 ( k ~ p 0.3 - 0.4) of electric power output is strictly proportional to the steam flow and about 213 ( k H P kRH = 1) is affected by a time lag t ~which , cannot be negligible when loadvaries rapidly (as in the case ofload rejection incidents or turbine speed control problems). Finally, observe that the parameters of Equation 79.41 are slightly dependent on plant load.
+
2. flame stability at low loads in coal fired plants. The second problem requires a very complex andysis and is relevant only in very particular situations. To analyze (I), two dynamical phenomena must be studied: combustion kinetics, i.e., the chemical process governing fuel oxidation and its heat release, and mass and energy storage in the furnace, possibly augmenting the furnace storage to account for the rest of the air-gas circuit. Distributed parameter modeling would be required to describe these phenomena accurately. Again, a simple lumped model may be used to explain basic concepts. A simple way to account for combustion kinetics is to introduce a combustion time constant te relating fuel flow-rate w f to heat transfer rate Q f in the furnace, 1 A Q f = -HfAwf7 (79.42) 1 st,
+
where Hf is the heat value of the fuel. The time constant te is of the order of a second, smaller for oil or gas, and larger for pulverized coal. Mass and energy balances for the furnace can bie derived from the following assumptions: the gas pressure pg is uniform in the furnace, heat transfer from gas to wall in the furnace is computed from the mean gas temperature, Tg,and the combustion gas behaves as an ideal gas.
Combustion and Air-Gas Dynamics Referring to Figure 79.2, we see thaf the air-gas subsystem forms a complex circuit, where the largest storage (of mass and, thus, of energy) is the furnace. Apart from air preheaters (2) and (3), the dynamics of the air-gas subsystem are very fast, so that it is substantially decoupled from the dynamics of the steam-water subsystem. Air-gas dynamics are relevant only to two special problems: 1. control of furnace pressure in a balanced draft furnace (recqgnizable from the induced draft fan (1 l ) of Figure 79.2), and
Then, without considering gas recirculation:
and
THE CONTROL HANDBOOK where w, is the air-flow rate with enthalpy h,,, Q, is the heat transfer rate radiated to the wall, w g ois the outlet gas flow-rate with enthalpy hg,, Mg = Vfpg is the total mass of gas in the furnace (pg is the mean density), and Eg = cug T M is the total energy of the gas (tug is the specific heat at constant volume). Ideal gas law and radiation law equations are
put control variables, namely, the fuel flow-rate wf,the air flowrate w,, the turbine admittance Y, the feed-water flow-rate w,, the recirculation gas flow-rate W r g ,desuperheating spray flows wds and wd,,induced fan vane position z , and the output variables to be controlled, namely, electric power P,,throttle pressure p ~ drum , level y ~throttle , temperature Tr,reheater outlet temperature TRH,furnace pressure pg, and air-to-fuel ratio A,,+.(or combustion efficiency). From Figure 79.7 and Equation 79.41, P, and p~ are strictly related and simultaneously affected by the heat transfer rate (i.e., w f ) and turbine admittance Y.
where k,, is a constant and T, is the wall teniperature (T: << Ti). Boundary conditions are determined by head losses along the air-gas circuit and by a forced and induced draft fan. Considering the outlet boundary conditions,
Flow rates wd, and wdr slightly affect P, and p~ because they have only a small mass effect on steam flow. Similarly, feed-water flow w, only marginally affects power (see Equation 79.39). Air flow is nearlyproportionally to fuel flow, so that trimming actions to optimize A,f have little influence on the heat transfer rate QEV to the evaporator. Only wrg changes the heat release partition in the boiler; the control band width of this variable (used to control reheat temperature) is, however, rather narrow for that reason. Therefore, pressure and power form a (2x2) subsystem, tightly coupled but with limited disturbance from outside.
fi
where p, is the atmospheric pressure, p, is the head at wg, = 0 ofthe induced draft fan, k, is the constant yielding the head losses along the gas circuit, and k f ( z ) is the constant yielding the head losses of the fan depending on the control inlet vane position z. Assuming that the forced draft fans are controlled with air flow-rate w, and that, in the mass equation, wf << w, may be approximated by Qf/ h f , linearization of the model Equations 79.43-79.47 yields
where all th'e variables S y are expressed in p.u., the superscript * denotes the value at the linearization steady state, L is the plant load, and rgO, t i , r2, t3, r4, 55, pU are suitable constants computed from design or operating data. Omitting cumbersome computations for brevity, Equation 79.48 deserves some remarks. The time constants rj, j = 2 . . . , 5, are proportional to the furnace crossing time tf = (Mg/wg)*. Mg*is nearly insensitive to the load, above the technical minimum, w: is nearly proportional to the load, and the time constants rj, j = 2 . . . , 5 become larger as the load decreases. The smallest time constant, on the contrary, decreases as the load decreases. And because it is only a few tenths of second at the maximum load, it may be neglected. Typical values are tf x 2 -4 sec., tl < O.ltt, r2 x 0.25tf, r3 x 5 - 6 t f , t4 x O . l t f ,rgO is about 0.1. Due to the large value of r3, fuel trip from maximum load (6Qf = 1) causes large pressure drops, with extreme risk of implosion. Inlet vane control is introduced to attenuate the effect of such a disturbance.
Concluding Remarks on Dynamics To summarize the analysis of the preceding Sections, it is useful to identify the interactions in the system. Consider the in-
Drum level y~ is the only output variable markedly influenced by feed-water flow w,; y o is also "disturbed" (see Equation 79.28) by steam flow w,, by drum pressure p,, and by the evaporator heat transfer rate QEV,so that w, does not influence the load-pressure subsystem, but y g is considerably influenced by the control variables of that subsystem. Superheated temperature TT, according to Figure 79.10 and Equation 79.33,is influenced by w d ,and is "disturbed" by steam flow and heat transfer rate Q ~ Hwhich, , in turn, follows fuel flow variations. So, disturbance of the power-pressure subsystem to temperature is relevant, even though there is a natural partial compensation due to the boiler behavior (see AD, in Figure 79.10). Reheater spray follows a rule similar to superheater spray. Furnace pressure is dynamically decoupled from all of the variables related to the steam-water side subsystems and, according to Equations 79.42 and 79.48 is affected by air flow, fuel flow, and induced draft fan inlet vane position (normally used as its control variable). A similar criterion applies to the fuel-air ratio. Input-output relations are summarized in Figure 79.13,where only the major interactions are considered. Finally, in coal fired units fuel is supplied by pulverizers, not influenced by the rest of the plant. The pulverizers have sluggish dynamics due to the' dead time of the grinding process and to the uncertain behavior of such machines caused by coal quality variation and machine wear. For those plants, where fuel flow cannot be considered a directly manipulated variable, the slow response of coal pulverizers must be cascaded with the evaporator dynamics of Figure 79.7 often creating severe problems for system stability and control.
79.1. CONTROL O F ELECTRIC POWER GENERATING PLANTS
1469
throttle valve. As a result, turbine-following control allows rapid regulation of throttles pressure and slow, but stable, regulation of generation. Consequently,turbine-following control is preferred for plants not used for load following.
I
Figure 79.13
Boiler &
I V
Process interactions.
79.1.4 Control Systems: Basic Architectures Introductory Remarks on Control This section discusses plant control at the "principal regulations level" (see Figure 79.4) which is organized around four control subsystems: load and pressure control: the regulation of power generation ancl steam pressure at the throttle, drum level control: the regulation of water level in the drum (steamlwater separator), temperature control: the regulation of steam temperature at the superheater and reheater outlets, and combustion control: the regulation of heating rate (fuel flow), excess oxygen (air flow), and furnace pressure. Power plant control systems have evolved over many decades. Today there are thousands of electric generatingplants operating throughout the v~orltl.It would be difficult to find more than a few with identical control systems. Yet certain basic configurations are almost universally employed with relatively minor variations. These are described in the following paragraphs.
Control of Pressure and Generation As discussed in Section 79.1.3, steam pressure and power generation are tightly coupled process variables. Both are strongly affected by energy (fuel) input and throttle valve position. This two-input, two-output system must be considered as such. Even though single-input, single-output compensation arrangements are successful, they must be designed (tuned) as a unit. Three basic architectures are commonly employed: turbine following: generation is aired with fuel rate and pressure with throttle valve position, boiler following: generation is paired with throttle valve position .and pressure with fuel rate, and coordinated control: a true two-input, two-output configuration of which there are variations. The turbine-following arrangement, shown in Figure 79.14 has distinctive attributes. First, the control of the energy input to the boiler is relatively slow compared with the positioning of the
Figure 79.14
The turbine following configuration for pressure and
generation control. The boiler-following architecture is illustrated in Figure 79.15. It produces substantially more rapid responses i:o generation commands but they can be quite oscillatory. Moreover, pressure response is typically oscillatory.
Figure 79.15 The boiler followingconfigurationfor pressure and generation control. Modern requirements for load following have led to the widespread use of two-input, two-output pressure and generation control. There are a number of approaches to coordinated control. One configuration (commonly referred to as "coordinated control" or "integrated control") is shown in Figure 79.16. Properly designed coordinated-control systems can provide excellent response to load demand changes.
"p&p*m -4 y
Valve Tu~bine Servo
~ r p .
A coordinated-control configuration for pressure and generation control.
F i e 79.16
Generator speeds naturally synchronizebecause of their interconnection via the electrical network. Ultimately, the (steadystate) network synchronous speed is regulated by a system level
THE CONTROL HANDBOOK controller through the assignment of generation commands to individual units. Nevertheless, speed governing on a substantial fraction of the network's generating units is essential to damping the power system's electromechanical oscillations. As a result, in many plants, the goal of turbine flow controi includes speed governing as well as regulating power output. This dual requirement is almost always accomplished with the "frequency bias" arrangement shown in Figure 79.17. Here turbine speed error is fed directly through a proportional compensator to the turbine valve servo and simultaneously a frequency error correction is added to the power generation demand signal through the frequency bias constant Bf. Ideally, Bf is precisely the sensitivity of system load to synchronous frequency. The frequency bias arrangement can be incorporated in either the boiler-following or coordinated-control configurations.
position servo
I
LU!L
cashlion servo Clrculat~on LOOP
Figure 79.18 (a).A typical single-element drum level configuration. (b).A typical two-element drumlevel configuration. (c).A typical three-
element drum level configuration. will be examined in more detail in Section 79.1.5.
Temperature Control The turbine valve may be used to stabilize turbine speed and regulate generation with the "frequency bias" modification of the boiler-following or coordinated-control configurations. Figure 79.17
Drum Level Control The goal of the drum level controller is to manipulate the flow of feedwater into the drum so that the drum water level remains sufficiently close to a desired value. Feedwater flow is typically regulated by a flow control valve or by adjusting the speed of the feedwater pump. Drum level controllers are classified as single-, two-, or three-element. A single-element level controller utilizes feedback of a drum level measurement as illustratedin Figure 79.18(a). Two- and three-element controllers include "feedforward" measurements of steam flow and both steam and water flow, respectively, as illustrated in Figures 79.18(b) and 79.18(c). The importance of the steam flow feedforward can be appreciated by examining the leading term in Equation 79;28 which shows that the drum level deviation is proportional to the integral of the difference between steam and water flow. Any sustained difference between steam flow and water flow can quickly empty or fill the drum. In current practice, three-element controllers are typically used during normal operation but are not suitable at very low loads, where it is common to switch to singleelement configurations. Drum level dynamics are also nonminimum phase (the so-called "shrink and swell" effect) as can be observed in the last term of Equation 79.28. Drum level control
An important goal of plint control is to regulate &team temperature at the turbine entry points, i.e., at the superheater and reheater outlets. There are a number of control means for accomplishing this. The most direct are attemperators which inject water at the heat exchanger inlet. By moderating the fluid temperature enteri?g the heat exchanger, it is possible to control the outlet temperat$re. Other possibilities are associated with adjusting the heat trahsferred to the fluid as it passes through the exchanger. This can be accomplished by changing the mass flow rate of gas past the heat transfer surfaces with recirculated gas or excess air flow, or the gas temperature at the exchanger surfaces by altering the burner positions or "tilt" of the burners. Sometimes a combination of these methods is employed. In the following discussion it is assumed that control is affected by attemperators. The dynamics of superheaters and reheaters have been discussed in Section 79.1.3. Recall that the response of the outlet temperature to a change in inlet temperature is characterized by series of first-order lags with time constants that vary inversely with the steam flow rate through the heat exchanger. Because of the significant time delay of the outlet temperature response, a cascade control arrangement, as illustrated in Figure 79.19, is typically required for temperature regulation. The attemperator outlet temperature is,a convenient intermediate feedback variable although, depending on the heat exchanger construction, other intermediate steam temperatures may also be available for measurement. Because of the strong dependence of she time lag on steam flow, parameterization of the regulator parameters on steam flow is necessary for good performance over a wide load range. Some control systems incorporate disturbance feedfor-
79.1. CONTROL O F ELECTRIC POWER GENERATING PLANTS trollers. Calorimetry information (notably excess oxygen) is provided on a sampled or continuous basis. Minimum fuel and air (primary and secondary) flow constraints are also accommodated.
ward.
79.1.5 Design of a Drum Level Controller Issues Involving Design Over a Wide Load Range
Figure 79.19 A fairly sophistic;ttedcontrol temperature arrangement employs a cascade arrangement, gain scheduling and disturbance feed forward.
Combustion Control The main purpose of the combustion control system is to regulate the fuel and air inputs into the furnace to maintain the desired heat input into the steam generation process while assuring appropriate combustion conditions (excess oxygen). In most instances, regu1,atingthe furnace gas pressure is a secondary, but important, function of the combustion controller. A typical combustion control configuration is illustrated in Figure 79.20. Recall that the heating rate command signal Q is generated by the pressure-generation controller (Figure 79.14).
Figure 79.20 Combustion control includes regulation of fuel and air flow and furnace pressure. The details of the combustion control system depend significantly on the type of fuel. Oil and gas are typically regulated with flow control valves. These controls are usually fast. Also, oil and gas flow and the caloric content of the fuel can be reliably measured. Pulverized coal presents a different situation. Fuel flow is regulated by adjusting the "feeder speed" (which directly changes the flow rate of coal into the pulverizer) and the primary air flow (the air flow through the pulverizer that carries thspulverized coal into the furnace). The pulverizing process is quite slow and adds a delay of 100-300 sec. in the fuel deliveryprocess. Moreover, the flow rate of coal is difficult to estimate accurately and coal's caloric content varies. The master combustion control proportions the fuel and air requirements and establishes set points for the lower level con-
The control architectures described in Section 79.1.4 represent a starting point for detailed design of the plant control system. Once the configuration is chosen, it is necessary to determide compensator parameters that produce acceptable performance over a wide range of operating conditions. Doing this involves applying various analytical control design tools in conjunction with detailed simulation studies and generallyterminating with field tuning during plant commissioning. This section provides an example of the analytical phase in designing a drum level controller. Certain necessary plant operations, e.g., startup, rapid load following and controlled runback, evoke behavior that can only be attributed to the nonlinearity of the steam generation process. Deleterious nonlinear effects are particularly evident at low generation rates and when relatively large changes in generation level are made. Regulating drum level and combustion stability are particularly problematic at low loads. Nonlinear behavior can be readily observed in two ways: by compa~ringlarge and small excursions from a given equilibrium point or by comparing small signal behavior at different equilibria corresponding to distinct load levels. In the following paragraphs, we will examine the steam generation process from the latter point of view. This is particularly useful when designing control systems based on small signal (linear) behavior. Power plant controllers are designed to provide good performance at or near rated conditions. At off-design conditions, e.g., as load is decreased. performance deteriorates because the plant behaves differently than at the design point. Typically, satisfactory performance can be achieved down to about 30% of rated generation. At some point performance becomes unsatisfactory, requiring manual intervention or a switch to a retuned or even restructured control system. At low loads the nonlinearities associated with the evaporation and combustion processes 181, in particular, are quite severe. Compounding this complexity, plant operation over a wide generating range may require configuration changes. For example, as generation is reduced, the number of burners and fuel pumps or pulverizers will be reduced (see, for example, [9]),and steam by-pass systems may be employed, particularly, in large once-through fossil plants and nude= plants [lo]. The assumptions behind the model of evaporator dynamics developedin Section 79.1.3 are not suitable for capturing dynamics at low steam generation rates. In Section 79.1.5, following the assumptions Equations 79.3 and 79.4 are relaxed to allow more general variations of steam quality and fluid pressure through the evaporator. This model is then used to investigate the small signal dynamicalbehavior at various load levels. C:ontrol system
THE CONTROL HANDBOOK design is considered in Section 79.1.5. Much of the discussion herein is based on [ l l ] which emphasizes the nonlinear behavior of the circulation loop. Earlier discussions on level control system design included [12] and [13].
Circulation Loop Dynamics Revisited The system (see Figure 79.3) is comprised of a natural circulation loop whose main components are the drum which separates the steam and water, the downcomer piping which carries water from the drum to the bottom of the furnace, and the riser tubes which are exposed to the burning furnace gas and in which boiling takes place. A feedwater valve regulates the flow of water into the system and the throttle valve regulates the steam flow out of it. The heat absorbed by the fluid in the risers is a third input. The drum dynamical equations are considered first and then the remainder of the circulation loop.
where the following nomenclature has been adopted: Wr, Wdc, W ,
mass flow rates, riser, downcomer and turbine, respectively,
vdf, vdg
drum specificvolume, liquid and gas, respectively, drum pressure, drum temperature, total drum volume,
pd
Td
v
volume of water in drum, and net drum quality, xd = Vw/ V.
vw Xd
In addition to these differential equations we require the constitutive relations (coexistecce curve) vf
=
vf(P)
and
u,=v,(P)
(79.51)
and the drum level equation
Drum Dynamics The dynamical equations for the drum can be formulated in a number of different ways depending on the variables chosen to characterize the thermodynamic state of the drum. Two thermodynamic variables must be selected from the four possible pairs: (T, v), (s, P), (s, v), (T, P), where T, P, v, and s denote temperature, pressure, specific volume, and specific entropy, respectively.2 Because the fluid in the drum is in the saturated state, the pair (T, P ) is not suitable but any of the other choices is valid. As an example, we give the equations for the (s, P ) formulation. The following assumptions are made:
Under the stated assumptions, the drum thermodynamic state, entropy and pressure (s, P), is equivalent to (xd, P ) or (Vw, P ) because
Hence, we refer to the above equations as the (s, P ) formulation even though s does not explicitly appear. The steam flow out of the drum to the turbine is governed by the relationship
1. Drum liquid and gas are in the saturated state. 2. Pressure is uniform throughout the drum. 3. All liquid resides at the bottom of the drum and all gas at the top.
Then, the drum dynamics, derivable from mass and energy balance equations, (see [9]) are
and -dVw=
dt
-
[we
+ (1 - ~ r ) w r- wdcI(V - VW) vd av
(V - vw,&&,
+vw&%
d8
2 ~ the n standard convention, specific entropy is denoted by the symbol s which is also used to denote the Laplace transform variable. This will not lead to confusion in the subsequent discussion.
where w,o, PdOare the throttle flow and drum pressure at rated conditions, respectively, and A, denotes the normalized valve position, with rated conditions corresponding to At = 1.
Circulation Loop Dynamics The main deficiencyof the model in Section 79.1.3 was the simplified treatment of the circulating fluid flow and of the complex two-phase flow dynamics of the riser loop. Consequently, it does not adequately represent the nonminimum phase characteristics associated with shrink-swell phenomenon at lower load levels. Another approach to riser modeling is based on discretizing the time-dependent, nonlinear partial differential equations of one-dimensional, two-phase flow using the method of collocation by splines, a form of finite element analysis. The circulation loop, composed of the downcomer and riser, is assumed to be characterized by homogeneous single or twophase flow. The conservation of mass, energy, and momentum lead to three first-order partial differential equations which are then discretized using collocation by linear splines. We use a single element for the downcomer containing fluid in a single phase (liquid) and N elements of equal length for the riser which consists of two-phase flow. In the latter case the fluid properties change significantlyalong the spatial coordinate. In the entropypressure (s, P) formulation, the downcomer equations are
79.1. CONTROL C)F ELECTRIC POWER GENERATING PLANTS
1473
Once again, we need constitutive relations to complete the model. These are required in the form v
v(s,P ) , T = T(s,P )
=
y~ =
($1
Y
,
and
y~ =
(g) P
(79.59a)
. (79.59b)
Reduction of Circulation Loop Equations The circulation loop model described above contains fast dynamics irrelevant to the control problem. In general terms, fast dynamics are associatedwith certain pressure-flow dynamics (hydrauljc/acoustic oscillations) and slow dynamics are associated with thermal (entropy) transients. Formally, we can approach the problem of identifying and approximating fast dynamics using asymptotic analysis,because fast dynamics are associatedwith the fact that the parameter y~ is small. Our analysis is based on two assumptions:
and
The riser equations are
only the slowest mode of fast pressure-flowdynamics is significant to the control problem; we shall refer to this as hydraulic dynamics. spatial variations in flow and pressure along the circulation loop are negligible as far as the hydraulic dynamics are concerned. The flow and pressure equations can be written
4-Fi (W - i , wi-1, s t , si-1, 8,-P i - I ) , ai := L i / A i and
dPi-l bi- d t
=
(wt - wi-1)
+
(79.60a) si, si-1, Pi-l),
2
bi := ~ A A ~ L ~ / u ~ - , .
(79.60b)
Now, we define the average circulation loop flow and pressure: and
NI-1
and
Pa,
:=
C p i p i , pi = 1=1
bi
Ci=1 bi
. (79.62)
Let us also define the functions for i = 2, . . . , N
+ 1. The following nomenclature is employed: nurnber of riser sections, dovvncomer length and riser section length (total riser IengthlN), douvncomer, riser cross section areas, mars flow rate at ith node, pressure at ith node, telrlperature at ith node, aggregate entropy at ith node, and specific volume at i th node.
f i ( w u u , ~ i ~ ~Puv) t - l := ~ Fi(wau, Wuu,si,S,-l, P U UPau). ~ (79.63) We can state the key assumption: Assumption 1: For w i , s, and P : , i = 0,.. . , N 1, the following approximations are valid in the slow time scale:
+
fi(wuu, S i , si-1, Paul
X
Fi(wi, wi-1, si, si-1, I'i, Pi-1)
(79.64) and g i ~ w u u , s i ~ s i -pa") l~
%
i?i(wi-l,sivsi-l~. pi-1). (79.65)
THE CONTROL HANDBOOK It is easy to validate this assumption in the equilibrium state. Invoking these approximations and simply adding Equations 79.60 results in
Equilibrium values are computed by specifying the desired load, drum pressure, and drum level and then computing the required control inputs and the remaining state variables. A Taylor linearization at each equilibrium point yields a linear model of the perturbation dynamics. Thus, it is possible to determine the system poles and zeros and to examine how they change as a function of load. Figure 79.21 gives a sample of the results obtained from solving the equilibrium equations.
and Figure 79.22 is an eigenvalue plot showing how the plant dynamics vary with load. Table 79.2 summarizes a complete plant modal analysis at 100% load.
Assumption 2: The slow time scale flow and pressure distribution through the circulation loop is adequately approximated by equilibrium conditions of Equations 79.60 and the approximations of Assumption 1, i.e.,
and
load %
These equations are important because they 'allow the computation of Po and w, = w ~ + 1which are necessary to establish the interface of the circulation loop with the drum. From Equations 79.20 and the definitions of w,,, Pa,, we can derive the following relationships:
load %
Figure 79.21 Typical equilibrium curves show the circulation ratio, average loop pressure (upper),and drum pressure (lower) as a function of load.
and
AU but one of the eigenvalues of the circulation loop are illustrated. The missing eigenvalue is relatively far to the left at approximately-8 to -20 (dependingon loadlevel). There is an eigenvalue at the origin for all load levels as anticipated. The symbol (*) denotes 100% load.
Figure 79.22
The remaining interface equation is
Equilibria and Perturbation Dynamics First, we consider the open-loop behavior. The procedure followed is 1. Trim the system at load levels ranging from near 5%
to 100%. 2. Compute the linear perturbation equations. 3. Analyze the pole-zero patterns as a function of load level.
Zero ~ ~ n a m i c s The linearized plant transmission zeros were also calculated as a function of load level using various combihations of inputs and outputs. These results are summarized in Figures 79.23 and 79.24. These figures, which characterize the relation between feedwater flow and drum level, and heat rate and drum level, re-
79.1. CONTROL I ~ ELECTRIC F POWER GENERATING PLANTS
1475
-
TABLE 79.1 Summary of Dynamical Equations u 1 = q , u 2 = we, u 3 = A1
9=
= f i ( w a v 7 $19 ~ 2~ , 3 Prrv, , Pd)
+g21(pav, s l ) u l + g22(wuv, P d b 2 a , pa,) + ~ , I ( P a~ ) ~u l, 3 = f 4 ( w u v 7 s27 s 3 , p a u ) + g 4 l ( p & ~~~ 3 1 ~ 1 3 = f 5 ( w a v ? ~ 3~ , 4 P,u v ) + 8 5 1 ( P a v 7s r ) u i dP = f 6 ( w a v , s l , ~ 2 7~ 3$47, P a v , P d ) + g 6 l ( w u v 9 ~ 1 . ~ ~2 3, ~ , 4 P,N U ) U I dP == f 7 ( w a v , s l , s 2 , S3, S47 P a v * P d t V w ) + g 7 l ( w u v , s l $2, ~ S3, S4, P t v 7 p d 7 V w ) u l + g 7 2 ( P d 7 V U J ) ~ ~ +g73(pd7 v w ) u 3 '2 := f8(%vt ~ 1 $ 2, , ~ 3~ , 4 P.a v r P d r V W )+ g 8 1 ( ~ a v~, 1~ , 2~ , 3~ , 4 P,a v , P d , V w ) u l f g 8 2 ( P d , V w ) u 2 f 2 ( w a v , ~ 1 p.u v )
yi
TABLE 79.2
Mode
Mode description Eigenvalue wav
$1 S2 S3 s4 Pav pd vd
$$=
f3(wuu, s l ,
+g83(pd, v w ) u 3 ( V W ) , y3 = ws
= P d , y2 = Y D = f
= h3(Pd)
Eigellvalues and Eigenvectors at 100% Load 1
2 &3
4&5
Drum pressurecirculation flow rebalance
Drum-Riser mass balance oscilation
Riser flow-density oscilation
-.7817f 18351' .6866f .5787i -.0052f .0063i .0025f .0148i .0186f .0222i -.0933f .0099i .0451f .0178i .0561f .0283i -.3925-+.1526i
-.3854f .3145i -.9500&. 1684i .0034f .0046i .0097*.0058i .0029f .0207i -.0178f .0280i -.0107&.02701' -.0321f .0170i .0879f .2397i
-7.7758 0.5288 -0.0066 0.0005 0.0001 0.000 1 -0.0323 0.7644 0.3675
6
Circulation flowdrum level rebalance -0.3023 0.9692 0.0045 0.0058 0.0071 0.0087 -0.0267 0.0187 0.2436
-
7
8
Energy-coupled drum-level
Drum level mass balance
-0.0274 0.4173 -0.0193 -0.0212 -0.0230 -0.0247 -0.1285 -0.1117 0.8916
O.OOCI0 0.0000 0.0000 0.00c~0 0.0000 0.0000 0.0000 0.0000 1,0000
spectively, show the nonminimum phase behavior typical of such systems, with zeros in the right half-plane.
Figure 79.24
Zeros of the transfer function Q + lev. The nonminimum phase characteristic which produces the shrinWswel1phenomenon is dearly evident.
Figure 79.23
The zeros of the transfer function o, += lev show the expected nonminhum phase characteristic of drum level dynamics.
Drum Level Control at High and Low Loads The control configurations of interest in this section are shown in Figure 79.18, which illustrates typical one-, two-, and three-element feedwater regulators with proportional-integralderivative (PID) con~pensationapplied to the water level error
signal and "feedforward" of the steam flow. This is the control structure most commonly used for feedwater regulation in drum type power plants. The unfortunate terminology 'Yeedforward" arises from the view that steam flow is a disturbance vis-a-vis the drum level control loop, in which case such terminology would be appropriate. From the point of view here, however, because throttle valve position is the disturbance and steam flow is an output, we will refer to steam flow "feedback:': This seemingly innocuous distinction is critical to understanding low loaci
THE CONTROL HANDBOOK performance. If steam flow feedback is omitted, the result is a single-element controller. With the configuration specified, the control system design problem reduces to selecting PID parameters for the compensator. It would be desirable to identify one set of parameters to provide acceptable performance at all load levels but it is wellknown that this is not possible. Open-loop analysis shows that the dynamics vary dramatically over the load range and, in particular, the position of the open-loop zeros, which limit achievable performance at all loads, is particularly poorly located at low loads. Thus, it is common practice to use two different sets of parameter values, one for high loads and the other for low loads. Even so, it will be seen that the achievable low load performance is not very good and that feedback of steam flow degrades performance at the very low end.
Compensator Design Performance Limitations There is a basic performance trade-off associated with the PID compensator for feedwater regulation. The regulator introduces a (second) pole at the origin and allows for placement of two zeros. The three design parameters may be viewed as these two zero locations and the loop gain. There are four dominant modes: the three modes 6,7, and 8 described in Table 79.2 and the mode introduced by the compensator and corresponding to the new pole at the origin. The latter will be referred as the "drum level trim" mode. This terminology is consistent with the intent of compensator integral action, to eliminate steady-state errors from the drum level. One classical design approach is to fix the compensator zero locations and to examine the root locus with respect to loop gain. Figure 79.25 through Figure 79.27 illustrate a root locus plot at 100% load. One of the compensator zeros is placed on the real axis close to the origin because the undesirable, destabilizing phase lag introduced by the integral action is compensated for by the neighboring zero. Of course this traps a pole near the origin. This pole corresponds to the trim mode and the result is a very slow trimming of the steady-state drum level error. The placement of the second zero is much more critical. Here it is placed on the real axis between the poles corresponding to modes 7 and 8. The logic is to pull mode 8, the critical drum mass balance mode, from the origin to the left, the farther the better, which translates into high gain. As seen in the figures, this draws modes 6 and 7 together, coupled into an oscillatory mode which approaches instability as the loop gain is increased. These poles are attracted by the nonminimum phase zeros. This represents one basic trade-off. A gain must be chosen which provides acceptablespeed of response for mode 8, while retaining reasonable damping for modes 6 and 7. Performance Under Load Variation It is important to examine the performance of this controller at lower load levels. In terms of pole location, performance degrades as load is reduced:
Figure 79.25 This figure provldes an overview of the root locus at 100% load. The designations are: open loop poles (x), open loop zeros (o),design poles (*). All but one (theleftmost)pole is shown. There are
also two additional zeros, both real and to the left. Notice the two right half-plane zeros. Two of the four dominant poles (near the origin) are not clearly vis~bleon this scale. 006
1004hLoad
A closer view of the dominant poles is provided by this figure. Notice that the selected value of the gain provides closed-loop poles for the disturbance-coupled drum level mode, which are slightly underdamped. Further increases in gain reduce the damping, and eventually these poles move into the right half-plane attracted by the right half-plane zeros. Although some additional gain would be acceptable (in fact, desirable) at this load level, it has adverse affects at lower loads, as will be seen.
Figure 79.26
At 60% load the disturbance-coupled drum dynamics have significantlydegraded. Higher gain is called for, but once again, lower load dynamics would suffer. \ At 50% load the disturbance-coupled dynamics have further degraded, although the drum level mode response time is somewhat improlred. This is a critical load which marks a transition in the qualitative closed-loop behavior. As load is reduced, the openloop poles associated with modes 6 and 7 move to the right along the real axis. At 50% load the mode 7 pole is precisely located at one of the compensator zeros. Because of this, mode 7 is effectively unregulated and remains fixed as gain'is varied. In modern terminology, this is an input decoupling zero. It is possible, of course, to place the compensator zero more to the right in order to lower the load level (50%) at which the transition takesplace. This strat-
79.1. CONTROL OP ELECTRIC POWER GENERATING PLANTS ~10.1
1477 40% Loud
100% Load O 0.015 0.01
Figure 79.27 This figure provides an even doser look at the poles associated with the drum level mode and the level trim,mode. These are by far the slowest modes. At the design condition, the drum level mode has a time constant of about 200 sec. This response time could be improved by increasing the gain, but at the expense of lower load stability. The trim mode is considerably slower. egy reduces th~eachievable response time of mode 8. Consequently, this is a second basic trade-off which must be addressed. As load is further reduced to 40%, mode 7 couples with mode 8, rather than mode 6, to form a slow, but acceptably damped, oscillatory mode. The damping, however, diminishes as load is further reduced because the (alpen-loop)mode 7 pole migrates to the right and also because one of the right half-plane zeros approaches the origin from the right. There is a distinct change in root locus behavior in the transition from 609'0-50%-40% load levels. Figure 79.28 illustrates the dominant poles at 40% load. At 30% load there is a fairly dramatic degradation of zrformance. Although the system remains stable, the dominant dynamic is a slow oscillation (a period of over 600 sec.) which is very lightly damped. A reduction in gain would improve performance, although it would be marginal at best. At 20% load the closed-loop system is unstable. A reduction in gain is required to stabilize it. Performance can be optimized by adjusting all three controller parameters, but even the best achievable performance is not very good. Moreover, even stable performance is attainable only with substantial degradation of high load performance. A
Low LoadNegulator Eventually, it is necessary to face the dilemma that stable low load operation is not achievable with the conventional control configuration tuned to provide reasonably good performance at higher loads. Thus, a new set of PID parameters is needed for low load operation. This is accomplished by drastically moving both compensator zeros to the right and choosing a new loop gain by examining the root locus. Figure 79.29 shows the root locus for a low load design.at 20% load. It is important to note that the critical mode 8 has a dosed-loop time constant 4 to 5 times slower than obtainable at
0
.
.
>
.
-7
Figure 79.28 This figure provides a closer look at the dominant poles at 40% load. The damping ratio is already somewhat less than ideal so that increased gain would not be desirable. high load. Here again a transition for this regulator, of exactly the same type as described earlier at 50% load, takes place between 10% and 5% load.
Figure 79.29 The low load regulator root locus a$20% load shows that the system is stable but has much slower response of the dominant modes.
Figure 79.30 At 5% load a transitionhas taken place and, even though the system is still stable, the damping is substantially reduced.
Effect of Steam Flow Feedback So far the effect of steam flow feedback has not been addressed. It is possible to examine the effect of the steam flow feedback loop on the oth-
THE CONTROL HANDBOOK erwise open-loop plant. Computation shows that the effect on most of the eigenvalues is negligible. Only mode 7 is significantly influenced. At all load levels the steam flow feedback moves the eigenvalues to the right. Although the largest changes are associated with the high load levels, the consequences are important only at low load levels. It is necessaryto redesign the compensator with the steam flow feedback in place. Figures 79.3 1 and 79.32 show the closed-looppole location for the redesigned high load controller with steam flow feedback for loadaevels of 100% and 40%. Once again 50% load is a critical transition point. At 30% load stability, margins are poor and probably not acceptable. The system is unstable at 20% load. Figures 79.33 and 79.34 show the eigenvalue locations when the low load controller defined above is employed with steam flow feedforward. By comparison with Figures 79.29 and 79.30, steam flow feedback marginally degrades stability at 20% (and also at lo%, not shown) load, but the system is unstable at 5%. The choice not to use steam flowfeedbackwith the low load controller is made to retain stability at very low loads. IW% Load
0.06 0.04
-
0.02
-
.
*.
...
....
Figure 79.33 A three element-controllerretuned for low load real operation provides reasonablestabilityat 20%loadwhich is onlymarginally inferior to a single control shown above in Figure 79.29.
.
Figure 79.34 At 5% load the three-element controller results in an unstable system. The essential point is that at some load level the threeelement controllerwill lead to an unstable response unless the controller is again retuned. Even retuning for stability, however, is not likely to result in acceptable performance. real
Figure 79.31 Root locus for redesigned regulator using steam flow feedback (three-element control) at 100%load.
Root locus using three-element control at 40% load. Stabilityissomewhat degraded.
Figure 79.32
DisturbanceResponse It is useful to assessthe relative importance of each of the closed-loop dynamical modes in terms of their appearance in the drum level response to disturbances,
either heat rate disturbance or throttle valve disturbance, or to drum level set point change. To do this, we look at the residues associated with each closed-loop pole when the system is subjected to a disturbance input or commanded set point change. Tables 79.3 and 79.4 summarize the closed loop eigenvalues at high itnd low loads, respectively. Table 79.3 corresponds to the high load controller and Table 79.4 to the low load controller. Table 79.3 contains data corresponding to the three-element high load controller whereas Table 79.4 is generated using the low load single-element controller. Thus, the 30% column in the two tables can be compared for the effect of the change in controhers on the dosed-loop eigenvalues. Notice that there are 11 dosed-loop eigenvalues or poles. The open-loop modes 6,7, and 8 correspond to closed-loop poles 8, 9, and 10. Closed loop pole 11 is the slow drum level trim mode introduced by the controller integral action. In the ramp disturbance case, there is a 12th pole at the origin which contributes a constant offset in the steady-state drum level because the controller is only of type 1. The drum level response is a weighted sum of the time responses correspondirigto the simple poles, and the coefficients are the transfer function residues at the respective poles. Consequently, the relative significance of each mode in the drum level response can be identified by comparing the residue magnitudes. These computations (see Figures 79.36 and
79.1. CONTROL O F ELECTRIC POWER GENERATING PLANTS
TABLE 79.3
High Load Closed-Loop Eigenvalues
TABLE 79.4
Low Load Closed-Loop Eigenvalues
30%
20%
10%
5O h
-6.4923 -0.7237 - 0.0570i -0.7237 0.0570i -0.2710 - 0.31331' -0.2710 0.31331' -0.3300 - 0.04581' -0.3300 0.04581' -0.0130 - 0.02241' -0.0130 0.02241' -0.0008 -0.0001
-7.2444 -0.7765 -0.5246 -0.2269 - 0.2594i -0.2269 0.25941' -0.2970 - 0.03721' -0.2970 0.0372i -0.0088 - 0.01821' -0.0088 0.0182i -0.0009 -0.0001
-8.8627 -0.7189 -0.5187 -0.1247 - 0.21111' -0.1247 0.21 1 I i -0.2612 -0.2058 -0.0045 - 0.0104i -0.0045 0.0104i -0.0012 -0.0001
-12.3819 -0.6922 -0.5185 -0.1734 - 0.06601' -0.1734 0.0660i -0.0625 - 0.13971' -0.0625 0.13971' -0.0009 - 0.0036i -0.0009 0.0036i -0.0039 -0.0001 ,
+ + + +
+
+ +
+
+
+ + +
THE CONTROL HANDBOOK .mm
1.01,
Fessurc.
time (s)
Steam Flow
,
Dtum Leva1
0.4
0.77,
t
time (s) 0,85
I
Feedwater Flow
time (s)
time (s)
Figure 79.37 Three-element controller response to a 5% step command changing load from 80% to 75%.
pole number
Drum level residues for the closed-loop system responding to a ramp firing rate disturbance at load levels from 30% (x) to 100% (0). Recall that the 12th pole is at the origin and is contributed by the ramp disturbance. The nonzero residue merely confirms that the type 1 controller leads to a constant drum level offset when subjected to a ramp input. Otherwise, the dominant modes in the response are the three open-loop modes 6,7, and 8 and the trim mode. The trim mode clearly dominates at 100% load, but, at lower loads, poles 9 and 10 are quite important also. These poles correspond to open-loop modes 7 and 8 which, through the feedbackloop, have been coupled into a lightly damped oscillation. Figure 79.35
0,95
D N Pressure ~
0.52,
I
Steam Flow
time (s)
time (s)
Drum Level
-I
0
50
100 150 tlme (s)
I
0.6
0.2 0
200
Feedwater Flow
50
100
150
200
time (s)
Figure 79.38 Three-element controller response to a 10% step command changing load from 40% to 50%.
Drum Pressure
O.8dm
la1
0
2
0
4
6
8
10
1I2
pole numb=
The low load controller is subjected to the same ramp firing rate disturbance. The residues are shown for the load range 5% (x) to 30% (0). Although there is a 12thpole at the origin, its residue is very small because the controller zeros are much closer to the origin and, as load is dropped, an open-loopplant zero moves toward the origin from the right. The trim mode residue is also extremely small for the same reason. The remaining dominant modes are the same as the high load case (open-loopmodes 6,7, and 8). Notice that the residue value values are considerably larger, however.
Figure 79.36
o
.
:
~
0.82 0.8 0.78 0
0.195 50
2,
100 150 time (s) D N Level ~
time (s)
200
I
0.19 0 0.3
50
100 150 time (s)
200
FeedwaterFlow
time (s)
Single-element controller response to a 5% step command changing load from 15% to 20%. Below 30% load, a singleelement controller is used. For this particular plant, the 15%-20% range is particularly troublesome. Figure 79.39
79.1. C O N T R O L OF ELECTRIC POWER GENERATING PLANTS 79.37) confirm thae the dominant dynamics are the four slowest modes identified above in Table 79.3: circulation flow-drum level rebalance (mode 6), energy coupled drum level (mode 7), drum level mass rebalance (mode 81, and the drum level trim mode. Control system design must focus on the regulation of these modes.
1
0.82 --r
Drum Pressure
.
0.083 1
time (s)
0,
Drum Level
time 1:s)
Steam Flow
time (s) 0,3
I
Feedwater Flow
I
time (s)
Figure 79.40 Single-element controller response to a 2% step command changing load from 10% to 8%. At this very low load, larger step changes are not possible.
The effect of steam flow feedforward can be seen in the 30% load results in Figures 79.35 and 79.36, which correspond to the three-element and single-element controllers, respectively. Consider pole num~ber10 which corresponds to the mass balance mode (open-loop mode 8). The 30% residue of this pole is the 'x' in Figure 79.35 and the '0'in Figure 79.36. There is an order of magnitude improvement with the three-element controller in this important mode.
Simulatior~Responses Another perspective is obtained from the closed-loop time responses. The following figures correspond to a feedbackregulator structure that employs a coventional three element feedwater controller above 30% road and a single element controller below 30% load. These controllers cannot tolerate load level step changes of more than about 5%-lo%, and somewhat smaller step changes at very low loads. The following figures present a sequence of simulation results of step load change commands spanning the range from high load to very low 16ad. The threeelement controller used at higher loads responds much better to load disturbances than the single element controller because the residues associated with the dominant dosed-loop modes are much smaller. However, the destabilizing effect of steam flow feedback (disturbance feedforward)precludes its use at low loads.
References [ I ] Steam, its Generation and Use, 39th Ed., Babcock e+ Wilcox, New York.
1481
(21 Bolis, V., Maffezzoni, C., and Ferrarini, L., Synthesis of the overall boiler-turbine control system by single loop auto-tuning technique, Proc. 12th World IFAC Congress, 3,409-414, 1993 (also to appear in Contr. Eng. Practice, 1995). [3] Maffezzoni, C., Issues in modeling and simulation of power plants. Proc. IFAC Symposium on Control of Power Plants and Power Systems, 1, 19-27, 1992. [4] Groppelli, P., Maini, M., Pedrini, G., and Radice, A., On plant testing of control systems by a real-time simulator, 2nd Annual IDSAIEPRI Joint Control and Instrumentation Conf., Kansas City, 1992. [5] Colombo, F., De Marco, A., Ferrari, E., and Magnani, G., Considerations upon the representation of turbine and boiler in the dynamic response of fossil-fired electrical units. Proc. Cigrk-IFAC Symposium on Control Applications for Power System Security, Florence (Italy), 1983. [6] Fuetterer, B., Lausterer, G.K., and Leibbrandt, S.R., Improved unit dynamic response using condensate stoppage, Proc. IFAC Symposium on Controll of Power Plants and Power Systems, 1, 129-146, 1992. [7] Kalnitsky, K.C. and Kwatny, H.G., First Principle Model for Steam Turbine Control Analysis. J. Dyn. Syst., Meas. Contr., 103, 61-68, 1981. [8] Kwatny, H.G. and Bauerle, J., Simulation Analysis of the Stability of Coal Fired Furnaces at Low Load, Proc. 2nd IFAC Workshop on Power Plant Dynamics and Control, Philadelphia, 1986. [9] Kwatny, H.G., McDonald, J.P., and Spare, J.H., Nonlinear Model for Reheat Boiler Turbine-Generator Systems, Part 1 -General Description and Evaluation, Part 2-Development. 12th Joint Automat. Contr. Conf., 1971. [lo] Kwatny, H.G. and Fink, L.H., Acoustic!;, Stability and Compensation in Boiling Water Reactor Pressure Control Systems. IEEE Trans. Automat. Contr. AC-20, 727-739,1975. [ l l ] Kwatny, H.G. and Berg, J., Drum Level Regulation at All Loads: A Study of System Dynamics and Conventional Control Structures. Proc. 12th IFAC World Congress, Sydney, 1993. [12] Schulz, R., The Drum Water Level in The Multivariable Control of a Steam Generator, IEEE Trans: Ind. Electron. Contr. Instrum. IECI-20, 164-169, 1973. [13] Nahavandi, A.N. and Batenburg,A., Steam Generator Water Level Control, Trans. ASME J. Basic Eng., 343354, 1966. [14] Chien, K.L., Ergh, E.I., Ling, C., and Lee, A., Dynamic Analysis of a Boiler. Trans. ASME, 80, 18091819,1958. [15] Profos, P., Die Regulung von Dampenfigen, (The Control of Thermal Power Plants), Springer, Berlin, 1962, (in German). [ 161 Dolezal, R. and Varcop, L., Process Dynamics, Elsevier, Amsterdam, 1970.
THE C O N T R O L H A N D B O O K [17] Klefenz, G., La regulation dans les central thermiques (The control of thermal power plants). Edition Eyrolles, Paris, 1974 (in French). 1181 Friedly, J.C., bynamic Behavior of Processes, Prentice Hall, Englewood Cliffs, NJ, 1972. [19] Kecman, V., State-Space Models of Lumped and Distributed Systems, Springer, New York, 1988. [20] Dukelow, S.G., The Control ofBoilers, ISA Press, 1986. 1211 Maffezzoni, C., Dinamica deigeneratoridi vapore (The dynamics of steam generators), Masson Ed., 1989 (alsoprinted by Hartmann and Braun, 1988),(in Italian). [22] Maffezzoni, C., Controllo deigeneratoridi vapore (The control of steam generators), Masson Ed., 1990 (also printed by Hartmann and Braun, 1990), (in Italian). [23] Quazza, G. and Ferrari, E ., Role of Power Station Control in Overall System Operation. In Real-Time Control of Electric Power Systems, Handschin, E., Ed., Elsevier, Amsterdam, 1972 (241 Nicholson, H., Dynamic Optimization of a Boiler, Proc. IEEE, 111, 1478-1499,1964. [25] Anderson, J.H., Dynamic Control of a Power Boiler, Proc. IEEE, 116, 1257-1268,1969. [26] McDonald, J.P. and Kwatny, H.G., Design and Analysis of oiler-Turbine- ene era tor Controls Using Optimal Linear Regulator Theory. IEEE Trans. Automat. Contr. AC-18,202-209,1973. [27] Maffezzoni, C., Concepts, practice and trends in fossil fired power plant control. Proc. IFAC Symposium on Control of Power Plants and Power Systems, Pergamon Press, pp. 1-9, 1986. [28] Calvaer, A., Ed., Power systems. Modeling and control applications. Proc. IFAC Symposium, Pergamon, 1988. [29] Welfonder, E., Lausterer, G.K., and Weber H., Eds., Cantrol ofpowerplants and power systems, Proc. IFAC Symposium, Pergamon, 1992. [30] .Both World Cong. Automat. Contr., IFAC Proceedings, Pergarnon, 1987. [311 11th World Cong. Automat. Contr., IFAC Proceedings, Pergamon, 1990. [32] 12th World Cong. Automat. Contr., IFAC Proceedings, Pergamon, 1993. [33] Uchida, M. and Nakamura, H., Optimal control of thermalpowerplants in Kyushu electricpower company, Proc. 1st IFAC Workshop on Modeling and Control of Electric Power Plants, Pergamon, 1984. [34] Mann, J., Temperature control using state feedback in a fossil-fired power plant, Proc. IFAC Symposium on Control of Power Plants and Power Systems, Pergamon, 1992.
Further Reading Power plant automatic control systems have evolved over many decades as have the generating plants themselves and
the environment within which they operate. In the early years the main issues revolved around the control devices themselves, along with actuators and sensors. Indeed the physical design and construction of the early pneumatic, mechanical, and analog electrical control components were quite remarkable and intricate. In contrast, the overall control architectures themselves were very simple and the setting of controller parameter values was always done in the field with carefully controlled tests carried out by engineers possessing keen knowledge of process analysis and feedback concepts combined with extensive operating experience. In this environment, most of the work done to improve the control systems of electric power plants was performed by the major suppliers of control equipment in close cooperation with the utility companies that operated them. Much of the expertise was recorded in numerous obscure corporate documents that generally did not find their way into the public domain. So, assemblinga complete technical and historical record is very difficult and the present work does not attempt that task. By the late 1950s, the maturation of the field of automatic control, both theory and devices, and the demands for improved generating plant performance through automation coalesced. A serious effort at model-based control system analysis and design for power generating plants began and with it has emerged a sizable technical literature. Our goal is to provide a connection to it. While there are undoubtedly antecedents, much of the literature on boiler modeling, related to the special needs of control system design, traces back to the 1958 paper of Chien et al. [14] which deserves special mention. In the open literature are some books, with different points of view, that deal with boiler dynamics and control providing good balance between theory and practice. Again, for its historical significance, we note the book by Profos [15].Another important one (1970) is Dolezal andVarcop [16],where attention concentrates on boiler dynamics. The analysis is strongly based on mathematical modeling, with much analytical detail. It is very useful in explaining the origin of basic dynamical&ehaviorand identifying the process parameters on which they depend. The book is a milestone in boiler dynamics, though, perhaps, not~so "friendly" to read. A book (1974) by Klefenz [17],which is the French translation of the original German version, has a limited circulation. Most of the concepts illustrated there are still valid and applied in practice. An excellent general text for methods and examples of process dynamics, with many references, is the book by Friedly [18].Another is by Kecman [ 191. The book by Dukelow [20], is an excellent condensation of engineering skill and field experience. Without equations, boiler control concepts and schematics are plainly discussed and explained. It represents a good first reference for anyone approaching the subject without practical experience. Several books (1989-1990),published in Italian by
79.2. CONTROL OF POWER TRANSMISSION one of the authors, :proposea systematic approach to (simplified) boiler dynamics [21] based on first principles as a prerequisite to understanding the fundamentals of boiler control [22], and tlhe most popular control concepts and their underlying motivations. The books above cited all have a "classical" approach to power plant control, following the approach typical of process control, i.e., structuring the control system in a number of hierarchicallyarranged loops and trying to minimize interactions among th~em.Nonclassical control strategies have been studied and proposed in the last quarter century with the application of "modern" control techniques, such as optimal control, adaptive control, robust control, andvariable structure control. Much of the motivation for improved dynamic performance can be attributed to the plant's importance to overall power system operation [23]. Early investigations that applied modern control methods to power plants include [24]-1261. The interested reader is referred to the survey paper [27] summarizing that research effort up to 1986 and to the Proceedings of subsequent dedicated events (primarily 1281, [29] and the target area sessions within the world IFAC Congresses [30], [31], and [32]). There are many outstanding contributions among the many interesting papers in the literature on nonclassical control concepts applied to power plants; we want to mention a couple of cases where the new concepts have been introduced in continuo~islyoperating in commercial units: the steam temperature control of once-through boilers based on optimal control theory (see [33] and related references) and an observer-based state feedback scheme applied to superheated steam temperature control [34]. Concluding this short survey of contributed work, testing and installing improved control systems in commercial power plants is very expensive, so that it is often impossible for researchers to test their findings; yet, it is very important to validate new concepts on a realistic test-bench (possibly the plant). Complex, large simulators are now available that allow credible testing [4].
79.2 Control of Power Transmission John J. Paserba, Juan J. Sanchez-Gasca; and Einar l? Larsen, GE Power Systems Engineering, Schenectady, NY
79.2.1 Introdluction The power transmnssion network serves to deliver electrical energy from power plant generators to consumer loads. While the behavior of the power system is generally predictable by simulation, its nonlinear character and vast size lead to challenging demands on planning and operating engineers. The experience and intuition of these engineers is generally more important to successful operation of the system than elegant control designs.
In most developed countries, reliability of electrical supply is essential to the national economy, therefore, reliability plays a large role in all aspects of designing and operating the transmission grid. Two key aspects of reliability are adequacy and security [I]. Adequacy means the ability of the power system to meet the power transfer needs within component ratings and voltage limits. Security means being able to cope with sudden major system changes (contingencies)without uncontrolled loss of load. In this section, the focus will be mostly on the security aspect, emphasizing power system stability. There are two main categories of power system performance problems. One category involving steady-state issues, such as thermal and voltage operating limits, is not discussed in this section. The second broad category involves power system dynamics, dealing with transient, oscillatory, and voltage stability. The following three subsectionsprovide an overview of these stability issues.
Transient Stability Transient stability has traditionally been the performance issue most limiting on power systems, and thus, the phenomenon is well understood by power system engineers. Transient stability refers to the ability of all machines in the system to maintain synchronism following a disturbance such as a transmission system fault, a generator or transmission line trip, or sudden loss of load. Unlike the other stability issues described here, transient stabilityis a relatively fast phenomenon, with loss of synchronism typically occurring within two seconds of a major disturbance. Traditional stability analysis software programs are designed to study and identifytransient stabilityproblems. These software programs provide the ability to model electrical and mechanical dynamics of generators, excitation systems, turbinelgovernors, large motors, and other equipment such as static Var compensators (SVC) and high voltage direct current (HVDC) systems. Scenarios studied typically involve major events such as faults applied to a transmission line with line trips. A transient instability is identified by examining the angles (or speeds) of all machines in the system. If the angle (or speed) of any machine does not return toward an acceptable steady-state value within one to three seconds following a power system disturbance, the machine is said to have lost synchronism with the system. Although not always modeled for transient stability analysis, this loss of synchronism will typically result in the %ene;ator being tripped off-line by a protective relay. Another way of looking at the transient stability issue is to consider voltage swings in the transmission networ'k. For example, many utilities require that the voltage swing during a transient event remains above a pre-determined value (e.g., 80% of nominal) because relays may trip generators or loads if the voltage is too low for too long. If the voltage swing does not meet this criterion, the event caasing the disturbance, while not unstable in the classical definition of transient stability, is still considered transiently unacceptable. In many systems, discrete, open-loop action is applied to insure stable operation in the power system following a major
.
THE CONTROL HANDBOOK disturbance. These so-called "Special Protection Systems" or "Remedial-Action Schemes" can be as simple as undervoltage load shedding on selected loads or as complex as wide area generator tripping and intentional network separation [2]. Such schemes have proven highly useful, and will likely always be needed. Designing such schemes and keeping them updated as systemsevolve is an ongoing challenge. To date automation is not applied to developing such schemes; experience and large-scale simulation tools are relied upon.
Voltage Stability
.
'
With the evolution of modern power systems, voltage stability has emerged as the limiting consideration in many systems. Voltage stability refers to the power systems' ability to maintain acceptable voltages under normal operating conditions and after a major system disturbance. A system is said to be voltage unstable (or in voltage collapse) when an increase in load demand, a major disturbance, or a change in the system causes a progressive and uncontrollable decrease in voltage. Means of protecting and reinforcing power systems against voltage collapse depend mostly on the supply and control of reactive power in the system. The phenomenon of voltage instability (collapse) is dynamic, yet frequently evolves very slowly, from the perspective of a transient stability. A voltage collapse initiated by loss of infeed into a load center, for example, can progress over a one to five minute period. This type of phenomenon presents two challenges to the power system engineer. The first challenge is in identifying and simulating the voltage collapse. The second challenge is selecting the system reinforcements to correct the problem in the most economic manner. A recently emerging class of computer simulation software provides utility engineers with powerful new tools for analyzing long-term dynamic phenomena. The ability to perform long-term dynamic simulations either with detailed dynamic modeling or simplified quasi-steady-state modeling permits assessment of critical power system problems more accurately than with conventional power flow and stability programs. The voltage stability issue is discussed in great detail in [3].
Oscillatory Stability Power systemscontain electromechanicalmodes of oscillation due to synchronous machine rotor masses swinging relative to one another. Apower system having severalsuch machineswill act like a set of masses (rotating inertia of machines) interconnected by a network of springs (ac transmission), and will exhibit multiple modes of oscillation. These "power-swing modes" usually occur in the frequency range of 0.1 to 2 Hz. Particularly troublesome are the so-called inter-area oscillations which usually occur in the frequency range of 0.1 to 1 Hz. The inter-area modes are usually associated with groups of machines swinging relative to other groups across a relatively weak transmission path. The higher frequencymodes (1 to 2 Hz) usually involve one or two machines swinging against the rest of the power system or electrically close machines swinging against each other (41. Because there is great incentive to minimize transmission losses in the power system, power swing modes have very lit-
tle inherent damping. Damping which is present is usually due to steam or water flow against the turbine blades of generating units and to special conductors placed in the rotor surface of the generators, known as amortisseur (damper) windings. High power flows in the transmission system create conditions where swing modes can experience destabilization. The effect of these oscillations on the power system may be quite disruptive if they become too large or are underdamped. The actual power (watts) swings may themselves not be overly troublesome, but they can result in voltage escillations in the power system adversely affecting the system's performance. Limitations may be imposed on the power transfer between areas to reduce the possibility of sustained or growing oscillations, or special controls may be added to damp these oscillations. Once lightly damped, sustained, or growing oscillations have been observed or predicted by time simulation for a specific system condition, the most common and effective way to investigate them in detail is with linear or small-signal, frequency-domain analysis. With the linear approach to studying power systems, the engineer can view the stability problem from a different perspective than time-domain simulations. This approach enhances the overall understanding of the system's performance. The basic elements and tools of linear analysis commonly applied to power systems include eigenvalues (poles, modes, or characteristic roots), root locus, transfer functions, Bode plots, Nyquist plots, eigenvectors (mode shapes), factors, Prony processing, and others. While power systems are nonlinear in nature, analysis based on small-signal linearization around a specific operating point can be important in solving damping problems. When combined with practical engineering experience, time-domain simulations, and field measurements, linear analysis can assure reasonable stability margins under many operating conditions. Considering that the majority of closed-loop supplemental controls in the power system are for damping oscillations, the remainder of this section on "Control of Power Transmission" will focus on the closed-loop controls which affect power system oscillatorystability.
79.2.2 Impact of Generator Excitation Control on the Transmission System \ The main control function of the excitation system is to regulate the generator terminal voltage. This is accomplished by adjusting the field voltage in response to terminal voltage variations. A typical excitation system includes a voltage regulator, an exciter, protective circuits, limiters, and measurement transducers. The voltage regulator processes the voltage deviations from a desired set point and adjusts the required input signals to the exciter, which provides the dc voltage and current to the fieldwindings, to take corrective action, Figure 79.41 represents a single generator with its excitation system, supplying power through a transmission line to an "infinite bus" (a power grid having significantly more capacity than the generator under study). Early excitation systems were manually controlled, i.e., an operator manually adjusted the excitation system current with a
79.2. CONTROL OF POWER TRANSMISSION rheostat to obtain a desired voltage. Research and development in the 1930s and 1940s showed that applying a continuously acting proportional control in the voltage regulator significantly increased the generator steady-state stability limits. Beginning in the late 1950s and early 1960s most of the new generating units were equipped with continuously acting voltage regulators. As these units became a larger percentage of the generating capacity, it became apparent that the voltage regulator action could have a detrimental impact on the overall stability of the power system. Low frequency oscillations often persisted for long periods of time and in somt: cases presented limitations to the system's power transfer capal~ility.In the 1960s, power oscillations were observed following the interconnection of the Northwest and Southwest U.S. power grids. Power oscillations were also detected between Saskatchewan, Manitoba, and Ontario systems in the Canadian power system. It was found that reducing the voltage regulator gain of excitation systems improved the system stability. Great effort has been directed toward understanding the effect of the excitation system on the dynamic performance of power systems. The single generator infinite bus system shown in Figure 79.41 provides a good vehicle to develop concepts related to the stability of a synchronous generator equipped with an excitation system. F:igure 79.42 is the block diagram associated with the linear representation of the system shown in Figure 79.41. This diagram has become a standard tool to gain insight into the stability characteristics of the system [5]. The stability of the linearized system can be studied by considering the torques that comprise the accelerating torque Tu: the synchronizing torque Tr, which is in phase with the rotor angle deviations, the damping torque Td which is in phase with the rotor speed deviations, and the torque T,, due to the excitation system acting through the field winding of'the machine. The mechanical torque, T,, is assumed constant. In Figure 79.42 the phase relations are indicated for conditions of positive synchronizing torque which tends to restore the rotor to steady-state conditions by accelerating or decelerating the rotor inertia. The damping torque is also positive and tends to damp rotor oscillations. Tcx is shown as contributing positi~esynchronizing torque and negative damping torque. The primary synchronizing effect is exhibited in the term K1 of Figure 79.42, which represents the change in torque mostly in phase with angle. The damping parameter D represents turbine-generator friction, windage, and the impact of steam or water flow through the turbine. Typically, the torque-angle loop is stable due to the inherent damping and restoring forces in the power system. However, unstable oscillations can result from the introduction of negative damping torques added through the excitation system loop [5], 161. Phase lags are introduced by the voltage regulator and the generator field dynamics so that the resulting torque is out of phase with both rotor angle and speed deviations. The phase characteristics of this path depend on the generator and voltage regulator character~stics,and the parameters K4, Kg, and Kg. K5plays a dominant role in the phase relationships with respect to the damping torque. For weak systems and heavy loads, K5 is typically negative. Coupled with the phase lags due to the
voltage regulator and generator fieldwinding, the path via K5 can produce a torque T,, which contributes positive synchronizing torque and a negative damping torque. This negative damping component can cancel the small inherent positive damping due to the damping torque and lead to an unstable system. A detailed description of these parameters is given in [5].
Generator
Voltage Regulator
Figure 79.41
Single generator and excitation system.
Toque Irmti
w
Rsguletis SWmkhg Synchrmbhg Torqu. co* w i h ConsLant Flux
changa h T o q w Due to Fku Changs
Exwon Eliscls wih Reg~!aIw Loop Cksd
Change d Run Lhbges
Change h Voltage Due to flux
Figure 79.42 Phase relationships for simplified model of single machine to infinite bus.
79.2.3 Power System Stabilizer (PSS)Modulation on Generator Excitation Systems Because a major source of negative damping has been introduced to the system by the application of high-response excitation systems, an effectiveway to increase damping is to modify the action of these excitation systems. A Power System Stabilizer (PSS) is often used to provide a supplementary signal to the excitation system. The basic function of the PSS is to extencl stability limits by modulating generator excitation t- provide positive damping torque to power swing modes.
THE CONTROL HANDBOOK To provide damping, a PSS must produce a component of electrical torque on the rotor in phase with speed deviations. The implementation details differ, depending upon the stabilizer input signal employed. PSS input signals which have been used include generator speed, frequency, and power [ 6 ] . However, for any input signal, the transfer function of the PSS must compensate for the gain and phase characteristics of the excitation system, the generator, and the power system. These collectively determine the transfer function from the stabilizer output to the colnponent of electrical torque which can be modulated via excitation control. Figure 79.43 is an extension of Figure 79.42 to include a speedinput PSS (PSS(s)). The transfer function G ( s ) represents the characteristics of the generator, the excitation system, and the power system. A PSS utilizing shaft speed as an input must compensate for the lags in G ( s ) to produce a component of torque in phase with speed changes so as to increase the damping of the rotor oscillations. An ideal PSS characteristic would therefore be inversely proportional to G ( s ) . Such a stabilizer would be impractical because perfect compensation for the lags of G ( s ) requires pure differentiation with its associated high gain at high frequencies. A practical speed PSS must utilize leadllag stages to compensate for phase lags in G ( s ) over the frequency range of interest. The gain must be attenuated at high frequencies to limit the impact of noise and minimize interaction with torsional modes of turbine-generator shaft vibration. Low-pass and possibly band reject filters are required. A washout stage is included to prevent steady-state voltage offsets as system frequency changes. A typical transfer function of a practical PSS which meets the above criteria is given by
where F ( s ) represents a filter designed to eliminate torsional frequencies. A PSS must be tuned to provide the desired system performance under the power system condition which requires stabilization, typically weak systems with heavy power transfer, while at the same time being robust in that undesirable interactions are avoided for all system conditions. To develop such a design, it is important to understand the path G ( s ) through which the PSS operates: 1. The phase characteristicsof G ( s )are nearly identical
to the phase characteristicsoftheclosed-loop voltage regulator. 2. The gain of G ( s )increases with generator load. 3. The gain of G ( s ) increases as the ac system strength increases. This effect is amplified with high-gain voltage regulators. 4. The phase lag of G ( s ) increases as the ac system becomes stronger. This has the greatest influence with high-gain exciters, because the voltage regulator crossover frequency approaches that of the frequency of the swing modes.
G ( s ) has the highest gain and greatest phase lag under conditions of full load on the generator and the strongest transmission system (i.e., higher short circuit strength). These conditions therefore represent the limiting case for achievable gain with a speed-input PSS and constitute the base condition for stabilizer design. Because the gain of the plant decreases as the system becomes weaker, the damping contribution for the strong system should be maximized to insure best performance with a weakened system. In general, the highest compensation center frequency which provides adequate local mode damping will yield the greatest contribution to intertie modes of oscillation. The following are guidelines for setting the leadllag stages to achieve adequate local mode damping with maximum contribution to intertie modes of oscillation. Two basic criteria in terms of phase compensation are 1. It is most important to maximize the bandwidth within which the phase lag remains less than 90". This is true even though less than perfect phase compensation results at the local model frequency. 2. The phase lag at the local mode frequency should be less than about 45'. This can be improved somewhat
by decreasing the washout time constant, but too low a washout time constant will add phase lead and effect to the intertie an associated desyn~hronizin~ oscillations. In general, it is best to keep the washout time constant greater than one second. The gain and frequency at which an instability occurs also provide an indication of appropriate leadllag settings. The relationship of these parameters to performance is useful in root locus analysis and in field testing. The following observations hold: 3. The frequency at which an instability occurs is high-
est for the best leadllag settings. This is related to maximizing the bandwidth within which the phase lag remains less than 90". 4. The optimum gain for a particular leadllag setting is consistently about one-third of the instability gain.
,.4 Practical Issues for Supplemental
Damping Controls Applied to Power Transmission Equipment In many power systems, equipment is installed in the transrnission network to improve various performance issues such as transient, oscillatory, or voltage stability. Often this equipment is based on power electronics which generally means that the device can be rapidly and continuously controlled. Examples of such equipment include high voltage dc systems (HVDC), static Var compensators (SVC), and thyristor-controlled series compensation (TCSC). To improve damping in a power system, a supplemental damping controller can be applied to the primary regulator of a device. The supplemental control action should
79.2. CONTROL OF POWER TRANSMISSION
Objectives for Control Design and Operation
I
Synchranizing Torque
I
Several aspects of control design and operation must be satisfied during both transient and steady-state o:peration of the power system before and after a major disturbance. These aspects suggest that controls applied to the power system should meet these requirements:
CI
Synchmnung
Tomue Cosfticianl wilh~onstantFlux
1. Survive the first few swings after a major system dis-
/ i I
Change of Flux Linkages
I
Figure 79.43
Ia K6
Change in Volage Due to Flux
I I
I I
,
Simplified model of single machine to infinite bus.
modulate the outpui of a device, in turn affecting power transfer and adding damping to the power system swing modes. This section provides an ovcrview of soint- practical issues affecting the ability of damping controls to improve power system dynamic performance [ 7 ] . Section 79.2.5 on "Examples" illustrates the application of these concepts (81.
Siting Siting plays an important role in the ability of a device to stabilize a swing mode. Many controllable power system devices are sited based on issues unrelated to stabilizingthe network (e.g., HVDC transmissior~and generators). In other situations (e.g., SVC or TCSC), the equipment is installed primarily to help support the transmission system, and sitingwill be heavily influenced by its stabilizing potential. Device cost represents an important driving force in selecting a location. In general, there will be one location which makes optimum use of the controllability of a device. If the device is located at a different location, a larger sized device would be needed to achieve a desired stabilization objective. In some cases, overall costs may be minimized with non-optimum locations of individual devices,because other considerations must also bertaken into account, such as land price and availability, environmental regulations, etc. The inherent abi11,tyof a device to achieve a desired stabilization objective in a robust manner, while minimizing the risk of adverse interactions, is another consideration which can influence the siting decision. Most often, these other issues can be overcome by appropriate selection of input signals, signal filtering, and control design discussed later in this section. This is not always possible, however, so these issues should be included in the decision-making process for choosing a site. For many applications, it will be desirable to apply the devices in a distributed manner. This approach helps to maintain a more uniform voltage profile ,acrossthe network, during both steady-state operation and after transient events. Greater security may also be possible with distributed devices because the overall systein can more likely tolerate the loss of one of the devices.
turbance with some degree of safety. The safety factor is usually built into a ReliabilityCouncil's criteria (e.g., keeping voltages above some threshold during the swings). 2. Provide some minimum level of damping in the steady-state condition after a major disturbance (post-contingent operation) because, in addition to providing security for contingencies, sorne applications will require "ambient" damping to prevent spontaneous growth of oscillations in steady-state operation. 3. Minimize the potential for adverse side effects, which can be classified as follows: (a) Interactions with high-frequency phenomena on the power system, such as turbine-generator torsional vibrations and resonances in the ac transmission network. (b) Local instabilitieswithin the bandwidth of the desired control action. 4. Be robust. This means that the control will meet its
objectives for a wide range of operating conditions encountered in power system applications. The control should have minimum sensitivity to system operating conditions and component parameters because power systems operate over a wide range of operating conditions and there is often uncertainty in the simulation models used for evaluating performance. The control should also have minimum communication requirements. 5. Be highly dependable. This means that the control has a high probability of operating as expected when needed to help the power system. This suggests that the control should be testable in the field to ascertain that the device will act as expected in a contingency. The control response should be predictable. The security of system operations depends on knowing, with a reasonable certainty, what the various control elements will do in a contingency.
Closed-Loop Control Design Closed-loop control is utilized in many power system components. Voltage regulators are commonplace in generator excitation systems, capacitor and reactor banks, tap-changing transformers, and SVCs. Modulation controls to enhance power system stability have been applied extensively to generator exciters and to HVDC and SVC systems. A notable advantage of closed-
THE CONTROL HANDBOOK loop control is that stabilization objectives can often be met with less equipment and impact on the steady-statepower flows than with open-loop controls. Typically, a closed-loop controller is always active. One benefit of such a continuing response to low-level motion in the system is that it is easy to test continuously for proper operation. Another benefit is that, once the system is designed for the worst-case contingency, the chance of a less severe contingency causing a system breakup is lower than if only open-loop controls are applied. Disadvantages of closed-loop control involve mainly the potential for adverse interactions. Another possible drawback is the need for small step sizes, or vernier control in the equipment, which will have some impact on cost. If communication is needed, this could also be a problem. However, experience suggests that adequate performance should be attainable using only locally measurable signals. One of the most critical steps in control design is to select an appropriate input signal. The other issues are determining the input filtering and control algorithm to assure attainment of the stabilization objectives in a robust manner with minimal risk of adverse side effects. The following paragraphs discuss design approachesfor closed-loop stabilitycontrols, so that the potential benefits of such control can be realized in the power system. Input-Signal Selection The choice of local signals as inputs to a stabilizing control function is based on several considerations: 1. The input signal must be sensitive to the swings on
the machines and lines. In other words, the swing modes must be "observable" in the input signal selected. This is mandatory for the controller to provide a stabilizing influence. 2. The input signal should have as little sensitivity as possible to other swing modes in the power system. For example, in a transmission line device, the control action will benefit only those modes which involve power swings on that line. Should the input signal respond to local swings within an area at one end of the line, then valuable control range would be wasted in responding to an oscillation over which the damping device has little or no control. 3. The input signalshould have little or no sensitivity to its own output in the absence of power swings. Similarly, there should be as little sensitivity as possible to the action of other stabilizing controller outputs. This decoupling minimizes the potential for local instabilities within the controller bandwidth. These considerations have been applied to a number of modulation control designs, which have proven themselves in many actual applications. The application of PSS controls on generator excitation systems was the first such study. The study concluded that speed or power were best and that frequency of the generator substation voltage was an acceptable choice as well [ 6 ] . When applied to SVCs, the magnitude of line current flowing past the SVC was the best choice [ 9 ] . For torsional damping controllers
on HVDC systems, it was found that the frequency of a synthesized voltage close to the internal voltage of the nearby generator (calculatedwith locally measured voltages and currents) was best [lo]. In the case of a series device in a transmission line, these considerations lead to the conclusion that frequency of a synthesized remote voltage to "find" the center of an area involved in a swing mode is a good choice. Synthesizing voltages at either end of the line allows determining the frequency difference across the line at the device location, which can then be used to make the line behave like a damper across the inherent "spring" nature of the ac line. Synthesizing input signals is discussed further in the section on "Examples".
Input-Signal Filtering To prevent interactions with phenomena outside the desired control bandwidth, low-pass and high-pass filtering is used for the input signal. In certain applications, notch filtering is needed to prevent interactions with specific lightly damped resonances. This has been the case with SVCs interacting with ac network resonances and modulation controls interacting with generator torsional vibrations. On the low-frequency end, the high-pass filter must have enough attenuation to prevent excessive response during slow ramps of power or during the long-term settling following a loss of generation or load. This filtering must be considered while designing the overall control, as it will strongly affect performance and the potential for local instabilitieswithin the control bandwidth. However, finalizing such filtering usually must wait until the design for performance is completed, after which the attenuation needed at specific frequencies can be determined. During the control design work, a reasonable approximation of these filters needs to be included. Experience suggests a high-pass break near 0.05 Hz (three-second washout time constant), and a double low-pass break near 4 Hz (40-msec. time constant) as shown in Figure 79.44, is suitable as a starting point. A control design which adequately stabilizes the power system with these settings for the input filtering will probably be adequate after the input filtering parameters are finalized.
I Figure 79.44
I
Initial input-signal filtering.
Control Algorithm Typically, the control algorithm for damping controllers leads to a transfer function which relates an input signal(s) and a device output. When the input is a speed or frequency type and the output affects the real power, the control algorithm should approach a proportional gain to provide a pure damping influence. This is the starting point for understanding how deviations in the control algorithm affect system performance. In general, the transfer function of the control and input-
79.2. CONTROL OF POWER TRANSMISSION signal filtering is most readily discussed in terms of its gain and phase relationship versus frequency. A phase shift of 0" in the transfer function means that the output is simply proportional to the input, and, for discussion, is assumed to represent a pure damping effect on a lighily damped power swing mode. Phase lag in the transfer function (up to 90ยฐ), translates to a positive synchronizing effect, tending to increase the frequency of the swing mode when the control loop is closed. The damping effect will decrease with the sine of the phase lag. Beyond 90ยฐ, the damping effect will become negative. Conversely, phase lead is a desynchronizing influence and will decrease the frequency of the swing mode when the control loop is closed. Generally, the desynchronizing effect should be avoided, so the preferred transfer function is one which lags between 0" and 45" in the frequency range of the swing modes the control is designed to damp. After the shape of the transfer function meeting the desired control phase characteristics is designed, the gain of the transfer function is selected to obtain the desired level of damping. The gain should be high enough to insure full utilization of the device for the critical disturbances, but no greater, so that risks of adverse effects are minimized. Typically, the gain selection is done with root locus or Iqyquist analysis. This handbook presents many other control clesign methods that can be utilized to design supplemental contrclls for the power transmission system.
Performance Evaluation Good simulation tools are essential for applying damping controls to power transmission equipment for system stabilization. The controls must be designed and tested for robustness with these simulation tools. For many operating conditions, the only feasible means of testing the system is by simulation, so that confidence in the power system model is crucial. A typical large-scalepower system model may contain up to 15,000statevariables or more. For design purposes, a reduced-order model of the power system is often desirable. If the size of the study system is excessive, the large number ofsystem variations and parametric studies required becomes tedious and prohibitivelyresource limited for many linear analysis techniques in general use. Good understanding of the system performance can be obtained with a model of only the relevant dynamics for the problem under stndy. The key conditionswhich establish that controller performance is adequate and robust can be identified from the reduced-order model, and then tested with the fullscale model. Recent improvements have been made in deriving meaningful reduced-order equivalentpower system models [ 111. Field testing is also essential for applying supplemental controls to power systems. Testing needs to be performed with the controlIer open loop, comparing the measured response at its own input and the inputs of other planned controllers against the simulation models. Once these comparisons are acceptable, the system can be tested with the control loop dosed. Again, the test results should correlate reasonably with the simulation program. Methods have been developed for performing testing of the overall power system to provide benchmarks for validating the full-system model. Testing can also be done on the simulation program to obtain the reduced-order models for the advanced
control design methods [12]. Methods have also been developed to improve the modeling of individual components. Adverse Side Effects Historically in the power industry, each major advance in improving system performance has had adverse side effects. For example, adding high-speed excitation systems more than 40 years ago caused the destabilization known as the "hunting" mode of the generators. The solution was power system stabilizers, but it took more than 10 years to learn to tune them properly, and there were some unpleasant surprises involving interactions with torsional vibrations on the turbine-generator shaft 161. HVDC systems also interacted adversely with torsional vibrations (the so-called subsynchronous torsional interaction (SSTI) problem), especiallywhen augmented with supplemental modulation controls to damp power swings. Similar SSTI phenomena exist with SVCs, although to a lesser degree than with HVDC. Detailed study methods have since been established which permit designing systems with confidence that these effects will not disturb normal operation. Protective relaying exists to cover unexpected contingencies [lo], [ 131. Another potentially adverse side effect is with SVC systems which can interact unfavorably with network resonances. This has caused problems in the initial application of SVCs to transmission systems. Design methods now exist to deal with this phenomenon, and protective functions exist within SVC controls to prevent continuing exacerbation of an unstable condition [9]. As technologies continue to evolve, such as the current industry focus on Flexible AC Transmission Systems (FACTS), new oiportunities arise for improving power system performance. FACTS devices introduce capabilities that may be an order of magnitude greater than existing equipment for stabilityimprovement. Therefore there may be much more serious consequences if new devices fail to operate properly. Robust, non-interacting controls for FACTS devices are critically important for stability of the power system [ 141.
79.2.5
Examples
Large power-electronicdevices have been applied to power transmission grids for many years. One of the most common and wellestablished uses of large-scale power electronics is high voltage dc transmission systems (HVDC). As noted above, current industry activity is focusing on new types of equipment which can affect ac transmission systems, using the acronym "FACTS". This acronym arises from the concept of "Flexible AC Transmission Systems:' where power-electronic devices can improve flexibility beyond what is possible with conventional switched compensating equipment. Because these devices are thyristor-controlled, their operating point can be adjusted within a few milliseconds, and there is no mechanical wear associated with rapid cycling of the operating point. These characteristics are the primary features which made power-electronic devices attractive for stabilization functions 181. In this section, three types of power-electronic systems are described and two examples are presented, emphasizing their
THE CONTROL HANDBOOK ability to add damping to power system modes of oscillation.
High-Voltage DC (HVDC) High-voltage dc (HVDC) transmission is used to interconnect asynchronous ac grids for power transfer over very long distances. The system includes line-commutated converters at both the sending and receiving ac terminals with a dc link between. In long-distance transmission schemes, this dc link is a transmission line or cable. In back-to-back schemes, both converters are in the same building and the dc link is simply a short length of busbar. HVDC provides a means to control real power transfer directly between ac networks, independent of ac phase angle. This feature is sometimes utilized to provide a modulating function to the controls for damping power-swing oscillations. Because the converters arc line commutated, they draw a reactive power load from the ac networks. This reactive power is related to the real power and both ac and dc voltages, which tends to be a detriment to adding modulation controls. Considerable study has been directed at supplemental controls for HVDC systems. Some schemes have benefited from modulation [ 151.
Static Var compensator (SVC) The SVC provides rapid control of an effective shunt susceptance on the transmission grid and is typically used to regulate voltage at a bus. This action is similar to conventional switched reactors or shunt capacitorswhich have been used for many years, except that with the SVC, the control can be accomplishedquickly and continuously. The voltage-regulatingfunction of an SVC is sometimes augmented by a supplemental control to modulate the voltage set point to add damping to power-swing modes. Generally, the best location for shunt compensation is the point where the voltage swings are greatest for the dominant swing mode. In a simple remote generation system, this is at the electrical midpoint between the internal voltage of the generator and that of the receiving system. In an interconnected system, the siting decision is complicated by the fact that there are multiple modes of oscillation. Knowing the general characteristics of the dominant swing modes, (based on which generators swing relative to the others), siting decisions may be made using the same basic concept as for the simple remote generation system just described. I n addition, properly selecting input signals can allow the damping benefits for a specific mode, while not risking instability of others. These important issues are described in detail in [9]. When an SVC is located near a load area, modulation may be ineffective or even detrimental to power-swing damping. The difficulty lies in the fact that the change in power of the generators in the load area is affected in two ways by a variation in voltage. One is due to the change in power flow across the transmission lines from the sending system, and the other is due to the change in local load. These effects counteract each other, and the leverage for controlling the swing mode is very small. Worse than being small, however, is that the sign of the effect can change with operating point, so that a control designed for a beneficial
effect in some operating conditions may have a detrimental effect in others. This inherent characteristic implies that a robust damping control is difficult to achieve on an SVC sited in a load area for swing modes which involve that area. Note that there is usually value in simply stiffening the voltage in a load area, however, so applying an SVC with only voltage control may help stability. As an example of the effect of SVC on damping power system oscillations,consider the three-area system in Figure 79.45. This system has two modes of oscillation, one near 0.4 Hz and the other near 0.9 Hz. The lower frequency mode involves Area 1 swinging against Areas 2 and 3, while the higher frequencymode involves Area 2 swinging against Area 3. With the SVC located midway along the major intertie (Bus 1to Bus 5 is the intertie, Bus 4 is midway as shown in Figure 79.45), the leverage ofthe SVC on damping (known as controllability) the two modes varies with intertie power transfer. The controllability is greatest at higher power transfer and is different for each of the modes, based on the participation of generators in the specific swing mode. The best input signal for the SVC power-swing damping control, based on work presented in [9], is line current magnitude. Figure 79.46 shows the input-signal filtering and controller configuration, and Figure 79.47 shows the corresponding transfer function. Figure 79.48 shows the power system response following a disturbance in Area 1, while ~ r e 1a is exporting power. The thin curves are for the system with no damping controller on the SVC, and the thicker curves are the case with a damping control. The swings of Area 1 angle reflect primarily the 0.4-Hz mode, while the Area 2 angle contains some of both modes. Figure 79.49 shows the simulation results for a disturbance in Area 2 with Area 1 importing power. Here, Area 2 shows primarily the 0.9Hz mode, and Area 1 shows primarily the 0.4-Hz mode. In both cases, the simulation results show that significant damping is added to the system with a control added to modulate the SVC voltage set point, even for greatly varying power system operating conditions. These results suggest that the control design is robust.
Area 1 (lOGW
1'
I&-,
b5 Area 3
Figure 79.45 Example of three-area system (SVC is rated at 7.5% of total system generation).
79.2. CONTROL OF POWER TRANSMISSION
Figure 79.46
S~mplepower-swing damping control structure for an
svc.
K1, = 0.85 pulp^ T , =032sec T, =0.78 sec T, = O 13 sec TF = 0.032 sec
f
0
2
4 6 T i e - Seconds
10
8
Figure 79.48 Angle swings for a disturbance in Area 1 with Area 1 exporting power (Area 3 is reference).
0
1
Frequency - Hz
Figure 79.47
Sample transfer function of a simple SVC damping con-
Area 1
trol.
Thyristor-Controlled Series compensation (TCSC) Series capacitors have long been applied to transmission systems for reducing the effective impedance between load and source to increase power transfer capability. Adding thyristor control provides significantleverage in directing steady-state power flow along desired paths. In addition, this controllability can be highly effective for damping power swings. The thyristor control also mitigates the adverse side effects from the resonance between the series capacitor and the inductance of the transmission lines, transformers, and generators (known as subsynchronous resonance or SSR) [ 161. ATCSC module consistsofaseries capacitor andaparallelpath with a thyristor switch and a surge inductor. Also in parallel, as is typical with series capacitor applications, is a metal-oxidevaristor (MOV) for overvoltage protection. A complete compensation system may be rnadt: up of several of these modules in series and may also include a conventional (fixed) series capacitor bank as part of the overall scheme, as shown in Figure 79.50. The TCSC can be rapidly and continuously controlled in a wide band of operating points that can make it appear as a net capacitive or net inductive impedance in a transmission line [ 171. The following is an example of a damping control for an actual TCSC installation. The objective of the control was to con-
.iM
Time - Seconds
Angle swings for a disturbance in Area 2 with Area 1 importing power (Area 3 is reference).
Figure 79.49
tinuously adjust the series reactance (X,) of the TCSC to add damping to critic4 electromechanical modes of oscillation. An appropriate input signal with the proper characteristics to meet the stabilization objective of this design was selected. The signal that proved most appropriate for this application is thefrequency difference across the system obtained by synthesizing voltages behind reactances in both directions away from the TCSC. This concept is illustrated in Figure 79.51 and described in [14] and [18]. This signal is synthesized from local voltage and current measurements and eliminating the need for long-distance communication of signals. To prevent interaction with phenomena outside the desired control bandwidth, low-pass, high-pass, and torsional filtering; is included in the modulation control path, as well as some phase compensation. The control structure for one specific installation [16] is shown in Figure 79.52. The transfer function of this control and input signal filtering are shown in Figure 79.53. Note that the 6-Hz and 13-Hz notch filters are
THE CONTROL HANDBOOK 6 HZ Notch
in the input-signal filtering path to minimize the potential for site-specific torsional frequency interaction. The system examined here is a radial machine swinging against the rest of a power system at a mode near 1Hz. This configuration is significant because actual TCSC controls were factory tested and field tested on a system with a topology such as this [16]. The TCSC for this system has an rms line-to-line voltage rating of 500 k V , rms line current rating of 2900 Amps, and a nominal reactance rating of 8S2. Figure 79.54 shows analog simulator results for a severe fault applied on one of the system buses. At approximately one-half second, a fault is placed on the system and the frequency shows significant oscillations without the supplemental damping control. At six seconds into this test, the TCSC power swing damping control is engaged and the oscillations are eliminated within three cycles of the l-Hz oscillation. The remainder of the simulator test results show the ambient damping of the system. The control deadband prohibits further reduction at this point. These results clearly demonstrate the potential leverage of a TCSC for damping powar system oscillations.
-150
Figure 79.50 TCSC.
Phase-Angle of
o,y", 3
TCSC compensation scheme including a multimodule
---Angle of ,v,
%ox
C.mpeMalh P h w ~ y l
(1. (1+.225.) 1125s)
Figure 79.52 Input-Signalfilteringand control structure for a specific TCSC installation.
Series Capacitor
Multi-Module TCSC
13 HZ
I -1
6 667722++<16 368. 3 + s. + z ~ ~ 1 - 1 11 44 2 1 + 3 .7~7l8s++4~m~
" .01
Figure 79.53
I
l
l l
lltl
.I
I
1
1 1 111,
I
1 Frequency in Hertz
1
, 1 1 1 1 1
10
Transfer function of a TCSC damping control.
generator excitation control, for which much experience and insight exist in the industry. Unlike PSS control at a generator location, the speed deviations of the machines of interest are not readily available to a controller sited in the transmission path. Further, since the intent is to damp complex swings involving large numbers of generators, speed signals themselves are not necessarily the best choice for an input signal. It is desired to extract an input signal from the locally measurable quantities at the controller location. Finding an appropriate combination of measurements is the most important aspect of control design.
Approximate Multimodal Decomposition Figure 79.51 Synthesis of remote voltage phasers, angle difference, and frequency difference.
79.2.6 Recent Developments in Control Design Traditional approaches to aid the damping of power swings include the use of Power System Stabilizers (PSS) to modulate the
The approach outlined in this section and described in [18] makes a few approximations to develop engineering insight in the control design process. One key approximation is that the modes of interest exhibit light damping. Another is that the impact of any individual control on the frequency and mode shape of the power swing is small. These assumptions permit breaking the system apart based upon the approximate mode shape information, and determining the incremental effect of controllers on each mode separately. The effect of the controllers upon themselves also becomes apparent, and the design can proceed using
79.2. CONTROL OF POWER TRANSMISSION
, Swing Mode of Interest
HZ
4 Ohms
Kd(S)
A
Power System (Except ith Mode) K,ci(s)
Time (Sec.)
TCSC Damping Control Engaged
Figure 79.54 simulator.
Damping benefits of a TCSC as shown on an analog
this information to assure minimum self-interaction effects on the final response.
u
Feedback
I Figure 79.55
K~~~c(s)
Power System Dumping Controller J
Multimodal decompositionblock diagram.
Formulation In the single-machine model described earlier, the mechanical swing mode is represented in terms of a synchronizing and a damping torque with control loops built around it. The same type of representation for each of the swing modes can be obtained by transforming the linearized representation of the power system into the equivalent form [18]:
where ASmi and A@,, represent the modal angle and modal speed, respectively, associated with a swing mode hi, kmi and dmi represent modal synchronizing and damping coefficients, and Z,,,i consists of all the other state variables. Based on this representiltion, a block diagram, similar to the one developed in [5], can be constructed for mode h i . Such a block diagram is shown in Figure 79.55. In this tigure, K,, (s), KCi(s),,Koi(s), and K z L i ( s )denote modal, controllability,observability, and inner loop transfer fun~ctions,respectively. These transfer functions evaluated at s =. jo are complex, providing both gain and phase information which can be used to select effective transfer functions for feedback control. From Figure 79.55, the effective control action can be described
by the transfer function:
This relationship describes the impact of a given damping con, the modal system and is useful in estimattroller, K P S D C ( s )on ing the eigenvalue sensitivity of the ith swing mode. This relationship shows that the effective control action Kei ( s ) directly relates to the controllability Kci ( s ) and observability Koi ( s ) functions. Assuming the perturbation of the complex pair of eigenvalues corresponding to the ith swing mode as A l i = -Au fj Awi, the following relationships can be shown [18]:
Thus, for an ideal damping controller design, K,i ( s ) is desired to be real for the frequency range of interest. For a practical controller design, some phase lag at the swing mode frequency is acceptable, since it tends to increasesynchronizing torque. Also, a good bandwidth is desired, such that high-frequency interactions can be limited by the controller design, as described earlier in this section. This direct relationship of modal sensitivity t(othe controller and power system characteristics provides the key to achieving the desired insight into control design. Subsequent discussion will focus on each of the terms in the above equations, to show how they relate to control performance and how they can be used to select effective measurements and controls.
T H E CONTROL HANDBOOK
1494
Interpretations Controllability The effect of a controller on a given swing mode is defined as the controllability function K,,(s), and K,, ( s ) is directly proportional to Kc, ( s ) . In PSS design, Kc, ( s ) depends mostly on the excitation system, generator flux dynamics and network impedances. For network control devices such as TCSC, Kc, ( s ) depends mostly on the network structure and loads. When evaluated at s = jw,, Kc, ( j w , ) provides a measure of how controllable the ith mode is by the control signal u . If Kc, ( j w ,) is zero, then mode i is not affected by u. When more than one damping controller is used, Kc, ( s ) is a vector transfer function. In general, Kc, ( j w i ) is different for each swing mode. For example, a TCSC sited on a tie line would have significant controllability on the associated inter-area mode, but much smaller controllability over local modes. In cases with multiple interarea modes, the controllability may be nearly 180" out of phase from one mode to the next. Such a condition would mean that, if the machine speeds were averaged and transmitted to the controller, then the action of improving the damping on one mode would simultaneously decrease the damping of another mode. This situation must be compensated for by using an appropriate set of measurements so that this inherent adverse impact will be minimized or eliminated. The quantity Kc, ( j w , ) is a good indicator for evaluating effective locations to apply damping controllers, because the larger this is, the greater the leverage on the swir,g mode. For example, an SVC on a bus needing voltage support will be more effective for damping control than one close to a generator terminal bus. Observability The effective control action K,, ( s ) is also directly proportional to the observability function KO,( s ) . The function KO,( s ) relates the measured signal y to the ith modal speed Aw,, . For a PSS using the machine speed as the input signal, KO,( s ) = 1 for the local mode of that machine. When evaluated at s = j w , , K,, ( j w , ) gives an indication of the modal content of the ith swing mode in the measured signal y. Its magnitude can be used to assess the effectiveness of measurements y for damping control applications. In a multimodal system, because KO,( s ) is defined with respect to the modal speed, measurements directly related to machine speeds will have observability gains that are predominantly real. Signals more closely related to angular separation, such as power flow, will have an integral characteristic, i.e., nearly 90" of lag with respect to the speed. If KO,( j w , ) is small, then the ith mode is weakly observable from the measurement y. Thus having large K,, ( j w ,) for the dominant modes of interest is one ofthe criteria in selecting an input signal for a damping controller. InnerLoop The control design must also consider the effect of the controller output on its input (i.e., the component of the measured signal y due to the control u), other than via the swing mode of interest. This effect may be considered a "feedforward" term, but here we have called it the "inner-loop" effect, symbolized by the transfer function K I L (~s ) . The inner loop transfer function K I L i( s ) is extremely important in damping controller design using input signals other than
generator speeds. In [9] and [18], simple analysis based on the constraint imposed by K I L i( s ) ,are developed to aid the selection of an appropriate measurement signal.
79.2.7 Summary This section provides an overview of some key issues for control design, implementation, and operation. Basic stability issues for power systems, such as transient, voltage, and oscillatory stability were introduced. Given the focus of this book, the remainder of this section was on closed-loop controls that affect power system oscillatory stability. Basic concepts and a historic overview of generator excitation controls as they affect power system stability were introduced and described. Supplemental controls typically applied to generator excitation systems were also discussed (power system stabilizers (PSS)). Furthermore, practical issues for applying supplemental controls to power system transmission equipment (such as HVDC, SVC, and TCSC) were presented. The issues addressed included siting, objectives for control design and operation, closed-loop control design, input-signal selection, inputsignal filtering, control algorithms, performance evaluation, and a discussion on potentially adverse side effects due to the application of supplemental controls. Two detailed examples of power electronic devices (SVC and TCSC) were provided illustrating these basic concepts. Finally, a presentation on recent developments in control design was included. Several references were provided for further reading on the issues presented in this section. This information shows that supplemental control can be beneficial in increasing the stability and utilization of electric power systems.
References [ I ] Bertoldi, 0. and CIGRE WG 37.08, Adequacy and Security of Power Systems at the Planning Stage: Main Concepts and Issues, Paper 1A-05, Symposium on Electric Power System Reliability, Montreal, Canada, 1991. [2] Anderson, P.M. and LeReverend, B.K., Industry Experience with Special Protection Schemes, IEEE Power Eng. Soc., (PES) Paper 94-WM184-2 PWRS, New York, 1994. [3] Voltage Stability of Power Systems: Concepts, Analytical Tools and Industry Experiences, IEEE PES Special Publication 90TH0358-2-PWR, 1990. [4] Kundur, P., Power System Stability and Control, McGraw-Hill, New York, 1994. [5] deMello, F.P. and Concordia, C., Concepts of Synchronous ~ a c h i h Stability e as Affected by Excitation Control, IEEE Trans. Power Apparatus Syst., PAS-88, 316-329, 1969. [6] Larsen, E.V. and Swann, D.A., Applying Power System Stabilizers, Parts I, 11, and 111, IEEE Trans. Power Apparatus Syst., PAS-100,3017-3046,1981.
79.2. C O N T R O L OF P O W E R T R A N S M I S S I O N [7] Paserba, J.J., Larsen, E.V., Grund, C.E., and Murdoch, A., Mitigation of Inter-Area Oscillations by Control, IEEE PES Special Publication 95-TP-101 on InterArea Oscillations in Power Systems, 1995. [8] Paserba, J.J., et al., Opportunities for Damping Oscillations by Applying Power Electronics in Electric Power Systems, CIGTLE Symp. Power Electron. Electr. Power Systems, Tokyo, Japan, 1995. [9] Larsen, E.V. and Chow, J.H., SVC Control Design Concepts for System Dynamic Performance, in Application of Static Var Systems for System DynamicPerformance, IEEE F'ES Special Publication No. 87TH10875-PWR, 1987,3653. [lo] Piwko, R.J. and Larsen, E.V., HVDC System Control for Damping Subsynchronous Oscillations, IEEE Trans. Power Apparatus Syst., PAS-101(7), 2203-221 1, 1982. [ l l ] Chow, J.H., Date, R.A., Othman, H.A., and Price, W.W., Slow Coherency Aggregation ofLargePower Systems, IEEE Symp. Eigenanalysis and Frequency Domain Methocls for System Dynamic Performance, IEEE PES Special Publication 90TH0292-3-PWR, 1990,50-60. [12] Hauer, J.E, Application of Prony Analysis to the Determination ofModel Content andEquivalent Models for Measured Power Systems Response, IEEE Trans. Power Syst., 1062-1 068, 1991. [13] Bahrman, M.IP., Larsen, E.V., Piwko, R.J., and Patel, H.S., Experience with HVDC Turbine-GeneratorTorsional Interact ion at Square Butte, IEEE Trans. Power Apparatus Syst., PAS-99,966-975, 1980. [14] Clark, K., Fardanesh, B., and Adapa, R., ThyristorControlled Series Compensation Application StudyControl Interaction Considerations, IEEE PES Paper 94-SM-478-8-PWRD, San Francisco, CA, 1994. [15] Cresap, R.L., Mittelstadt, W.A., Scott, D.N., and Taylor, C.W., Operating Experience with Modulation of the Pacific HVDC Intertie, IEEE Trans. Power Apparatus Syst., PAS-9, 105,3-1059, 1978. [16] Piwko, R.J., Wegner, C.A., Darnsky, B.L., Furumasu, B.C., and Eden, J.D., The Slatt Thyristor-Controlled Series Capacitor Project -Design, Installation, Commissioning, and System Testing, CIGRE Paper 14-104, Paris, France, 1994. [17] Larsen, E.V., Clark, K., Miske, S.A., and Urbanek, J., Characteristics and Rating Considerations of Thyristor Controlled Series compensation, IEEE Trans. Power Delivery, 992-1000, 1994. [I81 Larsen, E.V., Sanchez-Gasca, J.J., and Chow, J.H., Concepts for Design of FACTS Controllers to Damp Power Swings, IEEE PES Paper 94-SM-532- 1-PWRS, San Francisco, CA, 1994.
SECTION XVIII Control Systems Including Humans
Human-in-the-Loop Control
R. A. Hess University of C a l i f o r n i ~D, a v ~ s
80.1 Introduction ..........................................................1497 80.2 Frequency-Domain Modeling of the Human Operator . . . . . . . . . . . . 1498 The Crossover Model of the Human Operator A Structural Model of the Human Operator 80.3 Time-Domain Modeling of the Human Operator .................. 1499 An Optimal Control Model of the Human Operator 80.4 Alternate Modeling Approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1500 Fuzzy Control Models of the Human Operator 80.5 Modeling Higher Levels of Skill Development . . . . . . . . . . . . . . . . . . . . . .1501 80.6 Applications ..........................................................1501 An Input Tracking Problem 80.7 Closure ................................................................1504 80.8 Defining Terms ....................................................... 1504 References.. .................................................................. 1504 Further Reading ............................................................1505
target
80.1 Introdiuction Interest in modeling the behavior of a human as an active feedback control device began during World War 11, when engineers and psychologists attempted to improve the performance of pilots, gunners, and bombardiers. To design satisfactory manually controlled systems these researchers began analyzing the neuromuscular characteristics of the human operator. Their approach, e.g., [19] was to consider the human as an inanimate servomechanism with a well-defined input and output. Figure 80.1(a) is a schematic representation of a tracking task in which the human is attempting to keep a moving target within the reticle of a gun sight. Figure 80.l(b) is a feedback block diagram of the gunnery task of Figure 80.1(a 1 in which the human has been represented as an error-activated compensation element. The input to the human in Figure 80.1 is a visual "signal" indicating the error Ibetween desired and actual system output, the gun azimuth. The output of the human is a command to the device which drivres the gun in azimuth. This device might be a simple gearing mechanism linked to a wheel which the human turns or an electric motor which transforms a joystick input into a proportional ].ate of change of the gun barrel's azimuth angle. What mathe~naticalrepresentation should be used for the block labeled "human controller"? The early researchers in the discipline known as "manual control" hypothesized this answer to the question: the same types of mathematical equations used to describe linear servomechanisms, namely, sets of linear, constant-coefficient differential equations. The hypothesis 0-8493-8570-9/96/$0,00+$.f~O O 1996 by CRC Press. lnc.
b
L
A
Figure 80.1
(a) A human-in-the-loop control problem. (b) A block diagram representation of the task of Figure (a).
of these early researchers turned out to be true, and the controltheory paradigm for quantifying human control behavior was born. This paradigm has become fundamental for manual control engineers [13]. The evolution of the control-theory paradigm for the human
THE CONTROL HANDBOOK controller or operator paralleled the development of new synthesis techniques in feedback control. Thus, "optimal control models" (OCMs) of the human operator [8] appeared as linear quadratic Gaussian (LQR) control system design techniques were being developed. "fuzzy controller" models [12]and "Hinfinity" models [I]of the human operator closely followed the appearance of these design techniques. The models discussed here will primarily be those which have successfully solved human-in-the-loop control problems. This approach will neglect some of t$e more recent human operator models, which are incompletely tested.
80.2 Frequency-Domain Modeling of the Human Operator Modeling the human operator as a set of linear, constantcoefficient differential equations suggests representing the human as a transfer function. This approach, generalized to describing function descriptions, captured the attention of some of the earliest and most influential manual control engineers 1141. Figure 80.2 shows a describing function representing the human operator or controller in a single-input, single-output (SISO) tracking task, such as an aircraft pitch-attitude tracking task. Here the transfer function representation'of the human has been generalized as a quasi-linear describing function [4]by the addition of an additive "remnant" signal, ne (t). This signal represents the portion of the system error signal e(t) unexplainable by linear operator behavior, and not linearlycorrelated with the system input c(t). The spectral measurements of remnant coalesced best when the remnant was assumed to be injected into the displayed error e(t) rather than the operator's output S ( t ) . For this reason, the remnant portion of the quasi-linear describing function is almost universally shown with error-injected remnant.
human operator describing function r - - - - - - - -I
Figure 80.2
A quasi-linear describing function representation of the
human operator. Frequency-domain identification of human operator describing functions in simple laboratory tracking tasks has been actively studied over the past three decades [IS].In these experiments,the describing functions identified were Yp ( j o ) (orYp (jw)Y, ( j o ) ) and Q,,,, (w), where the latter quantity is defined as the power spectral density of the remnant signal n,(t). The controlled element or p l h t was a member of a set of "stereotypical" controlledelement dynamics, i.e., Y,(s) = -K$, k = 0, 1,2, and the inputs
or disturbances were random-appearing signals, often generated as sums of sinusoids. The results led to one of the first true engineering models of the human operator, referred to as the crossover model.
80.2.1 The Crossover Model of the Human Operator The crossover model is based on the following experimentally verifiable fact: In a Bode diagram representingthe loop transmission Yp (jw) . Y,(jw) of the system, such as shown in Figure 80.2, the human adopts dynamic characteristics Yp (jw) so that Yp(jw).Yc(jw)=
(,,.e-rt"" 'jw
, for w near wc. (80.1)
The crossover frequency, w,, is defined as the frequency where I Yp Y, (jw) 1 = 1.O. Equation 80.1 is valid in a broad frequency range (1 to 1.5 decades) around the crossover frequency wc. The factor re, referred to as an effective time delay, represents the cumulative effect of actual time delays in the human information processing system (e.g., visual detection times, neural conduction times, etc.), the low-frequency effects of higher frequency human operator dynamics (e.g., muscle actuation dynamics), and higher frequency dynamics in the controlled element, itself. Here, "higher frequency" refers to frequencies well above oc. Associated with Equation 80.1 is a model of @ n e n e ( o ) , the power spectral density of the error-injected remnant. Again, extensive experimental evidence suggests the following form:
where e2representsthe mean-square value of the error signal e(t) in Figure 80.2. Table 80.1 shows approximate parameter values for the crossover and remnant models with related equations. In applying the crossover model, plant dynamics as simple as those in Table 80.1 will be rare. In these cases, one should interpret the stereotypical dynamics in the table to reflect the actual plant characteristicsin the crossover region. A detailed summary of empirically derived rules for selecting crossover model parameters, given the controlled-element dynamics and the bandwidth of the input signal, are given by [S]. In addition, simplified techniques are given for estimating human-machine performance (e.g., root-mean-square tracking error @)in continuous tasks with random .appearing inputs. The reason for beginning this discussion with the crossover model is that it is basic for manual control modeling. Any valid model of the human operator in continuous tasks with random-appearing inputs must exhibit the characteristics of Equation 80.1. The loop transmission prescribed by Equation 80.1 is similar to that which an experienced control system designer would select in a frequency-domain synthesis of a control system with an inanimate compensation element and performance requirements similar to the manually controlled system [16].
80,3. TIME-DOMAIN MODELING OF THE HUMAN OPER TABLE 80.1
Parameters and Relations for Crossover
human operator
r - - - - - - - - - - - - -' --1- -
Model yc
(around
q)
Wco
crossover)
(s)
(radls)
WR
R
(radls)
Figure 80.3
The structural model of the human operator.
80.3 Time-Domain Modeling of the Human Operator 80.3.1 An Optimal Control Model of the Human Operator
80.2.2 A Structural Model of the Human Operatior A model of the human operator which follows the crossover model, but provides a more detailed representation of human operator dynamics is offered in [ 7 ] and is referred to as a structural model of the .human operator. The model is shown in Figure 80.3. Table 80.2 lists the model parameter values which depend on the order of the controlled-element dynamics in the crossover region. The parameter "k" in Table 80.2 refers to the controlled-element order around crossover,i.e., k = 0 for zeroth order, k = 1 for first order, etc.
k
Ke
Nomimil Parameters for Structural Model. Tl T2 r K 1 K 2 (s) (s). (s) tn
0
1.0" 1.0 1.0
1.0 1.0 1.0
TABLE 80.2 -
1 2
2.0 2.0 10.0
5.0 5.0 2.5
0.15 0.15 0.15
0.707 0.707 0.707
Wn
(radls) 10.0 10.0 10.0
The advent of a time-domain control synthesis technique in the mid-1960s, referred to as linear quadratic Gauss~an(LQG) design led to a powerful model of the human operator called the Optimal Control Model (OCM). This model differs from those defined in the precedingbecause it is algorithmic and is based on a time-domain optimization procedure. The model is algorithmic because the quantitative specification of certain human operator information processing limitations, such as signal-to-noise ratios on observed and control variables and sensory-motor time delays, together with an objective function which the human is assumed to be minimizing in the task at hand, can lead to direct computation of the linear human operator dynamics and remnant. In addition, this algorithmic capability is not limited to SISO systems but also can be extended to hurnan control of multi-input, multioutput (MIMO) systems. Figure 80.4 shows the basic OCM structure. Focusing for the moment on a SISO system, the elements in Figure 80.4 from time delay to neuromuscular dynamics form the human operator dynamics. Strictly, speaking, the OCM is never a SISO model, be-
disturbances
K, chosen to provide desired crossover frequency bselected to achieve Kls-like crossover characteristics CParameternot applicable
- - - - - - - - - - human - - -operator '-----
-'-
The structural model of the human operator is called an "isomorphic model': because the model's internal feedback structure reflects hypothesized prqprioceptive feedback activity in the human. If the control engineer can reasonably estimate the crossover frequency for the manual control task at hand, Figure 80.3 and Table 80.2 can provide a model of the linear human operator dynamics ( Y p ( j o ) )adequate for many engineering applications. The empiricallyderived rule for determining crossover frequency for the crossover model can be used to define the crossover frequency for the structural model. Remnant parameter estimates can also be obtained in a similar fashion.
- - - --
I I
I
Figure 80.4
motor noir
observation noise I
The optimal control model of the human operator.
cause a fundamental hypothesis in the model formulation is that, if a variable d ( t ) is displayed to the operator, its time derivative d ( t ) is also sensed, both signals corrupted with ,white observation noise. Thus, in its simplest form, the y (t) in Figure 80.4 is a column vector whose two elements are the displayed signal and
THE CONTROL HANDBOOK
its time derivative. Likewise, the v y (t) in Figure 80.4 is also a column vector whose elements are the observation noise signals. The covariances of these observation noise signals are assumed to scale with the covariances of the displayedlobserved signals y (t) and y (t). In addition to observation noise, the signal u, ( t ) is also assumed to be corrupted with "motor" noise. This noise provides performance predictions which are more realistic than those produced when motor noise is absent. Table 80.3 shows "nominal" values of the noise-signal ratios and time delay for typical applications of the OCM to tracking or disturbance regulation tasks with random-appearing inputs or disturbances. The remaining user-defined element in application of the OCM is the quadratic index of performance which the operator is zssumed to be minimizing. In many SISO applications, the following index of performance has been employed:
J=
1,"
[y2(t)
+ pj2(t)] dt.
(80.3)
Nominal Parameter Values for Optimal Control Model (OCM).
TABLE 80.3
5
biu 2;;
Ktb ut
(s)
80.4 Alternate Modeling Approaches 80.4.1 Fuzzy Control Models of the Human Operator Fuzzy set theory leads to a description of cause and effect relationships which differ considerably from the control theoretic approaches discussed in the preceding [20]. In classical set theory, there is a distinct difference between elements which belong to a set and those which do not. Fuzzy set theory allows elements to belong to more than one set and assigns each element a membership value, M, between 0 and 1 for each set of which it is a member. Consider, for example, the use of fuzzy sets to model the manner in which a pilot might control the pitch attitude of an aircraft, creating what can be termed a fuzzy control model of the human. Assume that the aircraft's pitch attitude, Q(t),can vary within -20' 5 0 (t) 5 +20ยฐ. This range is often referred to as the "universe of discourse" in fuzzy set theory. On can define a number of membership functions over this range of discourse as in Figure 80.5(a) where five functions are shown. The shape defining each function as well as the number of functions spanning the universe of discourse is entirely up to the analyst. The five triangular functions of Figure 80.5(a) have been chosen for simplicity. The functions have also been given linguistic definitions, i.e., "B(t) very'negative, Q(t) negative': etc. Now, as in Figure 80.5(a), an aircraft pitch attitude of - 12.5" is a member of five sets with membership values oE M(- 12.5) = 0.25 (very negative), 0.75 (negative), 0 (around zero), 0 (positive), and 0 (very positive).
0 . 0 1 ~ 0 . 0 0 1 ~ 0.15 anoise-to-signalratio for observation noise vy, ( t ) . bnoise-to-signalratio for mQtor noise v, ( t ) .
Inclusion of control rate in the index of performance will produce first-order lag dynamics & the OCM IS]. These constitute the neuromotor dynamics shown in Figure 80.4. As in any LQG synthesis application, selection of the index of performance weighting coefficients is nontrivial. One procedure for selecting p in Equation 80.3 is a trial and error technique in which p is varied until the time constant of the first-order lag (neuromotor dynamics) of approximately 0.1 s. Another procedure for choosing p which does not require trial and error iteration has been used successfully [ 6 ]and is referred to as an "effective time constant" method.
Membership
t
8 = -12.5'
Membership
A
positive 0
The power of the OCM lies in its ability to provide accurate representations of human control behavior given "standard" values for quantifying the information processing limitations mentioned in the preceding. Finally, it should be noted that other human operator models have been developed which are progeny of the OCM. The biomorphic model of the human pilot [3], is one example, as is the OCM extension and refinement described by [21], and the more recent H-infinity model of [ 11.
6 (deg) -5
-2.5
0
2.5
5
I I
(a) Defining input membership functions for a fuzzy control model of the human operator. (b) Defining output membership functions for a fuzzy control model of the human operator.
Figure 80.5
Modeling the human operator with fuzzy sets involves using fuzzy relations between input and output values. Let us assume
80 5 . MODELING HIGHER LEVELS OF SKILL DEVELOPMENT
the output set here consists of elevator deflections which the pilot commands in response to an observed pitch attitude deviation from zero. Output fuzzy sets can be defined over the universe of discourse foi the elevator deflections, here defined as -5" 5 S ( t ) 5 +5', using membership functions shown in Figure 80.5(b). Again, linguistic definitions have been employed to describe these membership functions. Perhaps the pitch attitude of - 12.5' produced a pilot commanded elevator angle of -2.5", as shown in Figure 80.5(b). Now, as will be discussed in a later section, by observing the path-attitude change and resultant control actions of an experienced pilot, or group of experienced pilots in a flight sirnullator, the control engineer can create a "relational matrix" R between 0(t) and 6(t) as R = M(O)XM(S), where "X" denotes a Cartesian product, or minimum for each element pairing of M (0) and M(6). Assume that, on the basis of simulator observatior~s,the relational matrix below is obtained:
1501
Notice that the output of this fuzzy relation or filzzy mapping is not a definite value of 6 (n At), but a collection of membership values. In order to implement this fuzzy pilot model, say, in a simulation of an aircraft, one needs to "defuzzify" rnodel outputs such as the M(6) above into nonfuzzy S(nAT), which can then serve as useful inputs to the simulation. It is worth emphasizing that determining fuzzy models of human operator behavior is rarely as simple as described. For example, human operator actions are often predicated upon system error and upon error rate. This obviously complicatesthe definition of the human's input membership functions. Despite these complications, fuzzy models of the human operator have been successful in modeling human-in-the-loop control in a variety of tasks. Examples include modeling motorcycle drivers [12],ships helmsmen [18], and automobile drivers [9].
80.5 Modeling Higher Levels of Skill Development
The R of Equation 80.4 forms the basis of a "fuzzy control" model of the pilot. In exercising the model, one can assume that the pilot makes observations every AT sec. For example, at some observatior~instant assume 0 (nAT) = - 15.0". As shown in Figure 80.5(a), the memberships of the input are M(O) = 0.5 (very negative pitch), 0.5 (negative pitch), 0 (around zero pitch), 0 (positivepitch), and 0 (very positive pitch) Using R and M (0), the memberships of the fuzzy output can be obtained [17] as M (6) = m;ur of {min[0.6,0.5],min[0.2,0.5], min[O, 01, mi,n[O, 01 ,min[O, 01) for very negative elev. = ma*: of {min[O.5,0.5],min[0.75,0.5], min[0.4,0], rriin[O, 0] ,min[O, 01) for negative elev. =maxof{min[O, O.S],min[O.5,0.5],min[l, 01, min[0.5, O] ,min[O, 01) for around zero elev. =maxof(min[O, 0.5].,min[O,0.5],min[0.4,0], min[0.75,0] ,nnin[0.4,0]} for positive elev. =maxof{min[O, 0.51,min[O, 0.51 ,min[O., 01, min[O, 0] ,min[0.6,0]) for very ppsitive elev. The right hand sides of each of the five equations above constitute the membership values for S(t) at this instant. These values have been obtained by pairing the elements of R with elements of M(0) as would be done in the matrix multiplication M O R . The minimum value of each pairing was found, and finally, the maximum of the j elements selected. This yields M (6) = 0.5 (very negative elev.), 0.5 (negative elev.), 0.5 (around zero elev.), 0 (positive elev.), and 0 (very positive elev.)
In the models described briefly in the preceding paragraphs, it has been tacitly assumed that the human was operating upon displayed system error (and error rate in case of the OCM). Such tasks are referred to as compensatory tracking tasks. Often displays for human operators contain error information and output information as well. Such displays are referred to as "pursuit" displays, and the resulting tasks are referred to as pursuit tracking tasks. A discussion of such tasks and their associated human operator models can be found in 151. However, another, more subtle modeling issue can arise in which the human can actually develop pursuit tracking behavior with only a compensatory display. Such human behavior has been the subject of the "Successive Organizations of Perception" (SOP) theory, [lo]. The highest level of skill development is referred to as "precognitive" behavior and implies human execution of prepragrammed responses to certain stimuli without the necessity of continuous feedback information. Modeling any of these higher levels of skill development 1s beyond the scope of this chapter.
80.6 Applications The preceding sections have outlined approaches for modeling the human operator in SISO control systems in which the human is sensing system error and where the system input is random or random appearing. The various modeling approaches have met with success in engineering analyses. It is useful at this juncture to consider applying a subset of these models to a very simple but illustrative human-in-the-loop control example.
80.6.1 An Input Tracking Problem Consider again the block diagram of Figure 80.2 representing a hurnan-in-the-loop SISO control problem. Here, the human's task is to null a displayed error signal (difference between command input and system output) by manipulating a control device
I502
THE CONTROL HANDBOOK
producing an output S(t). The plant dynamics are very simple, consisting of an integrator, i.e., the plant transfer function is Y, (s) = :. Such dynamics are often referred to as rate-command, because a constant control input N t ) produces a constant output rate, m(t). The command input here is filtered white noise, wherein the filter transfer function is F(s) = 1 Three . mod(s+lI2 els of the human will be generated here: 1) a crossover model, 2) a structural model, and 3) an Optimal Control Model. In addition, a brief discussion of how a fuzzy control model might be developed is also included.
Crossover Model Development of the crossover model of Equation 80.1 begins by estimating the system crossover frequency, w,, and the effective time delay, t,.These parameters have been found empirically to depend upon the plant dynamics and input bandwidth [15],via the equations shown in Table 80.1, i.e.,
gives an estimate of tracking performance as e2 = 0.037 with root-mean-square (RMS) tracking error @ = 0.19. The one caveat that should accompany the use of Equation 80.9 is that it assumes a rectangular input spectra. In the task at hand, the input was formed by passing white noise through a second-order filter, and thus was not rectangular. However, to obtain a preliminary estimate of tracking performance, one can use the cutoff frequency of the continuous-input spectrum as the cutoff of the rectangular spectrum. Tracking performance can be more accurately predicted by a computer simulation of the crossover model, with injected remnant. Some iteration will be required, because the magnitude of the injected remnant has to scale with z2, the quantity one is trying to obtain from the simulation. Nonetheless, the author has found that the iterative process is fairly brief.
and
where W B w,refers to the input bandwidth, here approximated by, the break frequency of the input filter F(s) as 1 radls. Thus, Equations 80.5 yield
Equation 80.1 becomes
Figure 80.6
Determining 11 for Equation 80.9.
Note that the crossover model implies a very simple model of the human operator, i.e., Now the power spectral density of the error-injected remnant is given by Equation 80.2. Using Table 80.1, the remnant power spectral density becomes
Yp(jw) = w,e-0.27Jo,
for
o near
w,. (80.10)
Structural Model
In Equation 80.8 an intermediate value of R has been used. Simple block diagram algebra, fundamental spectral andysis techniques and the "$ power law" can be used to derive the following equation for estimating human-in-the-loop tracking performance [5]:
wherez2 refers to the mean-square value of the system input, here assumed to be unity. The I l in Equation 80.9 can be obtained 3.6 and Equation 80.9 from Figure 80.6 [15]. In this case, I1
Because the structural model follows the crossover model, one can use the first of Equations 80.5 to determine the crossover frequency as w, = 4.68 radls. Referring to Table 80.2, and recalling that the plant exhibits first-order dynamics at all frequencies (i.e., the slope of the magnitude portion of its Bode diagram is -20 dB/dec at all frequencies), one uses the structural model parameters corresponding to k = 1. Thus, all of the structural model parameters ca'n be chosen except K, which can be selected by requiring IYp(jwc)Yc(jw,)l = 1.0. This yields K , = 18.1. As opposed to Equation 80.10, the structural model offers a more detailed representation of hum& operator dynamics. In this application
One can also utilize the same remnant model, as discussed, with the crossover model and approximate tracking performance with Equation 80.9 and Figure 80.6.
Optimal Control Model Given the nominal OCM parameters from Table 80.3, only the selection of the weighting coefficient p in Equation 80.3 remains before the LQG synthesis technique produces a pilot model. One approach to selecting p has been mentioned in the preceding as an "effective time cofistant" method [6]. However, a simpler approach will be adopted which again calls on the empirical relation in the firsstof Equations 80.5 to define the crossover frequency w,. It can be shown that a relatively simple relationship exists between the plant dynamics and the coefficient p in an optimal regulator problem [ 1 11, namely,
--
-
-
Optimal C(ontrol Model
I
-180 deg
-360
-
-540
-
-720
crossover model
structural model
I Frequency radls
w is the bandwidth of the dosed-loop, humanwhere w ~ here vehicle system and the parameters rn and n are obtained from the plant dynamics when expressed as
In more precise terms, w ~ iswthe magnitude ofthat closed-loop pole closest to the frequency where the magnitude of the closedloop system transfer function is 6 dB below its zero-frequency value. Now one can approximate the crossover frequency, o,, given w g w from Equation 80.12 by
Using the nominal parameter values given in Table 80.3, the OCM can now be ap~pliedto the tracking task. With Y, = K = 1, m = 0, and rr = I in Equation 80.13. Equations 80.12 and 80.14 and w, = 4.68 radls. used in the previous models,
i,
4
4
(3)
yield p = (016) WC = = 2.05 . lo-'. The OCM computer program utilizedin this study is described in [2]. As opposed to Ithe crossover and structural models, errorinjected remnant and RMS tracking error are obtained as part of the OCM model solution. The RMS tracking error predicted by the OCM was @' = 0.21, which compares favorably with the 0.19 value obtained with the power law in the crossover and structural modeh. Finally, Figure 80.7 compares the Bode diagrams for theloop i:ransmission YpY,( j w ) obtainedwith each of the three modeling approaches. Again, the comparison is favorable.
3
Fuzzy Contra11Model Formulation Developing a fuzzy control model for the human operator in the input tracking task defined in the preceding would require actual simulator tracking data or a less attractive alternative involving detailed discussionswith trained human operators who might describe how they select their control actions given
Figure 80.7 Bode diagrams of loop transmissions (I>Y,(jw)) for application example. ,
certain displayed errors. This procedure assumes that the control engineer is unwilling or unable to employ any of the models just discussed. This might be the case, for example, if the plant dynamics were unknown, were suspected of being highly nonlinear, or if the task were sufficiently removed from that of tracking random-appearing input signals, e.g., tracking transient signals. Space does not permit adetaijeddiscussion ofthe fuzzycontrol approach to this problem, other than avery general, rudimentary outline. Assume that a set of human operator input ancl output time histories ( e ( t )and S(t) respectively in Figure 80.2) are available from a human-in-the-loop control simulation ofthe task at hand. Let us further assume that, on the basis of discussions with the operator, the manual control engineer has been convinced that only error and not some combination of error andl error rate or error and integral error are used by the operator in generating appropriate control inputs, S ( t ) . Next, the engineer would create a set of membership functions for the error and output signals, similar to those shown in Figure 80.5(a). Choosing the number and shapes of the functions are part of the art in applying fuzzy control. Nexir, human operator time delay, t, would be estimated. This delay would be considerablysmaller than the effective delay, r e , discussed in the crossover model as it is intended only to approximate the delay between a visual stimulus and the initiation of a control response by the operator. Values on the order of 0.1 s. would be rippropriate. The engineer would then tabulate pairs of error and delayed output values e(nA T ) and S(nAT + T ) . Using the paired input and output values, a relational matrix such as Equation 80.4 might be created as follows: For each inputloutput pair, form a relational matrix R; as R, = M(ei)XM(S,),i.e., the Cartesian product of M ( e , ) and M(S,), or minimum for each element pairing of M ( e l )and M ( 6 , ) .After
THE C O N T R O L H A N D B O O K this has been done for all of the input-output pairs, the final relational matrix R is obtained by using, as each element R(i, j ) , the largest corresponding element found in any R;. This last op eration is sometimes referred to as finding the "union" of the relational matrices. With the relationalmatrix R thus obtained, one has the basis of a fuzzy control model which can produce membership numbers for a S ( n A T t)for any error e ( n A T ) . This result is still fuzzy (recall the M ( 6 ) obtained after Equation 80.4) and needs to be "defuzzified" for appropriate use in any simulation employing the fuzzy control model.
+
80.7 Closure This treatment of human-in-the-loop control has been brief, and many important topics have gone untouched. Two examples of such unexplored topics are the modeling of the human operator in tasks where motion cues are important, and the modeling of human control of MIMO systems. It is hoped that the foregoing material has provided the reader with sufficient introductory background to explore these omitted topics.
80.8 Defining Terms Compensatory tracking task: A tracking taskin which the human is presented with a display of system error only. Crossover model: A model of the human operatorlplant combination (loop transmission) for compensatory tasks stating that the loop transmission in manually controlled SISO systems can be approximated by an integrator and time delay around the crossover frequency. Fuzzy control model: A model of the human operator derived from fuzzy set theory. In its simplest form, this model is defined by input and output membership functions and a relational matrix describing mappings between these functions. Optimal control model: An algorithmic, time-domain based model of the human operator based upon linear quadratic-Gaussian control system design. Pursuit tracking task: A tracking task in which the human is presented with a display of system error and system output. Structural model: A model of the human operator which follows the crossover model, but provides a more detailed representation of human operator dynamics.
References [ I ] Anderson, M., Standard optimal pilot models, AIAA, 94-3627,1994.
[2] Curry, R. E., Hoffman, W. C., and Young, L. R., Pllot modeling for manned simulation, Air Force Flight Dynamics Lab., AFFDL-TR-76- 124, 1976, Vol. 1. [3] Gerlach, 0. M.,The biomorphic model of the human pilot, Delft University of Technology, Uept. of Aerospace Engineering, Report LR-3 10, 1980. [4] Graham, D. and McRuer, D. T., Aitalysis of riottiinear corttrol systems, Dover, New York, 197 1, Chapter 10. [5] Hess, R. A., Feedback control models, in Hor~dbookof Human Factors, Salvendy, G., Ed., John Wiley &Sons, New York, 1212-1242, 1987. [6] Hess, R. A. and Kalteis, R., Technique for Predicting Longitudinal Pilot-Induced Oscillations, J. Guldnnce, Control, Dyn., 14(1), 198-204, 1991. [7] Hess, R. A., Methodology for the Analyt~calAssessment of Aircraft Handling Qualities, in Coi~trolnrrd Dynamtc Systems, Vol. 33, Advances zn Aerospace Systems Dynamzcs artd Controi Systems, Part 3, Leondes C.T., Ed., Academic, San Diego, 1990, 129-150. [8] Kleinman, D. L., Levison, W. H., and Baron, S., An optimalcontrol model of human response, part I: theory and validation, Automatica, 6(3), 357-369, 1970. [9] Kramer, U., On the application of fuzzy sets to the analysis of the system-driver-vehicle environment, Automatzca, 21(1), 101-107, 1985. [lo] Krendel, E. S. and McRuer, D. T., A servomechanism approach to skill development, J. Franklin Inst., 269(1), 24-42, 1960. [ l I ] Kwakernaak, 7. and Sivan, R., Linear optimal control systems, John Wiley & Sons, New York, 1972. [12] Liu, T. S. and Wu, J. C., Amodel for a rider-motorcycle system using fuzzy control, IEEE Trans. Syst., Man, and Cybernetics, 23(1), 267-276, 1993. [13] McRuer, D. T., Human dynamics in man-machine systems, Automatica, 16(3), 237-253, 1980. [14] McRuer, D. T. and Krendel, E. S., Dynamic Response of Human Operators, Wright Air Development Center, WADC TR 56-524, 1957. [15] McRuer, D. T., Graham, D., Krendel, E. S., and Reisener, W., Jr., Human pilot dynamics in compensatory systems, Air Force Flight Dynamics Lab., AFFDL-65-15, 1965. [16] Nise, N., Control Systems Engineering, 1st ed., BenjaminICummings, Redwood City, CA, 1992, Chapter 11. [17] Sheridan, T. B., Telerobotics, Automation, and Human Supervisory Control, MIT, Cambridge, MA, 1992, Chapter 1. [ 181 Sutton, R. and Towill, D. R., Modelling the Helmsman in a Ship Steering System Using Fuzzy Sets, Proc. of the IFAC Conference on Man-Machine systems: Analysis, Design and Evaluation, Oulu, Finland, 1988. [19] Tustin, A., The nature of the operator's response in manual control and its implication for controller design, J. IEE, 94, IIa(2), 1947. [20] Zadeh, L., Outline of anew approach to the analysis of
80.8. DEFINING TERMS complex systems and decision processes, IEEE Trans. Syst., Man, Cybernetics, SMC-3(1), 28-44, 1973. [21] Wewerinke, P. H., Models of the human observer and controller of a dynamic system, Ph.D. Thesis, University of Tyente, the Netherlands, 1989.
Further Reading An excellent summ,iry of application of human operator models to modeling, human pilot behavior, circa 1974, is available in the AGtiRD report referenced herein, Mathematical Models of Hunlar~Pllot Behavior, by McRuer and Krendel. The excellent text by Thomas Sheridan and William Ferrell, Man-Machine Systenls, published by MIT Press in 1974, provides a broad view of the topics involved with humanmachine interaction. The 1987 Wiley publication Handbook of Human Factors, [5] contains a chapter by the author entitled Feedback Control Models. This chapter covers some MIMO applications of the classical and modern (OCM) human operator models. A new edition of this Handbook is currently in press. The book, Modelling Human Operators in Control System Design, by Robert Sutton and published by Wiley in 1990 offers a very readable treatment ofhuman-in-the-loop control systems. This book includes chapters devoted to modeling the human operator using fuzzy sets.
a ( w ) contours, 188 cr curves, 190 /zm norm, 492
A-tuning, 821 y-synthesis, 1407 0-junction, 102 1 j w t factors, 176 1-junction, 102
+
ald conversion, 306 ald converter, 307 AIF mald~str~bution, 1266 absolute value, 52 ac induction motors, 1430 ac synchronous motors, 1432 aclac converters, 17 acceleration resolved control, 1355 accelerometer, 11'73 acid, 1207 dlprotic, 1208 Ackermann's formula, 213,287 acsl, 420 active controller, 384 active learning, 1018, I0 19 actuator, 1192, 1291 binary, 346,347 control, 1304 monostable, 347 actuator dynamics, 1343 actuator inertia, 1342 actuator nonlinearities, 1321 actuator saturation, 75,201 acyclic I 0 reachable decompositions, 789 adaptation, 1019 adaptive control, 524, 927,1240,1346,1357, 1380 adaptive controllers, 384, 1277 adaptive decentralized control, 787 adaptive feed forward, 825 adaptive feedback linearization, 1347 adaptive fuzzy control, 1013 adaptive law, 853,854 adaptive nonlinear cor~trol,980 adaptive regulation, 85'3 adaptive techniques, 823 adaptive tracing, 854 addition boolean, 348 matrix, 34 additive modeling perlurbations, 553 additive uncertainty, 534 adjoint equation, 454 adjoint system, 459,768 adjugate-determinant formula, 482 admissible controls, 1148 admissible region, 773 admittance, 1358 admittance control, 1358 advance operator form, 10 affine function, 497 a f h e plants, 499 affine polynomials, 500,501 afhe uncertainty, 498
agent monoprotic, 1207 air-fuel ratio control, 1263 air-gas dynamics, 1467 Akaike's information theoretic criterion (AIC), 1039 algebra block diagram, 246 control Lie, 1363 Lie, 865, 971,972, 1363 matrix, 34 numerical linear, 403 algebraic Riccati equation, 597, 608, 764 algebraic view, 1060 algebraically equivalent, 455 algorithm, 1287, 1239 branch and bound, 755 closed-loop, 830 computer, 118 control, 1269, 1488 Euclidean, 484 feedback control, 1197 genetic, 994,997 noncausal design, 763 path planning, 1363 velocity, 201,203 algorithm spa, 1045 algorithmic interface, 435 aliasing effect, 1418 all pass filter, 80 almost sure asymptotic Lyapunov stability, 1106 almost sure exponential Lyapunov stability, 1106 almost sure Lyapunov stability, 1106 almost surely asymptotically stable, 1106 almost surely exponentially stable, 1106 a ( w ) contours, 188 a curves, 190 alternative parallel sequence, 357 alternator, 1454 ambient space, 896 amplifiers power, 1388 amplitudes quantized, 239 andog controllers, 358 analog filter, 1198 andog plant input filtering, 295 analog-to-digital converters (ald), 239 analysis bifurcation, 886 dynamic, 1294 frequency, 1043 frequency response, 513 maximum amplification direction, 513 mimo frequency response, 512 /*., 680 performance, 1247 residual, 1051 robust performance, 681 robustness, 499,671 sensitivity, 305 stability, 1383 step-response, 1043
THE CONTROL HANDBOOK stochastic, 307 structural, 783,1458 transient, 1043 trim, 1290 analytic constraints, 542 analytic part, 60 analytic properties, 542 analytical methods, 819 and, 347,348,350-352,354,355 and block (anb), 354 Andronov-Hopfbifurcation, 955 angle delay, 1416 flux slip, 1450 stator current vector, 1450 angular velocity, 1312 anholonomy, 967 annihilating polynomial, 37 anti-windup, 203,386 anticausalpart, 1086 anticipatory behavior, 769 apparatus experimental, 1008 approach KOO~-IOCUS, 841 state space, 72 approximate controllability,1143,1150,1165 approkimate multimodal decomposition, 1492 approximate observabiliv, 1172 approximately controllable, 1150,1152 approximation impulse-invariant, 272 polynomial, 266 ramp-invariant, 272 step-invariant, 272 universal, 1021 approximation methods, 276 approximator, 1021 function, 1020 arbitrary pole placement, 210 architectures, 1469 area dilation factor, 61 argument principle, 58 arithmetic fixed-point, 301,307 fixed-point binary, 301 floating-point, 302,310 floating-point binary, 301 logarithmic, 301 residue, 301 arma, 74 arma (autoregressivg moving-average) representation, 492 arma models, 490,1079 armax models, 1079 ARMAX-model, 1042 Arnold, 1121 Arnold, Crauel and Wihstutz, 1123 array Nyquist, 723 Routh-HuMTitz, 132 ant models, 1034 ARX-model, 1042 assumptions plant, 849 reference model, 849 asymptotic Iqr properties. 639 asymptotic Lyapunov stability, 1105,1106
asymptotic methods, 873 asymptoticproperties, 639 asymptotic sample length (asl), 835 asymptotic stability, 390,890,925 asymptotically p'h moment stable, 1106 asymptotically stable, 457, 1105 asynchronousexecution, 352 attenuation, 817 attitude control, 1313 attitude-hold configuration, 1325 attraction region of, 890 attractor, 952 strange, 957 audio-susceptibility, 1421 augmented potential, 975 auto spectrum, 1045 automatic control, 382 automatic reset, 199 automatic tuning, 823 automation systems, 359 autonomous control, 998 autonomous linear system,'909 autonomous systems, 82,874,926 autonomous trajectories, 919 autonomy, 1161 autoregressions, 1035 autoregressive (ar) process, 836 autoregressive moving average (ARMA) process, 1080, 1082-1084, 1087 autotuning, 369,827,1215 auxiliary conditions, 12 auxiliary equation, 5 average cost, 1103 average a s t problem, 1102 averaged momentum, 976 averaged potential, 967,973-977 averaged system, 874 averaging, 874,1417 generalized, 1422 averagingmethods, 875 axes, multiple, 1391 axis imaginary, 52 inertial, 1311, 1313 red, 52 badanking, 1212 backpropagation, 1025 backstepping, 980 integrator, 935 backward error, 402 backward error analysis, 402 backward partial differentialequations, 1068 backward recursion, 1099 backward shift operator, 1080 backward stability, 402 balance, harmonic, 964 balancing control, 1009 Banach space, 1170,1171 band reject filters, 1393 band-limited, 25 bang-bang control, 424,773 base frame, 1340 base space, 1363 bases, 1207
INDEX
(
basic logical gates, 347 basic slip control, 1448 basin, 890 basis, 38 orthogonal, 41 orthonormal, 41 P.Hall, 972 basis functions, 1048 basis influence functions, 1021 batch ph control, 1216 batch processes, 568 batch training, 1024 beam, Euler-Bernoulli, 1145, 1146 behavior anticipatory, 769 dynamical, 1458 minimum-phase, 1089 bending modulus, 1173 between-sample responlse, 294 bezoutian identity, 837 bias error, 1038 bifurcated solutions, 953 bifurcation, 951,952,958,960 Andronov-Hopf,955 control, 964 diagrams, 953 global, 953 local, 953 period doubling, 956 pitchfork, 955 saddle-node, 954,955 stationary, 954,955 subcritical,953 supercritical, 953, 954 bifurcation analysis, 886 bilateral Laplace transform, 18 bilateral z transform, 23 bilinear fractional transformations, 61 bilinear systems, 881 bilinear transformation, 61,267,268 binary actuators, 346,347 binary distillation colurnn, 570 binary point, 301 binary sensors, 346 biorthogonal functions, 1172 bispectrum, 1081 bistable unit, 347 Bistritz table, 149 BJ-model, 1042 black box modeling, 806 Blackman-Tukey approach, 1045 bleaching, 1221 blend chest dynamics, 1232 block and (anb), 354 complex full, 675 complex uncertainty, 676 full,673 gain, 86 in series, 86 integrator, 86 Jordan, 45 or (orb), 354 repeated scalar, 67'3 block decoupling, 799 block diagram, 85,93 architectures, 437
cacsd, 436 editors, 416,438 of feedback systems, 88 simulators, 419 block diagram algebra, 246 block jordan form, 250 block matrix, 37 block strict-feedback systems, 992 blockdiagonal perturbation matrix, 558 blocking code, 3'25 blocking state, 1414 bo (modulus optimum), 822 Bode, 173 bode gain-phase relation, 542 Rodeplots, 146,152,155,173-175 bode sensitivity integral, 543 Boeing, 420 boiler, 1454 bond graphs, 102 boolean addition, 348 boolean multiplication, 347 boost converter, 1415, 1417,1419, 1420, 1422 bottom-up design, 207 boundary conditions,760,1170 boundary controls, 1148 boundary layer system, 1346 boundary outputs, 1154,1155 boundary-layer model, 876 boundary-value problem, 61 bounded-input, bounded-output (bibo), 458 bounds disturbance, 705 optimal, 705,712,713 tracking, 705,713 Box-Jenkins-model, 1042 Brachistochrone problem, 772 branch and bound algorithms, 755 branch and bound methods, 755 branches, 93 break-before-make (bbm), 346 Breirnan's hinging hyperplane structure, 1049 Bresse system, 1142 Brockett, 1112 brownian motion, 1068,1069 buck converter, 1414 bumpless transfer, 381,386 cacsd, 429,431 block diagram tools, 436 consolidation, 432 critique, 433 packages, 431 state of the art, 431 CAD, 740 cancellation internal, 231 pole-zero, 1053 candidate operating condition, 952 canonical forms, 456,461 controllable, 127 observable, 126 capacitor, 106,423 cartesian force, 1354 cartesian manipulator velocities, 1353 cartesian space, 1353 cascade, 1239 cascade control, 207
THE CONTROL HANDBOOK cascaded forms, 93 cases, multi-inputlmulti-output(mimo), 121 Cauchy integral formulas, 58 Cauchy's estimate, 59 Cauchy's inequality, 52 Cauchy's theorem, 57 Cauchy-Riemann differential equations, 53 causal part, 1086 causal system, 1081 causality, computational, 101,421 causality condition, 9 causticizing area, 1221 Cayley-Hamilton theorem, 37 cellulose, 1220 centerline, 1139 chain form, 1364 chained systems, 972 Chang-Letov design approach for LQR, 757 chaos, 951,952,957,958,963,964 chaotic system, 957 characteiistic equation, 5,11,43,838 characteristic locus approach, the, 725 characteristic locus method, 722 charatteristic modes, 5,11 characteristic multipliers, 956 characteristic polynomial, 5,11,37, 115, 120, 125, 767 characteristic roots, 5,11 characteristic values, 5, 11 characteristic vector, 456,457 chart Nichols, 190 function, 358 synchronous, 352 chattering, 942,949 chemical pulping, 1221 Cholesky factorization, 44 Chow's theorem, 1363 Christoffel symbols, 1341 cin statement, 325,334,342 circle theorem, 903 circles, m , 188 circuit dc generator equivalent, 1439 exclusive or (xorj, 349 sequential, 352 ttl, 351 circular orbit, 1313 circulation loop dynamics, 1472 circulation loop equations, 1473 class, 332 class definition communicating, 1059 task, 334 class inheritance, 105,422 classical conic sector (circle) theorem, 903 classical control, 759 classical control structure, 2 16 classical method, 4, 14 classical passivity theorem, the, 898 classical small gain theorem, the, 897 classical solution, 4, 11 closed curve, 57 closed loop, 1148 closed loop system, 736 dosed-loop algorithm, 830 control, 830, 1099
control design, 1487 feedback control, 762 frequency response, 372 modes, 838 performance, 530 poles, 777,841 properties, 767 response, I40 return difference, 767 specifications, 541 stability, 760, 765, 1092 system, 744,764,765,924-926 time constant, 1235 tracking, 186 transfer functions, 88, 538, 541,692 closure, involutive, 1362 cmos (complementary metal-oxide semiconductor), 351 code blocking, 325 computer, 205 nonblocking, 325 scanned, 325 coefficient constant, 3 diffusion, 1158 dispersion, 1158 flux transfer, 1159 quantization, 304 unknown virtual control, 992 coeflicient of heat conduction, 1158 column m vectors, 33 combination linear, 96 parallel, 87 series, 86 combinatorial networks, 350 combustion, 1467 combustion control, 1471 command following, 514,667 command tracking, 1329 communicating classes, 1059 communication, 359 commuting matrices, 45 compact disc mechanism, 1405 companion matrix, 46,98 compensation, 1369, 1379 dynamic, 625 lag, 196 lead, 196 load power, 1445 model-based, 1379 reactive power, 1445 compensator, 1390 design, 371,1397,1399,1420,1476 dynamics, 770, 771 lead, 1393 pi, 1393 position loop, 1394 servo, 733 stabilizing, 735 velocity loop, 1393 compensatory tra&ng tasks, 1501 compiler, 333,430 complement, orthogonal, 41 comdementary controller, 734 complementary orthogonal projection, 41 complementary sensitivity functions, 216
complementary sensitivity integral, 545 complementary solution, 4,5, 11 complete equivalence elationsh ship, 126 complete operations, 349 completely controllable, 250 completely observable, 251 complex conjugation, 51 complex exponential input, 7 complex full blocks, 675 complex functions, 53 complex integrals, 57 complex matrices, 508, 509, 51 1 complex numbers, 51 algebra of, 51 conjugation, 51 geometric representation, 52 modulus, 51 purely imaginary, 51 complex plane, 52, 188 complex structured singular value, 673 complex systems, 207,664 complex uncertainty, 675 complex uncertainty blocks, 676 complex vectors, 508 complex-conjugate transpose, 509 complexity, 326 compliance frame, 1353 component controllability, 124 component observability, 124 components feedback, 896 forced, 242 natural, 242 steady-state, 242: transient, 242 zero-input response, 248 composite continuout;-time model plant, 295 composite control, 1345, 1346 composition estimation, 571 composition property, 454 computability, 664, 668 computable methods, 664 computational causality, 101,421 computationally intractable, 1099 computer algorithms, 118 computer code, 2,05 computer simulation, 763 computing, 1395 computing hardware, 430 condition auxiliary, 12 boundary, 760,1170 candidate operating, 952 controllability, 1,317 Dirichlet, 1159 dynamical boundary, 1141 geometric boundary, 1141 initial, 4, 10, 11'70 Lie algebra rank:, 978 Lipschitz, 889 matching, 932, '933,949 necessary?761 operating, 952 pseudogradient, 1285 recurrence, 1103 second-order nonholonomic, 1360 singular, 773
stationarity, 760, 761 strict feedback, 936 condition + action rules, 1004 condition number, 43 conditional expectation, 1070 conditional expectation operator, 806 conditional integration, 202 conditioned invariants, 472 conditioning, 401 numerical, 42 conditioning technique, 386 conducting state, 1414 configuration space, 1148, 1340, 1352 conformable partitions, 37 conformal map, 139 conformal mappings, 60 congruence transformation, 48 coniugate gradient direction, 1040 conjugation, complex, 5 1 connect statement, 106 conservation, 1159 conservation equation, 1159 conservatism, 498 constant, 175 closed-loop time, 1235 velocity error, 134 constant rn contour, 188 constant coefficients, 3 constant input, 7,13 constant lft, 677 constant matrix, 764 constant nominal equilibrium values, 1420 constant output feedback, 799 constant state-variable feedback, 764 constant voltage waveforms, 1413 constrained output feedback, 624 constrained-input design, 771,773 constraint analytic, 542 decentralized information structure, 780 final state, 760 hard, 810 magnitude, 773 overlapping information structure, 787 soft, 810 terminal, 845 constructive controllability, 971 constructor function, 338 content heat, 1158 continuity, 53 continuois conduction mode, 1418 continuous disconduction mode, 1418 continuous duty ratio, 1419 continuous systems, 55 continuous time, 582 continuous time linear systems, 1130 continuous-time feedback, 295 gaussian stochastic processes, 583 linear quadratic regulator, 762 linear systems, 585 optimal control, 760 processes, 1084 stochastic process, 586 systems, 607 state variable models, 273
THE CONTROL HANDBOOK systems, 273,451,775,1099 contour a(@),188 Nyquist, 136 contraction strict, 901 contraction mapping theorem, 49 control, 686, 750,861,1196,1392 acceleration resolved, 1355 adaptive, 384,824,827,1240,1346,1357, 1380 adaptive decentralized, 787 adaptive fuzzy, 1013 adaptive nonlinear, 980 admissible, 1148 admittance, 1358 air-fuel ratio, 1263 analog, 358 attitude, 1313 automatic, 382 autonomous, 998 balancing, 1009 bang-bang, 424,773 basic slip, 1448 batch ph, 1216 boundary, 1148 cascade, 207 classical, 759 closed-loop, 830, 1099 dosed-loop feedback, 762 combustion, 1471 composite, 1345, 1346 continuous-time optimal, 760 current-mode, 1421 damping, 1358 derivative, 199 design, 771 digital, 239 digitizing analog, 265 direct adaptive, 1381 dissipative, 1325 distributed, 1148 drum level, 1470,1475 dynamic dissipative, 1322 dynamic matrix, 224 energy-efficient, 1200 engine, 1261 equivalent, 944 exact, 1163 feedback, 769,775,1153,1266 feedforward, 208,769,1216,1273 feedforward ph, 1216 field-oriented, 1449 flight, 1287,1293,1295,1301 flow, 1196 fuzzy,995,1001,1002,1009 generalized minimum variance, 840,1093 generalized predictive, 224 generator excitation, 1484 generator system, 1440 Xrnand 'Hz, 653 high-gain servomechanism, 737 hybrid forcelposition, 1354 idle speed, 1263,1270 ignition, 1262 impedance, 1352,1357,1358 indirect adaptive, 1381 industrial, 358
inferential, 569 integral, 199,1377 intelligent, 994 internal model, 215,217,382 inverse damping, 1358 joint torque, 1378 learning, 1019,1350,1378 linear, 1214, 1232 linear feedback, 1145 linear multivariable, 408 local-loop, 1197 logical, 1199 long-range predictive, 842 loops, 763 Iqr, 637,639 Iti, 750, 752 manual, 382 mathematical software in, 413 mean-level, 844 mimo, 1240 mini-max, 649 full-state feedback, 648 mini-max and Xrn minimum-variance, 837, 1092 model-based, 1319 model predictive, 224 model predictive heuristic, 224 model reference, 849 multivariable, 1302 mv, 838 neural, 1017 nonlinear, 1214 nonlinear optimal, 971 observer-based,382 open loop optimal, 756 open-loop, 830,967 operational space, 1345 optimal, 595,602,809 optimal feedback, 762,775 oscillatory, 978 passivity-based, 1321 passivity based robust, 1348 passivity-based adaptive, 1348 pd, 1343 pendulum balancing, 1014 pid, 345 position, 1352 predictive, 807 pressurization, 1196 process, 369,825 programmable, 345,353 programmable logical (plc), 345 proportional, 198 proportional-integral-derivative (pid), 221 receding horizon, 810 repetitive, 1350 resolved acceleration, 1345 robust, 411,1346,1405 sampled-data,260 scheduled, 384 selector, 208 self-tuning, 845 siso, 1239 spacecraft attitude, 1303 stabilizing, 691 stand alone feedback, 324 static dissipative, 1321 steady-state and suboptimal, 764
INDEX stiffness, 1358 suboptimal, 765 supplemental damping, 1486 swing up, 1008 system dynamic!;, 770 temperature, 825, 1470 time-optimal, 1349 toilet, 1181 tracking, 1296,1297 ultra-high precision, 1386 velocity, 1377 velocity-resolved, 1356 vibrational, 968 virtual, 980 weighting, 762 control actuators, 1304 control algorithm, 1269, 1488 control canonical form, 456 control design, 681,1091, 1240, 1252, 1392, 1487 control effort, 166 control equipment, 12:29, 1232 control flow diagrams, 441 control horizon, 844 control input, 910,924,926 control law, 198,832,853,854,959, 1311 control Lie algebra, 1363 control Lyapunov functions, 932 control structure, 942 classical, 2 16 control system, 158,379,1327,1345,1454, 1469 control system design, 682, 1213,1345 control system design methods, 282 control system response, 315, 317 control system striicture, 1327 control technology, 1180 control weight matrix, 638 control weighting, 844 control, pid, 206 control-theoretic view (ofautonomy, the, 998 control-theory paradigm, 1497 control-to-state map, 1150 controllability, 121,122,250,407,447,458,462,868,871,1142, 1148, 1165,1171,1317,1363,1493,1494 approximate, 1143,1150,1165 constructive, 971 exact, 1143, 1148$1150 grammian, 122 matrix, 122, 124 rank, 122 spectral, 1148 controllability conditions, 1317 controllability gramian,.458 controllability index, 251 controllability matrix, 2:51 controllable, 458,971,978 approximately, 1150, 1152 completely, 250 exactly, 1150 uniformly, 458 uniformly n-step, 463 controllable canonical form, 76, 128,490 controllable form, 250 controllable state, 1.22 controlled autoregressive moving average (ARMAX) process, 1081 controlled invariants, 472 controlled Markov chain, 1100 controllers, 686,691,750, 1392
active, 384 adaptive, 384 analog, 358 complementary, 734 decentralized robust, 744 digitizing analog, 265 dissipative, 1325 drum level, 1471 dynamic dissipative, 1322 dynamic matrix (dmc), 807 extended self-adaptive (epsac), 808 extended-horizon adaptive (ehac), 808 feedback, 924 fpcc yaw pointing/laterd translation, 629 generalized predictive (gpc), 808 Xm and Hz,653 high-gain servomechanism, 738 internal model (IMC), 382 latent, 385 linear, 930,1214 lqg, 1320 lqr, 637,639 lti, 750, 752 minimax, 649 model algorithmic (mac), 808 model-based, 1319 multivariable three-term, 734 nonlinear, 1214 observer-based, 382 observer-based stabilizing, 735 one-degree-of-freedom tracking, 385 optimal, 760 optimal tracking, 769 passivity-based, 1321 perfect robust, 742 pid, 345 primary, 225 process, 825 programmable, 345,353 programmable logical (plc), 345 properties, 736 proportional-integral-derivative (pid), 221 robust servomechanism, 735 sampled-data, 260 scheduled, 384 self-tuning, 845, 1216 stabilizing, 691 static dissipative, 1321 state-space minimum-variance, 1093 strictly proper dissipative, 1323 supervisory fuzzy, 1012,1014 temperature, 825 variable structure, sliding-mode, 941 controller design, 685,750,754,946,985,1318,1332 controller synthesis, 1010 controller tuning, 382 controller tuning procedures, 1214 convergence, 1128 converse Lyapunov theorem, 921 converter, 1413 ald, 307 aclac, 1417 analog-to-digital (ald), 239 boost, 1415,1417,1419,1420,1422 buck, 1414 dc-link, 1417 digital-to-analog (&a), 240
THE CONTROL HANDBOOK high-frequency pwm dc/dc, 1414 phase-controlled, 1416 resonant, 1416 switched-mode, 1415 voltage step-down, 1414 voltage step-up, 1415 convex optimization, 749 convolution, discrete, 248 convolution integral, 8 convolution sum, 14,831 cooperative scheduling, 328 coordinate vector, 38 coordinates, 38 cyclic, 974 coordination motion, 1384 coprime, 130 coprime denominator polynomials, 289 coprime factor perturbations, 554 coprime factorization, 554,555 coprime fraction, 692, 905 coprime numerator polynomials, 289 coprime polynomial matrices, 123 coprime polynomials, 690 correlation analysis, 1044 cost average, 1103 criterion, 1103 discounted, 1103 infinite horizon total discounted, 1101 optimal, 762,775 optimal infinite horizon, 1101 costate, 761 equation, 760 costing range, 810,844 countable state space, 1103 countably infinite state, 1101 counters, 356 couplink bounds cross, 713 covariance function, 1044,1084 covariance matrix, 1036 coverage properties, 1023 cpu (central processing unit), 353,359 Cramer's rule, 35 credit assignment functions, 1030 criterion, 1086 Akaike's information theoretic, 1039 determinant, 149 discounted cost, 1102 frequency domain, 148 infinite horizon cost, 1103 Markov stability, 150 Nyquist stability, 55 Rissanen's minimum description length, 1039 robustness, 499 Routh-Hurwitz, 132 Schur-Cohn, 148 critical parameter value, 953 cross coupling bounds, 713 cross spectrum, 1045 cross-coupling, 712, 1391 cross-covariance function, 1044, 1083 cross-validation, 1038 crossover frequency, 768 crossover model, 1498, 1502 cubic nonlinearity, 364
current, 101 current-flow-causer model, 101 current-mode control, 1421 curse of dimensionality, 1023 CuNe a,190 closed, 57 lm, 178 M ,190 process titration, 1206 simple closed, 57 Stribeck, 1370 strong acid-strong base, 1207 switching, 774 titration, 1206 cycle kraft liquor, 1221 stable limit, 877 cyclic coordinates, 974 cyclic input, 1363 cyclic vector, 46 cycfoconverter, 1417 cylinder air charge, 1264 d-k iteration, 682 d-step ahead predictor, 1091 dampers, 1193 damping nonlinear, 985 damping control, 1358 Darcy's law, 1158 darma, 833 data estimation, 1039 integrity, 331 validation, 1039 data persistence, 436 dc gain, 770,829 dc generator, 1437,1440 dc generator equivalent circuit, 1439 dc machine generator, 1438 dc motor, 1427-1429 dc term, 832 dc-link converter, 1417 de Moivre's formula, 52 de morgan theorems, 348 dead band, 1207 dead zone, 365,369,845 deadband, 310 deadband effects, 3 10 deadbeat observer, 290 deadbeat response, 170, 171 deadtime, 1235,1236 deadtime compensation, 1239 deadzone nonlinearity, 1198 decentralized control problem, 780 decentralized information structure constraint, 780 decentralized inputs and outputs, 782 decentralized robust controller, 744 decentralized stabilization, 782 decentrally stabilizablestructures, 783 decibel (db), 174 decomposition acyclic I 0 reachable, 789 approximate multimodal, 1492 disjoint, 781 graph-theoretic, 788
INDEX Ibt, 789 nested epsilon, 590 overlapping, 791 singular value, 404 decoupling, 476 block, 749 diagonal, 795 disturbance, 476 nonsquare syste,ms, 800 static, 802 triangular, 802 define statement, 338 definite symmetric matrix, 891 defuzzification, 1007 defuzzificationinterface, 1004 degre: fractional, 486, 187 Macmillan, 481,486 delay angle, 1416 delay operator form, LO delays, time, 225 delta-operator polynomials, 150 density, mass, 1158 dependent loop level, 1457 derivation, physical, 1158 derivative, 53 Lie, 880 partial, 1277 derivative control, 199 derivatives, 844 describing function, 364 sinusoidal, 363 describing funct~onmethod, 363 descriptions fractional, 690 design dosed-loop corltrol, 1487 compensator, 1397,1399,1420,1476 constrained-input, 773 control, 681,771,1091,1240,1392,1487 control system, 682,1213, 1345 controller, 685,750,754,946,985,1318,1332 digital controller, 776 equations, 761 experiment, 1036,1050 feedback, 1273 feedforwad control, 1273 filter, 1336 imc, 2 18 Kalman filter, 1332 linear quadratic, 260 linear quadratic regulator, 1332 Iq regulator, 1335 lqr, 762,770 Lyapunov, 932 modular, 981, 085 nonadaptive, 980 numerical optimization-based, 749 observer, 411,049 one-loop-at-a-time siso, 759 optimal lq, 776 outer loop, 762. output feedback, 989,990 output-feedback modular, 990 recursive, 984 reference model tracking system, 293 response rnodd, 292
siso, 759 steady-state process, 1222 step-invariant discrete-time observer, 290 switching-surface,944 tracker, 770 tracking system, 291,292 vessel, 121X design cycle, 324 design formalism, 326 design model, 1332, 1334 design polynomials, 841 design trade-offs, 669 desire, 421 destabilizing, 1378 detectability, 765,925, 1165, 1171 detectable, 926 detection, 565, 567 determinant, 35,129 properties, 35 determinant criterion, 149 deterministic system, 1099 deviation system, 770 deviations, 770,832 devices discrete-state, 1199 modulating, 1200 programmable logic (pld), 351 diagonal coupling zeros, 797 diagonal decoupling, 795 internal stability, 796 diagonal form, 250 diagonalization, 44 diagram bifurcation, 953 block, 85,93 control flow, 441 grafcet, 356 ladder, 351 lm, 175 Im angle, 185, 190 log magnitude-angle, 175 Nyquist, 137, 140 object, 421,422 phase, 175 signal flow, 437 state-transition, 437 vector, 1438 diagram blocks, 85 gain, 86 summer, 85 diamond turning machine, 1395 diffeomorphism global, 910 local, 910 difference equation, 9,29,66, 146,241,777 difference-equation models, 831 differential equation, 3,28 auxiliary conditions, 3,4 backward partial, 1068 Cauchy-Riemann, 53 Fokker-Planck partial, 1068 initial conditions, 4 Kolmogorov forward partial, 1068 linear, 880 linear ordinary, 409 matrix Riccati, 595, 604 nonlinear partial, 1153
THE CONTROL HANDBOOK numerical approxima:ion, 265 ordinary, 3,66,74 partial, 3, 1170 Riccati, 596 scalar, 115 signal-flow graphs, 96 stochastic, 1101 differential equations, 28 differential expression, 3 differential operator, 125 differential operator controllability, 125, 126 differential operator form, 125 differential operator(observability, 125,128 differential sensitivitv. ,. 539 differentiation, numerical, 271 diffusion coefficient, 1158 digester house, 1221 digital control, 239 digital control system, 239 digital controller design, 776 digital filter, 265, 1198 digital processing, 307 digital sensors, 346 digit;? simulation, 431 digital-to-analog converters (dla), 240 digitizing analog controllers, 265 dilation, 61, 1048 dimension, 38 diodes Schottky, 351 diophantine equation, 777,809 diophantine identity, 837,838,841,842 diprotic acid, 1208 direct adaptive control, 1381 direct criteria optimization, 822 direct feed matrix, 454 direct field orientation, 1450 direct mrac, 850 with normalization, 856 with normalized adaptive laws, 851 without normalization, 850,855 direct plot, 188 direct polar plot, 179, 181,186, 189 direction conjugate gradient, 1040 Gauss-Newton, 1040 gradient, 1040 Levenberg-Marquard, 1040 Newton, 1040 Dirichlet condition, 1159 Dirichlet series, 1166 Dirichlet's problem, 61 discontinuity, 424,942 discount factor, 1101 discounted cost, 1103 cost criterion, 1102 cost problem, 1101 discrete convolution, 248 discrete Hermite-Bieler theorem, 148 discrete inverted pendulum dynamics, 775 discrete systems, 55,742,788 discrete time, 575, 1034 discrete time Fourier transform, 22 discrete-frequency response, 247 discrete-state devices, 1199 discrete-time
algebraic Riccati equation, 775 gaussian stochastic processes, 577 LQR, 775 systems, 61 1, 775 discrete-time algebraic Riccati equation, 611 discrete-time fourier transform (dtft), 253 discrete-time linear systems, single-input single-output, 57 discrete-time models, 273,831 discrete-time signals, 239,240 discrete-time stability tests, 146 discrete-time state equations, 247 discrete-time stochastic processes, 581 discrete-time system, 146, 147, 150,239,459 disease, 1185 disjoint decompositions, 781 dispersion coefficient, 1158 displacement, presliding, 1371, 1380 displays graphical, 430 dissipative controllers, 1325 distillation column, 742 distinguishability, 1169, 1170 distinguishable, 1171 distributed controls, 1148 distributed-parameter system, 1169 distribution involutive, 1361 steady state, 1102 disturbance, 832 exogenous inputs, 666 external, 3 17 matched, 949 rejection problem, 666 self-induced torque, 1330 disturbance attenuation, 692 disturbance bounds, 705 disturbance decoupling, 476 disturbance elimination, 832 disturbance localization, 476-478 dynamic compensator, 478 stability, 477 disturbance model, 703,1090 disturbance rejection, 164,514 disturbance response, 1478 disturbance specifications, 712 disturbancesltracking poles, 732 dither, 1378 divergencetheorem, 1165 divisibility check, 485 division uncertainty, 534 divisors elementary, 485 domain frequency, 676 time, 110 domain of attraction, 890 domain performance frequency, 161 double integrator, 1344 Doyle-Stein condition, 612 drift, 1080 drift-free (class i) systems, 971 drift-free system, 971 driftless systems, 1360 drive trains, 1388 drum dynamics, 1472 drum level control, 1470,1475
INDEX drum level controller, 1471 drum level dynamics, 1463 dsblock, 426 dt, 233 dtc, Smith predictor, 225 dual control system, 11171 dual space, 39 duality, 251,471 duct static pressure, 1203 duty ratio, 1414 dymola, 105,421 dynamic analysis, 1294 dynamic compensation, 625 dynamic dissipative controller, 1322 dynamic elastic beam models, 1139 dynamic elastic plate models, 1146 dynamic friction, 1376 dynamic fr~ctionmodel, 1376 dynamic matrix contrc~l,224 dynamic matrix controller (dmc), 807 dynamic model, 1428 dynamic modeling, 14 13 dynamic output feedback, 781 dynamic plates, 1148,1153 dynamic programming, 1099 dynamic truth-tables, 437 dynamical behavior, 1458 dynamical boundary conditions, 1141 dynamical form, 235 dynamical nonlinear system, 924 dynamical system, 895, 952 dynamics, 1341 actuator, 1343 air-gas, 1467 blend chest, 1232 circulation loop, 1472 compensator, 770,771 control system, 770 discrete inverted ,pendulum, 775 drum, 1472 drum level, 1463 equilibria &perturbation, 1474 equivalent system, 944, 945 f-18 haw lateral clirectional, 632 f-18 haw linearized lateral, 623,625,626 fpcc lateral directional,633 inverse, 1344 joint space inverse, 1344 lagrangian, 1341 linear time-varying, 393 nonlinear, 1290 plant, 944 pressure, 1459 pulp and paper process, 1234 reduced-order linear, 945 reheat steamside, 1464 spacecraft rotational, 1306 superheat steamside, 1464 system, 390 task space inverse, 1345 time-invariant, 765 unmodeled, 845 zero, 918,922, 14'74 dynasim ab, 421
earth observing system (EOS),664 easy-5,420
economizer cooling, 1200 edge, exposed, 499 edge theorem, 495 effect aliasing, 1418 loading, 86 nonlinear, 1052 efficiency,high, 1413 eigenspace, 43 maximal, 44 eigenstructure,621 eigenstructureassignment, 621,624,625,627 eigenstructureLQR design, 766 eigenvdue, 43,287,288,291,404,450, 752 eigenvaludeigenvector derivative method, 624 eigenvalues, 5,11,43,44,118,123,125,249,445,475;,742 eigenvectors, 4345,445,450 eingenstructureassignment, 629 elastic beam equation, 1171 elasticity, joint, 1343 electric generator, 1454 electric power generating plants, 1453 electrical demand limiting, 1201 electrical generators, 1437 electrodes, 1206 electromechanical relay, 347 electronics, power, 1413 elementary divisors, 485 elements resistive source (rs), 104 snubber, 1417 encirclement, 57 end-effector frame, 1340 end-effector space, 1340 energy, 102 kinetic, 1149,1341 potential, 1341 strain, 1149 energy methods, 973 energy modeling, 102 energy norm, 1154 energy-efficient control, 1200 engine control, 1261 engine models, 1271 envelope, operating, 960 envelope function, 553 environment rigid, 1352,1355 soft, 1353,1355 environment forces, 1343 environments,exogenous input, 1330 equation adjoint, 454 algebraic Riccati, 597,608,764 auxiliary, 5 Cauchy-Riemann differential, 53 characteristic, 5,11,43,838 circulation loop, 1473 complementary, 4 conservation, 1159 costate, 760 design, 761 difference, 9,29,66, 146,241,777 differential,3,28,96,115,265 diophantine, 777,809 discrete-time algebraic Riccati, 611,775 discrete-time state, 247
THE CONTROL HANDBOOK elastic beam, 1171 Euler, 761, 1289 explicit matrix design, 762 filter algebraic Riccati (fare), 592 fixed-point, 49 Fokker-Planck, 1076 force, 1289 Hamilton's, 762 Hamilton-Jacobi-Bellman, 1100 heat, 1157, 1158,1161,1167 homogeneous, 4 homogeneous difference, 242 Instantaneous, 426 Kalman filter, 591 kinematic, 1289 KoImogorov's backward, 1076 Kolmogorov's forward, 1076 Lagrange, 761,1345 Laplace, 53,61 linear, 42 linear algebraic, 403 linear difference, 1034 linear differential, 880 Lyapunov, 409,891 Lyapunov matrix, 592 matrix, 49 matrix design, 760, 763, 764 matrix Riccati, 762 matrix R~ccatialgebraic, 604 matrix Riccati differential, 595,604 moment, 1289 nonlinear partial differential, 1153 nonlinear spacecraft, 1312 nonrecursive difference, 10 normal, 42 ordinary differential, 74 output, 454 plant, 843 partial differential, 1170 Poisson's, 61 power continuity, 102 recursive difference, 10 Riccati, 409,595,762,769 state, 96,249,454,760 steady-state, 1160 stochastic differential, 1072, 1101 Sylvester, 409 system, 1352 trim, 1291 unforced rotational link, 390 wave, 1157,1167,1171 equation, complementary, 4 equation, homogeneous, 4 equations of motion, 1140,1141,1323 equilibria & perturbation dynamics, 1474 equilibrium, 390 equilibrium manifold, 942,946 equilibrium point, 81,889,952 equipment control, 1229,1232 power transmission, 1486 equipotential line, 62 equivalence topological, 456 equivalence relationship, 127 equivalent control, 944 equivalent state-space representation, 125, 127
equivalent system dynamics, 944,945 ergodic theorems, 1063 ergodicity, 577,584 error backward, 402 bias, 1038 estimation, 988 input, 402 integral absolute (iae), 218 integral square (ise), 218 latency, 328 modeling, 523 prediction, 830,833 quantization, 302,307, 319 tracking, 768, 923 tuned, 1284 variance, 1038 worst case (wce), 218 error analysis, 402 error feedback, 924 error feedback output regulation, 925 error matrix, 402 error system, 982 error variable, 924 error variance (ev), 218 essential singularity, 55, 60 estimate, Cauchy's, 59 estimation, optimal, 1395 estimation data, 1039 estimation error, 988 estimator, 832 Is, 838 recursive, 833 Euclidean algorithm, 484 Euclidean norm, 40,457,508 Euler's backward method, 266 Euler's equations, 761, 1289 Euler's forward method, 266 Euler-Bernoulli beam, 1145,1146 Euler-Bernoulli beam model, 1142 Euler-Bernoulli hypothesis, 1140 evaporators, multiple effect, 1221 event task, 335 exact control, 1163 exact controllability, 1143,1148, 1150 exact controllability problem, 1152 exact linearization, 913 exactly controllable, 1150 exciter, 1442 rotating, 1443 static, 1443 exciterfgenerator protection, 1445 exclusive or (xor) circuit, 349 execution, asynchronous, 352 execution shell, 324 exhaust gas recirculation (EGR), 1262 exit functions, 328 exogenous input, 926,930 exogenous input environments, 1330 exogenous input vafiables, 924 exosystem, 552,924,925 expansion Fliess, 882 functional, 880 Laplace, 35 Laurent, 59 expectation, conditional, 1070
INDEX expectation operator, 11390,1094 experiment design, 1036,1050 experimental apparatus,, 1008 expert systems, 994,996 explicit matrix design equations, 762 explorer i, 1307 exponent, 302 exponential, unit complex, 512 exponential input, 6,121 exponential ~ i a ~ u n stability, ov 1105 exponential system, 465 exponentially:stable, 1105 exponentially stable equilibrium point, 874 exposed edge, 499 expression, differential, 3 extended Kalman filter, 617 extended self-adaptive controller (epsac), 808 extended separation principle, 616 extended state, 478 extended-horizon adaptive controller (ehac), 808 extern, 334 external disturbances, 3 17 external stabilizability,475,476 externs, 332 f-18 haw lateral directional dynamics, 632 f-18 haw linearized lateral dynamics, 623,625,626 f -invariant, 39 factors 1 jot , 176 area dilation, 61 discount, 1101 forgetting, 834 invariant, 46 j w , 176 quadratic, 177 variable forgetting, 835 factor space, 470 factorization Cholesky, 44 coprime, 554,555 normalized coprime, 554 plu, 35,36 qr, 41 spectral, 1081 stable coprime, 490 stable right coprime: 492 fans, 1195 fare (filter algebraic Riccati equation), 592 fast tool servos, 1390 feasible control signals, 1297 feedback, 295,297,978 constant output, 799 constant state-variable, 764 constrained output, 624 continuous-time, 295 control, 769,775 control system, 762 dynamic output, :781 error, 924 full state-variable, 760 local direct state, !261 local dynamic state, 962 loops, 760 nondifferentiable, 392 observer, 291 output, 288,621, '768
+
possible, 1038 state, 210,287 static decentralized state, 781 steam flow, 1477 time-varying state-variable, 762, 764 feedback compensation, 289 feedback components, 896 feedback control, 1153,1266 feedback control algorithms, 1197 feedback control law, 464 feedback control system, 180 feedback controller, 924 feedback design, 1273 feedback forms, 93 feedback linearization, 423,424,909,911, 1344 feedback loop, 369,370 feedback stabilization, 893 feedback system unity, 189 feedback system performance specification, 540 feedback systems, 88,538,690,827 feedback transfer function, 88 feedforward, 208,1239 control, 769 loops, 770 system, 769 feedforward control, 208,1216,1273 feedforward control design, 1273 feedforward ph control, 1216 feedforward schemes, 1418 feedforward sp, 230 feedforwardlfeedbackstructure, 959 Feng and Loparo, 1124 fets (field-effecttransistors), 351 fiber, 1220 Fick's law, 1158 fictitious binary point, 301 field, vector, 1361 field orientation, 1425, 1450 field-oriented control, 1449 field-programmable logic arrays (fpla), 351 fieldbus, 359 fields, 33 vector, 862 filter all pass, 80 analog, 1198 band reject, 1393 digital, 265, 1198 inverse, 292 Kalman, 593 noise spike, 1198 notch, 197 optimal, 1086 optimal realizable, 1086 pseudosensitivity, 1277 realizable Wiener, 1087 reference input, 292 sensitivity, 2278 unrealizable, 1087 unrealizable Wiener, 1086 whitening, 1045 Wiener, 1086 filter algebraic Riccati equation (fare), 592 filter design, 1336 filter tuning, 382 filtering, 595,602, 1395 analog plant input, 295
THE CONTROL HANDBOOK higher rate digital, 297 input-signal, 1488 filtration, 1127 final state constraint, 760 final state weighting, 762 final-state weighting function, 760 finite control set, 1099,1103 control system, 1100,1102 horizon, 1099,1100 time horizon, 1099 finite dimensional linear transformations, 510 finite energy pair, 1149 finite energy solution, 1149,1152 finite energy space, 1144,1149 finite gain, 896 finite impulse response, 696 finite-dimensional,452 finite-dimensional continuous-time system, 952 finite-impulse models, 806 finite-state machine, 326 first order subsystems, 309 fixed point, 952 fixed-point arithmetic, 301,30 fixed-point binary arithmetic, 301 fixed-point equation, 49 fixed-point number, 301 flattened Lyapunov function, 939 flexibility, 664,668 flexible space structures, 1316 Fliess expansion, 882 Fliess fundamental formula, 882 Fliess generating power series, 882 Fliess series, 882 Fliess series expansions, 879 flight control, 1287, 1293,1295,1301 flight mechanics, 1288 flip-flop, 350 floating-point arithmetic, 302,310 floating-point binary arithmetic, 301 floating-point number, 301 flop, set-reset (sr) flip, 350 flow control, 1196 flow graphs, signal, 93 flowcharts, 437 flushing, 1183 flux slip angle, 1450 flux transfer coefficient, 1159 Fokker-Planck equation, 1076 Fokker-Planck partial differentialequations, 1068 folding frequency, 270 force cartesian, 1354 environment, 1343 hybrid, 1352 force equations, 1289 force of constraint, 1354 force of motion, 1354 force-controlledsubspace, 1353 forced components, 242 forced linear time-invariant systems, 446 forced response, 4,5,115,831,842 forced systems, 446 forgetting multiparameter models, 835 forgetting factors, 834 form
advance operator, 10 block jordan, 250 canonical, 456,461 cascaded, 93 chain, 1364 control canonical, 456 controllable, 250 controllablecanonical, 76,128,490 delay operator, 10 diagonal, 250 differentialoperator, 125 dynamical, 235 feedback, 93 hamiltonian, 975,976 imc filter, 219 integral, 235 ISA (Instrument Society of America), 200 Jordan, 45 linear, 389 linear multiplicative, 970 linear-in-the-parameters (litp), 833 modal canonical, 121 moving-average (ma), 836 nondynarnical, 236 normal, 913,927 observable, 250 observable canonical, 76, 126 observer canonical, 453 output-feedback, 989 parallel, 93,200,303 parametric strict-feedback,984 polynomial, 776 quadratic, 47,939 rational canonical, 46 reduced, 55 regular, 944 series, 200,203 Smith, 484 Smith canonical, 36 Smith-MacmiUan, 490,491 standard, 200,875 vector additive, 970 form input, 435 formality, 326 formula Ackermann's, 287 adjugate-determinant, 482 Cauchy integral, 58 de Moivre's, 52 Fliess fundamental, 882 Hadamard's, 56 heaviside's expansion, 55 inversion, 59 forward error analysis, 402 forward transfer function, 88 forward-path gains, 95 Fourier transform, 17 numerical computation, 27 fourier transform (ft), 253, 1043 Fourier's law, 1158 fpcc lateral directional dynamics, 633 fpcc yaw pointingllateral translation controller, 629 fpla, 351 fraction coprime, 692 matrix, 776 polynomial matrix, 482
INDEX fractional linear, 677 fractional dead time, 832 fractional degree, 486,487 fractional descriptions, 690 fractional transformations, 61 fractions, coprime, 905 frame base, 1340 compliance, 1353 end-effector, 1340 task, 1340, 1353 world, 1340 Francesco, Jacopo (Count Riccati), 595 free response, 831,842 Freeman, 938 frequency crossover, 768 folding, 270 gain crossover, 155,286 gain-margin, 184 natural, 5 phase crossover, 152,286 phase-margin, 184 resonant, 186 frequency analysis, 1043, 1044 frequency and time responses, 190 frequency content, 1041 frequency domain, 188 418,676 frequency domain criterion, 148 frequency domain equality, 592 frequency domain methods, 4,286 frequency domain performance, 161 frequency prewarping, 268 frequency resolution, 1045 frequency response, 180,408,508, 1043, 1045 frequency response analysis, 513 frequency transfer function, 175 frequency weights, 516 frequency-domain mocleling, 1498 frequency-domain robust performance tests, 678 frequency-response curves, 173 friction, 1342, 1375 dynamic, 1376 off-line, 1375 position-dependent, 1575,1380 rlsing static, 1370 static, 1380 Stribeck, 1370 friction compensation, 1376 friction materials, 1370 friction modeling, 1369, 1370 friction parameters, 1373 frictional memory, 1371,1380 Frobenius norm, 40 Frobenius' theorem, 915 ftime function, 3338 full blocks, 673 full state-variable feedback, 760 full-order observers, 290 function affine, 497 basis, 1048 basis influence, 1021 biorthogonal, 1172 dosed-loop transfer, 88,538,541,692 complementary stsnsitivity,216
complex, 53 constructor, 338 control Lyapunov, 932 covariance, 1044 cross-covariance, 1044 credit assignment, 1030 describing, 364 envelope, 553 exit, 328 feedback transfer, 88 final-state weighting, 760 forward transfer, 88 ftime, 333 gain, 895 generalized predictive control cost, 844 hamiltonian, 761 harmonic, 53 hinge, 1049 holomorphic analytic, 53 homogeneous scalar quadratic, 47 Hurwitz-stable rational, 690 impulse response, 452, 1162 inner loop transfer, 1493 inverse describing, 1379 irrational, 2 Y linear, 38,40 Lipschitz, 889 localized basis influence, 1021 logarithm, 57 loop transmission, 715 Lyapunov, 890,932 main, 334 matrix sign, 605 matrix-valued, 48 membership, 1021 meromorphic, 56 multiaffine, 497 multiple-valued complex, 53 multivariable, 47 nonlinear, 47 norm, 895 open-loop transfer, 88, 181 orthogonal, 41 plant transfer, 838 polynomial, 47,497 positive definite, 921 prefilter, 715 proper, 921 proper rational, 20 proper rational matrix, 481 proper separation, 901 public, 335 quadratic Lyapunov, 934 radial basis, 1021 rational, 20,55 rational transfer, 689 regular analytic, 53 scalar exponential, 1174 Schur-Stable rational, 690 sensitivity, 254, 1277 signum, 773 simulation, 341 single-valued complex, 53 spectral density, 579 splinelike, 1021 stabilizing, 980 switching, 773, 1418
THE CONTROL HANDBOOK transfer, 55,70,75,86,93,96, 147, 179,242,407,827,832,919 transfer feedback, 88 transition, 1075 tuning, 981,982,990 vector Liapunov, 784,785 virtual, 334 Z-transfer, 242 (unction approximators, 1020 function charts, 358 function exchange method, 332 functional expansions, 880 functionals, linear, 39 fundamental, local, 1422 fundamental theorem, 51 fundamental theorem of algebra, 54 fundamental transformations, 61 fuzzification, 1005 fuzzification interface, 1004 fuzzy control, 995,1001,1002,1009 fuzzy control models, 1500, 1503 fuzzy logic, 1004 fuzzy models, 1049 fuzzy sets, 1004 &zy systems, 994 g.a. & t.m. Korn, 421 gain dc, 770,829 finite, 896 forward-path, 95 high, 640 high-frequency, 386 high servo, 1377 Kalman filter, 592 loop, 95, 767 low, 640 margin, 768 multi-loop feedback, 764 nonlinear small, 901 nontouching-loop, 95 optimal feedback, 762,769,775 peak-to-peak, 666 unknown time-varying, 829 gain and phase margins, 162 gain block, 86 gain crossover frequency, 155,286 gain function, 895 gain margin, 152, 184,286,528,739 gain scheduling, 393,824 gain suppression, 621 gain-margin frequency, 184 gain-scheduling, 388,393 gain-suppression structure, 624 gap metric, 905 gate, 348-349 Gauss' mean value theorem, 59 Gauss-Newton direction, 1040 gaussian, 1081 gaussian noise, 589 gaussian optimal filtering problem, 596 gaussian white noise, 1090 gaussian white noise sequence, 807 general conic regions, 903 general linear feedback loops, 676 general linear group, 34 general sde, 1075 generalized averaging, 1422
generalized minimum variance control, 1093 generalized minimum-variance (gmv) control, 840 generalized Nyquist's stability criterion, 182 generalized predictive control, 224,844 generalized predictive control (gpc) cost function, 844 generalized predictive controller (gpc), 808 generalized state-space models, 1423 generalized structured plant uncertainty, 557 generalized unstructured perturbations, 556 generator dc, 1437,1440 dc machine, 1438 electric, 1454 electrical, 1437 induction, 1446,1448 steam, 1454 synchronous, 1440 wound-field synchronous, 1442 generator excitation control, 1484 generator excitation systems, 1485 generator system control, 1440 generator voltage, 1440 generic loop, 1366 genetic algorithms, 994,997 geometric boundary conditions, 1141 Geometric Control Theory, 872 geometric series, 56 geometrical optics, 1173 geometry higher dimensional, 1167 value set, 501 global approximation structure, 1019 global asymptotic stability, 890 global bifurcations, 953 global diffeomorphism, 910 global rotations, 1140 global scope, 420 global stabilization, 921 gpc, unconstrained, 809 gradient, 1025 gradient descent, 1024 gradient direction, 1040 grafcet (graphe de commande etape-transition), 356-357 gram matrix, 42 Gram-~chmidtorthogonalization, 41 Gram-Schmidt procedure, 41 grarnmian controllability, 122,458 observability, 123,458 graph bond, 102 inverse, 896 signal-flow, 93,96 graph separation theorem, 896 graph-theoretic decompositions, 788 graphical displays, 430. graphical input, 436 graphical programming, 1203 Green's operator, 1162 groundwood mius, 1221 group general linear, 34 special euclidean, 1340 special orthogonal, 1340 growth, linear, 934 guaranteed dosed-loop stability, 770 guaranteed robust properties, 593
INDEX guaranteed robustness properties, 762 guaranteed stability, 592,768, 770 guideposts, 188 gyrator (gy), 103
wooand controllers, 653 ho3-and synthesis methods, 1320 woonorm, 652 '?i2
Hadamard's formula, 56 Hamilton's equations of motion, 762 Hamilton's principle, 1140 Hamilton-Jacobi-Bellrnan equation, 1100 hamiltonian, 760 hamiltonian control sy:jtems, 974 hamiltonian form, 975, 976 hamiltonian function, '761 hamiltonian property, 598 hamiltonian system, 761,766 hamiltonian viewpoint, 975 hamming window, 1045 hard constraints, 810 hardware, 1395 computing, 430 hardware modification, 1377 harmonic balance, 964 harmonic functions, 53 harmonics, local, 1422 Has'minskii, 1109, 11115 header file, 334 health, 1186 heat conduction procer~s,1171 heat content, 1158 heat equation, 1157, 1158, 1161, 1167 heat exchangers, 1192,1464 heat flow, 101 heat flux, 1158 heaviside's expansion formula, 55 heavyweight threads, 3%6 hemicellulose, 1220 hermitian, 509 hermitian complex-valued matrix, 509 hermitian symmetric, 36 hermitian transpose, 36 HGSC, 738 robustness properties, 738 hierarchical control sch.emes, 763 hierarchical modeling, 105,416, 1423 high efficiency, 1413 high gain, 640 high servo gains, 1377 high-frequency gain, 386 high-frequency pwm dcldc converters, 1414 high-frequency pwm inverter, 1416 high-frequency pwm rectifier, 1415 high-gain servomechanlism control, 737 high-gain servomechaniism controller (HGSC), 738 high-level languages, 430 high-order coefficient matrix, 488 high-voltage dc (hvdc), 1490. high-voltage dc (hvdc) bulk-power transmission systems, 1413 higher dimensional geometries, 1167 higher index models, 423 higher order holds, 297 higher rate digital filtering, 297 Hilbert norm, 1151 Hilbert space, 1170 Hilbert uniqueness merhod, 1151
hinge functions, 1049 hinging hyperplanes, 1049 historical approaches to systems, 1187 holomorphic analytic functions, 53 homogeneous, 1073 homogeneous difference equation, 242 homogeneous scalar quadratic functions, 47 horizon control, 844 finite, 1099, 1100 finite time, 1099 lower costing, 844 upper costing, 844 hovering motions, 978 human operator, 1498 Hurwitz, 891 Hurwitz matrix, 874,891 Hurwitz-stable rational functions, 690 hybrid force, 1352 hybrid forcelposition control, 1354 hygiene, 1186 hyperbolic systems, 1142, 1145 hypercube, 773 hypermedia, 431 hyperplanes, hinging, 1049 hyperstatic situation, 1352 hypothesis, Euler-Bernoulli, 1140 hysteresis, 365 ICAFC, 1269 ics, 351 idcom, 224 ideal sliding mode, 942 ideal switch, 1414 ideal tracking, 292 identifier passive, 987 swapping, 988 identity bezoutian, 837 diophantine, 837,838,841,842 predictor, 1085, 1087 identity matrix, 34 idle speed control, 1263, 1270 IEEE, 422 ignition control, 1262 image, 39 imaginary axis, 52 imaginary number, 51 imc, 821 imc design, 218 imc filter forms, 219 imc structure, 217 impedance, 1358 impedance control, 1352,1357, 1358 impulse response, 68,1043-1045 impulse response function, 452 impulse-invariant appoximation, 272 impulse-response function, 1162 incidence matrix, 1059 include, 334 incremental heat capacity, 1158 incremental training, 1025 incrementally stable, 903 independence, linear, 38 independent additive perturbations, 557 independent, identically distributed (iid), 1058
THE CONTROL HANDBOOK index controllability, 251 invariant dynamical, 490 ise, 170 itae, 170 Kronecker controllability,490 performance, 169,760,762,769,775,1090 step, 240 indirect adaptive control, 1381 indirect field orientation, 1450 indirect mrac, 852,856 indoor toilet, 1186 induced euclidean norm, 40 induction generator, 1448 induction generator model, 1447 induction generators, 1446 industrial control, 358 inequality cauchy's, 52 linear matrix (Irni), 675,752 Schwarz, 41 triangle, 39,52 inertia matrix, 1341 inertid alris, 1311,1313 infeasibilities,811 infeasible control signals, 1297 inference mechanism, 1004,1006 inferential control, 569 infinite control set u,1099 horizon cost criterion, 1103 horizon total discounted cost, 1101 infinite gain margin, 768 infinite horizon optimal control problem, 603 infinite horizon performance index, 764 infinite-dimensional lti systems, 507 infinite-horizonlq, 845 infinitesimal strains, 1140 inheritance, class, 422 inhomogeneous Neumann condition, 1159 initial conditions, 4,10,1170 initial state, 1170 inner loop, 1494 inner loop transfer functions, 1493 inner looplouter loop architecture, 1344 inner product spaces, 41 inner products, 39,40,898 innovations representation, 1086 innovations variance, 1038 innovationsvector, 591 input constant, 7, 13 control, 910,924,926 cyclic, 1363 exogenous, 926,930 exponential, 6,12 exponential complex, 7 form, 435 graphical, 436 menu driven, 435 oscillatory, 967,968 reference, 281 sinusoidal, 7,13,109 unit impulse, 711 input and output strictly passive, 898 input error, 402 input of the plant, 523
input signal, 1050,1488 input sinusoids, 513 input spectra, 1038 input strictly passive, 898 input-output feedback linearization,917 input-output linearizability, 1431 input-output linearization, 424,1425 input-output models, 65 input-output stability,895 input-output stable, 242 input-signd filtering, 1488 instability,practical, 227 instantaneous equations, 426 instantiatingtasks, 334 instructions, 439 integrable, 1361 integral bode sensitivity, 543 complementarysensitivity, 545 complex, 57 convolution, 8 It13,1071 Wiener, 1071 integral absolute error (iae), 218 integral control, 199, 1377 integral form, 235 integral square error (ise), 218 integral theorems, 57 integral windup, 1213 integrated dynamic friction model, 1372 integrated systems, 426 integrated systems inc., 419 integration conditional,202 integrator, 86,832,1236 double, 1344 integrator backstepping, 935 integrator block, 86 integrator windup, 614, 1421 integrators, 770 intelligent autonomous control systems, 994 intelligent control, 994 interchangeability,360 interconnection,well-defined, 896 interface ' algorithmic, 435 operator, 334 user, 431 interior point, 55 intermediatevariable, 761 internal cancellation, 231 internal modeL 832,930 internal model control, 215,217 internal model controllers (IMC), 382 internal model principle, 235 internal stability, 216 internal stabilizability, 475 internal state, 910, 1025 internally stabiliiable,475,476 interoperability, 360 interpolation, 1049 interpretation models, 1458 interrupt, 326 interrupt scheduling, 329 interrupt signals, 354 interrupts, 329 intertask communication, 331
INDEX interval, sampling, 1034 interval polynomials, 501 invariance principle, 890 invariant conditioned, 472 controlled, 472 maximum controlled, 473 minimum conditioned,473 time, 452 invariant dynarnical indices, 490 invariant factors, 46 invariant manifold, 926 invariant polynomials, 36 invariant set, 952 invariant subspaces, 471 invariant zeros, 475 inverse, matrix, 34, 35 inverse damping wntrol, 1358 inverse describing funntion, 1379 inverse discrete time fourier transform (idtft), 253 inverse dynamics, 1344 inverse filter, 292 inverse fourier transfordm(ift), 253 inverse graph, 896 inverse kinematics, 1340 inverse Laplace transfo~m, 55,117 inverse models, 424 inverse Nyquist array, 723 inverse Nyquist array design method, 722 inverse polar plot, 179 inverse stiffness, 1358 inversion, 61 inversion formula, 59 inverted pendulum, 971 inverted pendulum prolblem, 765 inverter, 1413,1416 high-frequency pwm, 1416 invertibility,35 involutive closure, 1362 involutive distribution, 1361 involutivity, 915 ipm synchronous moto~r,1434 irrational function, 21 irrational number, 51 irrational transforms, 2 1 ISA (Instrument Society of America) form, 200 ISA terminology, 1223 ISC, 1271,1272 calibration, 1273 ise index, 170 isolated invariant set, 952 isolated singularity, 55 isopotential point, 1206 itae index, 170 iteration d - k,682 policy, 1102 value, 1102 iterative solution, 10 Ita correction term, 1068 It6 integral, 1070, 1071 It62 rule, 1068, 1072 j lti plant models, 704 jacobian matrix, 389 jk flip-flop, 350 joint elasticity, 1343
joint space, 1340 joint space inverse dynamics, 1344 joint torque control, 1378 joints, 1340 j w factors, 176 Jordan blocks, 45 Jordan form, 45 jordan representation, 78 junction, 102 summing, 91 Kbrmbn system, 1153 Kalman filter, 593 extended, 617 properties, 592 Kalman filter design, 1332 Kalman filter dynamics, the, 591 Kalman filter equations,591 Kalman filter gain, 592 kappa-tau tuning, 819 kernel, 39 kf problems, 592 Kharitonov, theorem of, 495,501 kinematic equations, 1289 kinematic reciprocityrelationship, 1353 kinematics, 1340 kinetic energy, 1149,1341 Kirchhoffmodel, 1146,1148,1153 Kirchhoff plates, 1148, 1154 knowledge models, 145 Kokotovic, 938 Kolmogorov forward partial differentialequations, 1068 Kolmogorov's backward equation, 1076 Kolmogorov's forward equation, 1076 kraft liquor cycle, 1221 haft process, 1221 Kronecker~trollabilityindices, 490 Kronecker products, 47 Kushner, 1108,1109 La Sdle invariance principle, 1171 ladder diagrams, 351 ladder logic, 437 lag compensation, 196 lead, 197 lag-lead compensator, 180 Lagrange equations, 1345 Lagrange's equations of motion, 761 lagrangian, 1140,1341 reduced, 975 lagrangian control syitem, 974 lagrangian dynamics, 1341 lagrangian model, 975 lagrangian systems, 974 lambda tuning, 1232,1234 A-tuning, 821 lan (local area network), 360 languages, 325 high-level, 430 modeling, 42 1 simulation, 420,431 Laplace expansions, 35 Laplace operator, 1157 Laplace transforms, 17, 18 inverse, 117 irrational, 21 rational, 20.25
THE CONTROL HANDBOOK Laplace's equation, 53,61 Laplace-Bore1 transform, 885 latch, 350 latched relay, 352 latency, minimum, 329 latency error, 328 latent controller, 385 lattice, 599 Laurent expansion, 59 Laurent series, 59 law adaptive, 853, 854 control, 198,832,853,854,959,1311 Darcy's, 1158 feedback control, 464 Fick's, 1158 Fourier's, 1158 minimum-variance control, 1092 nonlinear feedback control, 1324 smooth control, 936 state-feedback control, 762,774 lbt decompositions, 789 lead compensation, 196 lead compensator, 1393 lead lag cornpeasation, 197 learning, 1019 active, 1018, 1019 passive, 1018, 1019 reinforcement, 1030 learning control, 1019,1350,1378 least significant bit (Isb), 301 least squares, 1023 nonlinear, 1023 least-squares (Is) method, 834 left coprime, 13P left eigenvector, 44 left half-plane (Ihp), 152 left matrix fraction description, 483 left singular vectors, 510 level dependent loop, 1457 principal regulation, 1457 quantization, 306 unit control, 1457 Levenberg-Marquard direction, 1040 lft, constant, 677 Li and Blankenship, 1123 Lie algebra, 45,865,971,972,1363 Lie algebra rank condition (larc), 971,978 Lie bracket, 45,861,862,914,968,971,1361 Lie derivative, 880 Lie monomials, 886 Lie polynomials, 886 Lie product, 45,880,914 Lie saturate, the, 867 life cycle, 324 lightweight threads, 326 lignin, 1220 lime kiln, 1221 limit, 53 limit cycles, 310, 366,886, 952 limit set, 952 limit switches, 346 limit theorems, 1062 line equipotential, 62 reference, 1139
line-of-sight pointing, 1326 linear algebraic equations, 403 linear black-box models, 1043 linear black-box systems, 1041 linear combination, 96 linear control, 1232 linear controller, 930,1214 linear detectable system, 930 linear difference equation, 1034 linear differential equations, 880 linear equations, 42 linear feedback controls, 1145 linear filtering, 1080,1081 linear form, 389 linear fractional, 677 linear fractional transformation, 61,676 linear full-order observers, 607 linear function, 38-40 linear growth, 934 linear independence, 38 linear least squares, 41 linear least squares method, 1034 linear least squares problems, 403 linear matrix equations, 49 linear matrix inequality (lmi), 675, 752 linear models, 1042,1146 linear multiplicative form, 970 linear multivariable control, 408 linear observed system, 1171 linear ordinary differential equations, 409 linear parameter varying systems, 395 linear quadratic (Iq) minimum-time problem, 772 linear quadratic design, 260 linear quadratic gaussian (Iqg), 1089, 1095 linear quadratic minimum-time design, 772 linear quadratic regulator (LQR) problem, 762 linear quadratic regulator design, 1332 linear reduced-order observers, 609 linear regression, 1035,1043 linear step-invariant discrete-time system, 242,247 linear stochastic systems, 1113 linear subsystem, 918 linear system, 93,147,239,451,459,890,910,918,929,1041 siso continuous-time, 585 unforced, 445 linear techniques, 960 linear time-invariant (lti), 3,4 linear time-invariant systems, 115 linear time-varying dynamics, 393 linear transformatioe, 590 linear truth model, 1331 linear-in-the-parameters (litp) form, 833 linear-quadratic optimal control problem, 595 linearity, 1161 in parameters, 1022 linearizability input-output, 1431 linearization, 81,82,388,390,396,417,874,891 of functions, 389 about a trajectory, 392 about an equilibrium, 390 adaptive feedback, 1347 exact, 913 feedback, 424,909,911,1344 input-output, 424,1425 input-output feedback, 917 limitations, 392
INDEX robust feedback, 1346 linearized mathematical model, 1316 linearized model, 1308, 1420 linearizing, 416 linearly dependent, 38 linearly independent, 38 linguistic rules, 1004 links, 1340 LINPACK, 413 Liouville's theorem, 59 Lipschitz condition, 889 Lipschitz functions, 880 liquor, white cooking, 1221 Im and phase diagram, 184 Im angle diagram, 185,190 Im curves, 178 lm diagram, 175 lmi, 752 problems, 754 synthesis, 752 load disturbances, 219,817 load power cornpensatLon, 1445 loading effects, 86 local approximation structure, 1019 local area networks (lans), 360 local bifurcations, 953 local diffeomorphism, 910 local direct state feedback, 961 local dynamic state feedback, 962 local fundamental, 1422 local harmonics, 1422 local methods, 755 local stabilization, 920 local-loop control, 1197 localization, 1160 disturbance, 476-478 localization properties, 1023 localized basis influence functions, 1021 locally Lipschirz functions, 889 location, 1048 log magnitude, 174 log magnitude-angle diagram, 175 logarithm function, 57 logarithmic arithmetic, 301 logarithmic plots, 174 logic fuzzy, 1004 ladder, 437 positive, 347 state-transition, 327 transistor-transistor (ttl), 347 ttl, 351 logical control, 1199 long-range predictive wntrol, 842 loop dosed, 1148 control, 763 design outer, 762 feedback, 760 feedfonvard, 770 gain, 767 general linear feedback, 676 generic, 1366 inner, 1494 open, 1148 outer design, 760 position, 1396
primary, 207 real time control, 760 secondary, 207 unity feedback gain outer, 770 velocity, 1396 loop gains, 95 loop system closed, 736 loop transfer function matrix, 513 loop transmission functions, 715 loop tuning, 1231 loop-shaping, 705 low gain, 640 low load regulator, 1477 low order polynomials, 150 low-power schottky ttl (Is-ttl) elements, 351 lower costing horizon, 844 lq optimd singular value constraint, 767 lq problems, 592 lq regulator design, 1335 lqg controller, 1320 lqr design, 762,770 frequency-domain relationships, 767 frequency weighted cost functionals, 644 gain selection tools, 640 guaranteed robustness, 767 physical motivation, 636 properties, 638 lqr asymptotic properties, 640 lqr controllers, 637,639 lqr loop transfer matrix, 638 lqr stability robustness, 639 1s estimator, 838 lse optimd performance, 218 lti controllers, 750,752 linear time-invariant, 3,4 models, 756 systems, 581,586,750 lubricant modification, 1377 lumped, 452 Lyapunov criterion, 467 Lyapunov design, 932 recursive, 935 Lyapunov equations, 409,891 Lyapunov exponent method, 1113 Lyapunov function, 890,932 flattened, 939 Lyapunov function method, 1107 Lyapunov matrix, 592 Lyapunov method, 890 Lyapunov redesign method, 932-933 Lyapunov stability, 889,1105 Lyapunov surface, 890 Lyapunov theory, 150 Lyapunov transformation, 456 Lyapunov's linearization method, 82
rn cirdes, 188
M curves, 190 rn vectors, 33 machine, synchronous, 1440 machine epsilon, 401 Macrnillan degree, 48 1,486 magnitude constraint, 773 magnitude spectrum, 18
THE CONTROL HANDBOOK main function, 334 main loop theorem, 677 make-before-break (rnbb), 346 maneuver tracking, 1329 maneuvering, vehicle, 1330 manifold, equilibrium, 942 manipulator jacobian, 1340 mantissa, 302 manual control, 382 manufacturing, uniform, I240 manufacturing automation protocol (map), 360 manufacturing message specification (mms), 360 manufacturing processes, 1221 map, 360 conformal, 139 control-to-state, 1150 measurable, 1101 mapping theorem, 503 mappings conformal, 60,61 margin gain, 152,184,286,528,739, 768 gain and phase, 162 infinite gain, 768 phase, 152,155,184,286,528,768 Markov chain, 1058 controlled, 1100 Markov process, 1127 Markov stability criterion, 150 markovian policy, 1099 martingale, 1070 masking, 348 Mason's rule, 93 mass density, 1158 matched disturbances, 949 matching, pole-zero, 272 matching condition, 932,933,949 matching step, 271 material flux, 1158 mathematical expectation, 1036 mathematical model, 827, 1008, 1197, 1323 mathematical software, 412,413 matrix, 33,45,47,462 addition, 34 algebra, 34 block, 37 blockdiagonal perturbation, 558 characteristicequation, 43 commuting, 45 companion, 46,98 complex, 508,509,513. constant; 764 control weight, 638 controllability, 122,124,251 coprime polynomial, 123 covariance, 1036 definite symmetric, 891 design equations, 760,763,764 direct feed, 454 elements, 33 environments, 433,436 error, 402 exponentials, 409,445 fraction, 776 gram, 42 hermitian complex-valued, 509 high-order coefficient, 488
Hurwitz, 874,891 identity, 34 incidence, 1059 index, 45 inequalities, 752 inertia, I341 inverse, 34,35 inversion, 35 invertible, 34 jacobian, 389 linear equations, 49 loop transfer function, 513 lqr loop transfer, 638 Lyapunov, 592 multiplication, 3 noninvertible, 34 nonnegative definite, 44 nonsingular, 34 nonsingdar square polynomial, 482 normal, 43 observability, 123,124,463,609 orientation, I340 orthogonal, 37,42 output, 454 output-feedback, 754 perturbed, 402 polynomials, 36,37 positive definite, 44 positive definite symmetric, 149 powers, 37 proper rationa!, 481 real polynomial, 482 regular, 34 representations, 39 resolvant, 116,121 resolvent, 48 Riccati equation, 762 similar, 39 singular, 34 skew-symmetric,36 square, 33 stability, 891 stable transfer, 689 state transition, 116, 117, 122,454 stochastic, 1058 symmetric, 36,44 system, 454 transfer, 796 transfer function, 507,557 transposition, 36 unirnodular, 36,61,130 unimodular polynomial, 482 unimodular square polynomial, 482 unitary, 37 unitary complex-valued, 509 unity, 1354 weighting, 1292 zero, 34 matrix equations, 49, matrix polynomial, 37 matrix Riccati algebraic equation, 604 matrix Riccati differential equation, 595,604 matrix sign function, 605 matrix transposition, 36 matrix-valued functions, 48 maximal eigenspace, 44 maximum amplification direction analysis, 513
maximum controlled invariants, 473 maximum modulus tl.teorem, 59 maximum principle, 1159 maximum selector, 208 maximum singular vectors, 511 mbpc, 806 mean uniformly bounded, 11 103 zero, 1090 mean square optimal predictor, 1085 mean-level control, 844 measurable map, 1101 measurement noise, 111,319,515,818 mechanical pulping, 1221 mechanical system, 102 membership function, 1021 memory, frictional, 1371,1380 memoryless system, 924 menu driven input, 435 meromorphic function, 56 meta instructions, 439 method analytical, 8 19 approximation, 276 asymptotic, 873 averaging, 873 branch and bound, 755 characteristic locus, 722 classical, 4, 14 computable, 664 control system design, 282 describing function, 363 eigenvalueleiger~vector derivative, 624 energy, 973 Euler's backward, 266 Euler's forward, 266 frequency-domain, 4,286 function exchange, 332 h,- and p synithesis, 1320 Hilbert uniqueness, 1151 least-squares (Is), 834 local, 755 Lyapunov, 890 Lyapunov exponent, 1113 Lyapunov funct~ion,1107 Lyapunov redesign, 932 mbpc, 806 model-based, 8214 mpca, 568 multiplier, 1173 nonparametric ~dentification,1051 open-loop, 978 optimization-based, 822 parametric estimation, 1051 perturbation, 873 phase plane, 37i! rectangular, 267 root locus design, 283 rule-based, 825 shooting point, 761 step response, 8.24 sweep, 762 time-domain, 4 trapezoidal, 267 Tustin's, 267 two-time-scale, 873 unit solution, 761
Z-transfer function, 242 Ziegler-Nichols, 818 method of convolution, 8, 13 microsynthesis, 1407 mills, groundwood, 1221 mimo, 75,128 mimo, transmission zeros, 447 mimo case, 124 mimo control, 1240 mimo feedback systems, 513 mimo frequency response analysis, 512 mimo linear time-invariant systems, 507 mimo lti systems, 166, 168 mimo plants, 508 mimo qft cad package, 715 mimo state-space system, 122 mimo systems, 528,710,713,760, 767 mimo transfer functions, 78 mini-max andH ', full-state feedback control, 648 mini-max controllers, 649 minimal polynomial, 37,46 minimum conditioned invariants, 473 minimum latency, 329 minimum modulus theorem, 59 minimum phase, 1081, 1095 minimum selector, 208 minimum singular vectors, 511 minimum-phase behavior, 1089 minimum-phase system, 179,732,920 minimum-time design, 771 minimum-variance, self-tuning, 838 minimum-variance (mv) control, 837 minimum-variance control, 1092 minimum-variance regulators, 806 miso analog control system, 702 miso discrete control system, 707 miso qft cad packages, 707 miso qft pc cad, 707 miso sampled-data control system, 707 miso system, 702 mixed p, 675 modal canonical form, 121 mode continuous conduction, 1418 discontinuous conduction, 1418 ideal sliding, 942 scan, 325 sliding, 942 mode decoupling, 621 mode switches, 203 model boundary-layer, 876 crossover, 1498,1502 current-flow-causer, 101 design, 1332, 1334 development, 566 disturbance, 1090 dynamic, 1428 dynamic friction, 1376 Euler-Bernoulli beam, 1142 fuzzy control, 1503 induction generator, 1447 integrated dynamic friction, 1372 internal, 832,930 Kirchhoff, 1146,1148 lagrangian, 975 linear truth, 1331
THE CONTROL HANDBOOK linearized, 1308 linearized mathematical, 1316 Iti, 756 mathematical, 1008, 1323 noise, 1052 nominal, 683 nonlinear averaged, 1419 optimal control, 1499,1503 perturbed, 739 plant, 731,738,750 prediction, 830,833 process, 1090 quantization noise, 309,310 Rayleigh beam, 1142 reduced, 875 scan, 325 seven-parameter friction, 1372 state, 454,459 state space, 1268 state-space averaged, 1419 structural, 1499, 1502 structure, 1038 switched state-space, 1418 synchronous generator, 1440 system, 760,762,769,775,942 time-invariant continuous-time state-space, 1419 time-invariant sampled-data, 1422 uncertainty, 551 voltage-drop-causer, 101 model algorithmic controller (mac), 808 model discretization, 1265 model hverse based, 220 model order, 1052 model output plot, 1052 model poles, 231 model predictive control, 224 model predictive heuristic control, 224 model reference control, 849 model selection, 1050 model stnicture, 831,1036 model uncertainty, 520,533,818 model unstable, 1052 model validation, 559,1050,1051 model-based compensation, 1379 model-based controllers, 1319 model-based methods, 824 model-based observers, 591 modeling, 99,1246,1303,1329,1331,1391,1406 black box, 806 dynamic, 1413 energy, 102 frequency-domain, 1498 friction, 1369,1370 hierarchical, 105,416,1423 object-oriented, 104,421,426 parametiic, 642 plant, 416 power plant, 1458 sampled-data, 1417 time-domain, 1499 uncertainty, 1406 modeling errors, 523 modeling languages, 421 modeling software, 418 models arma, 490,1079 armax, 1079
arx, 1034 continuous-time state variable, 273 difference-equation,831 discrete-time, 273, 831 disturbance, 703 dynamic elastic beam, 1139 dynamic elastic plate, 1146 engine, 1271 finite-impulse,806 fuzzy, 1049 fuzzy control, 1500 generalized state-space, 1423 higher index, 423 input-ovtput, 65 interpretation, 1458 inverse, 424 j [ti plant, 704 Kirchhoff, 1153 knowledge, 1458 linear, 1042, 1146 linear black-box, 1043 linearized, 1420 mathematical, 827, 1197 multiparameter,835 nonlinear, 1033 nonlinear black-box, 1048, 1049 nonminimum-phase discrete-time, 832 physically parameterized, 1047 plant, 551,830 predictive, 836 process, 806 ready-made, 1042 Reissner-Mindlin, 1153 sampled-data, 1422 second-source, 426 singular perturbation, 1343 state variable, 247 state-space, 75,99,1046 step-response, 806 synthesize tracking, 703 transfer function, 807 uncertain plant, 552 uncertainty, 683 unstructured perturbation, 552 unstructured plant uncertainty, 553 models for control, 1248 moderate rotations, 1140 modem paradigm, 651 modes characteristic, 5, 11 closed-loop, 838 natural, 5, 11 oscuatory, 207 sliding, 942 specialized task tailored, 621 structurally fixed, 783 system, 125 uncontrollable, 122, 129 modular design, 981,985 modulating devids, 1200 modulating signal, 1418 modulation power system stabilizer (pss), 1485 pulse-width (pwm), 1415 modulus, 52 modulus optimum (bo), 822 moment equations, 1289
moment exponential :Lyapunov stability, 1106 momentum, averaged, 976 monic polynomial, 36,54 monitoring on-line, 567 process, 565 monomials, 47 Lie, 886 monoprotic agent, 1237 monostable actuator, 347 most significant bit (rnsb), 301 motion brownian, 1068,1069 equations of, 1140,1141 hovering, 978 pilot head, 1330 motion control systems, 1382 motion coordination, 1384 motion planning, 886 motion profiling, 1384 motor ipm synchronous, 1434 spm synchronous, 1434 synchronous reluctance, 1435 motors, 1388 ac induction, 1430 ac synchronous, 1432 dc, 1427 moving-average (ma) form, 836 mpca method, 568 mrac, 848 direct, 850 indirect, 852,856 robust, 852 p analysis, 680 p synthesis, 681 p theory, 557 multi-criterion optimization problem, 756 multi-input systems, 764,992 multi-input-multi-output (mimo), 75 multi-inputlmulti-o~~tput (mimo) cases, 121 multi-loop feedback gain, 764 multiaffine function, 497 mdtiaffine uncertainty structures, 503 multibody flexible space systems, 1323 multiloop design problem, 776 multiloop system, 763 multiobjective optimization problem, 756 multiparameter mod'els, 835 multiple axes, 1391 multiple effect evaporators, 1221 multiple-valued complex function, 53 multiplication boolean, 347 matrix, 34 scalar, 34 multiplicative ergodic theorem, 1114 multiplicative perturbations, 554 multiplicative uncertainty, 768 multiplier methods, 1173 multipliers, characteristic, 956 multivariable bode plot, 767,768 multivariable controll, 130" multivariable functions, 47 multivariable pole-zero cancellations, 449 multivariable poles, 446 multivariable systems, 696, 1053
multivariable three-term controller, 734 multivariable transmission zeros, 447 multivariate statistics, 562 mutual exclusion, 332 mv control, 838 nand, 349,355 nand gate, 349 natural components, 242 natural frequencies, 5 natural modes, 5, 11 natural period, 1211 natural response, 4, 11, 115, 120 Neal-Smith criterion, 1301 nearest neighbors, 1049 necessary conditions, 761 negative definite, 890 negative encirclement, 57 negative limit set, 952 negative semidefinite, 890 neighbors, nearest, 1049 nested epsilon decomposition, 790 network combinatorial, 350 local area, 360 neural, 994,996,1018, 1049 plant wide, 360 radial basis, 1049 sigmoidal neural, 1020 networking, 105 Neumann condition, 1159 Neumann's problem, 61 neural control, 1017 neural networks, 994,996,1018, 1049 neutral stability, 925 new operator, 334,335 Newton direction, 1040 Newton search direction, 1039 Nichols chart, 190 Nichols plots, 173, 185 nodes, 93,105 noise, 108 gaussian, 589 measurement, 111,319,515,818 process, 589 white, 577, 584, 807, 832, 1068, 1080, 1081, 1085, 1090, 1095, 1392 white input, 111 noise model, 1052 noise spike filter, 1198 noise structures, 1052 noise suppression, 164 nominal model, 683 nomind plant, 704 nominal signal response, 538 nominal system, 932,933,1010,1012 non-gaussian process, 1081 nonadaptive design, 980 nonanticipative control policy, 1100 nonautonomous systems, 891 ' nonblocking code, 325 noncausal design algorithm, 763 noncommutative Padt-type approximant, 884 noncommuting vector fields, 967 nondifferentiable feedback, 392 nondynamical form, 236 nonharmonic Fourier series, 1174
THE CONTROL HANDBOOK nonholonomic path planning problem, 1362 nonholonomic systems, 1359 nonholonomy, test, 1361 nonlinear averaged model, 1419 nonlinear black-box models, 1048,1049 nonlinear black-box structures, 1048 nonlinear conic sector theorem, 903 nonlinear control law, 1312 nonlinear control systems, 880,932 nonlinear controllers, 1214 nonlinear damping, 985 nonlinear dynamics, 1290 nonlinear effects, 1052 nonlinear feedback control law, 1324 nonlinear functions, 47 nonlinear inner loop, 1299 nonlinear least squares, 1023 nonlinear minimum-phase systems, 920,921 nonlinear models, 1033 nonlinear observers, 613 nonlinear optimal control, 971 nonlinear optimal control ~roblem.761 nonlinear output regulation, 923 nonlinear partial differential equations, 1153 nonlinear passivity, 901 nonlinear programming, 755 nonlinear small gain, 901-902 nonlinear spacecraft equations, 1312 nonlinear system, 929,952, 1081 feedback linearization, 909 nonlinear systems, 393,879 nonlinear zero dynamics, 917 nodinearities, 1321 nonlinearity cubic, 364 saturation, 365 nonlinearly input and output strictly passive, 901 nonminimum phase, 1081 nonminimum phase plants, 806 nonminimum phase systems, 546 nonminimum-phase discrete-time models,'832 nonminimum-phase system, 179,732,736 nonminimum-phase zeros, 838 nonmodel-based compensation, 1377 nonnegative definite matrix, 44 nonparametric identificationmethods, 1051 nonrecursive difference equation, 10 nonsingular square polynomial matrix, 482 nontouching-loop gains, 95 nor, 348,349,351,355 nor gate, 348 norm, 39,1149 hw, 492 euclidean, 40,457,508 Frobenius, 40 652 Hilbert, 1151 induced euclidean, 40 p, 689 spectral, 509 system, 652 uniform, 40 norm function, 895 norm-minimality, 1143 normal equations, 42 normal form, 913,927 normal matrix, 43 P
xco,
normal solution, 1205 normalization, 845 normalized coprime factorization,554 normalized time, 170 norms, 39,40 p, 895,89 vector, 39 not, 347,348,355 notation, operational, 10 notch filter, 197 nullity, 42 nullspace, 39,42 number complex, 51 condition, 43 fixed-point, 301 floating-point, 301 imaginary, 51 irrational, 51 transcendental, 51 winding, 58 numerical conditioning,42 numerical differentiation, 271 numerical linear algebra, 403 numerical optimization-baseddesign, 749 numerical stability, 401 Nyquist, 173 Nyquist array, inverse, 722-723 Nyquist contour, 136 Nyquist criterion, 135 Nyquist diagram, 137, 140 Nyquist plot, 145,152, 155, 173 Nyquist stability criterion, 55,524 Nyquist stability test, 135 Nyquist theorem, 135,138 Nyquist's stability criterion, 181,183 generalized, 182 object diagrams, 421,422 object instantiation, 105 object-oriented modeling, 104,421,426 object-orientedmodeling languages, 421 object-orientedmodeling paradigm, 421 object-oriented modeling systems, 426 objects, 105 observability, 121,123,250,295,447,458,462,1165,1167,1169,1317, 1493,1494 approximate, 1172 grammian, 123 matrix, 123, 124 rank, 123 spectral, 1172 observabilitygramian, 458 observability matrix, 463,609 observability operator, 1167 observabilitytheory, 1169 observable, 123,458,929,1171 completely, 251 unifordp 459 observable canonical fotm, 76, 126 observable form, 250 observations, 615 observer, 287 deadbeat, 290 reduced-order state, 291 observer canonical form, 453 observer design, 411,949
INDEX observer feedback, 291 observer theory, 1171 observer-based controllers, 382 observer-based stabilizing controller, 735 observer-predictor,234 observers full-order, 290 nonlinear, 613 reduced-order, 615 reduced-order state, 291 oe-model, 1042 off point, pick, 90 off-line friction, 1375 offset, steady-state, 807, 1091 omola, 421 on-line monitoring, 567 one hidden layer feedforward sigmoidd net, 1049 one-degree-of-freedomtracking controller, 385 one-loop-at-a-timesiso design, 759 one-sided Laplace transform, 18 one-sided z transform, 23 open kinematic chain, 1340 open left half-plane (olhp), 21 open loop, 1148 open region, 53 open systems, 435 open systems interconnection(osi) scheme, 359 open unit disk, 24 open-loop properties, 765, 767 quantities, 767 system, 765 open-loop control, 830,967 open-loop methods, 978 open-loop optimal control, 756 open-loop specifications,541 open-loop system, 927 open-loop transfer function, 88, 181 operating conditions, 952 operating envelope, 960 operating system, 332,431 operation, stressed, 960 operationalnotation, 10 operationalspace control, 1345 operations, complete, 349 operator backward shift, 1080 conditional expectation, 806 differential, 125 expectation, 1090,1094 Green's, 1162 human, 1498 Laplace, 1157 observability, 1167 reconstruction, 1169 solution, 1161 operator interface, 334 optics, geometrical, 1173 optimal control design, 759 control problem, 760 contro1ler, 760 cost, 762,775 feedback control. 762,775 feedback gain, 762,769,775 lq design, 776 lq tracker, 768
return difference relationship, 767 singular value relationship, 767 systems, 759 tracking controller, 769 optimal bounds, 705,712,713 optimd control, 595,602,809 optimal control model, 1499,1503 optimd estimation, 1395 optimd filter, 1086,1088 optimal filtering, 1088 optimal infinite horizon cost, 1101 optimal prediction, 1088, 1091 optimal predictlor, 809 optimal realizable filter, 1086 optimal smoothing, 1088 optimality,885 optimization, 7155 convex, 749 direct criteria, 822 performance, 1426 optimization problem, 756 optimization-baed methods, 822 optimum control system, 170 or, 348,349,3511,352,354,355 or block (orb), 354.* order, model, 1052 ordinary differentialequation, 3,66,74 orientation, fielid, 1425 orientation matrix, 1340 Ornstein-Uhler~beckprocess, 1075 orthogonal basis, 41 orthogonal complement, 41 orthogonal function, 41 orthogonal matrix, 37,42 orthogonal polynomial series, 1020 orthogonal projection, 41 orthogonal vectors, 41,508 orthogonality, 40 orthonormal basis, 41 oscillatory cont~ol,978 oscillatoryinputs, 967,968 oscillatorymodes, 207 oscillatorystability, 1484 Oseledec, 1114 osi, 359,360 outer design loop, 760 output equation, 454 output feedback; 288,621,748 output feedback designs, 989 output innovations, 1082 output matrix, 454 output replation, 923 error feedback, 925,928 f d information, 925,926 output stabilization, 1367 output strictly passive, 898 output-error-model, 1042 output-feedback design, 990 output-feedbadc form, 989 output-feedback matrix, 754 output-feedback modular design, 990 outputs, 356 boundary,1154,1155 overbounding, 498 overflow, 302 overlappingdecomposition, 791 overlappinginformation structure constl
THE CONTROL HANDBOOK overlapping subsystems, 787 prh moment exponentially stable, 1106 p norm, 689,895,897 P. Hall basis, 972 Pad6-type approximants, 884 Pad* approximation, 80,81 pal, 351 paper making, 1222,1225 paradigm modern, 651 object-oriented modeling, 421 regulator, 1296 parallel combination, 87 parallel form, 93,200, 305 parallel systems, 87 parallelism, 325 parameter, Youla, 750 parameter changes, 203 parameter error vector, 833 parameter estimation, 827 parameter variations, 949 parameterization, Youla, 750 parameters, friction, 1373 parametric estimation methods, 1051 parametric modeling, 642 parametric resonance, 82 parametric strict-feedback form, 984 parametric uncertainty, 535 Parseval's relation, 1081 Parseval's theorem, 897 part analytic, 60 anticausal, 1086 causal, 1086 principal, 60 partial derivative, 1277 partial differential equation, 3, 1170 partial differential equations nonlinear, 1153 partial least squares, 563 partial state, 125 partial state-feedback systems, 992 particular solution, 5,12 partitions, conformable, 37 passive identifier, 987 * passive learning, 1018,1019 passivity nonlinear, 901 strict, 988 passivity based robust control, 1348 passivity property, 1342 passivity theorem, 898,899 passivity-based adaptive control, 1348 passivity-based controllers, 1321 path planning algorithms, 1363 path space, 1365 Paynter, Henry, 102 pcatpcr controller design, 569 pcalpcr estimation and control, 571 pd control, 1343 pd plus feedforward acceleration, 1344 peak values, 665 peak-to-peak gain, 666 pendulum inverted, 971
rotational inverted, 1009 pendulum balancing control, autotuning, 1014 percent overshoot, 119 perfect robust controller, 742 perfect RSP, 732 performance robust, 502,679,680,685 system, 1327 performance analysis, 1247 performance evaluation, 1012 performance index, 169,760,762, 769,775, 1090 infinite horizon, 764 performance optimization, 1426 performance requirements, 1327 performance robustness, 667 period natural, 1211 sampling, 240,315 period doubling bifurcation, 956 period doubling route to chaos, 958 periodic average zero (paz) vector, 969 periodic solution, 874,952 persistency of excitation, 1348 perturbation, singular, 875, 1343 perturbation method, 873 perturbations additive modeling, 553 coprime factor, 554 generalized unstructured, 556 independent addit'lve, 557 multiplicative, 554, 667 switching-time, 1423 perturbed matrix, 402 perturbed model, 739 perturbed system, 1012 Petri-nets, 437 ph measuring system, 1205 ph scale, 1205 phase, minimum, 1095 phase condition, 1278 phase crossover frequency, 152,286 phase diagram, 175 phase margin, 152,155,184,286,528, 768 phase plane, 774 phase plane method, 372 phase properties, 1081 phase spectrum, 18 phase-controlled converters, 1416 phase-margin frequency, 184 physical derivation, 1158 physical principles, 99 physical systems, 958, 968, 1329 physically parameterized models, 1047 pi, 1239 pi compensator, 1393 pick-off point, 90 pid, 359,1239 pid control, 198,206 pid controllers, 345 pid regulators, 832 pid.6 1239 pilot head motion, 1330 pilot-induced oscillations, 1301 pitchfork bifurcation, 955 placement, pole, 819 plane complex, 52,188
phase, 774 planning systems, 994, 996 plant, 295,666,691, 909, 924 assumptions, 849 composite continuous-time model, 295 input, 523 multiple-inputs, 288 nominal, 704 output, 529 power, 1454 structure, 1454 templates, 708 plant and feedback structures, 781 plant dynamics, 944 plant equations, 843 plant input signals, 295 plant model, 551,731, 738, 750, 830 plant modeling, 416 plant transfer function, 838 plant wide networks, 360 plants affine, 499 electric power generating, 1453 mimo, 508 nonminimum phase, 806 time-invariant, 827 uncertain, 499 unstable, 235 plates dynamic, 1148,1153 Kirchhoff, 1148, 1154 plausibility, 839 PIC,345,346,353-355,357-360 programming, 355 plc system, 354 pld, 353 plot Bode, 146,152,155,173-175 direct, 188 direct polar, 179, 186, 189 inverse polar, 179 logarithmic, 174 model output, 1052 multivariable bode, 767, 768 Nichols, 173, 185 Nyquist, 145, 152,155, 173 polar direct, 179. 181 root locus, 192,19 singular value, 512 pls (pulse instruction), 355 plu factorization, 35, 36 point binary, 301 equilibrium, 81,952 exponentially stable equilibrium, 874 fictitious binary, 301 fixed, 952 interior, 55 isopotentid, 1206 pick-off, 90 Poisson stable, 924 radix, 301 singular, 55 turning, 954 upper equilibrium, 971 point sensors, 346 pointing, line-of-sight, 1326
points, equilibri~um,889 Poisson stable point, 924 Poisson's equation, 61 pole, 246 closed-loop, 841 pole assignmenit, 41 1 pole placement, 209,212,289,608,819 arbitrary, 210 pole-placement (pp) self-tuning, 841 pole-zero cancellations, 1053 pole-zero matching, 272 poles, 20,55, 118,121,138, 181,407 closed-loop, 777 di~turbancesltrackin~, 732 model, 231 multivariable, 446 process, 823 right half-plane, 546 unstable, 553 policy iteration, 1102 markovian, 1099 nonanticilpative control, 1100 polling, 354 polynomial, 129, 932 annihilating, 37 characteristic, 5,11,37, 115, 120, 125,767 form, 776 matrix, 37 minimal, 37,46 monic, 36 systems, 776 weighting, 776 polynomial approximation, 266 polynomial functions, 47,497 polynomial matrices, 36, 123 polynomial matrix fraction, 482 polynomial nonlinearity, 370 polynomials, 54 affine, 5001,501 coprime, 690 coprime denominator, 289 coprime numerator, 289 delta-ope~ator,150 design, 84 1 invariant, 36 Lie, 886 low order, 150 uncertain, 501 polytopic systems, 753 Pontryagin's minimum principle, 773 portability, 324 poses, 1340 position control, 1352 position loop, 1390, 1396, 1399 position loop compensator, 1394 position-dependent friction, 1375,1380 positive definite,,890 positive definite function, 921 positive definite matrix, 44 positive definite solution, 764 positive definite symmetric matrix, 149 positive encirclement, 57 positive limit set:,952 positive logic, 347 positive semidefinite, 890 possible feedback, 1038
THE CONTROL HANDBOOK potential augmented, 975 averaged, 967,974477 potential energy, 1341 potential t\eory, 61 power, 351 power amplifiers, 1388 power continuityequations, 102 power electronics, 1413 power generation, 1466,1469 power plant, 1454 power plant modeling, 1458 power series, 56 Fliess gfnerating, 882 power series expansions, 56 power system stabilization, 1445 power system stabilizer (pss) modulation, 1485 power transmission, 1483 power transmission equipment, 1486 practical instability, 227 practicality, 664 precision, relative machine, 401 prediction, optimal, 1091 prediction error, 830,833 prediction horizon, 836 prediction model, 830,833,836 predictive control, 807 predictor, 830 d-step ahead, 1091 mean square optimal, 1085 optimal, 809 predictor identity, 1085,1087 preemptable scheduler, 330 preemptable tasks, 330 preemptive scheduling, 330 prefilter design, 706,709 prefilter functions, 715 prefiltering, 319 prescribed endpoint steering problem, 971 presliding displacement, 1371,1380 pressure dynamics, 1459 pressurization control, 1196 prewarping, 270 primary controller, 225 primary loop, 207 principal component analysis, 562 principal component regression, 563 principal part, 60 principal regulation level, 1457 principle extended separation, 616 Hamilton's, 1140 internal model, 235 invariance, 890 La Salle invariance, 1171 maximum, 1159 Pontryagin's minimum, 773 separation, 607,612 principle of optimality, 1099 principle of the argument, the, 136 principle of virtual work, 1343 principles, physical, 99 prismatic joints, 1340 probabilities, transition, 1075,1100 probability, 1105,1106 problem boundary-value, 61
Brachistochrone, 772 Dirichlet's, 61 exact controllability,1152 gaussian optimal filtering, 596 generalized eigenvalue, 404 infinitehorizon optimal control, 603 inverted pendulum, 765 linear least squares, 403 linear quadratic (lq) minimum-time, 772 linear quadratic regulator (LQR), 762 linear-quadratic optimal control, 595 lmi, 754 multiloop design, 776 Neumann's, 61 optimal control, 760 prescribed endpoint steering, 971 receding horizon optimal control, 603 robust servomechanism,731 servomechanism,731 tracking, 1092 trajectory approximation steering, 971 two-point boundary-value, 761 Zermelo's, 772 procedure Gram-Sdunidt, 41 controller tuning, 1214 process autoregressive (ar), 836 batch, 568 description, 566,568. heat conduction, 1171 Kraft, 1221 Markov, 1127 monitoring, 565 Ornstein-Uhlenbeck, 1075 sulfite, 1221 vector continuous-timestochastic, 586 white noise, 578, 1068 Wiener, 1068,1069 process control, 369 process controllers, 825 process model, 806,1090 process noise, 589 process poles, cancellation, 823 proqss titration curves, 1206 processes continuous-time, 1084 continuous-time gaussian stochastic, 583 discrete-timegaussian stochastic, 577 manufacturing, 1221 vector discrete-time stochastic, 581 Wiener, 807 processing, digital, 307 product inner, 40,898 Lie, 45,880,914 shuffle, 885 tensor. 47 product rule, 35 product variability, 1222 profiling, motion, 1384 program structure, 334 programmable array logic (pal), 351 programmable controllers, 345,353 programmable logic devices (pld), 351 programmable logical controllers (plc), 345 programming
INDEX dynamic, 1099 graphical, 1203 nonlinear, 755 plc, 355 textual, 1203 thrust angle, 761 projection complementary orthogonal, 41 orthogonal, 41 subspace, 1046 proof-of-concept testing, 1253 proper function, 921 proper rational function, 20 proper rational matrix, 481 proper rat~onalmatrix funct~on,481 proper separation function, 901 properties analytic, 542 asymptotic, 639 asymptotic lqr, 639 closed-loop, 767 guaranteed robustness, 762 Iqr asymptotic, 640 open-loop, 765,767 robustness, 638 smoothing, 1070 property composition, 454 coverage, 1023 hamiltonian, 598 localization, 1023 passivity, 1342 universal approximation, 1029 proportional band, 203 proportional control, 198 proportional feedback gain, 195 proportional-integral-derivative (pid) controllers, 221 protected variables, 338 protection, exciterlgenerator, 1445 protocol manufacturing automation (map), 360 technical and office (top), 360 pseudo-inverse, 43 pseudocontrol, 628 pseudogradient condition, 1285 pseudosensitivity filter, 1277 public functions, 335 public section, 335 pulp and paper mill, 1222 pulp and paper mill culture, 1231 pulp and paper process dynamics, 1234 pulping, 1221 pulse, 832 pulse response, 831, 833 pulse waveform, 1414 pulse-width modulation (pwm), 1415 pumps, 1195 pure-feedback systems, 992 purely complex p, 673 purely imaginary complex numbers, 5 1 pursuit tracking tasks, 1501
/
QFT methods, 713 qr factorization, 41 quadratic factors, 17; quadratic form, 47,939 quadratic Lyapunov function, 934
quadrature glitch, 1369 quantitative feedback theory (QFT),701-702 quantities, open-loop, 767 quantization efifects, 306 quantization error, 302, 307, 319 quantization level, 306 quantization noise model, 309, 310 quantized amplitudes, 239 quantized observations, 615 quasi-hyperbolic systems, 1144, 1146 quasi-steady-state system, 1346 quaternion, 13;!4 quotient space, 470 R-equivalence,,36 radial basis fun'ctions, 1021 radial basis networks, 1049 radius, spectral, 44 radius of convergence, 56 radix point, 3011 ramp-invariant approximation, 272 random walk, 1059,1080 range, 39,42 costing, 810,844 rank, 42 raster graphics displays, 431 rate, sampling, 295 rate limits, 667 ratio continuoils duty, 1419 signal-to-noise, 307 rational canonical form, 46 rational function, 20, 1081, 1084 rational functions, 55 Hunvitz-stable, 690 Schur-Stable, 690 rational Laplact! transforms, 20,25 rational transfes functions, 689 rational z transl'orms, 29 Rayleigh beam ]model, 1142 reachability, 462, 765 reachable set, 4'74,1149, 1165 reachable subspace, 474 reactive power t:ompensation, 1445 ready-made maldels, 1042 reagent delivery. systems, 1213 real axis, 52 red polynomial matrix, 482 real time control loops, 760 red uncertainty; 675 realizable Wiener filter, 1087 receding horizon control, 810 receding horizon optimal control problem, 603 receding-horizon strategy, 843 receptivity, 356 reconstruction operator, 1169 recovery boiler, 1221 rectangular meihods, 2 i 7 rectified waveform, 1416 rectifier, 1413 high-frequency pwm, 1415 recurrence conditions, 1103 recursion, backward, 1099 recursive design, 984 recursive difference equation, 10 recursive estimator, 833 recursive Lyapunov design, 935
THE CONTROL HANDBOOK recursive prediction error (rpe), 832 redes~gn,Lyapunov, 933 reduced form, 55 reduced lagrangian, 975 reduced model, 875 reduced-order linear dynamics, 945 reduced-order observers, 615 reduced-order state observer, 291 reference conditioner, 379 reference frame transformation, 1430, 1433 reference governor, 379 reference input filter, 292 reference input model system, 293 reference inputs, 281 reference line, 1139 reference model assumptions, 849 reference model tracking system design, 293 reference signal extrapolation, 1298 reference surface, 1146 reference tracking, 695 reference trajectory, 768 refining, 1222 region admissible, 773 general conic, 903 open, 53 unit-circle stability, 838 region of analyticity, 53 region of asymptot~cstability, 890 region of attraction, 890,942 region of convergence, 19 regression, Ijnear, 1035, 1043 regression vector, 1035, 1048 regressors, 1035, 1342 regular analytic functions, 53 regular form, 944 regularization, 1049 regulated outputs, 666 regulation, 731 structurally stable, 931 regulator, 281 low load, 1477 self-tuning, 838 steady-state lq, 765 two-degrees-of-freedom, 777 regulator paradigm, 1296 regulators minimum-variance, 806 pid, 832 switching, 1415 reheat steamside dynamics, 1464 reheaters, 1464reinforcement learning, 1030 Reissner-Mindlin models, 1153 Reissner-Mindlin system, 1147, 1148, 1151, 1155 relationship complete equivalence, 126 equivalence, 127 kinematic reciprocity, 1353 optimal return difference, 767 optimal singular value, 767 relative machine precision, 401 relay, 365,369 electromechanical, 347 latched, 352 self-holding, 352 relay autotuner, the, 824
removable singularity, 55 repeated complex scalar, 675 repeated real scalar, 675 repeated roots, 5, 11 repeated scalar blocks, 673 repetitive control, 1350 representation equivalent state-space, 127 signed-magnitude, 302 two's complement, 302 reset state, 352 residence time, 1210 resldual analysis, 1051 residual vector, 591 residual, 608, 61 1 res~due,60 residue ar~thmetic,301 residue theorem, 59, 60 residues, 20 resistive source (rs) elements, 104 resistors, 423 resolution, frequency, 1045 resolvant matrlx, 116, 121 resolved acceleration control, 1345 resolvent matrx, 48 resonance, 1235 parametric, 82 resonances, structural, 1393 resonant converters, 1416 resonant frequency, 186 response between-sample, 294 closed-loop, 140 control system, 315, 317 deadbeat, 170,171 discrete-frequency, 247 disturbance, 1478 forced, 4,5, 115,831, 842 free, 831,842 frequency, 408,508,1043 frequency and time, 190 impulse, 68, 1043, 1045 natural, 4, 11, 115, 120 nominal signal, 538 pulse, 831, 833 sinusoidal and time, 187 steady-state, 925 step, 1043 step input, 1391 stepped sinusoidal, 1391 system, 247 transient, 158 unit-step, 831 white noise, 1392 zero-input, 115 zero-state, 115,248 response model design, 292 response model tracking, 292 response terms, 242 response, natural, 4 return statement, 330 reversible, 460 revolute, 1340 rhp (right half-plane), 709 Riccati, Count (Jacopo Francesco), 595 Riccati algebraic equation, 597 Riccati differential equation, 596
INDEX Riccati equation, 409,595,762,769 ridge-type, 1049 right coprime, 130 right half-plane poles, 546 right matrix fraction description, 483 right singular vectors, 510 rigid environment, 1352, 1355 ring, 33 rise time, 119 rising static friction, 1370 Rissanen's minimum description length criterion (MDL), 1039 robot, 1339 robust bidirectional transfer, 385 robust control, 41 1,1346,1405 robust feedback linearization, 1346 robust mrac, 852 robust performance, 502,679,680,685 robust performance analysis, 681 robust performance tests, 677 robust properties, guaranteed, 593 robust servomechanism controller, 734 properties, 734 robust servomechanism controller (RSC), 735 robust servomechanism problem (RSI'), 731,732 robust stability, 502,539,676,680 robust stabilization, 694 robustness, 495,628,767, 785,818,905,949,978,1012, 1235 guaranteed, 767 Iqr stability, 639 stability, 524,528,530 robustness analysis, 499,671 robustness criterion, 499 robustness properties, 638, 738 root loci, 195 root locus design methods, 283 root locus plot, 192,197 root sensitivity, 1394 root-locus approach, 841 roots, 5, 11 rotating exciters, 1443 rotation, 61 rotational inverted pendulum, 1009 rotations, 870 global, 1140 moderate, 1140 Rouche's theorem, 59 rounding, 302 roundoff, 303 Routh-Hurwitz array, 132 Routh-Hurwitz criterion, 132 Routh-Hurwitz stability criterion, 131 row vectors, 33 rpe, (recursive prediction error), 832 rsp, robustness properties, 738 rule, 1021 condition + action, 1004 Cramer's, 35 Itals, 1068,1072 linguistic, 1004 product, 35 rule base, 1004 rule-based methods, 825 saddle-node bifurcation, 954,955 sampled data, 629 sampled data control systems, 254 sampled-data systems, 788
sampled-data controllers, 260 sampled-data modeling, 1417 sampled-data models, 1422 sampled-data stability tests, 146 sampled-data system, 147 samples, 240 sampling, 295, 1417 sampling interval, 1034 sampling period, 240,3 15 sampling rate, 2195,319 sampling theorem, 313 sanitation system, 1184 saturation, 667 saturation nonlinearity, 365 scalability, 435 scalar, 34 repeated complex, 675 repeated real, 675 scalar differentral equation, 115 scalar exponential functions, 1174 scalar multiplication, 34 scale, ph, 1205 scaling, 1048,1161 scan mode, 325 scan model, 325 scanf statement, 325, 334, 342 scanned code, 2125 scattering theory, 1173 scheduled cont~rollers,384 scheduler, 358 scheduling, 3281 cooperative, 328 gain, 393,824 interrupt, 329 preemptfive, 330 time slice, 330 scheduling variable, 393 scheme open systems interconnection (osi), 359 soft-start, 1421 Schottky diode.$,351 Schur-Cohn criterion, 148 Schur-Stable rational functions, 690 Schwarz inequality, 41 scope, global, 420 sde, general, 1075 second order subsystems, 310 second-order nonholonomic condition, 1360 second-order response, 119 second-source ]models,426 secondary loop, 207 selector controll, 208 self-holding relay, 352 self-induced torque disturbances, 1330 self-tuning controller, 845, 1216 self-tuning regulator, 838 semiconductor switches, 1414 s e d i n e a r systems, 1153 sensitivity, 5 13 differential, 539 root, 1394 weighted, 517 weighted complementary, 517 sensitivity anal7ysis,305 sensitivity filte~s,1278 sensitivity functions, 216,254, 1277 sensitivity weighted lqr (swlqr), 641
THE CONTROL HANDBOOK sensor nonlinearities, 1321 sensors, 1195,1388 binary, 346 digital, 346 point, 346 separately excited dc motor, 1428 separation principle, 607,611,612 sequence alternative parallel, 357 gaussian white noise, 807 simple, 357 simultaneous parallel, 357 weighting, 83 1 sequential circuits, 352 series Dirichlet, 1166 Fliess, 882 geometric, 56 Laurent, 59 nonharmonic Fourier, 1174 orthogonal polynomial, 1020 power, 56 Taylor's, 56 .Volterra, 881 series combination, 86 series form, 200,203 series wound dc motor, 1429 servo compensator, 733 servomechanism problem, 731 set finite control, 1103 fuzzy, 1004 invariant, 952 isolated invariant, 952 limit, 952 negative limit, 952 positive limit, 952 reachable, 474,1149, 1165 spanning, 38 spectral, 502 value, 499 set membership representation, 520 set point weighting, 199 set-reset (sr) flip flop, 350 setpoint following, 817 setpoint resetting, I202 settling time, 119 seven-parameter friction model, the, 1372 shape space, 1363 shift operator, 1080 shooting point method, 761 short-circuiting, 1206 shortest path problem, 1097 shuffle product, 885 shunt wound dc motor, 1429 sigmoidal neural networks, 1020 signal discrete-time, 240 input, 1050,1488 modulating, 1418 truncated, 895 signal conditioning, 1392 signal flow diagrams, 437 signal switching, 351 signal-flow graphs, 93,96 signal-to-noise ratio, 307 signals, 100,351,689
discrete-time, 239,240 feasible control, 1297 infeasible control, 1297 interrupt, 354 plant input, 295 sustained input, 353 trackingldisturbance, 731 unbounded reference inputldisturbance, 736 uncertain, 552 uncertain external input, 551 signed-magnitude representation, 302 signum function, 773 Sihna, 1111 similar matrices, 39 similarity, 1161 similarity transformation, 39, 76, 250 Simnon, 420 simple closed curve, 57 simple nonlinear separation theorems, 901 simple pole, 55 simple second-order system, 186 simple sequence, 357 simstruct, 426 simulation, 706, 1012,1246, 1373 computer, 763 digital, 431 simulation function, 341 simulation languages, 420,431 simulation software, 418 simulators, block diagram, 419 simulink, 419 simultaneous parallel sequence, 357 single quadratic cost functional,\l276 single-body flexible spacecraft, 1316 side-input single-output systems, 911 single-input systems, 287 single-input-single-output (siso), 75 single-input/single-output(siso) linear, time-invariant system, 121 single-pole double-throw (spdt), 346 single-pole, single-throw (spst) mechanical switch, 346 single-valued complex function, 53 singular condition, 773 singular perturbation, 875, 1343 singular perturbation models, 1343 singular point, 55 singular value, 40,767 singular value decomposition (svd), 46,404,509,510 singular value plots, 512 singular values, 46 singularity, 53 essential, 55,60 isolated, 55 removable, 55 sinusoidal and time response, 187 sinusoidal describing function, 363 sinusoidal input, 7,13,109 sinusoidal responses, 173 sinusoidal voltage waveforms, 1413 sinusoids, input, 513 siso, 75,521 siso continuous-time linear systems, 585 siso control, 1239 siso design, 759 siso Iti systems, 158 siso systems, 524 siting, 1487 situation, hyperstatic, 1352
INDEX skew-symmetric matrix, 36 sliding mode, 942 sliding surface, 941 slow adaptation, 987 slow manifold, 1281 small gain theorem, 526,897 Smith canonical form, 36 Smith form, 484 Smith predictor (dtc), 221,224,225 Smith-Macmillan form, 490,491 smooth control laws, 936 smoothing, 1160 smoothing properties, 1070 smoothness, tracking, 1329 snubber elements, 1417 so (symmetrical optimum), 822 Sobolev space, 1173 soft constraints, 810 soft environment, 1353,1355 soft-start scheme, 1421 software, 1395 mathematical, 412 modeling, 418 simulation, 418 system, 430 software hierarchy, 323 solution bifurcated, 953 classical, 4,11 complementary, 4,5, 11 finite energy, 1152 iterative, 10 normal, 1205 particular, 5, 12 periodic, 874, 952 positive definite, 764 viscosity, 1101 solution operator, 1161 spa, algorithm, 1045 space ambient, 896 Banach, 1170,1171 base, 1363 cartesian, 1353 configuration, 1148,1340,1352 countable state, 1103 dual, 39 end-effector, 1340 ' factor, 470 finite energy, 1144,1149 Hilbert, 1170 joint, 1340 path, 1365 quotient, 470 shape, 1363 Sobolev, 1173 state, 72, 121,209,796 task, 1340 vector, 37 space of observations, 1171 spacecraft, single-body flexible, 1316 spacecraft attitude control, 1303 spacecraft attitude sensors, 1304 spacecraft rotational dynamics, 1306 spacecraft rotational kinematics, 1304 spaces inner product, 41
vector, 37 span, 38 spanning set, 38 spdt switch, 346 special euclidean group, 1340 special orthogonal group, 1340 specializedtask tailored modes, 62 1 specification closed-loop, 541 manufacturing message (mms), 360 open-loop, 541 spectral analysis, 1045 spectral controliability, 1148 spectral density, 11084 spectral density function, 579 spectral factorization, 1081, 1088 spectral norm, 500 spectral observability, 117'2 spectral radius, 44 spectral representation, 44 spectral set, 502 spectrum, 1041,1081,1084 auto, 1045 cross, 1045 magnitude, 18 phase, 18 splinelike functions, 1021 spm synchronous.motor, 1434 square matrices, 383 square of sums, 7:76 sr, 352 sspa systems, 420 sstosys, 433 stability, 131,150,184,242,390,457,462,466,542,667,890,905,946, 1113,1321,1322,1325 almost sure asymptotic Lyapunov, 1106 almost sure exponential Lyapunov, 1106 almost sure Lyapunov, 1106 asymptotic, 390,890,925 asymptotic Lyapunov, 1105,1106 backward, 4:02 closed-loop, 760,765,1092 exponential Lyapunov, 1105 guaranteed, 592,768, 770 guaranteed closed-loop, 770 input-outpilt, 895 internal, 21'6 Lyapunov, 2L89,1105 moment exponential Lyapunov, 1106 neutral, 925 numerical, ,401 oscillatory, 1484 robust, 502,539,676,680 systems, 8915 transient, 1.483 uniform asjrmptotic, 1105 uniform as!rmptotic in probability, 1106 uniform exponential, 457,1105 voltage, 1484 weak, 410 stability analysis, 973, 1383 stability augmentation system, 621 stability bound, 7'04 stability matrix, $191 stability robustness, 524,528,530,739 stabilizabiiity, 925, 1145, 1153 external, 47'5
THE CONTROL HANDBOOK internal, 475 stabilizable externally, 475,476 internally, 475,476 vibrationally, 969 stabilization, 391,1329, 1337, 1367 decentralized, 782 feedback, 893 global, 921 local, 920 output, 1367 power system, 1445 robust, 694 state, 1367 stabilization control system, 1326 stabilization control system structure, 1331 stabilizing compensat'or, 733,734 stabilizing controller, 691 stabilizing function, 980 stable coprime factorization, 490 stable limit cycle, 877 stable right coprime factorization, 492 stable transfer matrices, 689 stand alone feedback control, 324 standard basis vectors, 38 standard form, 200,875 state blocking, 1414 conducting, 1414 controllable, 122 extended, 478 initial, 1170 internal, 910,1025 partial, 125 reset, 352 stochastic finite, 1100,1102 uncontrollable, 122 waiting, 338 state equation, 249,454, 760 signal-flow graphs, 96 state estimation problem, 590 state estimation theory, 1171 state feedback, 210,287 state feedback structure, 945 state model, 454,459 state model realizations, 452 state space, 72,73,121,209,796,1170 state space approach, 72 state space models, 75,1268 state stabilization, 1367 state transition matrix, 116, 117, 122 state variable models, 247 state variables, 116,455,461,1146 state weighting, 762 state-feedback control law, 762,774 state-space averaged model, 1419 state-space minimum-variance controller, 1093 state-space models, 99, 1046 state-space system, 123,127,128 state-transition diagrams, 437 state-transition logic, 327 state-transition matrix, 454 states, 326 static decentralized state feedback, 781 static decoupling, 802 static dissipative controllers, 1321 static exciters, 1443
-
static friction, 1380 static mixer, 1210 static system, 350 static var compensator (svc), 1490 statics, 332 stationarity, 577,584, 1060 stationarity condition, 760,761 stationary bifurcation, 954,955 stationary process, 1081 statistics, multivariate, 562 stator current vector angle, 1450 steady state distribution, 1102 steady state offsets, 807 steady-state accuracy, 160 steady-state and suboptimal control, 764 steady-state components, 242 steady-state equation, 1160 steady-stateIq regulator, 765 steaciy-stateoffset, 1091, 1095 steady-stateprocess design, 1222 steady-state response, 925 steady-statesystem, 770 steady-state velocity, 1375 steam flow feedback, 1477 steam generator, 1454 steam pressure, 1469 step index, 240 step input response, 1391 step invariant, 239,248 step reconstruction, 240 step response, 1043 step response methods, 824 step-invariant approximation, 272 step-invariant discrete-time observer design, 290 step-response analysis, 1043 step-response models, 806 step-varying systems, 249 stepped sinusoidal response, 1391 stiffness, inverse, 1358 stiffness control, 1358 stochastic analysis, 307 stochastic diffetential equation, 1072,1101 stochastic differential systems, 1101 stochastic finite state, 1100,1102 stochastic Iqr problem, the, 641 stochastic matrix, 1058 stochastic systems, 1100 strain energy, 1149 strain gauge, 1173 strains, infinitesimal, 114 strange attractor, 957 streamline, 62 stressed operation, 960 Stribeck curve, 1370 Stribeck friction, 1370 strict contraction, 901 strict feedback condition, 936 strict passivity, 985,988 strictly outside, 903 strictly proper, 20 strictly proper dissipative controller, 1323 strong acid-strong base curve, 1207 strong data typing, 436 strongly asymptoticallystable, 1154 structural analysis, 783, 1458 structural model, 1499, 1502 structural resonances, 1393
structurally fixed modes, 783 Structure Breiman's hinging hyperplane, 1049 control, 216,942 control system, 1327 . feedforwardlfeedback, 959 global approximation, 1019 imc, 217 local approximation, 1019 model, 831,1036 program, 334 stabilization control system, 1331 state feedback, 945 structured plant uncertainty, 556 structured uncertainty, 668 structures flexible space, 1316 multiaffine uncertainty, 503 noise, 1052 nonlinear black-box, 1048 uncertainty, 496, 504 subcontroller, 930 subcr'itical bifurcation, 953 submodel statement, 106 suboptimal control, 765 subspace, 38 force-controlled, 1353 reachable, 474 velocity-controlled, 1353 subspace projection, 1046 subspaces, 38 invariant, 471 subsystem, 921 first otder, 309 linear, 918 overlapping, 787 second order, 310 sulfite process, 1221 sum, convolution, 14,831 summer, 85 summing junction, 85,91 supercritical bifurcation, 953,954 superheat steamside dynamics, 1464 superheaters, 1464 superposition, 239 supervisory fuzzy controller, 1012,1014 supplemental damping controls, 1486 surface Lyapunov, 890 reference, 1146 sliding, 941 switching, 941 sustained input signals, 353 svd, 510,512 swapping identifier, 988 sweep method, 762 swing up control, 1008 switch ideal, 1414 single-pole, single-throw (spst) mechanical, 346 spdt, 346 switch statement, 328 switched state-space model, 1418 switched-mode converters, 1415 switches, 347 semiconductor; 1414 switches, mode, 203
switching, 351 switching curve, 7'74 switching function, 773, 1418 switching regulators, 1415 switching surface, 941 switching times, 1423 switching-surface (design,944 switching-time perturbations, 1423 swlqr (sensitivity weighted Iqr), 641-642 Sylvester equations, 409 symbols, Christogel, 1341 symmetric matrices, 44 symmetric matrix, 36 symmetrical optimum (so), 822 synthronous charts, 352 synchronous generator model, 1440 synchronous generators, 1440 synchronous machine, 1440 synchronous reluctance motor, 1435 synthesis ]mi, 752 p, 681 synthesis toolbox, 680 synthesize trackmg models, 703 system adjoint, 459, 768 automation, 359 autonomous, 82,874,926 autonomous linear, 909 averaged, 874 bilinear, 881 block strict-Ceedback, 992 boundary layer, 1346 Bresse, 1142 chained, 972 chaotic, 957 closed-loop, 744,764,765,924-926 complex, 207,664 continuous, 55 continuous lime linear, 1130 continuous-lime, 273,451,607, 760, 775, 1099 control, 158,379,1327,1345,1454,1469 delay, 776 deterministic, 1099 deviation, 770 digital control, 239 discrete, 55,742, 788 discrete-time', 146, 147, 150,239,459,611,775 discrete-time linear, 579 distributed-parameter, 1169 drift-free, 971 driftless, 13610 dual control, 1171 dynamical, 895,952 dynamical nonlinear, 924 error, 982 expert, 994, 996 exponential, 465 feedback, 88,538,690,827 ' feedback control, 762 feedforward, 769 finite control, 1100,1102 - finite-dimensional continuous-time, 952 forced, 446 forced linear time-invariant, 446 fuzzy, 994 generator excitation, 1485
THE CONTROL HANDBOOK hamiltonian, 761,766 high-voltage dc (hvdc) bulk-power transmission, 1413 historical approaches to, 1187 hyperbolic, 1142, 1145 infinite-dimensional Iti, 507 intelligent autonomous control, 994 Khrmiin, 1153 lagrangian, 974 lagrangian control, 974 linear, 93,147,239,451,459,890,910,918,929,1041 linear black-box, 1041 linear detectable, 930 linear observed, 1171 linear parameter varying, 395 linear step-invariant discrete-time, 242,247 linear step-invariant discrete-time, 247 linear stochastic, 1113 linear time-invariant (hi), 3, 115, 581,586, 750 mechanical, 102 memoryless, 924 mimo, 528,710,713,760,767 mimo feedback, 513 mimo linear time-invariant, 166, 168, 507 mimo state-space, 122 minimum-phase, 179, 732,920 miso, 702 miso analog control, 702 miso discrete control, 707 miso sampled-data control, 707 model, 760,762,769,775,942 motion control, 1382 multi-input, 764,992 multibody flexible space, 1323 multiloop, 763 multivariable, 696, 1053 nominal, 932,933,1010,1012 nonautonomous, 891 nonholonomic, 1359 nonlinear, 393,879,909,929,952 nonlinear control, 880, 932 nonlinear minimum-phase, 920,921 nonminimum-phase, 179,546,732,736 object-oriented modeling, 426 open, 435 open-loop, 765,927 operating, 43 1 optimal, 759 optimum control, 170 parallel, 87 partial state-feedback, 992 perturbed, 1012 ph measuring, 1205 physical, 958,968, 1329 planning, 994,996 plc, 354 polynomial, 776 polytopic, 753 pure-feedback, 992 quasi-hyperbolic, 1144,1146 quasi-steady-state, 1346 reagent delivery, 1213 reference input model, 293 Reissner-Mindiin, 1147,1148,1151, 1155 sampled data control, 254 sampled-data, 147,788 sanitation, 1184 semilinear, 1153
simple second-order, 186 single-input, 287 single-input single-output, 91 1 single-inputlsingle-output(siso) linear, time-invariant, 121 siso, 524 siso continuous-time linear, 585 siso lti, 158 stability, 895 stability augmentation, 621 stabilization control, 1326 state-space, 123, 127, 128 static, 350 steady-state, 770 step-varying, 249 stochastic, 1100 stochastic differential, 1101 time-invariant (stationary), 959 time-reversibre distributed-parameter, 1171 time-varying, 451,459,827,828 Timoshenko, 1142 tracking, 281 transmission, 1484 type 0,178 type 1,178 type 2, 179 ultra-high precision, 1386 uncertain, 752 uncertain controls, 752 uncontrollable, 122 under-actuated, 1008 unity-feedback, 88,513 unstable, 548 von Karmhn, 1147,1148,1153,1155 vsc, 943 with drift, 973 system dynamics, 390 system equations, 1352 system identification, 108,827,1033, 1037, 1051 system matrix, 454 system modes, 125 system norms, 652 system performance, 1327 system response, 247 system software, 430 system theory, 1165 systembuild, 419 systems, 689, 1325 systoss, 433 tables, truth, 347 target-code instructions, 439 task, 326,330 assignment, 331 class definition, 334 compensatory tracking, 1501 continuous, 331 event, 335 functions, 335 instantiating, 334 intermittent, 330 intertask communication, 331 preemptable, 330 pursuit tracking, 1501 synchronization of, 358 unitary, 330 task frame, 1340,1353 task space, I340
INDEX task space inverse dynamics, 1345 Taylor's series, 56 technical and office protocol (top), 360 techniques adaptive, 823 linear, 960 technology, control, 1180 temperature, 101 temperature control, 1470 temperature controllers, 825 templates, plant, 708 tensor product, 47 term, dc, 832 terminal constraints, 845 tests frequency-domain robust perfol:mance, 678 robust performance, 677 textual programming, 1203 theorem Cauchy's, 57 Cayley-Hamilton, 37 Chow's, 1363 circle, 903 classical conic sector (circle), 903 classical passivity, 898 classical small gain, 897 contraction mapping, 49 converse Lyapunov, 921 de Morgan, 348 discrete Hermite-Bieler, 148 divergence, 1165 edge, 495 ergodic, 1063 Frobeqius, 915 fundamental, 51 Gauss' mean value, 59 graph separation, 896 integral, 57 Kharitonov's, 501 limit, 1062 Liouville's, 59 main loop, 677 mapping, 503 maximum modulus, 59 minimum modulus, 59 multiplicative ergodic, 1114 nonlinear conic sector, 903 nonlinear passivity, 901 nonlinear small gain, 902 Nyquist, 135, 138 Parseval's, 897 passivity, 898, 899 residue, 59, 60 Rouche's, 59 sampling, 313 simple nonlinear separation, 90 L small gain, 526,897 theorem of Kharitonov, 495 Theory, Geometric Control, 872 theory Lyapunov, 150 p,557 observability, 1169 observer, 1171 potential, 61 quantitative feedback, 701 scattering, 1173
state estimation, 1171 thermal diffusivity, 1159 thermal storagf:, 1202 thermal-mechamicalpulping (tmp), 1221 third order moment, 1081 threads, 326 thrust angle programming, 761 thyristor-controlled series compensation (tcsc), 1491 time continuous, 582 discrete, 575, 1034 fractional dead, 832 residence, 1210 time delays, 225 time domain, 110 time invariant, 452 time response, 115, 173 time slice scheduling, 330 time slicing sch~eduler,330 time-delay compensation, 224,235 time-domain approach, 1296 time-domain methods, 4 time-domain modeling, 1499 time-domain slpecifications, 1196 time-invariant (stationary) systems, 959 time-invariant continuous-time state-space model, 1419 time-invariant dynamics, 765 time-invariant Iqr problem, the, 635 time-invariant plants, 827 time-invariant sampled-data model, 1422 time-invariant, linear (lti), 3,4 time-optimal csontrol, 1349 time-reversible distributed-parameter systems, 1171 time-scale separation, 1295 time-varying feedback, 762 time-varying state feedback, 762 time-varying state-variable feedback, 764 time-varying systems, 451,459,827, 828 timekeeping, 333 timers, 356 times, switching, 1423 Timoshenko system, 1842 titration curve, 1206 toggle (t) flip-fllop, 350 toilet, 1180 indoor, 1186 toilet control, 1181 toolbox, synthesis, 680 topological equivalence, 456 topological interconnection capability, 105 torque-slip, 1447 tracker design, 770 tracker problem, the, 768 tracking, 202,712,731 closed-lolop, 186 command, 1329 ideal, 292 maneuver, 1329 response model, 292 zero-statr:, 292 tracking boundls, 705, 713 tracking contrail, 1296, 1297 tracking error, 768,923 tracking problem, 1092 tracking smoothness, 1329 tracking specifications, 712 tracking system(,281
THE CONTROL HANDBOOK tracking system design, 291,292 trackingldisturbance signals, 731 training batch, 1024 incremental, 1025 trajectories, 1341 autonomous, 919 trajectory, reference, 768 trajectory approximation steering problem, 971 transcendental number, 51 transfer, bumpless, 386 transfer contact, 346 transfer function, 86,93, 147,242,832, 919 closed-loop, 88 open-loop, 88 signal-flowgraphs, 96 transfer function matrices, 507, 557 transfer function models, 807 transfer functions, 55,70,75,179,407,827 mimo, 78 rational, 689 transfer matrix, 796 transfer-function, 831 transform discrete-time fourier (dtft), 253 fourier (ft), 25% 1043 inverse discrete time fourier (idtft), 253 inverse fourieg (ift), 253 inverse Laplace, 55,117 Laplace-Borel, 885 w, 286 z, 248 transformation bilinear, 61,267,268 bilinear fractional, 61 congruence, 48 finite dimensional linear, 510 fractional lineor, 61,676 fundapental, 61 linear, 590 linear fractional, 61 Lyapunov, 456 reference frame, 1430, 1433 similarity, 39,76,250 Tustin, 61 upper linear fractional, 55 2,240 transformer (tf), 103 transforms, 579,584 bilateral Laplace, 18 bilateral z, 23 discrete time Fourier, 22 Fourier, 17 irrational Laplace, 21 Laplace, 17, 18 one-sided Laplace, 18 one-sided z, 23 rational Laplace, 20,25 rational z, 23 two-sided Laplace, 18 two-sided z, 23 unilateral Laplace, 18 unilateral z,23 z, 17,23 transient analysis, 1043 transient components, 242 transient response, 158
transient stability, 1483 transistor-transidtor logic (ttl), 347 transition function, 1075 transition logic, 327 transition matrix, state, 117 transition probabilities, 1075,1100 transitions, 356,1062 translation, 61, 1048 transmission, 1388 power, 1483 transmission system, 1484 transmission zero, 732,742 calculation, 448 nonsquare systems, 449 transmitters, 1195 transparency, 435 transpose complex-conjugate,509 hermitian, 36 transposition, matrix, 36 trapezoidal method, 267 triangle inequality, 39,52 triangular decoupling, 802 \ trigger, 350 trim analysis, 1290 trim equations, 1291 truncated signal, 895 truncation, 302,303 truth tables, 347 ttl circuits, 351 ttl logic, 351 tuned error, 1284 tuning, 1199 automatic, 823 controller, 382 filter, 382 kappa-tau, 819 lambda, 1232,1234 h, 821 loop, 1231 tuning functions, 981,982,990 turbine, 1454 turning point, 954 Tustin transformation, 61 Tustin's method, 267 two's complement representation, 302 two-degree-of-freedom tracking loop, 385 two-degrees-of-freedom regulator, 777 two-dimensional contour following, 1357 two-point boundary-value problems, 761 two-sided Laplace transform, 18 two-sided z transform, 23 two-time scale behavior, 766 two-time-scale methods, 873 type 0 feedback controi system, 180 type 0 system, 178 type 1 feedback control system, 180 type 1 system, 178 type 2 feedback control system, 180 type 2 system, 179 u-contour, 704 ultra-high precision control, 1386 ultra-high precision systems, 1386 unbounded reference inpuffdisturbance signals, 736 uncertain controls systems, 752 uncertain external input signals, 551
INDEX uncertain plant models, 552 uncertain plants, 499 uncertain polynomials, 501 uncertain signals, 552 uncertain systems, 752 uncertainty, 780 additive, 534 affine, 498 complex, 675 division, 534 generalized structured plant, 557 model, 520,533,818 multiplicative, 768 parametric, 535 plant output, 529 real, 675 set membership representation, 520 sources, 520 structured, 668 structured plant, 556 types, 520 unstructured, 520 uncertainty model, 551,683 uncertainty modeling, 1406 uncertainty structure, preservation, 497 uncertainty structures, 496,504 unconstrained gpc, 809 uncontrollable, 122, 129 under-actuated system, 1008 unforced linear systems, 445 unforced rotational link equations, 390 unidirectional waveform, 1416 uniform asymptotic stability, 1105 uniform asymptotic stability in probability, 1106 uniform exponential stability, 457, 11051 uniform manufacturing, 1240 uniform norm, 40 uniformly asymptotically stable, 1154 uniformly bounded mean, 1103 uniformly controllable, 458 uniformly n-step controllable, 463 uniformly observable, 459 unilateral Laplace transform, 18 unilateral z transform, 23 unimodular matrix, 61,130 unimodular polynomial matrices, 482 unirnodular square polynomial matrix, 482 unit complex exponential, 512 unit control level, 1457 unit impulse input, 711 unit solution method, 761 unit vector, 38 unit-circle stability region, 838 unit-step response, 831 unitary complex-valued matrix, 509 unitary matrix, 37 unitary scheduler, 330 unitary tasks, 330 unity feedback gain outer loop, 770 unity feedback system, 88,189,513 unity matrix, 1354 universal approximation, 1021 universal approximation property, 1029 universal gate, 348 unknown time-varying gain, 829 unknown virtual control coefficients, 99'2 unmodeled dynamics, 845
unobservable, 1;!3 unrealizable filter, 1087 unrealizable Wiener filter, 1086 unstable, model,,1052 unstable plants, 235 unstable poles, 553 unstable systems, 548 unstructured multiplicative perturbations, 667 unstructured perturbation models, 552 unstructured plant uncertainty models, 553 upper costing horizon, 844 upper equilibrium point, 971 upper linear fractional transformation, 556 user code block (UCB), 426 user interfaces, 43 1 validation, 1265 model, 559,1050 validation data, 1039 value absolute, 52 complex structured singular, 673 critical parameter, 953 singular, 413, 767 value iteration, Y 102 value set, 499 value set geometry, 501 values characteristic, 5, 11 constant nominal equilibrium, 1420 peak, 665 singular, 46 valve positioners, 1207 vabes, 1193 variable error, 924 intermediate, 761 scheduling, 393 variable forgetting factors, 835 variable structure, sliding-mode controller, 941 variables, 250 exogenous input, 924 protected, ,338 state, 116,455,461,1146 weakly coupled, 780 variance, 306, 10.45 innovations, 1038 weighted 11.iinimum, 1092 variance error, 1038 variations, parameter, 949 vector characteristic, 456,457 coordinate, 38 cyclic, 46 innovations, 591 parameter error, 833 periodic average zero (paz), 969 regression, 1035,1048 residual, 55111 unit, 38 vector additive form, 970 vector continuous-time stochastic process, 586 vector diagram, 1438 vector discrete-time stochastic processes, 581 vector field, 862,971,1361 flow, 924 noncommc~ting,967
THE CONTROL HANDBOOK vector Liapunov functions, 784,785 vector norms, 39 vector refresh displays, 431 vector space, 37 vector storage displays, 431 vectors column m, 33 complex, 508 left singular, 510 maximum singular, 511 minimum singular, 511 orthogonal, 41,508 right singular, 510 row, 33 standard basis, 38 vehicle maneuvering, 1330 vehicle vibration, 1330 velocity angular, 1312 cartesian manipulator, 1353 velocity algorithms,201,203 velocity control, 1377 velocity error constant, 134 velocity kinematics, 1340 velocity loop, 1390,1396, 1397 velocity loop compensator, 1393 velocity of constraint, 1354 velocity of freedom, 1354 velocity-controlled subspace, 1353 velocity-resolved control, 1356 vessel design, 1211 VHDLa, 422 vibration, vehicle, 1330 vibrational control, 968 vibrationally stabilizable, 969 viewpoint, hamiltonian, 975 virtual control, 980 virtual functions, 334,338,341 viscosity solutions, 1101 voltage, 101 generator, 1440 voltage stability, 1484 voltage step-down converter, 1414 voltage step-up converter, 1415 voltage-drop-causer model, 101 Volterra kernels, 883 Volterra series, 881,884 Volterra series expansions, 879 von Kirmin system, 1147,1148,1153,1155 vsc systems, 943 w transform, 286 waiting state, 338 wave equation, 1157,1167,1171 waveform constant voltage, 1413 pulse, 1414 rectified, 1416 sinusoidal voltage, 1413 unidirectional, 1416 wavelets, 1049 weak stability, 410 weakly coupled variables, 780 weighted complementarysensitivity, 517 weighted minimum variance, 1092 weighted sensitivity,517 weighting
control, 762 find state, 762 polynomials, 776 state, 762 weighting matrix, 1292 weighting sequence, 831 weights, frequency, 516 well-defined interconnection, 896 while loop, 325 white cooking liquor, 1221 white input noise, 111 white noise, 577,584,807,832,1068,1080,1081,1085,1090, 1095 white noise process, 578, 1068 white noise response, 1392 whitening filter, 1045 Wiener filter, 1086 Wiener integral, 1071 Wiener process, 807,1068, 1069 Wiees, 1111 Willems, 1112 winding number, 58 windup, 386,1216 integrator, 1421 word length, 301 world frame, 1340 worst case error (wce), 218 wound-field synchronous generators, 1442 xor, 349,351,355 Youla parameter, 750 Youla parameterization,750 Yule-Walker equation, 1084 z transformation, 240 z transforms, 17,23,248 Z-transferfunction, 242 Zermelo's problem, 772 zero, transmission, 732,742 zero dynamics,.918,922, 1474 zero exclusion condition, 501 zero matrix, 34 zero mean, log0 zero of order, 54 ?TO-crossing observations, 615 zero-input response, 115 zero-input response components, 248 zero-input stable, 250 zero-state response, 115,248 zero-state stable, 250 zero-state tracking, 292 zeros, 20,54,118,125,129,407,546,777,838,919 diagonal coupling, 797 invariant, 475 mimo transmission,447 nonminimum-phase, 838 transmission,448 Ziegle-Nichols methods, 818 modifications, 819 zone temperature, 1196
THE COMPUTER SCIEN
HAN
iERf
III [his single volume you can find quick answer to the questions that affect your work every day. The 113 chapters describe fundamental principles, "best practices", research horizons, and their impact upon the professions and society.
Jaico Price Rs. 4795.00
Original CRC Price DM 289 I Rs. 7051.OO
E
THE HANDBOOK OF SOFTWARE FOR ENGINEERS AND SCIENTISTS
7
3
2P
t-'
j: 2 rn
t-' IALTH AND SAFE
I
. ~0nCiSe iis hanCIIJ00~,*intwo volumes, is designed to pluvlua G U I I I ~ I C I I C I ~ ~ I V CVIA jcussion of each of the important environmental health areas.
Original CRC Price ISRN A1-7774-76RC
,
Jaica Price Rs. 3995.00
Original CRC Price DM 300 1 Rs. 7320.00
,
I'
2
H