Lecture Notes in Control and Information Sciences Editors: M. Thoma, M. Morari
374
Biao Huang, Ramesh Kadali
Dynami...
65 downloads
735 Views
4MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
Lecture Notes in Control and Information Sciences Editors: M. Thoma, M. Morari
374
Biao Huang, Ramesh Kadali
Dynamic Modeling, Predictive Control and Performance Monitoring A Data-driven Subspace Approach
ABC
Series Advisory Board F. Allgöwer, P. Fleming, P. Kokotovic, A.B. Kurzhanski, H. Kwakernaak, A. Rantzer, J.N. Tsitsiklis
Authors Prof. Biao Huang University of Alberta Dept. Chemical & Materials Engineering Edmonton AB T6G 2G6 Canada
Dr. Ramesh Kadali Suncor Energy Inc. Fort McMurray AB T9H 3E3 Canada
ISBN 978-1-84800-232-6
e-ISBN 978-1-84800-233-3
DOI 10.1007/978-1-84800-233-3 Lecture Notes in Control and Information Sciences
ISSN 0170-8643
British Library Cataloguing in Publication Data A catalogue record for this book is available from the British Library
Library of Congress Control Number: 2008923061 c
Springer-Verlag London Limited 2008
Apart from any fair dealing for the purposes of research or private study, or criticism or review, as permitted under the Copyright, Designs and Patents Act 1988, this publication may only be reproduced, stored or transmitted, in any form or by any means, with the prior permission in writing of the publishers, or in the case of reprographic reproduction in accordance with the terms of licences issued by the Copyright Licensing Agency. Enquiries concerning reproduction outside those terms should be sent to the publishers. The use of registered names, trademarks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant laws and regulations and therefore free for general use. The publisher makes no representation, express or implied, with regard to the accuracy of the information contained in this book and cannot accept any legal responsibility or liability for any errors or omissions that may be made. Typeset & Cover Design: Scientific Publishing Services Pvt.Ltd., Chennai, India. Printed on acid-free paper 987654321 springer.com
To Yali and Linda - BH To baby Rohan - RK
Preface
Aim of the Book The aim of this book is: 1) to provide an introduction to conventional system identification, model predictive control, and control performance monitoring, and 2) to present a novel subspace framework for closed-loop identification, datadriven predictive control and control performance monitoring. Dynamic modeling, control and monitoring are three central themes in systems and control. Under traditional design frameworks, dynamic models are the prerequisite of control and monitoring. However, models are only vehicles towards achieving these design objectives. Once the design of a controller or a monitor is completed, the model often ceases to exist. The use of models serves well for the design purpose as most traditional designs are model based; it also introduces unavoidable modeling error and complexity in building the model. If a model is identified from data, it is obvious that information contained in the model is no more than that within the original data. Can a controller or monitor be designed directly from input-output data bypassing the modeling step? This book aims to present novel subspace methods to address these questions. In addition, as necessary background material, this book also provides an introduction to the conventional system identification methods for both open-loop and closed-loop processes, conventional model predictive control design, conventional control loop performance assessment techniques, and state-of-the-art model predictive control performance monitoring algorithms. Thus, readers who are interested in conventional approaches to system identification, model predictive control, and control loop performance assessment will also find the book a useful tutorial-style reference.
Novel Data-driven Subspace Approach The subspace approach to process identification has been a great success over the last two decades. The interest in subspace methods continues to grow, towards, for example, the data driven subspace approach to process control and control
VIII
Preface
performance monitoring. On the other hand, subspace identification methods have also been used in industrial model predictive control design software packages. Preliminary studies have shown that the subspace based controller design approach can simplify design procedure and may achieve more desirable properties than the conventional design approaches. Similarly, the subspace based approach has the potential to simplify the multivariable controller performance assessment procedure. These observations motivated the authors to develop a unified framework for predictive control and control performance monitoring, driven directly by input-output data via the subspace approach. The other motivating factor for the development of the data-driven approach is the explosive growth in the amount of process data that is available for analysis. In almost all process environments, ranging from petroleum refining to pulp and paper processing, easy data access through distributed control systems (DCS) is now the rule rather than the exception. In fact, in most process industries data are collected and simply archived. Estimates indicate that most chemical plants require over 100 gigabytes of storage space to archive a year’s worth of data. The data-warehousing problem is a manifestation of the exponential increase in information flow as a result of the recent advances in networks and computers. In summary, we live in an information age in which the process industry world is awash with data. Properly archived, experimentally designed, or routinely operated data can be a tremendous source of information. The question is how to extract useful information from these data and then put them to good use? One of the main objectives of this research monograph is to suggest how data can be used to design predictive control, obtain non-invasive or lessinvasive measures of the performance of the control loops in the plant, or even develop models if they are needed.
Prerequisites Some of the material presented in the book has been published in the archival literature over the last several years by the authors. This book attempts to consolidate the previous results with many new ones in one place. In this respect, the book is likely to be of use for researchers as a monograph and as a place to look for basic information presented in a tutorial style as well as advanced material developed most recently. However, there are also results here that will appeal to industrial practitioners. Portions of the book will also be of use in a graduate level course in process control, control performance monitoring, and system identification. Introductory courses in control and linear algebra are considered the most appropriate prerequisites for the book. It would be of an advantage if readers are acquainted with some knowledge in system identification, but it is not a vital prerequisite for understanding the material as the book also contains a tutorial chapter on conventional system identification.
Preface
IX
Outline of the Book This book is organized as follows: Chapter 1 gives an overall introduction and motivation of the work presented in the book. Chapter 2 is devoted to an introduction of the classical system identification method, namely the prediction error method, and open-loop versus closed-loop identification methods, in a tutorial style. Chapter 3 gives an overview of open-loop subspace identification methods. This chapter provides the background material required for understanding most of the remaining chapters. It also provides a place for readers to find an overview of most open-loop subspace identification algorithms in the literature. Chapter 4 provides an overview of closed-loop subspace identification methods. It also introduces a novel closed-loop subspace identification approach through an orthogonal projection. As a by-product of this development, it answers the question of why the presence of feedback control results in a bias error in some existing subspace identification methods. A solution is provided to eliminate the bias. Chapter 5 introduces a practical subspace identification method for directly obtaining the process dynamic matrix and the noise model from closed-loop data, which is useful for industrial model predictive control design. Chapter 6 provides an introduction to conventional model predictive control design. A simplified step-by-step dynamic matrix control design procedure is illustrated. Chapter 7 describes a method for designing predictive controllers directly from input-output data via the subspace approach. Chapter 8 gives a tutorial on conventional univariate as well as multivariate control loop performance assessment methods and introduces representative algorithms behind this important technology. Chapter 9 presents an overview on most existing model predictive control performance monitoring algorithms. It serves as an introduction to the related research directions and motivations for the remaining chapters. Chapter 10 provides a novel approach for multivariate feedback control performance assessment directly from input-output data using subspace methods, without relying on a model or interactor matrix. The method consolidates the traditional three-step procedure, model identification, close-loop time series analysis, and extraction of the minimum variance control (MVC) benchmark into a one-step solution, leading to an algorithm of considerable compactness. Chapter 11 explores alternative solutions for multivariate control performance assessment problems. It introduces a prediction error approach and does not need a priori knowledge about the process model or time delays/interactor matrices for control performance assessment. Two data-driven subspace algorithms are presented. Chapter 12 describes a method for LQG-benchmark based controller performance analysis following the data-driven subspace approach. The LQG benchmark is useful for performance monitoring of model predictive control systems.
X
Preface
Suggested Sequence of Reading This book aims to provide a relatively complete and coherent view of subspace methods for identification, predictive control, and performance monitoring, each started from a tutorial introduction or overview of the subject matter, followed by in-depth discussions. Thus each of the three subjects may be read independently with some necessary background on subspace methods. For readers who are interested in general subspace methods for all three subjects as well as future research directions, complete reading of the book is recommended. For readers who are only interested in a specific subject, selected chapters and reading sequences are recommended as follows: • Open-loop and closed-loop subspace identification: Chapter 2 =⇒ Chapter 3 =⇒ Chapter 4 • Subspace identification of dynamic matrices for model predictive control: Chapter 2 =⇒ Chapter 3 =⇒ Chapter 5 • Model predictive control tutorial and advanced subspace approach for datadriven predictive control: Chapter 6 =⇒ Chapter 3 =⇒ Chapter 7 • Conventional control performance monitoring and MPC performance monitoring: Chapter 8 =⇒ Chapter 9 • Subspace approach to control performance monitoring: Chapter 3 =⇒ Chapter 8 =⇒ Chapter 10 =⇒ Chapter 11 =⇒ Chapter 12
Acknowledgements The material presented in this book has been the outcome of several years of research efforts by the authors. We started the research in this direction in 2000. We began preparing this book in 2004 when the first author was on sabbatical leave in the University of Duisburg in Germany. After returning to the University of Alberta the writing has continued over three and half years to the current version. Many chapters of the book have gone through several rounds of revisions by discussions with a number of colleagues, graduate students, and postdoctoral fellows. It is a pleasure to be able to thank many people who have contributed so generously in the conception and creation of the material in this text.
Preface
XI
We would like specifically to thank our colleagues and collaborators, Prof. Anthony Rossiter (Sheffield University), Prof. Joe Qin (University of Texas at Austin), Prof. Nina Thornhill (Imperial College London), Prof. Sirish Shah (University of Alberta), Prof. Steven Ding (University of Duisburg), and Prof. Tongwen Chen (University of Alberta), who have inspired many discussions in this or related topics over the past years; Chair of Department of Chemical and Materials Engineering at the University of Alberta, Prof. Fraser Forbes, for his supports of the work; former and current graduate students and postdoctoral fellows, Dr. Fangwei Xu, Mr. Sien Xu, Dr. Xiaorui Wang, Dr. Salim Ahmed, Dr. Shrikant Bhat, Mr. Nima Danesh, Mr. Yu Zhao for their participation in the discussions; our computing support staffs, Bob Barton and Jack Gibeau; and other supporting stuffs of the Department of Chemical and Materials Engineering at the University of Alberta. The supports from the Natural Sciences Engineering Research Council of Canada (NSERC) and Alexander von Humboldt foundation for this and related research work are gratefully acknowledged. Last but not least, we would also like to acknowledge Mr. Oliver Jackson (Springer) for his editorial comments and detailed examination of the book.
Contents
Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . XIX 1
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1 An Overview of This Book . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 Main Features of This Book . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3 Organization of This Book . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1 1 4 4
Part I Dynamic Modeling through Subspace Identification 2
System Identification: Conventional Approach . . . . . . . . . . . . . 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Discrete-time Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2.1 Finite Difference Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2.2 Exact Discretization for Linear Systems . . . . . . . . . . . . . . . 2.2.3 Backshift Operator and Discrete-time Transfer Functions 2.3 An Example of System Identification: ARX Modeling . . . . . . . . . 2.4 Persistent Excitation in Input Signal . . . . . . . . . . . . . . . . . . . . . . . . 2.5 Model Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.5.1 Prediction Error Model (PEM) . . . . . . . . . . . . . . . . . . . . . . . 2.5.2 AutoRegressive with Exogenous Input Model (ARX) . . . . 2.5.3 AutoRegressive Moving Average with Exogenous Input Model (ARMAX) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.5.4 Box-Jenkins Model (BJ) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.5.5 Output Error Model (OE) . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.5.6 MISO (Multi-input and Single-output) Prediction Error Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.5.7 State Space Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.6 Prediction Error Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.6.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.6.2 Optimal Prediction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.6.3 Prediction Error Method . . . . . . . . . . . . . . . . . . . . . . . . . . . .
9 9 9 10 10 11 12 13 15 15 15 16 17 17 18 18 19 19 21 24
XIV
Contents
2.7 Closed-loop Identification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.7.1 Identifiability without External Excitations . . . . . . . . . . . . 2.7.2 Direct Closed-loop Identification . . . . . . . . . . . . . . . . . . . . . 2.7.3 Indirect Closed-loop Identification . . . . . . . . . . . . . . . . . . . . 2.7.4 Joint Input-output Closed-loop Identification . . . . . . . . . . 2.8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
25 26 27 28 29 29
3
Open-loop Subspace Identification . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Subspace Matrices Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.1 State Space Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.2 Notations and Subspace Equations . . . . . . . . . . . . . . . . . . . 3.3 Open-loop Subspace Identification Methods . . . . . . . . . . . . . . . . . . 3.4 Regression Analysis Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.5 Projection Approach and N4SID . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.5.1 Projections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.5.2 Non-steady-state Kalman Filters . . . . . . . . . . . . . . . . . . . . . 3.5.3 Projection Approach for Subspace Identification . . . . . . . . 3.6 QR Factorization and MOESP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.7 Statistical Approach and CVA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.7.1 CVA Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.7.2 Determination of System Order . . . . . . . . . . . . . . . . . . . . . . 3.8 Instrument-variable Methods and EIV Subspace Identification . . 3.9 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
31 31 31 31 33 40 40 43 43 44 46 48 49 49 51 51 53
4
Closed-loop Subspace Identification . . . . . . . . . . . . . . . . . . . . . . . . . 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Review of Closed-loop Subspace Identification Methods . . . . . . . 4.2.1 N4SID Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2.2 Joint Input-Output Approach . . . . . . . . . . . . . . . . . . . . . . . . 4.2.3 ARX Prediction Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2.4 An Innovation Estimation Approach . . . . . . . . . . . . . . . . . . 4.3 An Orthogonal Projection Approach . . . . . . . . . . . . . . . . . . . . . . . . 4.3.1 A Solution through Orthogonal Projection . . . . . . . . . . . . . 4.3.2 The Problem of Biased Estimation and the Solution . . . . 4.3.3 Model Extraction through Kalman Filter State Sequence 4.3.4 Extension to Error-in-variable (EIV) Systems . . . . . . . . . . 4.3.5 Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
55 55 57 57 59 60 61 63 63 67 69 71 72 78
5
Identification of Dynamic Matrix and Noise Model Using Closed-loop Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 Estimation of Process Dynamic Matrix and Noise Model . . . . . . 5.2.1 Estimation of Dynamic Matrix of the Process . . . . . . . . . . 5.2.2 Estimation of the Noise Model . . . . . . . . . . . . . . . . . . . . . . .
79 79 81 84 85
Contents
5.3 Some Guidelines for the Practical Implementation of the Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.4 Extension to the Case of Measured Disturbance Variables . . . . . 5.5 Closed-loop Simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.5.1 Univariate System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.5.2 Multivariate System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.6 Identification of the Dynamic Matrix: Pilot-scale Experimental Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
XV
86 87 89 89 90 94 96
Part II Predictive Control 6
Model Predictive Control: Conventional Approach . . . . . . . . . . 6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2 Understanding MPC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3 Fundamentals of MPC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3.1 Process and Disturbance Models . . . . . . . . . . . . . . . . . . . . . 6.3.2 Predictions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3.3 Free and Forced Response . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3.4 Objective Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3.5 Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3.6 Control Law . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.4 Dynamic Matrix Control (DMC) . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.4.1 The Prediction Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.4.2 Unconstrained DMC Design . . . . . . . . . . . . . . . . . . . . . . . . . 6.4.3 Penalizing the Control Action . . . . . . . . . . . . . . . . . . . . . . . . 6.4.4 Handling Disturbances in DMC . . . . . . . . . . . . . . . . . . . . . . 6.4.5 Multivariate Dynamic Matrix Control . . . . . . . . . . . . . . . . . 6.4.6 Hard Constrained DMC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.4.7 Economic Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.5 Generalized Predictive Control (GPC) . . . . . . . . . . . . . . . . . . . . . . 6.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
101 101 102 103 103 105 106 107 107 108 108 109 111 111 112 113 115 116 117 118
7
Data-driven Subspace Approach to Predictive Control . . . . . . 7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2 Predictive Controller Design from Subspace Matrices . . . . . . . . . . 7.2.1 Inclusion of Integral Action . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2.2 Inclusion of Feedforward Control . . . . . . . . . . . . . . . . . . . . . 7.2.3 Constraint Handling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.3 Tuning the Noise Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.4 Simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.5 Experiment on a Pilot-scale Process . . . . . . . . . . . . . . . . . . . . . . . . 7.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
121 121 122 125 127 128 130 132 138 141
XVI
Contents
Part III Control Performance Monitoring 8
9
Control Loop Performance Assessment: Conventional Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.2 SISO Feedback Control Performance Assessment . . . . . . . . . . . . . 8.3 MIMO Feedback Control Performance Assessment . . . . . . . . . . . . 8.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
145 145 146 150 155
State-of-the-art MPC Performance Monitoring . . . . . . . . . . . . . 9.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.2 MPC Performance Monitoring: Model-based Approach . . . . . . . . 9.2.1 Minimum-variance Control Benchmark . . . . . . . . . . . . . . . . 9.2.2 LQG/MPC Benchmark . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.2.3 Model-based Simulation Approach . . . . . . . . . . . . . . . . . . . . 9.2.4 Designed/Historical vs Achieved . . . . . . . . . . . . . . . . . . . . . . 9.2.5 Historical Covariance Benchmark . . . . . . . . . . . . . . . . . . . . . 9.2.6 MPC Performance Monitoring through Model Validation 9.3 MPC Performance Monitoring: Model-free Approach . . . . . . . . . . 9.3.1 Impulse-Response Curvature . . . . . . . . . . . . . . . . . . . . . . . . . 9.3.2 Prediction-error Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.3.3 Markov Chain Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.4 MPC Economic Performance Assessment and Tuning . . . . . . . . . 9.5 Probabilistic Inference for Diagnosis of MPC Performance . . . . . 9.5.1 Bayesian Network for Diagnosis . . . . . . . . . . . . . . . . . . . . . . 9.5.2 Decision Making in Performance Diagnosis . . . . . . . . . . . . 9.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
157 157 158 158 159 160 161 161 162 165 165 166 166 167 171 171 173 175
10 Subspace Approach to MIMO Feedback Control Performance Assessment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.2 Subspace Matrices and Their Estimation . . . . . . . . . . . . . . . . . . . . 10.2.1 Revisit of Important Subspace Matrices . . . . . . . . . . . . . . . 10.2.2 Estimation of Subspace Matrices from Open-loop Data . . 10.3 Estimation of MVC-benchmark from Input/Output Data . . . . . . 10.3.1 Closed-loop Subspace Expression of Process Response under Feedback Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.3.2 Estimation of MVC-benchmark Directly from Input/Output Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.4 Simulations and Application Example . . . . . . . . . . . . . . . . . . . . . . . 10.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
177 177 179 179 180 181 181 183 190 193
Contents
11 Prediction Error Approach to Feedback Control Performance Assessment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2 Prediction Error Approach to Feedback Control Performance Assessment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.3 Subspace Algorithm for Multi-step Optimal Prediction Errors . . 11.3.1 Preliminary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.3.2 Calculation of Multi-step Optimal Prediction Errors . . . . 11.3.3 Case Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 Performance Assessment with LQG-benchmark from Closed-loop Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12.2 Obtaining LQG-benchmark from Feedback Closed-loop Data . . . 12.3 Obtaining LQG-benchmark with Measured Disturbances . . . . . . 12.4 Controller Performance Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12.4.1 Case 1: Feedback Controller Acting on the Process with Unmeasured Disturbances . . . . . . . . . . . . . . . . . . . . . . 12.4.2 Case 2: Feedforward Plus Feedback Controller Acting on the Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12.4.3 Case 3: Feedback Controller Acting on the Process with Measured Disturbances . . . . . . . . . . . . . . . . . . . . . . . . . 12.5 Summary of the Subspace Approach to the Calculation of LQG-benchmark . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12.6 Simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12.7 Application on a Pilot-scale Process . . . . . . . . . . . . . . . . . . . . . . . . 12.8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
XVII
195 195 196 201 201 202 206 211
213 213 214 217 219 219 220 221 222 223 224 227
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229 Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237
Notation
[A, B, C, D] [Acl , Bcl , Ccl , Dcl ] [Ac , Bc , Cc , Dc ] [Bm , Dm ] A(z −1 ) B(z −1 ) C(z −1 ) D(z −1 ) ∆u ˆ y yˆt r y∗ ˜ p (z −1 ) G at d E[∗] Ef Ei Ep et Fi fi Gm i Gcl (z −1 )
Dynamic state space system matrices of the process Dynamic state space system matrices of the closed-loop system Dynamic state space system matrices of the controller Dynamic state space system matrices corresponding to the measured variables Polynomials in the backshift operator, z −1 Incremental control moves vector over the control horizon Predicted output trajectory vector over the prediction horizon Predicted value of system output(s) at sampling instant t Setpoint (reference) trajectory vector over the prediction horizon Predicted free response trajectory vector over the prediction horizon Delay-free transfer function matrix of Gp Integrated white noise Process time delay for the univariate process or order of the interactor matrix for multivariate process Expectation operator Future data Hankel matrix for et Polynomial obtained in Diophantine expansion Past data Hankel matrix for et White noise (innovations) sequences Markov parameter (or impulse response) matrix at ith sample Impulse response coefficient at ith sample Multivariate step response coefficient matrix corresponding to the measured disturbance input at ith sample Transfer function representation of the closed-loop system
XX
Notation
Gc (z −1 ) Gsc Gi gi Gl (z −1 ) Gp (z −1 ) Gsp h cl HN
c HN
d HN
m HN
s HN
I J j Jmvc K ′ K
Transfer function representation of the controller State space representation of the controller Multivariate step response coefficient matrix corresponding to the deterministic input at ith sample Univariate step response coefficient at ith sample Transfer function representation of the stochastic part of the system or disturbance model Transfer function representation of the deterministic part of the system or process model State space representation of the process Dimension of measured disturbance(s) Lower triangular Toeplitz matrix of closed-loop system defined as Dcl 0 ... 0 Ccl Bcl Dcl ... 0 = ... ... ... ... −2 −3 Ccl AN Bcl Ccl AN Bcl ... Dcl cl cl Lower triangular Toeplitz matrix of controller defined as Dc 0 ... 0 Cc Bc Dc ... 0 = ... ... ... ... −2 −3 Cc AN Bc Cc AN Bc ... Dc c c Lower triangular deterministic Toeplitz matrix defined as 0 0 ... 0 CB 0 ... 0 = ... ... ... ... CAN −2 B CAN −3 B ... 0 Lower triangular Toeplitz matrix corresponding to the measured disturbances defined as Dm 0 ... 0 m CB m D ... 0 = ... ... ... ... CAN −2 B m CAN −3 B m ... Dm Lower matrix defined as triangular stochastic Toeplitz I 0 ... 0 CK I ... 0 = ... ... ... ... CAN −2 K CAN −3 K ... I Identity matrix Optimization objective function Number of block-columns in the block-Hankel matrices Minimum variance control objective function Kalman filter gain matrix = K − K∗
Notation
K∗ Klqg l LCL ue LCL ur LCL u LCL ye LCL yr LCL y Le
Lm Lu
Lw Lbw m Mf Mp mt N n N1 , N2 Nu Q R R1 , R2 Rf Rp rt Sf SN st U, S, V U1 , U2
XXI
Modified Kalman filter gain matrix LQG state feedback gain Dimension of system input(s) Closed-loop subspace matrix from Ef −→ Uf Closed-loop subspace matrix from Rf −→ Uf Closed-loop subspace matrix from WpCL −→ Uf Closed-loop subspace matrix from Ef −→ Yf Closed-loop subspace matrix from Rf −→ Yf Closed-loop subspace matrix from WpCL −→ Yf Subspace matrix containing the noise model Markov parameters; Le is shorthand representation of His where i is typically selected as N Subspace matrix containing the measured disturbances Markov parameters Subspace matrix containing the process Markov parameters; Lu is shorthand representation of Hid where i is typically selected as N Subspace matrix corresponding to past inputs and outputs (or state) Subspace matrix corresponding to the past inputs and outputs (or state) in the presence of measured disturbances Dimension of system output(s) Future data Hankel matrix for mt Past data Hankel matrix for mt Measured disturbance(s) at sampling instant t Number of block-rows in the block-Hankel matrices Dimension of system state(s) Parameters in the prediction horizon Control horizon Non-negative definite weighting matrix Non-negative definite weighting matrix Non-negative definite weighting matrices in control objective function Future data Hankel matrix for setpoint rt Past data Hankel matrix for setpoint rt System setpoint(s) at sampling instant t Future data Hankel matrix for st Dynamic matrix (with N block rows and N block columns) of step-response coefficients Output(s) measurement noise at sampling instant t Matrices from singular value decomposition Left matrices obtained in singular value decomposition
XXII
Notation
Uf
Uf∗ Up
Up∗ ut u∗t V1 , V2 Vf Vp vt W, W1 , W2 Wf Wf∗ Wpr ∗ Wpr
Wp Wpb WpCL wt Xfc Xpc xct
Future data Hankel matrixfor ut defined as uN uN +1 ... uN +j−1 uN +1 uN +2 ... uN +j ... ... ... ... u2N −1 u2N ... u2N +j−2 Future data Hankel matrix for measured inputs, u∗t , for EIV systems; see also Uf Past data Hankel matrix for ut defined as u0 u1 ... uj−1 u1 u2 ... uj ... ... ... ... uN −1 uN ... uN +j−2 Past data Hankel matrix for measured inputs, u∗t , for EIV systems; see also Up System input(s) at sampling instant t Measured system input(s) at sampling instant t for EIV systems Right matrices obtained in singular value decomposition Future data Hankel matrix for vt ; see also Uf Past data Hankel matrix for vt ; see also Up Input measurement noise Non-negative definite weighting matrices Yf U f∗ Yf U∗ f Rf Wp Rf W∗ f Yp U p Yp Up M p Yp Up Rp Process noise Future state matrix of the controller defined as xcN xcN +1 ... xcN +j−1 Past state matrix of the controller defined as xc0 x21 ... c xj−1 Controller state(s) at sampling instant t
Notation XXIII
Xf Xfb Xfcl Xp xt xst yc,t yf,t Yf Yf∗ Yp Yp∗ yt yt∗ ytd yts D(z) F , Fi Gi , Γi (η)f b , (η)f f &f b (E)f b , (E)f f &f b (Iη )f b , (Iη )f f &f b (IE )f b , (IE )f f &f b Γ¯N ∆ ∆cN ∆dN ∆sN η
Future state matrix defined as xN ... xN +j−1 Future state matrix, when the system has measured disturbance variables, defined as xbN ... xbN +j−1 Future closed-loop state matrix Past state matrix defined as x0 ... xj−1 System states(s) at sampling instant t Stochastic component of system states(s) at sampling instant t Forced response of the process output Free response of the process output Future data Hankel matrix for yt Future data Hankel matrix for measured outputs, yt∗ , for EIV systems; see also Uf Past data Hankel matrix for yt Past data Hankel matrix for measured outputs, yt∗ , for EIV systems; see also Up System output(s) at sampling instant t Measured system output(s) at sampling instant t Deterministic component of the system output(s) at sampling instant t Stochastic component of system output(s) at sampling instant t Interactor matrix Polynomials obtained in Diophantine expansion Polynomials obtained in Diophantine expansion LQG-benchmark based controller performance indices with respect to process output variance LQG-benchmark based controller performance indices with respect to process input variance LQG-benchmark based performance improvement indices with respect to process output variance LQG-benchmark based performance improvement indices with respect to process input variance ΓN with its first m rows removed Differencing operator (1 − z −1 ) Reversed extended controllability matrix of {Ac , Bc } −1 −2 Bc AN Bc ... Bc = AN c c Reversed extended controllability matrix of {A, B} = AN −1 B AN −2 B ... B Reversed extended controllability matrix of {A, K} = AN −1 K AN −2 K ... K Performance index
XXIV Notation
ΓNb ΓNcl ΓNc ΓN γN λ ωi , λi , γi , ψi Σe ΓN (t + j | t)
(t | t − j) A/B C t1 | t2
< ∗ ⊥ † T
Extended observability matrix of the expanded system with inputs and measured disturbances Extended observability matrix of the closed-loop system Extended observability matrix of the controller Extended observability matrix of the process, defined as T = C T (CA)T ... (CAN −1 )T Markov parameters used in noise model tuning Weighting on the control effort in control objective functions Parameters for calculating input and output variances for LQG benchmarking Covariance of innovation (white noise) sequence et ΓN with its last m rows removed j step ahead prediction from time instant t. For example, yˆ(t + 2 | t) is a two-step ahead prediction from time instant t; for system identification, the prediction is based on past inputs and outputs, while for predictive control, the prediction is based on past outputs, past inputs, and future inputs. j step ahead prediction from time instant t−j. For example, yˆ(t | t − 2) is a two-step ahead prediction from time instant t − 2; see (t + j | t) for additional explanations Oblique projection = [A/B ⊥ ][C/B ⊥ ]† Subscript; the first column of subspace matrix/vector starts from time t1 and ends by time t2 . If x is a vector, for example, then xt1 xt1 +1 xt1 |t2 = ... xt2 X < Y implies that X − Y is semi-positive definite Superscript; measured value of the corresponding variable Superscript; orthogonal complement Superscript; Moore-Penrose pseudo-inverse Superscript; transpose transformation
1 Introduction
1.1 An Overview of This Book A typical industrial plant can have controllers ranging from proportionalintegral-derivative controllers (PI/PID) to advanced model predictive controllers (MPC), such as dynamic matrix control (DMC) [1, 2], quadratic dynamic matrix control (QDMC) [3, 4], robust multivariable predictive control technology (RMPCT) (see review in [5, 6]), generalized predictive control (GPC) [7, 8], etc. With the goals of optimal performance, energy conservation and cost effectiveness of process operations in industry, the design of optimal controllers and controller performance assessment1 have received great attention in both industry and academia. Typically a ‘model’ or some sort of mathematical representation of the process and the control objective are required not only for designing suitable controllers but also for analyzing controller performance. For predictive controllers, which use a model of the system to make predictions, model identification forms the critical part of controller design. Identification aims at finding a mathematical model from the measurement record of inputs and outputs of a system [9, 10, 11]. Parametric model identification, such as of a transfer function or a state space model, involves obtaining reduced-order models of a pre-specified structure for a system that could be of a very high order and complexity. Nonparametric modeling approaches, such as impulse/step response modeling and frequency-domain based modeling, can also be found in the literature for controller design and performance analysis. Interestingly, nonparametric-model based controller design and analysis tools have been used in industry quite successfully. As a convention, identification of parametric or nonparametric models for the process is typically used as a first step in MPC design, and the models are then converted to prediction matrices as the second step before the calculation of control actions is performed. Data-driven approaches to obtain the prediction matrices used in the controller design directly from 1
In the literature, control performance assessment has also been called control performance monitoring. This book will not attempt to distinguish between these two terms.
B. Huang et al.: Dyn. Model. Predict. Ctrl. & Perform. Monitor., LNCIS 374, pp. 1–5, 2008. c Springer-Verlag London Limited 2008 springerlink.com
2
1 Introduction
process data, and avoid the intermediate explicit model identification procedure, has become an area of active research in recent years [12, 13, 14, 15, 16, 17, 18]. Traditionally, prediction error methods are used to identify parametric models. Subspace identification methods, with their computational advantages, have emerged as a powerful alternative to prediction error methods. They estimate state space models directly from the input-output data and eliminate certain constraints of prediction error methods such as a priori structure selection and non-linear optimization. In subspace identification methods, certain matrices, which capture the correlations between the process inputs and outputs in the form of subspace matrices, are estimated as a first step, by projections of the process input-output data. Lower-order state space system matrices are then obtained from these intermediate subspace matrices. It has been found, however, that the intermediate matrices contain the prediction information and can be used to derive predictors directly for the system. There have been attempts to design predictive controllers directly from these matrices. This book is motivated by the idea of designing predictive controllers directly from the intermediate subspace matrices, and extends the data-driven subspace approach to other important process control areas such as closed-loop identification and controller performance assessment. As a comparison, the conventional control design approach based on system identification, such as subspace identification, consists of 1) calculating subspace matrices through data projections, 2) extracting state space models from subspace matrices, and 3) deriving prediction matrices from the state space models for calculating control actions. The data-driven approach of this book, however, consists of 1) calculating subspace matrices through data projections, and 2) deriving prediction matrices directly from the subspace matrices. Thus, the intermediate step to obtain state space models is avoided. The research on control loop performance monitoring and diagnostics has been and remains one of the most active research areas in the process control community [19, 20, 21, 22, 23, 24]. Despite numerous developments, it remains a challenging problem to obtain a minimum variance control benchmark from routine operating data for a multivariable process, since the solution relies on the interactor matrix (or inverse time delay matrix). Knowing the interactor matrix is tantamount to having a complete knowledge of process models that are often either not available or not accurate enough for a meaningful calculation of the benchmark. There is growing research interest in reducing the complexity of the a priori knowledge requirement, such as Ko and Edgar (2001) [25], Kadali and Huang (2003) [26], and McNabb and Qin (2003) [27]. Although these attempts have reduced the complexity of the a priori knowledge requirement to some extent, they all require certain information that is computationally simpler but fundamentally equivalent to the interactor matrices; for example, the openloop process Markov parameter matrices, the lower triangular Toeplitz matrix, or the multivariate time delay (MTD) matrix. That is, they all require a priori knowledge that is beyond the time delays between each pair of the inputs
1.1 An Overview of This Book
3
and outputs. Thus, derivation of a method for multivariate control performance monitoring directly from input-output data is of great interest. By exploring the subspace approach to control performance monitoring, a solution is presented that depends on input-output data only. In addition to the challenge facing minimum variance control benchmarking, many other challenges also exist, for example, model predictive control performance monitoring; thus other alternative performance monitoring approaches are also introduced, built upon the same data-driven subspace framework. Just as model identification is a critical pre-step for the design of optimal predictive controllers, performance monitoring is an important post-step to ensure optimal performance of the designed controller in operation. As process modifications/changes take place over time the models used in the controller may need to be revisited and re-identified in order to sustain controller performance if the model-based control strategy is used. Open-loop model identification may not be always possible when the process is in operation. Closed-loop identification is a more feasible solution under such a scenario. Hence, closed-loop identification, predictive controller design and controller performance assessment are closely related in industrial control applications. Although this book focuses on a data-driven approach to predictive control design and control performance assessment, model identification, particularly closed-loop subspace model identification, from input-output data is another important ingredient of this book. Conventional prediction error approaches for closed-loop identification have been well addressed [28, 29, 11, 30, 10]. Subspace closed-loop identification remains mostly an outstanding problem, and has received much attention by a number of researchers [31, 13, 14, 30, 32, 33, 34, 35, 36]. It has been found that conventional open-loop subspace identification algorithms often yield biased estimation when applied to closed-loop data [30]. Through an overview of the existing subspace identification methods, this book introduces a novel approach that is based on orthogonal projection of input-output data [36] to address a bias problem. Why do we call the approach introduced in this book ‘data-driven’ ? This question may be answered merely by browsing the topics to be discussed, including: closed-loop estimation of the dynamic matrix and noise model directly from input-output data; predictive controller design without explicit models; controller performance analysis using the linear quadratic Gaussian (LQG) controller without an explicit model; estimation of multivariate minimum variance control (MVC) benchmark without calculating the interactor matrix or models; prediction-error approach to performance monitoring directly from subspace matrices. These topics can be considered ‘data driven’ in the sense that no explicit models are used. Certain intermediate subspace matrices obtained directly from the process data, which are calculated as a first step in subspace identification, are used in the closed-loop identification, controller design and performance assessment. Therefore, the primary difference between the data-driven approach and the conventional approach is whether an explicit model is used.
4
1 Introduction
1.2 Main Features of This Book This book presents both theoretical work and applications in the three important areas of advanced control: system identification, predictive control, and performance assessment, using a data-driven subspace approach. Pertinent features include: 1. A tutorial-style overview of conventional system identification, open-loop and closed-loop subspace identification, predictive control design, and control loop performance assessment. 2. A novel data-driven framework for synthesis of modeling, predictive control, and control performance assessment. 3. Development of a closed-loop subspace identification approach via orthogonal projection. 4. Analysis of bias error in closed-loop subspace identification. 5. Identification of dynamic matrices and noise models from closed-loop data. 6. Derivation of a predictive control law directly from process input-output data using subspace matrices, including all the practical features, such as, • integrator in the controller law, • feedforward control, and • noise model tuning required for practical implementation of the datadriven subspace predictive controller. 7. Derivation of algorithms for calculation of the LQG-benchmark from the subspace matrices estimated using closed-loop data. 8. Theoretical proof and derivation of algorithms for obtaining the multivariate minimum variance control benchmark that can be calculated directly from subspace matrices. 9. Derivation of alternative multivariate control performance assessment methods using the data-driven subspace approach.
1.3 Organization of This Book As reflected in the title, this book is concerned with three subject areas, namely dynamic modeling, predictive control, and control performance monitoring, which are linked through the subspace approach. Consequently, the book can be naturally divided into three parts with a common theme. To appreciate the subspace approach as it is applied to the three subject areas, conventional approaches need to be understood. Fundamental knowledge of conventional system identification, model predictive control, and control performance monitoring is essential for understanding the advanced material presented in this book. Thus, each part of the book starts with a tutorial introduction to the subject area. Part I starts in Chapter 2, which provides an overview of dynamic modeling through the conventional system identification methods. Chapter 3 serves as a starting point to introduce the data-driven subspace approach. The key subspace matrix equations that form the core of data-driven approaches can be
1.3 Organization of This Book
5
found in this chapter. Chapter 4 provides an introduction to conventional closedloop subspace identification, followed by a novel approach to solve a closed-loop subspace identification problem. Part I ends with Chapter 5, which introduces a data-driven closed-loop identification method for dynamic matrix estimation, a critical matrix for conventional model predictive control. Part II starts with a tutorial-oriented introduction to conventional model predictive control in Chapter 6, followed by a novel data-driven approach to predictive control design in Chapter 7. In Part III, conventional control performance monitoring is introduced in Chapter 8, followed by an overview of state-of-the-art model predictive control performance monitoring techniques in Chapter 9. The other three chapters of Part III are devoted to novel data-driven subspace methods for control performance monitoring. Some of the material presented in this book has been published in archival journals by the authors, and is included in this book after necessary modifications or updates (some modifications are major ones) to ensure accuracy, relevance, completeness and coherence. This portion of material includes: • Section 4.3 of Chapter 4, reprinted from Journal of Process Control, vol 15, Huang, B., S.X. Ding, J. Qin, “Closed-loop Subspace Identification: an Orc thogonal Projection Approach”, 53-66, 2005 Elsevier Ltd., with permission from Elsevier. • Chapter 5, reprinted with permission from Ind. Eng. Chem. Res., vol 41, Kadali, R., B. Huang, “Estimation of the Dynamic Matrix and Noise Model c for Model Predictive Control Using Closed-Loop Data”, 842-852, 2002 American Chemical Society. • Chapter 7, reprinted from Control Engineering Practice, vol 11, Kadali, R., B. Huang, A. Rossitor, “A Data Driven Subspace Approach to Predictive c Controller Design”, 261-278, 2003 Elsevier Ltd., with permission from Elsevier. • Chapter 11, reprinted from Journal of Process Control, vol 16, Huang, B., S.X. Ding, and N. Thornhill, “Alternative Solutions to Multivariate Feedback c Control Performance Assessment Problems”, 457-471, 2006 Elsevier Ltd., with permission from Elsevier. • Chapter 12, reprinted from ISA Transactions, vol 41, R. Kadali, and B. Huang, “Controller Performance Analysis with LQG Benchmark Obtained under c Closed Loop Conditions”, 521-537, 2002, 2002 ISA. All Rights Reserved. Reprinted with permission.
2 System Identification: Conventional Approach
2.1 Introduction A necessary prerequisite for conventional model-based control is a model of the process. Such certainty-equivalence, model-based control schemes rely on an offline estimated model of the process, i.e., the process is “probed” or excited by a carefully designed input signal under open-loop conditions and the inputoutput data are used to generate a suitable model of the process. Almost always, reduced-complexity models are generated to capture the most dominant dynamics of the process. Such batch or off-line identification methods represent a major effort and may require anywhere from several hours to several weeks of open-loop tests [22]. In contrast to this, the objective in closed-loop identification is to use closedloop operating data with external excitations to develop a dynamic model of the process. Practically, it is an appealing idea. In this mode, process identification can begin with the process in its natural closed-loop state. In some cases, the plant has to run under closed-loop conditions for safety reasons. In other cases, if a linearized dynamic model around a nominal operating point is desired, this may be achieved better by closed-loop identification. Classical open-loop and closed-loop identification theory has been well developed and presented in the celebrated textbooks by Ljung (1999) [11] and S¨ oderstr¨ om and Stoica (1989) [10]. The most common identification methods are the prediction error method [11] and the instrument variable method [10]. As a prerequisite for understanding the remainder of this book, this chapter is devoted to a tutorial-style introduction to the most representative classical system identification method, namely the prediction error method, following the approach of [10, 11]. The closed-loop identification problem will also be reviewed. For in-depth discussion of these and related subjects, readers are referred to [10, 11].
2.2 Discrete-time Systems The data used for system identification is discrete in general while the system itself is continuous. In this section, following an approach taken by [37], we B. Huang et al.: Dyn. Model. Predict. Ctrl. & Perform. Monitor., LNCIS 374, pp. 9–29, 2008. c Springer-Verlag London Limited 2008 springerlink.com
10
2 System Identification: Conventional Approach
will review the conversion between continuous- and discrete-time systems. For detailed derivations, readers are referred to [37] and other textbooks on digital control or sampled-data control systems. 2.2.1
Finite Difference Models
The simplest approach to convert continuous-time models to discrete-time models is through the finite difference technique. A differential equation, dy(t) = f (y(t), u(t)) dt
(2.1)
can be numerically integrated by introducing a finite difference approximation for the derivative. For example, the first-order, backward difference approximation to the derivative is dy(t) yt − yt−1 ≈ (2.2) dt Ts where Ts is the integration interval or sampling interval that is specified by the user, and yt is sampled quantity of y(t). Substituting Equation 2.2 in Equation 2.1 and evaluating function f (y(t), u(t)) at the previous sample of y and x (i.e., yt−1 and un−1 ) gives yt − yt−1 ≈ f (yt−1 , ut−1 ) Ts
(2.3)
yt ≈ yt−1 + Ts f (yt−1 , ut−1 )
(2.4)
or Equation 2.4 is a first-order difference equation that can be used to predict the value of y at time instant t based on information at the previous time sample t−1, namely, yt−1 and f (yt−1 , ut−1 ). This type of expression is called a recurrence relation. It can be used to numerically integrate Equation 2.1 by calculating yt for t = 0, 1, 2, · · · starting from known initial conditions, y0 and the sequence of inputs. In general, the resulting numerical solution, {yt , t = 1, 2, 3, · · · } becomes more accurate and approaches the correct solution y(t) as Ts decreases. However, for extremely small values of Ts , computer roundoff errors can be a significant source of error. Thus, the finite difference method is only an approximation of continuous-time systems by discrete-time ones. 2.2.2
Exact Discretization for Linear Systems
For a system described by a linear differential equation, an alternative discretetime model can be derived based on the analytical solution for a piecewise constant input. This approach yields an exact discrete-time model if the input variable is actually constant between samplings. Thus, this analytical approach eliminates the discretization error inherent in finite difference approximations for the important practical situation where the digital computer output (process input) is held constant between sampling instants. This is indeed the case if the
2.2 Discrete-time Systems
11
digital to analog device acts as a zero-order hold. We shall illustrate the exact discretization procedure through an example. Consider a first-order differential equation: 1 k dy(t) = − y(t) + u(t) dt τ τ
(2.5)
where u(t) is a piecewise continuous signal. For 0 < t ≤ Ts , u(t) is constant, such that u(t) = u(0). Taking the Laplace transform of Equation 2.5 gives 1 k u(0) sY (s) − y(0) = − Y (s) + τ τ s Solving for Y (s), Y (s) =
1 1 ku(0) [ + y(0)] s + 1/τ τ s
and taking inverse Laplace transform gives y(t) = ku(0)(1 − e−t/τ ) + y(0)e−t/τ
(2.6)
Equation 2.6 is valid for all values of 0 < t ≤ Ts . Thus, after one sampling period, at t = Ts , we have y(Ts ) = ku(0)(1 − e−Ts /τ ) + y(0)e−Ts /τ
(2.7)
We can generalize the analysis by considering the time interval, (t − 1)Ts to tTs . For an initial condition y[(t − 1)Ts ] and a constant input, u(t) = u[(t − 1)Ts ] between (t − 1)Ts and tTs , the analytical solution to Equation 2.5 is y(tTs ) = ku[(t − 1)Ts ](1 − e−Ts /τ ) + y[(t − 1)Ts ]e−Ts /τ
(2.8)
Note that the exponential terms are the same as in Equation 2.7. Finally, Equation 2.8 can be written more compactly as yt = e−Ts /τ yt−1 + k(1 − e−Ts /τ )ut−1 The result of exact discretization discussed can be generalized to higher order systems by following a state space representation of systems. 2.2.3
Backshift Operator and Discrete-time Transfer Functions
The backshift operator z −1 is an operator which moves a signal one step back, i.e., z −1 yt = yt−1 . Similarly, z −2 yt = z −1 yt−1 = yt−2 and zyt = yt+1 . It is convenient to use the backshift operator to represent a difference equation. For example, a difference equation yt = 1.5yt−1 − 0.5yt−2 + 0.5ut−1 can be represented as
(2.9)
12
2 System Identification: Conventional Approach
yt = 1.5z −1 yt − 0.5z −2yt + 0.5z −1ut This can be further written as yt 0.5z −1 = ut 1 − 1.5z −1 + 0.5z −2
(2.10)
This equation is also called a discrete transfer function. Time-delays can be easily represented by the backshift operator. For example, e−5s represents 5 units of time-delay for a continuous-time system. If the sampling period is 1 unit, then this delay can be represented as z −5 for the discrete-time system.
2.3 An Example of System Identification: ARX Modeling System identification is the field of modeling dynamics systems from experimental data, often through discrete-time transfer functions or difference equations. For example, if a dynamics system can be represented by a difference equation: yt + a1 yt−1 + . . . + ana yt−na = b1 ut−1 + b2 ut−2 + . . . + bnb ut−nb
(2.11)
then the task of system identification is to estimate parameters {a1 , · · · , ana } and {b1 , · · · , bnb } using plant input and output data {u1 , u2 , · · · , uN } and {y1 , y2 , · · · , yN }. The advantage of linear difference equation models is that they are particularly convenient to estimate the unknown parameters using linear regression. Consider Equation 2.11, corrupted by noise: yt = −a1 yt−1 − a2 yt−2 − · · · − ana yt−na + +b1 ut−1 + b2 ut−2 + · · · + bnb ut−nb + et
(2.12)
where et is white noise. This is also regarded as an AutoRegressive with eXogenous input (ARX) model. The transfer function form of the ARX model is yt =
b1 z −1 + b2 z −2 + · · · + bnb z −nb ut 1 + a1 z −1 + a2 z −2 + · · · + ana z −na 1 + et 1 + a1 z −1 + a2 z −2 + · · · + ana z −na
For example, a second order ARX model can be written as yt = −a1 yt−1 − a2 yt−2 + b1 ut−1 + b2 ut−2 + et
(2.13)
Suppose that after having excited the system with the input sequence: {u1 , u2 , · · · , uN −1 , uN }, the process response data {y1 , y2 , · · · , yN −1 , yN } are measured. Starting from n = 3 in Equation 2.13, we can write y3 = −a1 y2 − a2 y1 + b1 u2 + b2 u1 + e3 y4 = −a1 y3 − a2 y2 + b1 u3 + b2 u2 + e4 ··· = ··· yN −1 = −a1 yN −2 − a2 yN −3 + b1 uN −2 + b2 uN −3 + eN −1 yN = −a1 yN −1 − a2 yN −2 + b1 uN −1 + b2 uN −2 + eN
2.4 Persistent Excitation in Input Signal
13
Then the parameters {a1 , a2 , b1 , b2 } are unknown in these equations. We need at least 4 equations for these four unknown parameters. Because of the equation a1 , a ˆ2 , ˆb1 , ˆb2 } will not be the exactly error {e3 , e4 , · · · , eN , eN +1 }, the solution {ˆ same as {a1 , a2 , b1 , b2 }, i.e., there exists estimation error. We have to find an estimation method that can minimize the error. One of the estimation methods is known as least squares estimation as stated below. Writing the algebraic equations in a vector-matrix form, we obtain ⎞ ⎛ ⎞ ⎞ ⎛ ⎛ ⎛ ⎞ −y2 y3 e3 −y1 u2 u1 ⎟ ⎜ ⎟ a ⎟ ⎜ ⎜ ⎟ ⎜ ⎟⎜ 1⎟ ⎜ ⎟ ⎜ ⎟ ⎜ −y3 ⎟ ⎜ ⎟ ⎜ e4 ⎟ ⎜ y4 −y2 u3 u2 ⎟ ⎜ ⎟⎜a ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟⎜ 2⎟ ⎜ ⎟ ⎜ + ⎟ ⎜··· ⎟ = ⎜··· ⎜ ··· ··· ··· ⎟ ⎜··· ⎟ ⎟ ⎜ ⎟⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ b1 ⎟ ⎜ ⎟ ⎜ ⎜ yN −1 ⎟ ⎜ −yN −2 −yN −3 uN −2 uN −3 ⎟ ⎝ ⎠ ⎜ eN −1 ⎟ ⎠ ⎝ ⎠ b2 ⎠ ⎝ ⎝ yN −yN −1 −yN −2 uN −1 uN −2 eN This can be written as a compact form: Y = Xθ + The well-known least squares solution is then given by θˆ = (X T X)−1 X T Y where θˆ is the estimate of θ and defined as ⎛ ⎞ a ˆ ⎜ 1⎟ ⎜ ⎟ ˆ2 ⎟ ⎜a ⎟ θˆ = ⎜ ⎜ˆ ⎟ ⎜ b1 ⎟ ⎝ ⎠ ˆb2
2.4 Persistent Excitation in Input Signal Up to this stage, we have only discussed estimation of the unknown parameters using the least squares method. We understand that noise et can affect the quality of the estimation. However, in addition to the noise, the input signal, ut , can also affect the accuracy of the estimate. Take a second-order ARX model as an example again: yt = −a1 yt−1 − a2 yt−2 + b1 ut−1 + b2 ut−2 + et Suppose that there is no noise, i.e., et = 0. We therefore expect an exact solution of the 4 unknown parameters by using only 4 equations. However, if the input,
14
2 System Identification: Conventional Approach
ut , has a correlation with ut−1 , say, ut = cut−1 , i.e., u2 = cu1 , u3 = c2 u1 , u4 = c3 u1 , u5 = c4 u1 , then beginning from t = 3 y3 = −a1 y2 − a2 y1 + b1 cu1 + b2 u1 y4 = −a1 y3 − a2 y2 + b1 c2 u1 + b2 cu1 y5 = −a1 y4 − a2 y3 + b1 c3 u1 + b2 c2 u1 y6 = −a1 y5 − a2 y4 + b1 c4 u1 + b2 c3 u1 .. . i.e.,
⎛
y3
⎞
⎛
−y2
⎜ ⎟ ⎜ ⎜ ⎟ ⎜ ⎜ y4 ⎟ ⎜ −y3 ⎜ ⎟ ⎜ ⎜ ⎟ ⎜ ⎜··· ⎟ = ⎜··· ⎜ ⎟ ⎜ ⎜ ⎟ ⎜ ⎜ yN −1 ⎟ ⎜ −yN −2 ⎝ ⎠ ⎝ yN −yN −1
−y1
cu1
−y2 ··· −yN −3 −yN −2
u1
⎞
⎛ ⎞ ⎟ a 1 ⎟ ⎟ ⎟⎜ c2 u 1 cu1 ⎟ ⎟⎜ ⎜ a 2 ⎟⎜ ⎟ = Xθ ⎟⎜ ⎟ ··· ··· ⎟⎜b ⎟ 1 ⎟ ⎟ cN −3 u1 cN −4 u1 ⎟ ⎝ ⎠ ⎠ b2 cN −2 u1 cN −3 u1
It is clear that the last two columns of X matrix are linearly dependent; thus it is not possible to distinguish b1 and b2 in this experiment. Even if there is more data, X T X will not be invertible and the least squares solution does not exist. When the correlation is not perfectly linear, a strong correlation will, however, also inflate (X T X)−1 , and thus inflates the covariance of the estimation since ˆ = V ar(et )(X T X)−1 [10]. In this case, we say that the input sequence is Cov(θ) not persistent exciting. In other words, the estimation error could be infinitely large due to ill-designed input signals. A signal u(t) is said to be persistently exciting (pe) of order n if [10]: 1. the following limit exists: N 1 ut+τ uTt , N →∞ N t=1
ru (τ ) = lim and 2. the matrix
⎛
ru (0)
ru (1) · · · ru (n − 1)
⎜ ⎜ ⎜ ru (−1) ru (0) Ru (n) = ⎜ .. .. ⎜ ⎜ . . ⎝ ru (1 − n) · · ·
is positive definite.
⎞
⎟ ⎟ · · · ru (n − 2) ⎟ ⎟ .. .. ⎟ ⎟ . . ⎠ ···
ru (0)
In general, a necessary condition for consistent estimation of an nth-order linear system is that the input signal is persistently exciting of order 2n [10].
2.5 Model Structures
15
2.5 Model Structures A general model structure is given by [11]: yt = Gp (z −1 ; θ)ut + Gl (z −1 ; θ)et where Gp (z −1 ; θ) is the process/plant transfer function, and Gl (z −1 ; θ) is the disturbance transfer function. We assume: −1 −1 ; θ) and G−1 ; θ)Gp (z −1 ; θ) are asymptotically stable. • G−1 l (z l (z • Gp (0; θ) = 0, Gl (0; θ) = I
However, Gl (z −1 ; θ) is not restricted to be asymptotically stable. Models with unstable Gl (z −1 ; θ) can be useful for describing drift in the data. Some of the most commonly-seen model structures are discussed below. 2.5.1
Prediction Error Model (PEM)
The most general model in the class of commonly used model structures is the prediction error model, illustrated in Figure 2.1 and described by the following equation: B(z −1 ) C(z −1 ) ut + et A(z −1 )yt = −1 F (z ) D(z −1 ) where A(z −1 ) = 1 + a1 z −1 + . . . + ana z −na B(z −1 ) = b1 z −1 + . . . + bnb z −nb C(z −1 ) = 1 + c1 z −1 + . . . + cnc z −nc D(z −1 ) = 1 + d1 z −1 + . . . + dnd z −nd F (z −1 ) = 1 + f1 z −1 + . . . + fnf z −nf The parameter to be estimated is θ = [a1 , . . . , ana , b1 , . . . , bnb , c1 , . . . , cnc , d1 , . . . , dnd , f1 , . . . , fnf ]T 2.5.2
AutoRegressive with Exogenous Input Model (ARX)
By letting C(z −1 ) = D(z −1 ) = F (z −1 ) = 1 in the PEM model, we get an ARX model that has been discussed in the previous section as A(z −1 )yt = B(z −1 )ut + et Alternatively, as illustrated in Figure 2.2, an ARX model can be written as yt =
B(z −1 ) 1 ut + et A(z −1 ) A(z −1 )
Thus, an ARX model is very restrictive comparing to the prediction error model. However, a high order ARX model may be used to approximate most of other model structures [10]. Since an ARX can be estimated easily, this makes an ARX structure attractive as an initial solution to model identification in practice.
16
2 System Identification: Conventional Approach
et
C ( z −1 ) A( z −1 ) D( z −1 )
ut
yt
B ( z −1 ) A( z −1 ) F ( z −1 )
Fig. 2.1. PEM model structure
et
1 A( z −1 )
ut B ( z −1 ) A( z −1 )
yt
Fig. 2.2. ARX model structure
2.5.3
AutoRegressive Moving Average with Exogenous Input Model (ARMAX)
The ARX model is very restrictive. A slight relaxation in the numerator of the disturbance model yields an Autoregressive Moving Average with eXogenous input (ARMAX) model, shown in Figure 2.3, and is given by the following equation A(z −1 )yt = B(z −1 )ut + C(z −1 )et The transfer function form of the ARMAX model is yt =
B(z −1 ) C(z −1 ) u et + t A(z −1 ) A(z −1 )
Thus, an ARMAX model assumes that the process and disturbance models have a common denominator. An ARMAX model is useful in the design of Kalman filter or generalized predictive control (GPC).
2.5 Model Structures
17
et
C ( z −1 ) A( z −1 )
ut
yt
B ( z −1 ) A( z −1 )
Fig. 2.3. ARMAX model structure
2.5.4
Box-Jenkins Model (BJ)
The denominators of both process and disturbance models in ARMAX structure are the same. Relaxing this restrictiveness in the ARMAX model yields the BoxJenkins model, shown in Figure 2.4, and is given in the following equation: yt =
B(z −1 ) C(z −1 ) u et + t F (z −1 ) D(z −1 )
which is a fairly general model in practice where disturbance can have a completely different model from the process.
et
C ( z −1 ) D( z −1 )
ut
yt
B ( z −1 ) F ( z −1 )
Fig. 2.4. BJ model structure
2.5.5
Output Error Model (OE)
If the estimation of the disturbance model is not of interest, a special case of PEM is the OE model, illustrated in Figure 2.5, and given by yt =
B(z −1 ) u t + et F (z −1 )
18
2 System Identification: Conventional Approach
Although the OE model uses et (white noise) as its disturbance, the OE model can be used for other disturbances as well. Because there are fewer parameters to be estimated in the OE model, it is often a good option of model structures in practice.
et ut B ( z −1 ) F ( z −1 )
yt
Fig. 2.5. OE model structure
2.5.6
MISO (Multi-input and Single-output) Prediction Error Model
An l-input and m-output multivariate model may be decomposed into m multiinput and single-output (MISO) models as A(z −1 )yi,t =
2.5.7
B1 (z −1 ) Bl (z −1 ) u ul,t + . . . + 1,t F1 (z −1 ) Fl (z −1 ) C(z −1 ) et i = 1, . . . , m + D(z −1 )
State Space Model
A state space model is typically identified through an innovation or Kalman predictor form: xt+1 = A(θ)xt + B(θ)ut + K(θ)et yt = C(θ)xt + et where A, B, C are system matrices, D matrix is commonly omitted owing to zero-order-hold in the sampling (thus one sample delay results), et is called the innovation sequence, and K is the Kalman predictor gain. This state space model can also be transferred to the general form of the model yt = Gp (z −1 ; θ)ut + Gl (z −1 ; θ)et where Gp (z −1 ; θ) = C(θ)[zI − A(θ)]−1 B(θ) Gl (z −1 ; θ) = I + C(θ)[zI − A(θ)]−1 K(θ) Thus state space models and transfer function models (or difference equation models) are transferable to each other.
2.6 Prediction Error Method
19
2.6 Prediction Error Method 2.6.1
Motivation
The ARX model has the following form: yt = −a1 yt−1 − a2 yt−2 − · · · − ana yt−na + +b1 ut−1 + b2 ut−2 + · · · + bnb ut−nb + et Its estimation is given by θˆ = (X T X)−1 X T Y Let the real model be yt = −a10 yt−1 − a20 yt−2 − · · · − ana0 yt−na + +b10 ut−1 + b20 ut−2 + · · · + bnb0 ut−nb + et Then estimation of the ARX model by applying least squares method is consisN →∞ tent, i.e., θˆ → θ0 . This is verified below. Write ⎛ ⎞ −yt−1 ⎜. ⎟ ⎜. ⎟ ⎜. ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ −yt−na ⎟ ⎜ ⎟ φt = ⎜ ⎟ ⎜ ut−1 ⎟ ⎜ ⎟ ⎜. ⎟ ⎜ .. ⎟ ⎝ ⎠ ut−nb Then the ARX model becomes
yt = φTt θ + et and the real system model is yt = φTt θ0 + et
(2.14)
Furthermore, in the least squares estimation, θˆ = (X T X)−1 X T Y , we have ⎛
φT1
⎞
⎟ ⎜ ⎜ T ⎟ ⎜ φ2 ⎟ ⎟ X=⎜ ⎜ .. ⎟ ⎜. ⎟ ⎠ ⎝ φTN
20
2 System Identification: Conventional Approach
Thus N N φt φTt ]−1 [ φt yt ] θˆ = [ t=1
t=1
N N 1 1 φt φTt ]−1 [ φt yt ] =[ N t=1 N t=1
Replacing yt by Equation 2.14 gives N N 1 1 θˆ = [ φt φTt ]−1 [ φt (φTt θ0 + et )] N t=1 N t=1
= θ0 + [
N N 1 1 φt φTt ]−1 [ φt et ] N t=1 N t=1
N provided [ N1 t=1 φt φTt ]−1 exists, i.e., the input is persistent exciting of sufficient order. As N → ∞, N 1 φt φTt → E[φt φTt ] = Σφ N t=1 N 1 φt et → E[φt et ] N t=1
Thus
θˆ = θ0 + Σφ−1 E[φt et ]
Now whether θˆ is consistent depends on whether E[φt et ] = 0. Since ⎛
⎛ ⎞ ⎞ −yt−1 −E[yt−1 et ] ⎜. ⎜. ⎟ ⎟ ⎜. ⎜. ⎟ ⎟ ⎜. ⎜. ⎟ ⎟ ⎜ ⎜ ⎟ ⎟ ⎜ ⎜ ⎟ ⎟ ⎜ −yt−na ⎟ ⎜ −E[yt−na et ] ⎟ ⎜ ⎜ ⎟ ⎟=0 E[φt et ] = E ⎜ ⎟ et = ⎜ ⎟ ⎜ ut−1 ⎟ ⎜ E[ut−1 et ] ⎟ ⎜ ⎜ ⎟ ⎟ ⎜. ⎜. ⎟ ⎟ ⎜ .. ⎜ .. ⎟ ⎟ ⎝ ⎝ ⎠ ⎠ E[ut−nb et ] ut−nb i.e., white noise is independent of past inputs and outputs, θˆ by least squares is a consistent estimate of θ0 . On the other hand, if et in the ARX model is replaced by vt that is not white noise, then the consistency will be no longer guaranteed. That is to say, all model structures other than ARX may not have the property of the consistency if the (ordinary) least squares method is applied. This procedure of verifying the property of consistency of the ARX model by
2.6 Prediction Error Method
21
applying least squares does give us a hint on how to look for consistency of other model structures. Re-write the ARX model as yt = −a1 yt−1 − a2 yt−2 − · · · − ana yt−na + +b1 ut−1 + b2 ut−2 + · · · + bnb ut−nb + et
= yˆ(t|t − 1) + et where yˆ(t|t − 1) = −a1 yt−1 − · · · − ana yt−na + b1 ut−1 + · · · + bnb ut−nb is one-step ahead prediction of yt at time t−1, based on all information available at time t − 1, namely yt−1 , . . . , yt−na and ut−1 , . . . , ut−nb . The prediction error is yt − yˆ(t|t − 1) = et which is white noise. For other models, can we look for a one-step predictor so that the prediction error is white noise? Does the white-noise prediction error imply certain consistency of identification methods based on the corresponding predictor? The question and answer which we look for are whether such a predictor exists and whether the consistency conclusion can be extended to other model structures. 2.6.2
Optimal Prediction
Prediction error is defined by ε(t, θ) = yt − yˆ(t|t − 1) where yˆ(t|t − 1) now denotes a prediction of yt given all data up to and including time t − 1 (i.e., yt−1 , ut−1 , yt−2 , ut−2 , . . .). Consider the general model structure yt = Gp (z −1 ; θ)ut + Gl (z −1 ; θ)et with the assumption that Gp (0; θ) = 0. A general linear one-step ahead predictor is described as [10]: yˆ(t|t − 1) = L1 (z −1 ; θ)yt + L2 (z −1 ; θ)ut
(2.15)
which is a function of past data if the filters L1 (z −1 ; θ) and L2 (z −1 ; θ) are constrained by L1 (0; θ) = 0
(2.16)
L2 (0; θ) = 0
(2.17)
22
2 System Identification: Conventional Approach
Thus, the prediction error can be further written as ε(t, θ) = Gp (z −1 ; θ)ut + Gl (z −1 ; θ)et − L1 (z −1 ; θ)yt − L2 (z −1 ; θ)ut = Gp (z −1 ; θ)ut + (Gl (z −1 ; θ) − I)et + et − L1 (z −1 ; θ)yt − L2 (z −1 ; θ)ut = (Gp (z −1 ; θ) − L2 (z −1 ; θ))ut + (Gl (z −1 ; θ) − I − L1 (z −1 ; θ))yt + et
= Ψu (z −1 ; θ)ut + Ψy (z −1 ; θ)yt + et Given the conditions Gp (0; θ) = 0, Gl (0; θ) = I, L1 (0; θ) = 0, and L2 (0; θ) = 0, it can be verified that Ψu (0; θ) = 0 Ψy (0; θ) = 0 Namely, both Ψu (z −1 ; θ) and Ψy (z −1 ; θ) have at least one sample time delay. Thus, by expanding transfer functions into impulse response functions, we have Ψu (z −1 ; θ)ut = ψu1 ut−1 + ψu2 ut−2 + . . . Ψy (z −1 ; θ)yt = ψy1 yt−1 + ψy2 yt−2 + . . . Being a future white-noise disturbance relative to Ψu (z −1 ; θ)ut and Ψy (z −1 ; θ)yt , et is independent of both Ψu (z −1 ; θ)ut and Ψy (z −1 ; θ)yt . As a result Cov(ε(t, θ)) = Cov[Ψu (z −1 ; θ)ut + Ψy (z −1 ; θ)yt ] + Cov[et ] Cov(et ) or as a norm expression trace[Cov(ε(t, θ))] ≥ trace[Cov(et )] Therefore, the minimum is given by Cov(et ), i.e., the covariance of white noise et , which is Σe . Consequently, an optimal one-step predictor should give this lower bound as its prediction error. Let’s start from a simple example to demonstrate how the optimal prediction can be derived. Consider the following ARMAX model: yt =
bz −1 1 + cz −1 u + et t 1 + az −1 1 + az −1
The white noise term et can be derived from this equation as et =
1 + az −1 bz −1 (yt − ut ) −1 1 + cz 1 + az −1
(2.18)
The following derivation yields optimal one-step prediction: bz −1 1 + cz −1 ut + et −1 1 + az 1 + az −1 bz −1 (c − a)z −1 = u + (1 + )et t 1 + az −1 1 + az −1 bz −1 (c − a)z −1 = u + et + et t 1 + az −1 1 + az −1
yt =
(2.19)
2.6 Prediction Error Method
23
Replacing et in the second last term of Equation 2.19 by Equation 2.18 yields bz −1 (c − a)z −1 1 + az −1 bz −1 u + (y − u t ) + et t t 1 + az −1 1 + az −1 1 + cz −1 1 + az −1 bz −1 (c − a)z −1 bz −1 = ut + (yt − u t ) + et −1 −1 1 + az 1 + cz 1 + az −1 bz −1 (c − a)z −1 = u + yt + et t 1 + cz −1 1 + cz −1 Let the predictor be yt =
(c − a)z −1 bz −1 u + yt (2.20) t 1 + cz −1 1 + cz −1 which obviously satisfies Equations 2.15-2.17 as a linear one-step ahead predictor. With this prediction, the prediction error is given as yˆ(t|t − 1) =
ε(t, θ) = yt − yˆ(t|t − 1; θ) = et which is white noise; thus the one-step predictor derived in Equation 2.20 is the optimal one-step ahead predictor. For the general model yt = Gp (z −1 ; θ)ut + Gl (z −1 ; θ)et
(2.21)
the optimal prediction and the prediction error can be derived similarly as −1 −1 yˆ(t|t − 1) = G−1 ; θ)Gp (z −1 ; θ)ut + [I − G−1 ; θ)]yt l (z l (z −1 −1 −1 ε(t, θ) = Gl (z ; θ)[yt − Gp (z ; θ)ut ]
(2.22) (2.23)
Combining Equation 2.21 with Equation 2.23, it can be verified that ε(t, θ) = et for the predictor described by Equation 2.22; thus it is an optimal one-step ahead predictor. For a state space model xt+1 = A(θ)xt + B(θ)ut + wt yt = C(θ)xt + et where wt and et are mutually uncorrelated white noise sequences with zero means and covariance matrices R1 (θ) and R2 (θ), the optimal predictor is the Kalman predictor: x ˆ(t + 1|t) = A(θ)ˆ x(t|t − 1) + B(θ)ut + K(θ)[yt − C(θ)ˆ x(t|t − 1)] yˆ(t|t − 1) = C(θ)ˆ x(t|t − 1) where the gain is given by K(θ) = A(θ)P (θ)C T (θ)[C(θ)P (θ)C T (θ) + R2 (θ)]−1 and where P (θ) is the solution of the following algebraic Riccati equation: P (θ) = A(θ)P (θ)AT (θ) + R1 (θ) − K(θ)C(θ)P (θ)AT (θ) Thus, the Kalman filter (innovation) form of the state space models is commonly used in system identification.
24
2 System Identification: Conventional Approach
2.6.3
Prediction Error Method
Estimation of parameters θˆ through the prediction error method is to minimize the prediction error ε(1, θ), ε(2, θ), . . . , ε(N, θ). To define a prediction error one has to make the following choices: • choice of a model structure, • choice of a predictor, i.e., choice of L1 (z −1 ; θ) and L2 (z −1 ; θ), and • choice of a criterion, i.e., measure of the prediction error. In the stochastic framework, L1 (z −1 ; θ) and L2 (z −1 ; θ) are typically chosen to obtain an optimal predictor, as discussed above. The criterion to measure the prediction error can be chosen in many different ways. For illustration, let’s choose the trace of the sample covariance matrix of the prediction errors as the measure:
JN (θ) = trace[ΣN (θ)] = trace[
N 1 ε(t, θ)εT (t, θ)] N t=1
Let the real system be described by yt = Gp (z −1 ; θ0 )ut + Gl (z −1 ; θ0 )et
(2.24)
Substituting this in Equation 2.23 gives −1 ε(t, θ) = G−1 ; θ)[yt − Gp (z −1 ; θ)ut ] l (z −1 −1 ; θ)[Gp (z −1 ; θ0 ) − Gp (z −1 ; θ)]ut + G−1 ; θ)Gl (z −1 ; θ0 )et = G−1 l (z l (z
= Φu (z −1 ; θ, θ0 )ut + Φe (z −1 ; θ, θ0 )et
(2.25)
= Φu (z −1 ; θ, θ0 )ut + (Φe (z −1 ; θ, θ0 ) − I)et + et where −1 Φu (z −1 ; θ, θ0 ) = G−1 ; θ)[Gp (z −1 ; θ0 ) − Gp (z −1 ; θ)] l (z −1 ; θ)Gl (z −1 ; θ0 ) Φe (z −1 ; θ, θ0 ) = G−1 l (z
(2.26) (2.27)
Given Gp (0; θ) = 0 and Gl (0; θ) = I, which also implies Gp (0; θ0 ) = 0 and Gl (0; θ0 ) = I , it can be verified that Φu (0; θ, θ0 ) = 0 Φe (0; θ, θ0 ) − I = 0 Namely, both Φu (0; θ, θ0 ) and (Φe (0; θ, θ0 ) − I) have at least one sample time delay. Consequently, by assuming that ut and et are independent, Cov[ε(t, θ)] = Cov[Φu (z −1 ; θ, θ0 )ut ] + Cov[(Φe (z −1 ; θ, θ0 ) − I)et ] +Cov[et ] Cov[et ] or as a norm expression
2.7 Closed-loop Identification
25
trace[Cov(ε(t, θ))] ≥ trace[Cov(et )] The minimum is, once again, the covariance of the white noise Σe . This minimum is achieved when Φu (z −1 ; θ, θ0 ) = 0
(2.28)
Φe (z −1 ; θ, θ0 ) = I
(2.29)
Comparing these results with Equations 2.26-2.27, achieving the minimu implies Gp (z −1 ; θ) = Gp (z −1 ; θ0 ) Gl (z −1 ; θ) = Gl (z −1 ; θ0 )
(2.30) (2.31)
i.e., consistency is achieved. The procedure of the prediction error method can be summarized below: 1. Select a model structure yt = Gp (z −1 ; θ)ut + Gl (z −1 ; θ)et and determine the corresponding optimal predictor: −1 −1 ; θ)Gp (z −1 ; θ)ut + [I − G−1 ; θ)]yt yˆ(t|t − 1) = G−1 l (z l (z
2. According to the experiment data u1 , u2 , . . . , uN and y1 , y2 , . . . , yN , obtain the predictions yˆ(1|0), yˆ(2|1), . . . , yˆ(N |N − 1), all being functions of θ. The corresponding prediction errors are given by ε(1, θ), ε(2, θ), . . . , ε(N, θ). 3. Write the sample covariance of the prediction error: N 1 ε(t, θ)εT (t, θ) ΣN (θ) = N t=1
ˆ This least squares Minimize ΣN (θ) through its norm (trace), JN (θ), to find θ. problem is generally nonlinear and numerical solutions are typically necessary. As N → ∞, ΣN (θ) → Cov[ε(t, θ)]; if the above optimization procedure yields white noise as the prediction error, then the estimated parameters converge to the true parameters. Thus, the prediction error method has the property of N →∞ consistency for the general model structure, namely θˆ → θ0 .
2.7 Closed-loop Identification Open-loop identification problems have been discussed in the previous sections. What is the difference between open-loop and closed-loop identification? What
26
2 System Identification: Conventional Approach
is the challenge in closed-loop identification? From Figure 2.6 (by adding appropriate parameters), one can observe that two equations exist in the closed-loop system: yt = Gp (z −1 , θ0 )ut + Gl (z −1 , θ0 )et ut = −Gc (z −1 )yt + Gc (z −1 )rt These equations can be written as yt = Gp (z −1 , θ0 )ut + Gl (z −1 , θ0 )et
(2.32)
−1 yt = −G−1 )ut + rt c (z
(2.33)
The input and output data should satisfy both these equations. It is clear that et is the disturbance to Equation 2.32 while rt plays a similar role as a disturbance to Equation 2.33. An identification procedure is to find a model that fits the input and output data ut and yt the best; say fit the following model: yt = Gp (z −1 , θ)ut
(2.34)
Depending on the relative “size” of et and rt , disturbances to the two equations respectively, an identification result may end up with Equation 2.32 (i.e., −1 Gp (z −1 , θ) = Gp (z −1 , θ0 )), Equation 2.33 (i.e., Gp (z −1 , θ) = −G−1 )), or c (z somewhere in between. We shall discuss closed-loop identifiability under several different conditions.
et Gl
rt -
Gc
ut
Gp
yt
Fig. 2.6. Closed-loop system
2.7.1
Identifiability without External Excitations
If rt = 0, i.e., there is no external excitation, Equation 2.34 fits Equation 2.33 −1 perfectly. Any identification algorithm will naturally pick up −G−1 ) as the c (z solution; namely, an inverse of the controller transfer function is identified, and closed-loop identifiability is lost. If Gc (z −1 ) is nonlinear, such as a nonlinear control law, or there is constraint on the control actions and the constraint is active, then it is possible for an identification algorithm to pick up Equation 2.32, i.e., Gp (z −1 , θ0 ) may be identified even without external excitation rt . Since the model to be fitted by data is con−1 ) in Equation 2.33 is nonlinear, strained to a linear model structure but G−1 c (z −1 in this case the linear model Gp (z , θ) of Equation 2.34 may fit Gp (z −1 , θ0 ) of −1 Equation 2.32 better than −G−1 ) of Equation 2.33. c (z
2.7 Closed-loop Identification
2.7.2
27
Direct Closed-loop Identification
If there is at least one sample time delay in Gp (z −1 , θ) and rt is persistent excitation of sufficient order, then as N → ∞, both Gp (z −1 , θ) and Gl (z −1 , θ) can converge to their true values by applying the prediction error method directly to input and output data ut , yt . This is known as the direct method for closed-loop identification [10, 11]. The proof of this result is illustrated through a single-input and single-output system below. Let the real process and disturbance model be Gp (z −1 , θ0 ) and Gl (z −1 , θ0 ), respectively. By replacing Gp and Gl with Gp (z −1 , θ0 ) and Gl (z −1 , θ0 ) in Figure 2.6, in the case of a single-input and single-output system, the real closed-loop input response can be derived as ut =
Gc (z −1 ) −Gl (z −1 , θ0 )Gc (z −1 ) rt + et −1 −1 1 + Gc (z )Gp (z , θ0 ) 1 + Gc (z −1 )Gp (z −1 , θ0 )
(2.35)
Substituting the real system described in Equation 2.24 into the prediction error described in Equation 2.23, gives 1 [yt − Gp (z −1 ; θ)ut ] Gl (z −1 ; θ) Gl (z −1 ; θ0 ) 1 −1 −1 [G et (z ; θ ) − G (z ; θ)]u + = p 0 p t Gl (z −1 ; θ) Gl (z −1 ; θ)
ε(t, θ) =
This is so since we are applying the direct identification method that is no different from open-loop identification in the methodology. Using Equation 2.35 in the equation above, we have ε(t, θ) =
Gc (z −1 ) rt Gl (z −1 ; θ) 1 + Gc (z −1 )Gp (z −1 , θ0 ) −Gl (z −1 , θ0 )Gc (z −1 ) Gl (z −1 ; θ0 ) e et + ] + (2.36) t 1 + Gc (z −1 )Gp (z −1 , θ0 ) Gl (z −1 ; θ) 1
[Gp (z −1 ; θ0 ) − Gp (z −1 ; θ)][
Define sensitivity functions: 1
S(z −1 , θ) =
1+
Gc (z −1 )Gp (z −1 , θ) 1
S(z −1 , θ0 ) =
1+
Gc (z −1 )Gp (z −1 , θ0 )
Then Equation 2.36 can be simplified to ε(t, θ) =
1 [Gp (z −1 ; θ0 ) − Gp (z −1 ; θ)]S(z −1 , θ0 )Gc (z −1 )rt Gl (z −1 ; θ) S(z −1 , θ0 ) Gl (z −1 ; θ0 ) et + (2.37) S(z −1 , θ) Gl (z −1 ; θ)
= Φr (z −1 ; θ, θ0 )rt + Φe (z −1 ; θ, θ0 )et
(2.38)
28
2 System Identification: Conventional Approach
where [Gp (z −1 ; θ0 ) − Gp (z −1 ; θ)]S(z −1 , θ0 )Gc (z −1 ) Gl (z −1 ; θ) −1 −1 S(z , θ0 ) Gl (z ; θ0 ) Φe (z −1 ; θ, θ0 ) = S(z −1 , θ) Gl (z −1 ; θ)
Φr (z −1 ; θ, θ0 ) =
(2.39) (2.40)
Since external excitation rt is independent of white noise et , Equation 2.38 has the same form as Equation 2.25. Following the same proof procedure as the proof of Equations 2.28-2.29, the following equations are obtained when Cov[ε(t, θ)] is minimized: Φr (z −1 ; θ, θ0 ) = 0 Φe (z −1 ; θ, θ0 ) = 1 Thus it can be concluded that, under the closed-loop condition and by applying the direct identification method, when Cov[ε(t, θ)] achieves its minimum Cov[et ], the following equalities hold: Gp (z −1 ; θ) = Gp (z −1 ; θ0 ) S(z −1 , θ) = S(z −1 , θ0 ) Gl (z −1 ; θ) = Gl (z −1 ; θ0 ) i.e., consistency is achieved. 2.7.3
Indirect Closed-loop Identification
The closed-loop identification problem can also be solved through an indirect approach; namely identify the closed-loop transfer function first and then extract Gp (z −1 , θ) from the closed-loop transfer function. The procedure is illustrated below. In the case of a single-input and single-output system, the closed-loop response expression (model) according to Figure 2.6 can be written as yt =
Gc (z −1 )Gp (z −1 , θ) Gl (z −1 , θ) + r et t 1 + Gc (z −1 )Gp (z −1 , θ) 1 + Gc (z −1 )Gp (z −1 , θ)
= M (z −1 , θ)rt + N (z −1 , θ)et This is an open-loop identification problem since rt and et are uncorrelated. Thus M (z −1 , θ) and N (z −1 , θ) can be identified first and then Gp (z −1 , θ) and Gl (z −1 , θ) can be recovered from M (z −1 , θ) and N (z −1 , θ), respectively. For example, in the case of single-input and single-output system, Gp (z −1 , θ) can be calculated as 1 Gp (z −1 , θ) = G (z−1 ) c −1 ) M(z −1 ,θ) − Gc (z The limitation of this approach is that the controller Gc (z −1 ) must be known and the estimated Gp (z −1 , θ) may be of high order. Further model reduction is necessary.
2.8 Summary
2.7.4
29
Joint Input-output Closed-loop Identification
Gc (z −1 ) can also be estimated from data together with Gp (z −1 , θ) and Gl (z −1 ,θ). This approach is known as the joint input-output approach for closed-loop identification. It is illustrated below. In the case of a single-input and single-output system, the closed-loop input and output responses expression (model) can be written as yt =
Gc (z −1 )Gp (z −1 , θ) Gl (z −1 , θ) r et + t 1 + Gc (z −1 )Gp (z −1 , θ) 1 + Gc (z −1 )Gp (z −1 , θ)
= M (z −1 , θ)rt + N (z −1 , θ)et Gc (z −1 ) −Gl (z −1 , θ)Gc (z −1 ) ut = + r et t 1 + Gc (z −1 )Gp (z −1 , θ) 1 + Gc (z −1 )Gp (z −1 , θ)
= P (z −1 , θ)rt + Q(z −1 , θ)et They can be written further as a vector equation: ⎞ ⎞ ⎛ ⎛ ⎞ ⎛ N (z −1 , θ) yt M (z −1 , θ) ⎠ rt + ⎝ ⎠ et ⎝ ⎠=⎝ P (z −1 , θ) Q(z −1 , θ) ut
Since rt and et are uncorrelated, M (z −1 , θ), N (z −1 , θ), P (z −1 , θ), Q(z −1 , θ) can be estimated as an open-loop identification problem, from which Gp (z −1 , θ), Gc (z −1 ), Gl (z −1 , θ) can be extracted. For example Gp (z −1 , θ) can be calculated as M (z −1 , θ) Gp (z −1 , θ) = P (z −1 , θ)
2.8 Summary The most representative classical system identification approach, the prediction error method, has been reviewed in a tutorial form in this chapter. The challenges in closed-loop identification have been discussed. The material presented in this chapter is useful for understanding the chapters which follow, particularly those in the remainder of Part I.
3 Open-loop Subspace Identification
3.1 Introduction Conventionally, a system is modeled by a transfer function, which is a fractional representation of two polynomials with real coefficients, identified using an optimization scheme for a nonlinear least-squares fit to the data, as discussed in Chapter 2. Subspace identification methods offer an alternative identification of a model for the systems and are based on computational tools such as QRfactorization and SVD, which makes them intrinsically robust from a numerical point of view. Subspace identification methods are also non-iterative procedures (avoiding local minima and convergence problems) and may also be converted into an adaptive version of model identification [9]. Subspace identification methods are intrinsically suitable for multivariate systems identification compared to the prediction error methods. This chapter gives an overview of subspace identification methods. A more detailed presentation for specific methods can be found in references such as the books on subspace identification [38, 39], the special issues of the journals Automatica [40, 41] and Signal Processing [42], along with references therein. More algorithms have been added to the literature recently, such as [43, 35, 44, 45, 46, 47]. A variant of subspace identification methods is presented in [48, 49]. Although the principal goal of the methods described in this chapter is to identify the state space system matrices { A, B, C, D }, certain subspace matrices are first calculated as an intermediate step. A significant portion of this book, however, uses only these intermediate matrices for identification, control and monitoring purposes.
3.2 Subspace Matrices Description 3.2.1
State Space Models
There are several forms of state space representation of systems. The innovation form is the one most commonly used in system identification, given by B. Huang et al.: Dyn. Model. Predict. Ctrl. & Perform. Monitor., LNCIS 374, pp. 31–53, 2008. c Springer-Verlag London Limited 2008 springerlink.com
32
3 Open-loop Subspace Identification
xt+1 = Axt + But + Ket yt = Cxt + Dut + et
(3.1) (3.2)
where xt ∈ Rn , ut ∈ Rl , yt ∈ Rm , and et ∈ Rm is white noise (innovations) sequence with covariance Σe . Other forms of state space models have also been used in subspace identification literature, such as
where
⎛
E[⎝
wp sp
xt+1 = Axt + But + wt
(3.3)
yt = Cxt + Dut + st
(3.4)
⎞⎛ ⎠⎝
wp sp
⎞T
⎛
⎠ ]=⎝
Qw
S ws
(S ws )T Rs
⎞
⎠ δpq
where δpq is the Kronecker delta. Another variant of the state space models is represented by xt+1 = Axt + But + Fw wt yt = Cxt + Dut + Hwt + st
(3.5) (3.6)
In Equations 3.5 and 3.6, if we denote wt = Fw wt vt = Hwt + st then this straightforward transformation shows that the model represented by Equations 3.5 and 3.6 is equivalent to that represented by Equations 3.3 and 3.4. It has also been shown [50, 51] that the model represented by Equations 3.3 and 3.4 is also transferrable to that represented by Equations 3.1 and 3.2. This is elaborated below. Equations 3.3 and 3.4 consist of two additive subsystems, one deterministic subsystem driven by deterministic input ut and the other stochastic subsystem driven by noise wt and st . Extracting the stochastic subsystem, we get xst+1 = Axst + wt yts
=
Cxst
+ st
Define Λ = E[(yts )(yts )T ] Σ s = E[(xst )(xst )T ]
(3.7) (3.8)
3.2 Subspace Matrices Description
33
It has been shown [50, 51] that the model represented by Equations 3.3 and 3.4 can be transferred to the innovation form represented by Equations 3.1 and 3.2 with E[et eTt ] = Σe = Λ − CΣ s C T K = (Gs − AΣ s C T )(Λ − CΣ s C T )−1 where Σ s = AΣ s AT + (Gs − AΣ s C T )(Λ − CΣ s C T )−1 (Gs − AΣ s C T )T Gs = AΣ s C T + S ws Λ = CΣ s C T + Rs Owing to the equivalence of different representations of the state space model, following the common practice, we shall focus on the innovation form represented by Equations 3.1 and 3.2 in the sequel unless specified otherwise. 3.2.2
Notations and Subspace Equations
From Equation 3.1, we can derive, for t = N , xN +1 = AxN + BuN + KeN
(3.9)
xN +2 = AxN +1 + BuN +1 + KeN +1
(3.10)
For t = N + 1, Substituting Equation 3.9 in Equation 3.10 yields xN +2 = A(AxN + BuN + KeN ) + BuN +1 + KeN +1 u e N N = A2 xN + AB B + AK K uN +1 eN +1 Continue this procedure for t = N +2, t = N +3, until t = 2N −1. For t = 2N −2 ⎞ ⎛ uN ⎟ ⎜ ⎜ uN +1 ⎟ N −1 N −2 N −3 ⎟ ⎜ x2N −1 = A xN + A BA B · · · B ⎜ .. ⎟ ⎠ ⎝. u2N −2 ⎞ ⎛ eN ⎟ ⎜ ⎜ eN +1 ⎟ N −2 N −3 ⎟ ⎜ (3.11) + A KA K · · · K ⎜ .. ⎟ ⎠ ⎝. e2N −2
For t = 2N − 1
34
3 Open-loop Subspace Identification
x2N
⎛
uN ⎜ ⎜ uN +1 = AN xN + AN −1 B AN −2 B · · · B ⎜ ⎜ .. ⎝. ⎛
u2N −1 ⎞
eN ⎟ ⎜ ⎜ eN +1 ⎟ N −1 N −2 ⎟ ⎜ + A KA K · · · K ⎜ .. ⎟ ⎠ ⎝. e2N −1
⎞ ⎟ ⎟ ⎟ ⎟ ⎠ (3.12)
From Equations 3.2, we can derive, for t = N , yN = CxN + DuN + eN
(3.13)
yN +1 = CxN +1 + DuN +1 + eN +1
(3.14)
For t = N + 1,
Substituting Equation 3.9 in Equation 3.14 yields yN +1 = C(AxN + BuN + KeN ) + DuN +1 + eN +1 uN eN = CAxN + CB D + CK I uN +1 eN +1 Similarly for t = N + 2, using Equations 3.2 and 3.10, we can derive
yN +2
⎞ ⎛ ⎞ uN eN ⎟ ⎜ ⎟ ⎜ = CA2 xN + CAB CB D ⎝ uN +1 ⎠ + CAK CK I ⎝ eN +1 ⎠ uN +2 eN +2
⎛
Continue this procedure until t = 2N − 1. For t = 2N − 1, using Equations 3.2 and 3.11, we can derive
y2N −1
⎛
uN ⎜ ⎜ u N +1 = CAN −1 xN + CAN −2 B CAN −3 B · · · D ⎜ ⎜ .. ⎝.
u2N −1
⎛
⎞
eN ⎟ ⎜ ⎜ eN +1 ⎟ ⎟ N −2 N −3 ⎜ + CA K CA K · · · I ⎜ .. ⎟ ⎠ ⎝. e2N −1
⎞ ⎟ ⎟ ⎟ ⎟ ⎠
3.2 Subspace Matrices Description
35
Assembling the results, for t = N, N + 1, . . . , 2N − 1, yields a matrix equation: ⎛ ⎞ ⎛ ⎞ C yN ⎜ ⎟ ⎜ ⎟ ⎜ yN +1 ⎟ ⎜ CA ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ yN +2 ⎟ = ⎜ CA2 ⎟ xN + ⎜ ⎟ ⎜ ⎟ ⎝ ··· ⎠ ⎝ ··· ⎠ y2N −1 CAN −1 ⎛ ⎞ ⎞⎛ uN D 0 0 ··· 0 ⎜ ⎟ ⎟⎜ ⎜ CB D 0 · · · 0 ⎟ ⎜ uN +1 ⎟ ⎜ ⎟ ⎟⎜ + ⎜ CAB CB D · · · 0 ⎟ ⎜ uN +2 ⎟ ⎜ ⎟ ⎟⎜ ⎝ ··· ··· ··· ··· ···⎠⎝ ··· ⎠ N −2 N −3 N −4 CA B CA B CA B ··· D u2N −1 ⎛ ⎞ ⎞⎛ I 0 0 ··· 0 eN ⎜ ⎟ ⎟⎜ ⎜ CK I 0 · · · 0 ⎟ ⎜ eN +1 ⎟ ⎜ ⎟ ⎟⎜ + ⎜ CAK (3.15) CK I · · · 0 ⎟ ⎜ eN +2 ⎟ ⎜ ⎟ ⎟⎜ ⎝ ··· ··· ··· ··· ···⎠⎝ ··· ⎠ e2N −1 CAN −2 K CAN −3 K CAN −4 K · · · I Adding the time subscripts of the variables with 1, Equation 3.15 changes to ⎞ ⎛ ⎞ ⎛ C yN +1 ⎟ ⎜ ⎟ ⎜ ⎜ yN +2 ⎟ ⎜ CA ⎟ ⎟ ⎜ ⎟ ⎜ 2 ⎜ yN +3 ⎟ = ⎜ CA ⎟ xN +1 + ⎟ ⎜ ⎟ ⎜ ⎝ ··· ⎠ ⎝ ··· ⎠ y2N CAN −1 ⎛ ⎞ ⎞⎛ uN +1 D 0 0 ··· 0 ⎜ ⎟ ⎟⎜ ⎜ CB D 0 · · · 0 ⎟ ⎜ uN +2 ⎟ ⎜ ⎟ ⎟⎜ + ⎜ CAB CB D · · · 0 ⎟ ⎜ uN +3 ⎟ ⎜ ⎟ ⎟⎜ ⎝ ··· ··· ··· ··· ···⎠⎝ ··· ⎠ CAN −2 B CAN −3 B CAN −4 B · · · D u2N ⎛ ⎞ ⎞⎛ eN +1 I 0 0 ··· 0 ⎜ ⎟ ⎟⎜ ⎜ CK I 0 · · · 0 ⎟ ⎜ eN +2 ⎟ ⎜ ⎟ ⎟⎜ + ⎜ CAK CK I · · · 0 ⎟ ⎜ eN +3 ⎟ ⎜ ⎟ ⎟⎜ ⎝ ··· ··· ··· ··· ···⎠⎝ ··· ⎠ CAN −2 K CAN −3 K CAN −4 K · · · I e2N
Continue this procedure by adding the time subscripts of the variables with 2, 3, until j − 1. Assembling the resultant j matrix equations, column by column, gives
36
3 Open-loop Subspace Identification
⎞ ⎛ ⎞ C · · · yN +j−1 ⎜ yN +1 · · · yN +j ⎟ ⎜ CA ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ yN +2 ⎟ = ⎜ CA2 ⎟ x x · · · y N +j+1 · · · x N N +1 N +j−1 ⎜ ⎟ ⎜ ⎟ ⎝ ··· ··· ··· ⎠ ⎝ ··· ⎠ · · · y2N +j−2 y2N −1 CAN −1 ⎞ ⎛ ⎞⎛ uN uN +1 · · · uN +j−1 D 0 0 ··· 0 ⎟ ⎜ ⎟⎜ ⎜ CB D 0 · · · 0 ⎟ ⎜ uN +1 uN +2 · · · uN +j ⎟ ⎟ ⎜ ⎟⎜ + ⎜ CAB CB D · · · 0 ⎟ ⎜ uN +2 uN +3 · · · uN +j+1 ⎟ ⎟ ⎜ ⎟⎜ ⎝ ··· ··· ··· ··· ···⎠⎝ ··· ··· ··· ··· ⎠ CAN −2 B CAN −3 B CAN −4 B · · · D u2N −1 u2N · · · u2N +j−2 ⎛ ⎞ ⎞⎛ I 0 0 ··· 0 eN eN +1 · · · eN +j−1 ⎜ ⎟ ⎟⎜ ⎜ CK I 0 · · · 0 ⎟ ⎜ eN +1 eN +2 · · · eN +j ⎟ ⎜ ⎟ ⎟⎜ + ⎜ CAK CK I · · · 0 ⎟ ⎜ eN +2 eN +3 · · · eN +j+1 ⎟ ⎜ ⎟ ⎟⎜ ⎝ ··· ··· ··· ··· ···⎠⎝ ··· ··· ··· ··· ⎠ CAN −2 K CAN −3 K CAN −4 K · · · I e2N −1 e2N · · · e2N +j−2 (3.16) ⎛
yN
yN +1 yN +2 yN +3 ··· y2N
This completes the derivation of the subspace matrix equation for the output expressions. Next, we shall consider subspace state equation. Adding the time subscripts of the variables in Equation 3.12 with −N gets ⎞ ⎛ u0 ⎟ ⎜ u1 ⎟ ⎜ xN = AN x0 + AN −1 B AN −2 B · · · B ⎜ .. ⎟ ⎠ ⎝. uN −1 ⎞ ⎛ e0 ⎟ ⎜ e1 ⎟ ⎜ (3.17) + AN −1 K AN −2 K · · · K ⎜ .. ⎟ ⎠ ⎝. eN −1
Continuing adding time subscripts of the variables in Equation 3.12 with −(N − 1), −(N − 2), · · · , until −(N − j + 1). By adding time subscripts of the variables in Equation 3.12 with −(N − j + 1), we get ⎞ ⎛ uj−1 ⎟ ⎜ ⎟ ⎜ uj N ⎟ ⎜ N −1 N −2 xN +j−1 = A xj−1 + A BA B · · · B ⎜ .. ⎟ ⎠ ⎝. ⎛
uN +j−2 ⎞
ej−1 ⎟ ⎜ ⎟ ⎜ ej ⎟ N −1 N −2 ⎜ + A KA K · · · K ⎜ .. ⎟ ⎠ ⎝. eN +j−2
(3.18)
3.2 Subspace Matrices Description
By assembling all the state equations just derived, we get N xN xN +1 · · · xN +j−1 = A x0 x1 · · · xj−1 ⎞ ⎛ u0 u1 · · · uj−1 ⎟ ⎜ ⎜ u u2 · · · uj ⎟ + AN −1 B AN −2 B · · · B ⎜ 1 ⎟ ⎝ ··· ··· ··· ··· ⎠ uN −1 uN · · · uN +j−2 ⎞ ⎛ e0 e1 · · · ej−1 ⎟ ⎜ ⎜ e e2 · · · ej ⎟ + AN −1 K AN −2 K · · · K ⎜ 1 ⎟ ⎝ ··· ··· ··· ··· ⎠ eN −1 eN · · · eN +j−2
37
(3.19)
The above derivations lead to the following subspace matrix equations: d s Yf = ΓN Xf + HN U f + HN Ef d s Yp = ΓN Xp + HN Up + HN Ep
(3.20) (3.21)
Xf = AN Xp + ΔdN Up + ΔsN Ep
(3.22)
Equation 3.20 is a short-hand version of Equation 3.16, with detailed notations to be defined shortly. Equation 3.21 can be derived in a same way as that of Equation 3.16. Equation 3.22 is a short-hand version of Equation 3.19. Equations 3.20-3.22 are fundamental subspace equations in subspace literature, where subscript p stands for the “past” and f for the “future”. The past and future input block-Hankel matrices are defined as ⎞ ⎛ u0 u1 · · · uj−1 ⎟ ⎜ ⎜ u u2 · · · uj ⎟ (3.23) Up = U0|N −1 = ⎜ 1 ⎟ ⎝ ··· ··· ··· ··· ⎠ uN −1 uN · · · uN +j−2 ⎞ ⎛ uN uN +1 · · · uN +j−1 ⎟ ⎜ ⎜ u u · · · uN +j ⎟ Uf = UN |2N −1 = ⎜ N +1 N +2 (3.24) ⎟ ⎝ ··· ··· ··· ··· ⎠ u2N −1 u2N · · · u2N +j−2
where Up , Uf ∈ RlN ×j ; the subscript t1 |t2 indicates that the first column of the subspace matrix/vector starts from time t1 and ends by time t2 . The notation of ‘past’ and ‘future’ can be understood from the first column of the block-Hankel matrices. The data in the first column of the ‘future’ block-Hankel matrix follows data in the first column of the ‘past’ block-Hankel matrix in the time sequence. This relation holds for all columns in fact. Note that, in Equations 3.23 and 3.24, the row dimension of Uf could be different from that of Up , which would provide an extra freedom to tune the identification algorithm. For simplicity of presentation, we assume that past and future input block-Hankel matrices have the same dimension.
38
3 Open-loop Subspace Identification
The past/future output and innovation block-Hankel matrices Yp , Yf ∈ RmN×j , Ep , Ef ∈ RmN ×j , respectively, are defined conformably with Up , Uf as
Yp =
Yf =
Ep =
Ef =
⎛
⎞ y0 y1 · · · yj−1 ⎜ ⎟ yj ⎟ ⎜ y1 y2 · · · Y0|N −1 = ⎜ ⎟ ⎝ ··· ··· ··· ··· ⎠ yN −1 yN · · · yN +j−2 ⎞ ⎛ yN yN +1 · · · yN +j−1 ⎟ ⎜ ⎜ yN +1 yN +2 · · · yN +j ⎟ YN |2N −1 = ⎜ ⎟ ⎝ ··· ··· ··· ··· ⎠ y2N −1 y2N · · · y2N +j−2 ⎞ ⎛ e0 e1 · · · ej−1 ⎟ ⎜ ej ⎟ ⎜ e1 e2 · · · E0|N −1 = ⎜ ⎟ ⎝ ··· ··· ··· ··· ⎠ eN −1 eN · · · eN +j−2 ⎞ ⎛ eN eN +1 · · · eN +j−1 ⎟ ⎜ e · · · eN +j ⎟ ⎜e EN |2N −1 = ⎜ N +1 N +2 ⎟ ⎝ ··· ··· ··· ··· ⎠ e2N −1 e2N · · · e2N +j−2
(3.25)
(3.26)
(3.27)
(3.28)
where j should be chosen ‘sufficiently large’ (so that the data Hankel matrices contain sufficient ‘sample size’ to identify the system), and typically j max(mN, lN ) (‘very rectangular’ block Hankel matrices), as this reduces noise sensitivity [9]. Thus j plays a similar role as the number of observations in regression analysis. N is closely related to the ‘order’ of the system to be identified; thus N cannot be too small, but it cannot be too large, as a large N causes an ‘over-parameterization’ problem. Since the input and output are of l and m dimensions, respectively, each element in the above data Hankel matrices is a column vector of inputs and outputs, i.e., ⎛
⎞ ui1 ⎜ ⎟ . ⎟ ui = ⎜ ⎝ .. ⎠ uil
⎛
⎞ yi1 ⎜ ⎟ . ⎟ yi = ⎜ ⎝ .. ⎠ yim
The state matrices are defined as Xp = X0 = x0 x1 · · · xj−1 Xf = XN = xN xN +1 · · · xN +j−1
where Xp , Xf ∈ Rn×j .
(3.29) (3.30)
3.2 Subspace Matrices Description
39
The extended observability matrix ΓN is given as ⎞ ⎛ C ⎜ CA ⎟ ⎟ ⎜ ⎟ ⎜ ΓN = ⎜ CA2 ⎟ ⎟ ⎜ ⎝ ··· ⎠ CAN −1
where ΓN ∈ RmN ×n . The reversed extended controllability matrices ΔdN and ΔsN are given as follows: ΔdN = AN −1 B AN −2 B · · · AB B ΔsN = AN −1 K AN −2 K · · · AK K
Namely, they are the reversed extended controllability matrices of {A, B} and {A, K} respectively. d s and HN are given by The lower triangular block-Toeplitz matrices HN ⎛ ⎞ D 0 0 ··· 0 ⎜ ⎟ ⎜ CB D 0 ··· 0 ⎟ ⎜ ⎟ d (3.31) HN = ⎜ CAB CB D ··· 0 ⎟ ⎜ ⎟ ⎝ ··· ··· ··· ··· ···⎠ CAN −2 B CAN −3 B CAN −4 B · · · D ⎛ ⎞ I 0 0 ··· 0 ⎜ ⎟ ⎜ CK I 0 ··· 0 ⎟ ⎜ ⎟ s (3.32) HN = ⎜ CAK CK I ··· 0 ⎟ ⎜ ⎟ ⎝ ··· ··· ··· ··· ···⎠ CAN −2 K CAN −3 K CAN −4 K · · · I d s where HN ∈ RmN ×lN , HN ∈ RmN ×mN . By substituting equation (3.21) in equation (3.22) we can write ⎡
Yp
⎤
⎥ ⎢ ⎥ ⎢ d s Xf = AN ΓN† ΔdN − AN ΓN† HN ΔsN − AN ΓN† HN ⎢ Up ⎥ ⎦ ⎣ Ep
(3.33)
where † represents the Moore-Penrose pseudo-inverse. In subspace identification literature, the following short-hand notation is often used: ⎛ ⎞ Yp Wp = ⎝ ⎠ Up where Wp ∈ R(mN +lN )×j .
40
3 Open-loop Subspace Identification
3.3 Open-loop Subspace Identification Methods In principle, subspace identification methods can be described as the estimation of a covariance model from observed data followed by stochastic realization [52]. In the open-loop subspace state space identification methods, the sequence of the future states, Xf , and the extended observability matrix, ΓN , are estimated using Equations 3.20-3.22, and the estimation requires thatthe pair {A, C} is completely observable. Furthermore, the pair {A, B KΣe1/2 } is required to be
controllable. All modes must also be sufficiently excited (persistent excitation). Note that even though the deterministic subsystem can have unstable modes, the excitation ut has to be chosen in such a way that the deterministic state and output are bounded at all times. Additionally, the deterministic and stochastic subsystem may have common or completely decoupled input-output dynamics. If the pair {A, C} is observable, then the rank of ΓN equals the state order n. Subspace identification involves estimating a basis for the states of the system from the data Hankel matrices. It must be remembered that the states identified using these techniques do not have any physical meaning. The different subspace identification techniques available in the literature also differ in the manner in which the basis of the state space is estimated. The choices for a basis differ in a transformational matrix T that transforms a model {A, B, C, D } into an equivalent model {T −1 AT, T −1B, CT, D } [53]. The numerical tools used in the estimation of this basis range from SVD (used in [9] and N4SID [38]), QR-decomposition (used in MOESP [54, 55, 56]), canonical variables analysis (used in CVA [57, 58, 59, 60, 61, 62, 63]), etc. Some subspace identification methods also differ on how the disturbances are characterized. In the following sections, we will give an overview of several commonly-seen subspace identification methods.
3.4 Regression Analysis Approach The following derivations [64] are based on Equations 3.1 and 3.2 by shifting the time subscripts: xt = Axt−1 + But−1 + Ket−1
(3.34)
yt−1 = Cxt−1 + Dut−1 + et−1
(3.35)
Combining Equations 3.35 and 3.34 yields xt = (A − KC)xt−1 + Kyt−1 + (B − KD)ut−1
(3.36)
Recursively substituting Equation 3.36 results in xt = (A − KC)N xt−N + (A − KC)N −1 (Kyt−N + (B − KD)ut−N ) + + . . . + (A − KC)(Kyt−2 + (B − KD)ut−2 ) + +Kyt−1 + (B − KD)ut−1
(3.37)
Collecting equations corresponding to t = N, N + 1, . . . , N + j − 1 according to Equation 3.37, the following form of a subspace matrix equation can be obtained:
3.4 Regression Analysis Approach
Xf = Φy Yp + Φu Up + Φx Xp
41
(3.38)
where we note that Φx = (A − KC)N Since K is the Kalman filter gain, Φx represents error dynamics of the Kalman filter; thus as N → ∞, Φx → 0 owing to the stability of a Kalman filter. As a result, for large N , Equation 3.38 converges to Xf = Φy Yp + Φu Up
(3.39)
This result will also be reiterated in the CVA method (to be discussed) for subspace identification. Using Wp , Equation 3.39 can be written as (3.40) Xf = Φy Φu Wp = Lp Wp Substituting Equation 3.40 in 3.20, one can write
Yf = Lw Wp + Lu Uf + Le Ef
(3.41)
where • Lw : subspace matrix corresponding to the past inputs and outputs, • Lu : subspace matrix corresponding to the deterministic inputs, and Lu = d , HN s • Le : subspace matrix corresponding to the stochastic inputs, and Le = HN . In view of Equations 3.20 and 3.41, we can see that Lw = ΓN Lp
(3.42)
Subspace matrices can be identified from data Hankel matrices using regression techniques. Several regression techniques, including the use of weighting matrices, have been suggested in the literature for numerical advantages and special cases of data collection. The simplest regression method is the least squares method. The conditions of the process data for using regressions techniques, as evident from Equation 3.41, are as follows: • The deterministic input ut is uncorrelated with et . • ut is persistently exciting. • The number of measurements is sufficiently large. With the above conditions satisfied, the open-loop identification of the subspace matrices using the least squares solution involves finding the prediction of the future outputs Yf using a linear predictor: Yˆf = Lw Wp + Lu Uf
(3.43)
where the parameters to be determined contain system model information. An alternative derivation of Equation 3.43 will be given in Chapter 10 as well.
42
3 Open-loop Subspace Identification
The prediction Yˆf can be found by solving a least squares problem: Wp min ||Yf − Lw Lu ||2F Lw ,Lu Uf where the subscript F stands for Frobenius norm. The solution is given by [38]:
Lw Lu
= Yf
Wp Uf
†
= Yf
WpT
UfT
Wp ( Uf
−1 Wp Uf )
(3.44)
This algorithm1 can also be implemented in a numerically robust way with a QR-decomposition [65, 66, 38, 48, 54] or using PLS [49]. Following is one of the algorithms: Perform QR decomposition: ⎛ ⎞ ⎛ ⎞⎛ ⎞ Wp Q1 R11 0 0 ⎜ ⎟ ⎜ ⎟⎜ ⎟ ⎝ Uf ⎠ = ⎝ R21 R22 0 ⎠ ⎝ Q2 ⎠ Yf Q3 R31 R32 R33 Let
L= Then L can be solved as L=
Lw Lu
R31 R32
R11 0 R21 R22
†
The subspace matrices Lw , Lu and Le will be useful for predictive control and monitoring applications in several chapters of this book. For conventional subspace identification, however, the system matrices A, B, C, D need to be retrieved from results of the regression approach. To retrieve system matrices, the following SVD needs to be performed first: V1T Σ1 L w = U1 U2 (3.45) Σ2 V2T Theoretically, for a finite dimension of state space models, there exists Σ2 = 0 and the dimension of Σ1 determines the dimension of the state space matrix A. In practice, however, Σ2 does not need to be zero owing to the noise in the data. In most subspace literature, it is suggested that the dimension of Σ2 be 1
Strictly speaking, Lw Lu in Equation 3.44 is an estimate and should have been written with a notation different from the original. However, following the common practice in subspace literature without causing confusion, we do not make notational difference for this as well as other subspace matrices between the estimate and the original.
3.5 Projection Approach and N4SID
43
determined through the inspection of the number of significant singular values. A statistical procedure will be presented shortly through Equation 3.77. According to [51] 1/2
ΓN = U1 Σ1 ˆ f = Σ 1/2 V T Wp X 1 1
(3.46) (3.47)
ˆ f consists of the predicted state sequence as a function of past input where X and output data. Equation 3.46 can be understood from Equation 3.42, where we see that the column space of ΓN is the same as that of Lw . Thus the matrix 1/2 U1 Σ1 , spanned by column space of Lw , is an estimate of ΓN . Equation 3.47 can be explained from Equations 3.40, 3.42, 3.45, and 3.46. With the state sequence available, the system matrices may be extracted using several methods. It can be shown that [67, 51] XN +1 = A B XN + K EN |N (3.48) YN |N CD UN |N I where the convention of the subscripts used in Equation 3.48 follows that defined in Equations 3.23-3.30; XN and XN +1 can be constructed by the predicted ˆ f . Another simple linear regression analysis of Equation 3.48 state sequences X will yield an estimation of system matrices A, B, C, D, K. Computational details can be found in [67, 51]. However, because of the non-uniqueness owing to the similarity transformation of state space models, care has to be taken to ensure both XN +1 and XN are extracted from the same basis. This issue has been discussed in [67, 51] and will be further discussed when deriving Equation 4.51. To avoid this basis selection problem, alternatively one may adopt the CVA procedure [57, 58, 59, 60, 61, 62, 68] which will be addressed via Equation 3.76.
3.5 Projection Approach and N4SID 3.5.1
Projections
The following two projections are important for deriving the projection approach. Orthogonal projection: The orthogonal projection of the row space of A on to the row space of B is denoted by A/B and can be calculated through A/B = AB † B where B † is the pseudo inverse of B. In MATLAB , the pseudo inverse can be calculated by using the function pinv or B T (BB T )−1 . Oblique projection: When a vector projects on to two non-orthogonal vectors, the projection is called oblique projection. The oblique projection of the row space of A ∈ Rp×j along the row space of B ∈ Rq×j on the row space of
44
3 Open-loop Subspace Identification
C ∈ Rr×j is defined as A/B C and can be calculated via (following MATLAB matrix notation) A/B C = [A/B ⊥ ][C/B ⊥ ]† = D(:, 1 : r)C where
⎛
D = A⎝
C B
(3.49)
⎞† ⎠
where D(:, 1 : r) denotes the extraction of the matrix elements from the 1st column to the rth column, and B ⊥ is the orthogonal complement of B. The oblique projection, A/B C, can also be visualized through Figure 3.1, where vector A is projected into B and C.
B
A
A /B C
C
Fig. 3.1. Illustration of oblique projection
Two important properties of the oblique projection are often used and they are A/A C = 0
(3.50)
A/B A = A
(3.51)
These two properties are straightforward results by the definition of the oblique projection or by visualizing the corresponding projection through Figure 3.1. 3.5.2
Non-steady-state Kalman Filters
With the above definitions, a non-steady-state Kalman filter can be derived from Equations 3.21-3.22, according to the following procedure:
3.5 Projection Approach and N4SID
45
First, from Equation 3.21, one can obtain −1
s ) Ep = (HN
−1
s Yp − (HN )
−1
s ΓN Xp − (HN )
d HN Up
(3.52)
Substituting Equation 3.52 in Equation 3.22 yields s −1 Xf = (AN − ΔsN (HN ) ΓN )Xp
s −1 s −1 d + ΔsN (HN ) ΔdN − ΔsN (HN ) HN Wp
(3.53)
s −1 Xf /R Wp = (AN − ΔsN (HN ) ΓN )Xp /R Wp
−1 s s −1 d + ΔsN (HN ) ΔdN − ΔsN (HN ) HN Wp /R Wp
(3.54)
It is easy to see that Xf is the non-steady state solution of Kalman filter state with Xp as its initial value since it is derived from Equations 3.21-3.22, which are the solutions of the innovation form of the state space model, Equations 3.1 and 3.2. The innovation form of the state space equations is the Kalman filter [69, 11]. In subspace literature, one often performs an oblique projection of Equation 3.53 on to Wp through R where R is any non-zero constant matrix of appropriate dimension
Clearly, Wp /R Wp = Wp , by the property of the oblique projection, Equation 3.51. Therefore, Equation 3.54 can be simplified to −1
s ) ΓN )Xp /R Wp Xf /R Wp = (AN − ΔsN (HN
s −1 s −1 d + ΔsN (HN ) ΔdN − ΔsN (HN ) HN Wp
(3.55)
s −1 ) ΓN )Xp /Wp Xf /Wp = (AN − ΔsN (HN
−1 s s −1 d + ΔsN (HN ) ΔdN − ΔsN (HN ) HN Wp
(3.56)
ˆ f = Xf /Wp X
(3.57)
ˆ p = Xp /Wp X
(3.58)
Comparing Equation 3.55 with Equation 3.53, one can see that both serve as the solutions of the innovation state space equations 3.1 and 3.2. In Equation 3.53, Xf is the Kalman filter state with Xp as its initial condition. In ˆ f is also a Kalman filter state2 but with the initial Equation 3.55, Xf /R Wp = X ˆ condition Xp /R Wp = Xp . Following the same line, if we now make an orthogonal projection of Equation 3.53 onto Wp , we should have
and we can equally treat Xf /Wp as another Kalman filter state solution but with Xp /Wp as its initial condition, i.e.,
with the initial state as 2
Since the Kalman filter state is a function of past input and output data Wp (i.e., a ˆ f to reflect this fact. prediction), we have used the notation X
46
3 Open-loop Subspace Identification
3.5.3
Projection Approach for Subspace Identification
Let’s now revisit Equation 3.20 d s Yf = ΓN Xf + HN U f + HN Ef
(3.59)
The essential system information is contained in the extended observability matrix ΓN or in the state Xf . That is, the first term on the right hand side of Equation 3.59 deserves special attention. To calculate, for example, ΓN from Equation 3.59, one has to remove the terms containing Uf and Ef . If Ef is independent of past input Up , past output Yp (or equivalently their combination Wp ), and future input Uf , then one can easily achieve the above objective by performing an oblique projection of Equation 3.59 along the row space Uf onto the row space of Wp , i.e., d s Yf /Uf Wp = ΓN Xf /Uf Wp + HN Uf /Uf Wp + HN Ef /Uf Wp
(3.60)
It is easy to see why the last two terms of Equation 3.60 are zero: Uf /Uf Wp = 0 is by the property of the oblique projection according to Equation 3.50; Ef /Uf Wp = 0 is based on the assumption that future white-noise disturbance is independent of past inputs/outputs and future inputs. This assumption holds under the open-loop condition. Thus Equation 3.60 can be simplified to Yf /Uf Wp = ΓN Xf /Uf Wp
(3.61)
This result indicates that the column space of ΓN is the same as the column space of Yf /Uf Wp , which can be calculated by the SVD decomposition of Yf /Uf Wp . Similarly, the row space of Yf /Uf Wp is the same as the row space of Xf /Uf Wp , which is the Kalman filter state solution with Xp /Uf Wp as its initial condition, as has been discussed before. Therefore, the Kalman state sequences can also be calculated from the SVD decomposition of Yf /Uf Wp . Subsequently, the system state space matrices can be recovered either from the extended observability matrix or from the Kalman state sequences. This is a solution to the open-loop subspace system identification problem through projection. It has been shown in Equation 3.40 as well as in [51] that the state matrix Xf can be written as linear combination of Wp ; thus by performing oblique projection of Equation 3.41, we get Yf /Uf Wp = Lw Wp
(3.62)
This result has related the projection approach to the regression approach. Thus by performing SVD of Lw , as shown in Equation 3.45, ΓN and Xf can be estimated according to Equations 3.46 and 3.47. The system matrices may be estimated according to Equation 3.48. This procedure is known as N4SID (Numerical subspace state space identification). Alternatively, following the approach of MOESP (MIMO output error state space model identification) [54], system matrices may also be extracted from subspace matrices as discussed in the following.
3.5 Projection Approach and N4SID
47
It is noted that Γ¯N = Γ N A
(3.63)
where Γ¯N is ΓN by removing its first m rows, and Γ N is ΓN by removing its last m rows. Thus A = Γ † Γ¯N N
C = ΓN (1 : m, :) namely, the system matrix C can be simply extracted from the first m rows of ΓN . It can be further shown [70] that d (ΓN⊥ )T Yf Uf† = (ΓN⊥ )T HN
(3.64)
where ΓN⊥ is the orthogonal complement of ΓN with full rank. This equation can be rewritten, with term-by-term correspondence to Equation 3.64, as [51]
M1 M2 . . . MN
⎛
⎞ ··· 0 ⎜ ··· 0 ⎟ ⎟ ⎜ ⎜ ⎟ ··· 0 ⎟ = L1 L2 . . . LN ⎜ ⎜ ⎟ ⎝ ··· ··· ··· ··· ···⎠ CAN −2 B CAN −3 B CAN −4 B · · · D D CB CAB
0 D CB
0 0 D
By algebraic rearrangement, an explicit equation results [51]: ⎛
⎞ L L · · · L L 1 2 N −1 N M ⎛ ⎞⎛ ⎞ ⎜ 1⎟ ⎜ L2 L3 · · · LN 0 ⎟ ⎟ I ⎜ M2 ⎟ ⎜ 0 D N ⎟ ⎜ . ⎟=⎜ ⎠⎝ ⎠ 0 ⎟⎝ L3 L4 · · · 0 ⎜ . ⎟ ⎜ ⎟ ⎝ . ⎠ ⎜ 0 ΓN (1 : end − m, :) B ⎝··· ··· ··· ··· ··· ⎠ MN 0 LN 0 0 · · · ⎛
⎞
which is a set of linear equations for solving D and B. One may ask, at this stage, whether the calculation of the column space of Yf /Uf Wp alone is sufficient for the estimation of ΓN . The answer is that the exact value of ΓN is immaterial owing to the similarity transformation of the state space model. That is to say, the state space matrices extracted from the column space of ΓN are all equivalent up to the similarity transformation. However, there may exist an optimal solution. For an in-depth discussion, readers are referred to Overschee and De Moor(1996)[67] and Favoreel (1999)[51]. Other related references include [34, 58, 60, 65, 66, 38, 48, 49]. The above procedure can also be explained as a linear regression based on the multi-step ahead prediction error method with certain rank constraints [71].
48
3 Open-loop Subspace Identification
Rather than performing singular value decomposition on Lw , if the following weighted SVD is performed: ⎛ ⎞ ⎞⎛ Σ1 V1T ⎠ ⎠⎝ (3.65) W1 Lw W2 = U1 U2 ⎝ 0 V2T then according to [51], ΓN = W1−1 U1 Σ1 ˆ f = Σ 1/2 V T W −1 Wp X 1 2 1 1/2
(3.66) (3.67)
This is known as a unified approach to various subspace algorithms [51]. The MOESP algorithm, which will be discussed in detail in the next section, corresponds to the following weights: W1 = I and W2 = I −Uf⊥ Uf . This interpretation of subspace identification unifies various subspace identification algorithms including N4SD and CVA [51].
3.6 QR Factorization and MOESP MOESP [54] considers the following state space model:
xt+1 = Axt + But + Fw wt
(3.68)
yt = Cxt + Dut + Hwt + st
(3.69)
where the process noise wt and measurement noise st are zero-mean white noise, both independent of the input ut . The approach taken by MOESP is through QR factorization and then a procedure similar to statistical correlation analysis, by extensive use of the property of white noise. Owing to the use of QR factorization for solving the problem, the MOESP is expected to be quite robust numerically. Let the input ut be persistent excitation of sufficient order. Perform the following QR factorization: ⎛ ⎞ ⎛ ⎞⎛ ⎞ Q1 Uf L11 0 0 0 ⎜ ⎟ ⎜ ⎟⎜ ⎟ ⎜ Up ⎟ ⎜ L21 L22 0 0 ⎟ ⎜ Q2 ⎟ (3.70) ⎜ ⎟=⎜ ⎟⎜ ⎟ ⎝ Yp ⎠ ⎝ L31 L32 L33 0 ⎠ ⎝ Q3 ⎠ Yf Q4 L41 L42 L43 L44 and SVD:
L42 L43
=
⎛
U1 U2 ⎝
Σ1 0 0 0
⎞⎛ ⎠⎝
V1T V2T
⎞ ⎠
(3.71)
3.7 Statistical Approach and CVA
49
Then the column space of U1 is the same as that of ΓN . Consequently, if T is a non-singular n × n matrix, then ΓN = U1 T 1/2
As an example, taking T = Σ1 , then 1
ΓN = U1 Σ12 Hence the system order is obtained using the SVD. From the estimated ΓN , system matrices A and C are derived. In addition, the following identity also holds from the QR factorization and SVD procedure: (U1⊥ )T
L31 L32 L41
L21 L22 L11
†
d = (U1⊥ )T HN
(3.72)
d can be solved from Equation 3.72. Thus HN d With the estimation of ΓN and HN , the system matrices A, B, C, D can be extracted following the same procedure as discussed in the previous section.
3.7 Statistical Approach and CVA 3.7.1
CVA Approach
CVA (canonical variate analysis) considers the same model structure as that of MOESP, shown in Equations 3.68 and 3.69. In this method canonical variables are used to provide an ordered basis of the state space, ordered in terms of the predictive ability of the states [57, 58, 59, 60, 61, 62, 63]. The canonical correlations between ‘the past, Wp ’ and ‘the future outputs, Yf , conditional on the future inputs, Uf ’ are used as the basis of the state space. If n is the true and finite state order and there is sufficient information to determine reliably the order n, then the first n canonical variables give an optimal selection of system states in terms of maximizing the likelihood function [58]. The CVA [60, 58, 63] algorithm uses Akaike information criteria (AIC) to determine the state order. Following is the detailed algorithm. Define data column vectors over a horizon of N for the output, input, and their combination, respectively, as ⎞ ⎛ y ⎞ ⎞ ⎛ ⎛ ⎟ ⎜ t−N ⎜ .. ⎟ yt ut ⎜ . ⎟ ⎟ ⎟ ⎜ ⎜ ⎟ ⎜ .. .. ⎟ ⎟ ⎜ ⎜ yt−1 ⎟ ⎟ , uf = ⎜ ⎟ , wp = ⎜ . . (3.73) yf = ⎜ ⎟ ⎜ ⎟ ⎟ ⎜ ⎜ ⎜ ut−N ⎟ ⎝ yt+N −2 ⎠ ⎝ ut+N −2 ⎠ ⎜ . ⎟ ⎜ . ⎟ yt+N −1 ut+N −1 ⎝ . ⎠ ut−1
50
3 Open-loop Subspace Identification
By eliminating the effect of future input uf from future output yf , u y˜f = yf − Σyf uf Σu−1 f uf f where y˜f denotes yf after eliminating the effect of uf , Σyf uf = Cov(yf , uf ), and Σuf uf = Cov(uf , uf ). Σyf uf can be estimated from matrices Yf and Uf , and Σuf uf can be estimated from matrix Uf . It is argued [57, 58, 59, 60, 61, 62, 68] that a fundamental property of a linear, time invariant, and strict sense controlled Markov process with the state order n is the existence of a n-dimensional state xt , which can be written as a linear function of the past input-output wp and expressed as xt = Jn wp
(3.74)
where Jn denotes n rows of a matrix J to be determined. Note that this property has also been shown in Equation 3.40. With state xt , the optimal linear prediction y˜f is given by ˆf = Σy˜f xt Σx−1x xt y˜ t
t
Thus the CVA problem boils down to the optimal solution of the state or alternatively the matrix Jn , to minimize the prediction error. The prediction error is measured by ˜f )T Λ† (˜ yf − yˆ˜f ) (3.75) E(˜ yf − yˆ where Λ† is an arbitrary quadratic weighting that can be singular. For an optimal solution, Λ = Σy f y f which, according to [60], results in a near maximum likelihood system identification procedure. The state estimation problem is the following optimization problem: min E(˜ yf − yˆ ˜f )T Λ† (˜ yf − yˆ ˜f ) xt
yf − Σy˜f xt Σx−1 x )T Λ† (˜ yf − Σy˜f xt Σx−1 x) = min E(˜ t xt t t xt t xt
The solution is given by the generalized singular value solution: JΣwp wp J T = I LΛLT = I JΣwp y˜f LT = diag(r1 , r2 , . . . , rn , 0, . . . , 0) where r1 ≥ r2 ≥ . . . ≥ rn > 0 are generalized singular values. The estimation of Jn is given by the first n rows of J. In practice, those theoretically “zero” singular values may not be zero. Thus a rank determination procedure needs to be performed and will be discussed shortly.
3.8 Instrument-variable Methods and EIV Subspace Identification
51
Once the state is estimated according to Equation 3.74, the state space model can be calculated as follows [68]: ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ xt+1 xt x x AB ⎠ = Cov(⎝ ⎠ , ⎝ ⎠)Cov −1 (⎝ t ⎠ , ⎝ t ⎠) ⎝ (3.76) CD yt ut ut ut
Note that Van Overschee and De Moor [66, 51] have classified the CVA method as a special case under a unifying framework and the difference between different subspace algorithms can be expressed in terms of the weighting matrices involved in the SVD. They have argued that, in Equation 3.65, if W1 = [(Yf /Uf⊥ )(Yf /Uf⊥ )T ]−1/2 and W2 = I − Uf⊥ Uf , then the CVA algorithm is obtained. 3.7.2
Determination of System Order
The AIC criterion is given by [60, 72] AICN (n) = j(m(1 + ln2π) + ln|Σn |) + 2δn Mn
(3.77)
where j
Σn = and
1 T t j t=1 t
t = yt − yˆt represents the residual, where yˆt is the one-step prediction based on the identified model. Mn is the total number of unknown parameters in the model. For the CVA model given by Equations 3.68 and 3.69, Mn = (2n + l)m + lm + m(m + 1)/2 The correcting factor, δn , accounting for the limited number of samples, is given by j δn = j − ( Mmn + m+1 2 ) The model order n or the number of significant singular values is the one that minimizes AICN (n).
3.8 Instrument-variable Methods and EIV Subspace Identification Several subspace identification methods can also be unified under the instrument variable framework [73]. With an appropriate selection of the instrument variables, error-in-variable (EIV) problem can also be dealt with.
52
3 Open-loop Subspace Identification
Consider the following EIV state space model, originated from Equations 3.3 and 3.4 without measurement noise: xt+1 = Axt + But + γt yt = Cxt + Dut In the EIV case, both yt and ut cannot be measured directly. The measured input and output are given by u∗t = ut + vt yt∗ = yt + st where γt , vt , st are white noise. With this EIV system, a subspace matrix equation can be derived as γ d v Uf∗ + HN Vf + HN Γf + Sf Yf∗ = ΓN Xf + HN
(3.78)
where Vf , Γf , Sf are data Hankel matrices corresponding to the white noise vt , γt , st , respectively. If the following instrument variable is selected: ⎞ ⎛ Up∗ ⎠ Π=⎝ Yp∗
then it is clear that Π is independent of Vf , Γf , Sf owing to the fact that past input and output are independent of future white noise. Right multiplying Equation 3.78 by 1j Π T yields 1 ∗ T 1 1 d ∗ T 1 v 1 γ 1 Y Π = ΓN Xf Π T + HN U f Π + HN Vf Π T + HN Γf Π T + Sf Π T j f j j j j j
As j → ∞, the following result is obtained: 1 ∗ T 1 1 d ∗ T Y Π = ΓN Xf Π T + HN Uf Π j f j j
(3.79)
To obtain the column space of ΓN , multiply Equation 3.79 by the orthogonal complement of Uf∗ Π T . This gives 1 ∗ T ∗ T ⊥ 1 Y Π (Uf Π ) = ΓN Xf Π T (Uf∗ Π T )⊥ j f j Thus ΓN can be estimated through SVD of 1j Yf∗ Π T (Uf∗ Π T )⊥ . With availability of ΓN , following the same procedure as discussed in previous sections, the system matrices can be retrieved. Detailed derivations of the instrument variable method for subspace identification can be found in [73]. It is claimed that the above procedure can be applied to closed-loop data, provided that the controller is causal, there is at least one sample time delay in the controller, and the closed-loop system is asymptotically stable. The limitation, however, lies in the assumption that there is at least one sample time delay in the controller.
3.9 Summary
53
Remark 3.1. If there is no measurement noise on the input, i.e., vt = 0, then the following instrument variable is also suggested [73]: ⎞ ⎛ Up ⎟ ⎜ ⎟ ⎜ Π = ⎜ Yp∗ ⎟ ⎠ ⎝ Uf
In an earlier paper by Chou and Verhaegen [74], a similar approach for EIV subspace identification based on QR factorization was proposed and the results are provided below. Perform the following QR factorization: ⎛ ⎞ ⎛ ⎞⎛ ⎞ Uf Up∗T Uf Yp∗T Q1 R11 0 ⎝ ⎠=⎝ ⎠⎝ ⎠ Yf∗ Up∗T Yf∗ Yp∗T Q2 R21 R22 Then as j → ∞,
1 1 d √ R21 = √ {ΓN Xf Up∗T Yp∗T QT1 + HN R11 } j j 1 1 √ R22 = √ ΓN Xf Up∗T Yp∗T QT2 j j
(3.80) (3.81)
According to Equation 3.81, the column space of ΓN is the same as that of √1 R22 ; thus SVD of √1 R22 will provide an estimate of ΓN following the N N same SVD procedure as discussed previously in this chapter. Combining Equad can be calculated. Thus all A, B, C, D matrices can be tions 3.80 and 3.81, HN d retrieved from ΓN and HN .
3.9 Summary An overview of a variety of open-loop subspace identification methods has been presented in this chapter. Many new subspace identification methods are being added to the literature, indicating a consistent interest in this direction. Many well-known model predictive control systems have also taken a subspace approach as options in building dynamic models, indicating the practicality of subspace identification. Nevertheless, open-loop subspace identification has been considered a matured subject. Many problems and current research interest lie in closed-loop subspace identification, which will be discussed in the next chapter.
4 Closed-loop Subspace Identification∗
4.1 Introduction The problem of closed-loop identification has been investigated for over 30 years. Important issues such as identifiability under closed-loop conditions have received attention by many researchers [75, 76, 77, 10]. A number of identification strategies have been developed [11, 10]. Closed-loop identification refers to the identification of process models using the data sampled under feedback control. Correlation between the disturbances entering the process and the input offers a fundamental limitation [28, 29, 11, 30, 10] for utilizing the standard open-loop identification methods with closed-loop data. Several closed-loop parametric model identification methods have been suggested in the literature which require either certain assumptions about the model structure or knowledge of the controller model. The closed-loop identification methods found in the literature are broadly classified into direct, indirect and joint input/output identification methods [29]. See Chapter 2 and references in [78, 29, 79] for a review of the features and limitations of different classical closed-loop identification methods. Closed-loop identification has attracted increasing interest over the last two decades owing to the work of integrated identification and control. The key idea in the integrated identification and control strategy (as opposed to a ‘disjoint’ or separate identification and control) is to identify and control with the objective of optimizing a control performance criterion. This topic has received attention under such headings as: control-relevant identification, iterative identification and control, etc. Readers are referred to [80, 81, 82] for detailed discussions on these topics. After the establishment of the classical prediction error methods (PEM) for system identification, the subspace identification method is a relatively new approach used for the state space model identification. In this approach, certain subspace matrices of the process are, at the first step, calculated through data projection [65, 66, 67]. These matrices include, for example, an extended observability matrix, a lower triangular block-Toeplitz matrix, and/or state sequences ∗
Section 4.3 is reprinted from Journal of Process Control, vol 15, Huang, B., S.X. Ding, J. Qin, “Closed-loop Subspace Identification: an Orthogonal Projection c 2005 Elsevier Ltd., with permission from Elsevier. Approach”, 53-66,
B. Huang et al.: Dyn. Model. Predict. Ctrl. & Perform. Monitor., LNCIS 374, pp. 55–78, 2008. c Springer-Verlag London Limited 2008 springerlink.com
56
4 Closed-loop Subspace Identification
as discussed in Chapter 3. These subspace matrices are directly calculated from the input/output data Hankel matrices without any iteration compared to the iterative or nonlinear optimization schemes used in the prediction error methods. At the second step, the system state space matrices are recovered from either the extended observability matrix and the lower triangular block-Toeplitz matrix or directly from the state sequences. Identification of the subspace matrices from closed-loop data has also received attention from a number of researchers [31, 13, 14, 30, 32, 33]. It is found that the conventional open-loop subspace identification algorithm yields a biased estimate when applied to closed-loop data [30]. Van Overschee and De Moor(1997) [33] proposed an N4SID based method for closed-loop subspace identification which requires knowledge of the first i impulse response coefficients of the controller, where i is the maximum order of the state space model to be identified. Through analysis of closed-loop subspace methods, Ljung and McKelvey(1996) [30] presented a method for the identification of subspace matrices from closedloop data using estimated predictors according to ARX models. MOESP and CVA approaches are also proposed for the identification of state space models using closed-loop data [57, 58, 59, 60, 61, 62, 63, 83]. Another class of subspace system identification is called the instrument variable methods [34, 84, 72], discussed in Chapter 3. In the class of instrument variable methods, the effect of disturbances is eliminated by appropriate selection of instrument variables that are independent of the disturbances. Under the framework of MOESP, Chou and Verhaegen (1997) [34] developed an instrument variable subspace identification algorithm as discussed in Chapter 3. It is claimed that the algorithm works for closed-loop systems provided there is at least one sample time delay in the controller, which can be restrictive in practice. Aiming at solving the open-loop error-in-variable (EIV) identification problem, Wang and Qin (2002) [72] developed an instrument variable subspace identification method via principal component analysis, but it may also deliver a bias in closed-loop identification. Using the subspace EIV model structure of [72], in this chapter we will develop a novel closed-loop subspace identification algorithm through an orthogonal subspace projection, and then use either the extended observability matrix/lower block-Toeplitz matrix or the Kalman filter states resulting from the projection, to extract system models. The models are obtained using projections of subspace matrices, as is the case in most other subspace identification algorithms, and has certain additional properties compared to the instrument variable methods. This chapter also provides an analysis of a bias problem in subspace closedloop identification and subsequently discusses a solution to this problem. The remainder of this chapter is organized as follows. An overview of the existing closed-loop subspace identification methods is presented in Section 4.2. The main results, a novel subspace closed-loop identification approach and a solution to eliminate the bias under closed-loop conditions, are discussed in Section 4.3. Simulation results are presented in Section 4.3.5, followed by summary in Section 4.4.
4.2 Review of Closed-loop Subspace Identification Methods
57
4.2 Review of Closed-loop Subspace Identification Methods Figure 4.1 shows a general representation of process under feedback control, referred to as closed-loop system. Closed-loop data cannot be used with the open-loop subspace methods for model identification because of the correlations between ut and et . Several approaches have been used to circumvent this problem in the literature. The various closed-loop subspace identification approaches will be reviewed in this section.
et
wt Gl
εt
rt -
uε ,t Gc1
ut
Gp
yt
Gc 2 Fig. 4.1. Closed-loop system for subspace identification
4.2.1
N4SID Approach
The approach proposed in [33] makes use of only the process input-output data to identify the model Gp . In this framework, Gc1 = 1; Gc2 = Gc ; wt = 0 and rt does not have to be measured, and the knowledge of the first few impulse responses (Markov parameters) of the controller need to be known. The controller equations can be written as xct+1 = Ac xct + Bc yt ut = rt − Cc xct − Dc yt
(4.1) (4.2)
where { Ac , Bc , Cc , Dc } are the controller’s system matrices, and xct is the controller state. The controller cannot be unstable. Following the notations of data Hankel matrices as defined in Chapter 3, certain pseudo data matrices are constructed from the input and output data Hankel matrices using the knowledge of the controller Markov parameters:
c Y0|2N −1 Np|q = U0|2N −1 + H2N
c Mp|q = Up|q + Hp−q+1 Yp|q
(4.3) (4.4)
58
4 Closed-loop Subspace Identification
c where 0 ≤ p ≤ q ≤ 2N − 1, H2N is the lower triangular block-Toeplitz matrix for the controller. Np|q and Mp|q can be shown to be uncorrelated with the disturbances [33]. On the assumptions that:
1. 2. 3. 4.
rt is uncorrelated with the disturbances; the matrix N0|2N −1 has a full row rank 2mN ; j −→ ∞; the closed-loop problem is well posed, i.e., (I + DDc ) is invertible;
then it is shown [33] that ⎛
U0|N −1
⎞
⎟ ⎜ ⎟ ⎜ Z = YN |2N −1 / ⎜ Y0|N −1 ⎟ ⎠ ⎝ MN |2N −1 ⎛ ⎞ N0|2N −1 ⎠ = YN |2N −1 / ⎝ Y0|N −1
ˆ f + H d MN |2N −1 ] = TN [ΓN X N
where
d c −1 HN ) TN = (I + HN
Therefore, to obtain the state estimation apply the following oblique projection and SVD: ⎛ ⎞ U0|N −1 ⎠ O = YN |2N −1 /MN |2N −1 ⎝ (4.5) Y0|N −1 ˆf = TN ΓN X =
(4.6)
⎛
⎞⎛
⎞
Σ1 0 VT ⎠⎝ 1 ⎠ U1 U2 ⎝ 0 0 V2T
(4.7)
where state order = rank(O). Then it can be seen that 1/2
TN ΓN = U1 Σ1 ˆ f = Σ 1/2 V1T X 1 States sequences can therefore be extracted. A state space model for the process is subsequently identified from the estimated states. However, it is pointed by [33] that the state sequence estimated in this way will give a slightly biased result for extracting system matrices; a better algorithm has also been proposed, which is obtained through a more complex procedure. In [33] there exist two algorithms of calculating state space matrices according to the estimated states and another algorithm according to ΓN . The limitation of this method is the requirement of the knowledge of the controller. In practice, accurate knowledge of the impulse response (IR) coefficients/Markov parameters of the controller may not always be available.
4.2 Review of Closed-loop Subspace Identification Methods
4.2.2
59
Joint Input-Output Approach
The overall strategy of these methods is similar to the joint input-output identification strategy familiar in conventional closed-loop system identification. These methods do not require the knowledge of the controller. Apart from setpoint excitation, these approaches [58, 85, 83, 86] need an additional external excitation, wt , added to the controller output, uε,t , to make the process input independent of noise, as shown in Figure 4.1, where Gc1 = Gc and Gc2 = 1; rt is the setpoint that is a white noise sequence; et is the white noise (disturbance) added to the process output, yt , through a noise model, Gl . Then the measurable input T T vector is wtT rtT and measurable output vector is uTε,t εTt uTt ytT , where uε,t = Gc εt = ut − wt and εt = rt − yt . Using this information a global state space model is first identified using the MOESP algorithm. The global state space model is denoted as:
wt (4.8) xt+1 = Axt + B1 B2 rt ⎛ ⎞ ⎛ ⎞ ⎞ ⎛ C1 uε,t D11 D12 ⎜ ⎟ ⎜ ⎟ ⎟ ⎜ ⎜ εt ⎟ ⎜ C2 ⎟ ⎜ D D ⎟ wt (4.9) ⎜ ⎟ = ⎜ ⎟ xt + ⎜ 21 22 ⎟ ⎝ ut ⎠ ⎝ C3 ⎠ ⎝ D31 D32 ⎠ rt yt C4 D41 D42 xt has an order n + nc , where n is the order of the process, Gp , and nc is the order of the controller, Gc . The state space representation for the transfer functions Hw→y and Hw→u is given by the system matrices [A, B1 , C4 , D41 ] and [A, B1 , C3 , D31 ] respectively. Using the rules for concatenating and inverting state space models, the process model in the state space form, Gsp , and the controller model in the state space form, Gsc , are obtained as: Gsp = [A, B1 , C4 , D41 ][A, B1 , C3 , D31 ]−1
and
−1 −1 −1 −1 = [A, B1 , C4 , D41 ][A − B1 D31 C3 , B1 D31 , −D31 C3 , D31 ] ⎞ ⎛ ⎞ ⎞ ⎛⎛ −1 −1 B1 D31 A −B1 D31 C3 −1 ⎠ ⎠,⎝ ⎠ , C4 −D41 D−1 C3 , D41 D31 = ⎝⎝ 31 −1 −1 0 A − B1 D31 D3 B1 D31
Gsc = [A, B2 , C3 , D32 ][A, B2 , C2 , D22 ]−1 ⎞ ⎛ ⎞ ⎞ ⎛⎛ −1 −1 B2 D22 A −B2 D22 C2 −1 ⎠ ⎠,⎝ ⎠ , C3 −D32 D−1 C2 , D32 D22 = ⎝⎝ 22 −1 −1 0 A − B2 D22 D2 B2 D22
The overall deterministic state space model is first identified. The order selection in this step needs the specification of the sum of process model order and
60
4 Closed-loop Subspace Identification
controller model order [83]. The individual plant and controller model orders are determined/selected in the subsequent model reduction step. The approach of [86] based on CVA uses practically the same approach as MOESP except that the CVA algorithm is used to identify the state space matrices of the global closed-loop system, with AIC used in the selection of the state order. The state space representation is then converted to the transfer function representation, followed by the concatenation and inversion of the closed-loop transfer functions to obtain the open-loop process transfer function. 4.2.3
ARX Prediction Approach
Ljung and McKelvey (1996) [30] provide an insight into the problem of using the conventional subspace methods for closed-loop identification, and then demonstrate a feasible approach for closed-loop subspace identification. Consider an impulse response model for the process with input ut and noise et yt =
∞
Fku ut−k +
k=0
∞
Fke et−k
(4.10)
k=0
of which a predictor without considering future control actions can be extracted as yˆ(t + j|t) =
∞
Fku ut+j−k +
k=j
∞
Fke et+j−k
(4.11)
k=j
Define a prediction vector: ⎞ yˆ(t + 1|t) ⎟ ⎜ .. ⎟ ⎜ Yr (t) = ⎜ ⎟ . ⎠ ⎝ yˆ(t + r|t) ⎛
Then it can be shown [30] that the system order n is the rank of Yr (t) where r ≥ n. The Kalman filter state can be reconstructed by picking up a basis from Yr (t), namely xt = LYr (t) The prediction yˆ(t + j|t) can be solved through the following equation that is derived directly from Equation 4.10 by replacing time subscript t with t + j: yt+j =
∞
Fku ut+j−k
k=0
+
∞
Fke et+j−k
(4.12)
k=0
Equation 4.12 can be transferred to: yt+j =
j−1
Fku ut+j−k +
k=0
=
j−1
k=0
Fku ut+j−k +
∞
u Fk+j ut−k +
k=0 ∞
k=0
u Fk+j ut−k +
j−1
Fke et+j−k +
k=0 j−1
k=0
Fke et+j−k +
∞
e Fj+k et−k
k=0 ∞
k=0
e F˜j,k yt−k
(4.13)
4.2 Review of Closed-loop Subspace Identification Methods
61
The last term of Equation 4.13 is obtained by replacing es with ys for s ≤ t since there is one to one correspondence between the innovation es and the output ys [30]. By truncating infinite series into finite ones, Equation 4.13 is approximated by yt+j =
j−1
k=0
Fku ut+j−k +
nb
k=0
u Fk+j ut−k +
na
e F˜j,k yt−k + t+j
(4.14)
k=0
j−1 where t+j = k=0 Fke et+j−k . By removing the first term and the last term on the right hand side of Equation 4.14, we get an approximate prediction without considering the control actions and noise after time t. Thus the prediction defined by Equation 4.11 can be approximated by the middle two terms on the right hand side of Equation 4.14 by noting the correspondence between the innovation et and the output yt . As a result, the key to reconstructing the state is to estimate Equation 4.14. It is shown by [30] that the estimation of Equation 4.14 is consistent if and only if et+j , . . . , et+1 and ut+j , . . . , ut+1 are independent. This condition will not hold if there is feedback control. However, if there is one sample time delay in the system and j = 1 (one step prediction), according to [30] estimation of Equation 4.14 will be consistent despite of the feedback control. Note that, with j = 1, Equation 4.14 becomes an ARX model. Consequently, to circumvent the inconsistency one can recursively use one-step prediction to derive two-step prediction, then three-step, until r-step prediction. In summary, the various steps of this approach are: 1. find yˆ(t + j|t) for j = 1, ..., r by recursively using the one-step prediction of the ARX model; 2. form Yr (t) for t = 1, ..., N and estimate its rank n; 3. select L to obtain a well conditioned basis and x(t) = LYr (t), for t = 1, ..., N ; 4. find the matrices {A, B, C, D, K} by using the estimated states. Note that the states are identified from the estimated predictors and not directly from the data, unlike the previously illustrated approaches. The authors of [30] state that this method is only a ‘feasible’ method rather than the ‘best way’ of identifying systems operating in closed-loop. 4.2.4
An Innovation Estimation Approach
According to Equation 3.42, Equation 3.41 can be written further as the following form: Yf = ΓN Lp Wp + Lu Uf + Le Ef (4.15) There are N block rows in Equation 3.41. Qin and his coworkers [35, 44, 45] proposed a method that partitions Equation 3.41 into N block rows and used the estimated innovations from previous rows to further estimate model parameters of the next row sequentially.
62
4 Closed-loop Subspace Identification
The ith block row equation can be written as Yf i = ΓN i Lp Wp + Lui Uf i + L− ei Ef,i−1 + Ei
(4.16)
where Yf i is the ith block row of Yf , ΓN i the ith block row of ΓN , Lui the ith block row of Lu by excluding the last j − i block columns (they are actually zeros), L− ei the ith block row of Le by excluding the last j − i + 1 block columns, Uf i the first i block rows of Uf , Ef,i−1 the first i − 1 block rows of Ef , and Ei the ith block row of Ef . The above structure is due to the lower triangular nature of Lu and Le . For i = 1, Equation 4.16 can be written as Yf 1 = ΓN 1 Lp Wp + Lu1 Uf 1 + E1
(4.17)
E1 is uncorrelated with Wp since it is future white noise disturbance relative to Wp . Consider that there is a zero-order-hold in the system, i.e, D = 0; thus Lu1 = D = 0. Equation 4.17 is simplified to Yf 1 = ΓN 1 Lp Wp + E1 Hence ΓN 1 Lp can be estimated as ΓN 1 Lp = Yf 1 Wp† The innovation E1 can then be estimated as E1 = Yf 1 − ΓN 1 Lp Wp For i = 2, . . . , N , Yf i = ΓN i Lp Wp + Lui Uf i + L− ei Ef,i−1 + Ei where
⎛
Ef,i−1
E1
(4.18)
⎞
⎟ ⎜ ⎟ ⎜ ⎜ E2 ⎟ ⎟ =⎜ ⎟ ⎜ .. ⎟ ⎜. ⎠ ⎝ Ei−1
has been estimated from the previous steps. If there is at least one sample time delay in the controller, then Ei is uncorrelated with Uf i . It follows from Equation 4.18 that ⎛ ⎞ †
ΓN i Lp Lui
W ⎜ p ⎟ ⎜ ⎟ − = Y ⎜ ⎟ Lei U fi ⎝ fi ⎠ Ef,i−1
4.3 An Orthogonal Projection Approach
63
Assembling all ΓN i Lp for i = 1, 2, . . . , N yields ⎛
ΓN 1 Lp
⎞
⎜ ⎟ ⎜ ⎟ ⎜ ΓN 2 Lp ⎟ ⎟ ΓN Lp = ⎜ ⎜ .. ⎟ ⎜. ⎟ ⎝ ⎠ ΓN N Lp which is an estimate of Lw . The remaining steps to extract system matrices are the same as that given in Equations 3.45, 3.46, and 3.47.
4.3 An Orthogonal Projection Approach In this section, a closed-loop subspace identification procedure based on orthogonal projection [36] is presented. By detailed analysis of feedback effect on subspace projection, a source that introduces bias error to closed-loop subspace identification is identified, and a procedure is proposed for elimination of the bias error. The analysis presented in this section is informative for understanding the limitations of closed-loop subspace identification and for the path to the solutions. Thus this section is elaborated in detail. 4.3.1
A Solution through Orthogonal Projection
Open-loop subspace identification has been discussed in Chapter 3. One of the solutions, N4SID, is obtained through the oblique projection of the following equation: d s U f + HN Ef (4.19) Yf = ΓN Xf + HN Several terms on the right hand side of the equation become zero after the oblique projection. However, the situation becomes more complex for closed-loop identification where the future disturbance Ef is no longer independent of the future input Uf due to the feedback. The implication of this dependency is that the oblique projection of Ef along Uf on to Wp is no longer zero, although the orthogonal projection of Ef onto Wp is zero. This phenomenon can be observed from Figure 4.2, where the orthogonal projection of A to C is zero but the oblique projection of A to C via B is not zero. To circumvent this problem, by adopting the EIV structure of [72], we move the term related to Uf into the left hand side of Equation 4.19 as it would be a problematic term if it remained on the right hand side of the equation. This yields a new equation, very similar to the total least squares equation [87, 88] in the sense that both input and output variables are on the same side of the equation. ⎞ ⎛ Yf s d ⎠ = ΓN Xf + HN ⎝ Ef (4.20) I −HN Uf
64
4 Closed-loop Subspace Identification
B
A
A /B C
C
Fig. 4.2. Illustration of oblique projection under the closed-loop condition
Using a short-hand notation: ⎛
Wf = ⎝
Yf Uf
⎞ ⎠
Equation 4.20 can be simplified to s d Wf = ΓN Xf + HN Ef I −HN
(4.21)
Performing an orthogonal projection of Equation 4.21 on to the row space of Wp yields d W /W = Γ X /W + H s E /W (4.22) I −HN f p N f p p N f The last term of Equation 4.22 is an orthogonal projection of the future disturbance (white noise) on to the row space of past input and output matrix Wp , which is zero. Therefore, Equation 4.22 can be simplified to d ˆf Wf /Wp = ΓN Xf /Wp = ΓN X (4.23) I −HN The second equality is obtained from Equation 3.57.
Remark 4.1. Equation 4.23 is a natural result through the projection as is often done in subspace system identification literature. The orthogonal projection of Equation 4.21 on to the row space of Wp results in Equation 4.23, which includes a multiplication term between the extended observability matrix ΓN and Kalman ˆ f . Wang and Qin (2002) [72], on the other hand, used an instrument state X variable method to arrive at an equation as d W WT = Γ X WT (4.24) I −HN f N f p p
4.3 An Orthogonal Projection Approach
65
where the instrument variable is the past input and output Wp . This equation is derived by multiplying Equation 4.21 by WpT and noticing the independency between Wp and Ef . In [89], an alternative method through a column weighting is proposed aiming at reducing the variance error and providing a consistent estimation. The projection method may also be mathematically regarded as an instrument variable method where the instrument is WpT (Wp WpT )−1 Wp (since the orthogonal projection Wf /Wp = Wf WpT (Wp WpT )−1 Wp ). However, using the projection results in a multiplication term between the extended observability matrix and the Kalman filtered state, which provides some additional features and performance, as will be seen shortly. We shall call the projection method the subspace orthogonal projection identification method, abbreviated SOPIM, while the subspace identification method via PCA of Wang and Qin (2002) [72] is abbreviated as SIMPCA, in this chapter. Now multiplying both sides of Equation 4.23 by the orthogonal column space of ΓN , denoted by ΓN⊥ , yields d W /W = 0 (ΓN⊥ )T I −HN (4.25) f p
Denoting Z = Wf /Wp , the problem is transferred to finding the orthogonal column space of Z, which should equal the column space of T d . (ΓN⊥ )T I −HN Perform SVD decomposition of Z as ⎞ ⎛ ⎞⎛ Σ1 V1T ⎠ ⎠⎝ (4.26) Z = U1 U2 ⎝ 0 V2T where, in practice, Z is not singular and one has to determine its rank by checking its singular values. In theory, its rank should be lN + n [72] assuming that the external perturbation is persistent excitation. The rank determination is equivalent to the determination of system orders (see Section 3.7.2). With Equation 4.26, one can find the orthogonal column space of Z, which is U2 . Therefore T d = U2 M (4.27) (ΓN⊥ )T I −HN
where M ∈ R(mN −n)×(mN −n) is any full-rank constant matrix. Partitioning ⎛ ⎞ P1 U2 M = ⎝ ⎠ P2 then Equation 4.27 can be written as ⎛ ⎞ ⎛ ⎞ (ΓN⊥ ) P ⎝ ⎠= ⎝ 1⎠ d T ⊥ −(HN ) ΓN P2
(4.28)
66
4 Closed-loop Subspace Identification
Therefore,
and
ΓN = P1⊥
(4.29)
d −P2T = (ΓN⊥ )T HN
(4.30)
d , and then to extract the system The remaining problem is to solve for ΓN and HN d . The right hand side of Equation 4.30 has matrices A, B, C, D from ΓN and HN the same format as that of Equation 3.64. The procedure thus discussed in Section 3.5.3 for extracting A, B, C, D can be applied. d depend only Remark 4.2. As can be seen from the above derivation, ΓN and HN on the last few left singular vectors of Z. The left singular vectors of Z can also be calculated from eigen decomposition of ZZ T as ⎛ ⎞ ⎞⎛ Σ2 U1T 1 T ⎠ ⎠⎝ (4.31) ZZ = U1 U2 ⎝ 0 U2T
ZZ T has a dimension of (mN + lN ) × (mN + lN ) while Z has a dimension of (mN + lN ) × j. As typically j >> (mN + lN ), the decomposition of ZZ T is much simpler than that of Z. Therefore, we recommend calculating U2 from the eigen decomposition of ZZ T . U2 is the eigenvectors that correspond to the minor eigenvalues of ZZ T (zero eigenvalues if there is no disturbance). However, according to [90], squaring the singular values in the eigen decomposition method may lead to poor accuracy in the smallest singular values, which is the tradeoff of using the eigen decomposition approach. Remark 4.3. Subspace identification algorithms draw a lot of their strength from the use of QR decomposition for the projections, where in general Q is never calculated. Such a QR decomposition is known to speed up the projection by up to several orders of magnitude. The approach of using QR decomposition for solving this problem is presented below. Perform a QR decomposition ⎞ ⎛ ⎞⎛ ⎞ ⎛ R11 0 QT1 Wp ⎠=⎝ ⎠⎝ ⎠ ⎝ R21 R22 Wf QT2 Then the orthogonal projection can be calculated via the following equation: −1 Z = Wf /Wp = R21 R11 Wp
(4.32)
Can SOPIM and SIMPCA work under closed-loop conditions? Up to now, it appears that there should be no question but that both work under closed-loop conditions as the input Uf , the correlation of which with the future disturbance Ef is the key problem in closed-loop identification, is not involved in the projection. Both Equations 4.23 and 4.24 appear to be able to determine the process model uniquely irrespective of open or closed-loop. However, simulation results indicate that this is not the case, although both work for open-loop systems.
4.3 An Orthogonal Projection Approach
4.3.2
67
The Problem of Biased Estimation and the Solution
The problem must have resulted from the controller since both algorithms work fine under the open-loop condition. To diagnose the problem, let’s consider that the controller is described by the following state space model: xct+1 = Ac xct + Bc (rt − yt )
(4.33)
ut = Cc xct + Dc (rt − yt )
(4.34)
where r is the setpoint excitation. Using subspace notations, we should have the controller expressed as c Uf = ΓNc Xfc + HN (Rf − Yf ) c c c Up = ΓN Xp + HN (Rp − Yp )
(4.35) (4.36)
where, following the similar notations of Section 3.2.2, Rp and Rf are data Hankel matrices of the setpoint, Xpc and Xfc are the state matrices of the controller, c ΓNc is the extended observability matrix, and HN is the lower triangular blockToeplitz matrix. Equation 4.35 can be re-arranged to give c c c c (4.37) HN I Wf = ΓN Xf + HN Rf Combining Equations 4.21 and 4.37 gives ⎞ ⎛
d s I −HN Γ X H E N f f N ⎠ Wf = ⎝ + c c ΓNc Xfc HN Rf HN I
(4.38)
Pre-multiplying Equation 4.38 by
ΓN⊥
ΓNc⊥
T
to remove the state variables yields
ΓN⊥
ΓNc⊥
T
d I −HN c HN I
Wf =
s (ΓN⊥ )T HN
c (ΓNc⊥ )T HN
Ef Rf
(4.39)
In order to remove the effect of noise, post-multiply Equation 4.39 by an instrument variable W T ⎞ ⎛ ⎞T ⎛ ⎞⎛ ⎞ ⎛ d ⊥ T s ΓN⊥ I −HN E (Γ ) H N ⎠ Wf W T = ⎝ N ⎝ ⎠ ⎝ ⎠⎝ f ⎠WT c⊥ c c⊥ T c ΓN HN I (ΓN ) HN Rf (4.40)
68
4 Closed-loop Subspace Identification
where WT =
⎧ ⎨WT
for SIMPCA p ⎩ W T (W W T )−1 W for SOPIM p p p p
(4.41)
For the sake of rigor in the following derivation, multiply Equation 4.40 by and this results in ⎛ ⎞T ⎛ ⎞ d I −HN 1 ⎝ ΓN⊥ ⎠ ⎝ ⎠ Wf W T c j ΓNc⊥ HN I ⎛ ⎞⎛ ⎞ s Ef 1 ⎝ (ΓN⊥ )T HN ⎠⎝ ⎠WT = c j (ΓNc⊥ )T HN Rf
1 j
(4.42)
Let’s first consider SIMPCA where W = Wp . Since the rows of Ef are independent of the rows of Wp , 1j Ef WpT → 0 as j → ∞ and the first block row on the d , right hand side of Equation 4.42 will be zero. Consequently (ΓN⊥ )T I −HN which contains the essential information of the process model, must fall in the subspace orthogonal to the column space of Wf WpT and, therefore, process models can be recovered from the SVD decomposition of Wf WpT , as discussed before. However, if the rows of Rf are also independent of the rows of Wp , e.g., in the case of white noise, then 1j Rf WpT → 0 as j → ∞, and the second block row on the c right hand side of Equation 4.42 will also be zero. This means (ΓNc⊥ )T HN I , which contains the essential information of the controller model, is also part of the subspace orthogonal to the column space of Wf WpT . As a consequence, the subspace orthogonal to the column space of Wf WpT , which is used to identify the process model, contains both the process model relation and the controller relation. By merely selecting this subspace to extract a model, it is likely that the controller relation is also selected in addition to the process model relation, and the SVD decomposition is not able to tell which model, i.e., a process, controller or mixed model, has been selected. In this case the process identifiability is lost. In order to guarantee that the left null space of Wf W T contains only the process model subspace, the instrument variable W must satisfy the following conditions: 1) 1j Ef W T = 0 2) 1j Rf W T = 0 as j → ∞. A good choice for such an instrument in the case of SIMPCA should be ⎞ ⎛ Rf ⎠ (4.43) W = Wpr = ⎝ Wp
4.3 An Orthogonal Projection Approach
69
i.e., the instrument variable should include Rf , since multiplication of Rf by its transpose guarantees the product to be non-zero. The same applies to SOPIM. In this case, rather than projecting to Wp , we should project Equation 4.21 on to Wpr . This yields d W /W ˆ (4.44) I −HN f pr = ΓN Xf /Wpr = ΓN Xf This new projection guarantees Rf /Wpr = 0 in the projection of Equation 4.37 on to Wpr , and therefore eliminates the overlap between the process model subspace (Equation 4.44) and controller subspace (by projecting Equation 4.37 on to Wpr ), which is c c c c (4.45) HN I Wf /Wpr = ΓN Xf /Wpr + HN Rf /Wpr
where the last term of the equation is guaranteed to be non-zero. With the modification, all the computation procedures discussed in the last section are valid for closed-loop identification after replacing Wp by Wpr , and then both SOPIM and SIMPCA can be truly applied to closed-loop data. We shall call the modified algorithms closed-loop SOPIM (abbreviated as CSOPIM) and closed-loop SIMPCA (abbreviated as CSIMPCA), respectively. To summarize, for open-loop identification, the instrument variable could be chosen according to Equation 4.41, while for closed-loop identification, the instrument variable must be chosen according to Equation 4.46 ⎧ ⎨WT for CSIMPCA pr (4.46) WT = ⎩ W T (W W T )−1 W for CSOPIM pr
pr
pr
pr
Of course, the closed-loop algorithms are more general and also applicable to open-loop identification.
Remark 4.4. From the above discussion, one can see that the undesired effect of the feedback control on the identifiability of open-loop instrument and/or projection subspace methods may also be alleviated if the setpoint rt “stays away” from whiteness, i.e., if they are (strongly) autocorrelated or colored. But this “non-whiteness” is ambiguous and there is no a priori indication on how Rf /Wp or Rf WpT will be different from zero even if rt is not white noise. In addition, the effect of whiteness or non-whiteness on the identifiability also depends on the controller in the feedback and the disturbances that are affecting the process. For example, if the output cannot follow the setpoint closely due to large disturbances, then Wp can have a little correlation with Rf even if the setpoint is not white, resulting 1j Rf WpT → 0. Consequently, Wp may not be suitable to be an instrument in this case even though the external excitation is non-white. 4.3.3
Model Extraction through Kalman Filter State Sequence
In this subsection, we shall develop an alternative method for both open-loop and closed-loop identification through the estimated Kalman filter state sequences.
70
4 Closed-loop Subspace Identification
Combine Equations 4.23 and 4.44 as d ˆf Z = ΓN X I −HN
where
⎧ ⎨ W /W for SOPIM f p Z= ⎩ W /W for CSOPIM f pr
(4.47)
(4.48)
d Z should It is easy to see from Equation 4.47 that the row space of I −HN ˆ span the row space of Xf . Therefore, up to similarity transformation, the Kalman state can be calculated, for example, from ˆ f = Γ † I −H d Z X (4.49) N N With the availability of the state sequence, the system matrices may be extracted using several methods [67]. It has been discussed in Chapter 3 that
XN +1 AB XN K (4.50) = + EN |N YN |N CD UN |N I ˆ f ). To calXN can be directly calculated from Equation 4.47 (i.e., XN ← X culate XN +1 , one can simply change data Hankel matrices according to Rf = RN +1|2N −1 Up = U0|N Yp = Y0|N Uf = UN +1|2N −1 Yf = YN +1|2N −1 where, again, the subscripts follow the notations of Equations 3.23-3.30. Z is changed accordingly. Then XN +1 can be calculated as d XN +1 = Γ −1 Z (4.51) I −H N N
where Γ N is obtained by eliminating the last m rows of ΓN , and H dN is obtained d by eliminating the last m rows and the last l columns of HN . With the availability of XN , XN +1 , all parameter matrices, A, B, C, D, K can be estimated by performing least squares on Equation 4.50. We shall call this method the subspace orthogonal projection identification method via the state estimation for model extraction, abbreviated SOPIM-S. However, SOPIM-S may not be applicable to closed-loop data even though the state has been estimated. The main problem is Equation 4.50. Depending on the feedback control law and the strength of the external excitation, UN |N can be highly correlated with XN . A bias in the estimation of the parameters
4.3 An Orthogonal Projection Approach
71
from Equation 4.50 will therefore be expected. The remedy to this problem is through the following three-step procedure: Step 1: First estimate A from ΓN through the following equation: Γ¯N = Γ N A
(4.52)
where Γ¯N is ΓN by removing its first m rows, and Γ N is ΓN by removing its last m rows. Step 2: Assuming that there is at least one sample time delay in the process, i.e., D = 0, then C can be estimated from the following equation through regression analysis: (4.53) YN |N = CXN + EN |N The residual EN |N can be calculated from this equation. Step 3: B and K can be estimated from the following equation: ⎞ ⎛ U N |N ⎠ XN +1 − AXN = B K ⎝ EN |N
(4.54)
We shall call this modified method the closed-loop subspace orthogonal projection identification method via the state estimation for model extraction, abbreviated CSOPIM-S. Model estimation algorithms based on Kalman filter state, SOPIM-S and CSOPIM-S, appear to be more cumbersome than the algorithms that directly d matrices, such as SOPIM and CSOPIM. Howextract models from ΓN and HN ever, SOPIM-S and CSOPIM-S also calculate disturbance model parameters (or Kalman gain matrix K) directly, while SOPIM and CSOPIM need additional steps to extract K. 4.3.4
Extension to Error-in-variable (EIV) Systems
The derived closed-loop identification algorithm can also be directly applied to closed-loop identification of error in variable (EIV) systems with the following input-output measurements: u∗t = ut + vt yt∗ = yt + st
(4.55) (4.56)
where ut and yt are given by the state space equations (3.1) - (3.2); u∗t and yt∗ are measured input and output respectively; vt and st are white noise. Let the future data Hankel matrices of vt and st be written as Vf and Sf , and the future data Hankel matrices of yt∗ and u∗t be written as Yf∗ and Uf∗ . Define short-hand notations ⎛ ⎞ Vf Df = ⎝ ⎠ Sf
72
4 Closed-loop Subspace Identification
⎛
Wf = ⎝
and
⎛
Wf∗ = ⎝
Yf Uf Yf∗ Uf∗
⎞ ⎠
⎞ ⎠
Using these short-hand notations, Equations 4.55 and 4.56 can be written as Wf∗ = Wf + Df
(4.57)
Then for the EIV systems, it follows from Equations 4.21 and 4.57 that d (W ∗ − D ) = Γ X + H s E (4.58) I −HN f N f f N f or
Now define
s d d D Wf∗ = ΓN Xf + HN Ef + I −HN I −HN f ⎛
Rf
(4.59)
⎞
⎟ ⎜ ⎟ ⎜ ∗ Wpr = ⎜ Yf∗ ⎟ ⎠ ⎝ Uf∗
∗ Projecting Equation 4.59 orthogonally on to Wpr yields
d W ∗ /W ∗ = Γ X /W ∗ + H s E /W ∗ + d D /W ∗ (4.60) I −H I −HN N f f f f pr pr N pr pr N
∗ Following the same argument made before, Ef /Wpr = 0. Since vt and st are ∗ white noise, Df /Wpr = 0. Therefore,
∗ ∗ d ˆf Wf∗ /Wpr = ΓN Xf /Wpr = ΓN X I −HN
(4.61)
ˆ f = Xf /W ∗ . Solving Equation 4.61 through the singular value decomwhere X pr position procedure or the Kalman filter state estimation method as we did for d the regular systems yields HN , ΓN and the Kalman filter state sequence for the EIV systems, and the system matrices can be extracted subsequently. 4.3.5
Simulation
In this section, we will use a benchmark problem to evaluate the orthogonal projection approach and compare it with other existing subspace identification algorithms. We will apply the following representative subspace algorithms in the literature: the closed-loop algorithm by Van Overschee and De Moor (1996,1997)
4.3 An Orthogonal Projection Approach
73
[32, 33], the closed-loop algorithm by Verhaegen (1993)[83], the closed-loop algorithm by Ljung and McKelvey(1996) [30], two classical subspace algorithms, N4SID [67] and MOESP [91], and two versions of CVA, i.e., MATLAB N4SID with CVA weighting and CVA according to [60]. To comply with the standard practice in subspace identification literature [32, 33], we will perform MonteCarlo simulations, and the averaged Bode magnitude plot from the Monte-Carlo simulations will be used to represent bias error, while the scatter plot of estimated poles will be used to represent the variance error of the estimation. The system to be considered was first presented by Verhaegen (1993) [83], and was used again as a benchmark problem by Van Overschee and De Moor (1996,1997) [32, 33] for the comparison of closed-loop subspace identification algorithms. The block diagram of the original system is shown in Figure 4.3. The model, expressed in the innovation state space form [32], is given by Equations 3.1 and 3.2 with the following numerical values: ⎛
4.40 1 ⎜ ⎜ −8.09 0 ⎜ A = ⎜ 7.83 0 ⎜ ⎝ −4.00 0 0.86 0
0 1 0 0 0
0 0 1 0 0
⎞ ⎛ ⎞ ⎛ ⎛ ⎞ ⎞ 2.3 0.00098 1 0 ⎟ ⎜ ⎟ ⎜ ⎜ ⎟ ⎟ ⎜ −6.64 ⎟ ⎜ 0.01299 ⎟ ⎜0⎟ 0⎟ ⎟ ⎜ ⎟ T ⎜ ⎟ ⎜ ⎟ 0 ⎟ , B = ⎜ 0.01859 ⎟ , C = ⎜ 0 ⎟ , K = ⎜ 7.515 ⎟ ⎟ ⎜ ⎟ ⎜ ⎜ ⎟ ⎟ ⎝ −4.0146 ⎠ ⎝ 0.0033 ⎠ ⎝0⎠ 1⎠ 0.86336 −0.00002 0 0
The state space model of the feedback control has the following values: ⎛
⎞ ⎛ ⎞ ⎛ ⎞ 2.65 −3.11 1.75 −0.39 1 −0.4135 ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ 1 ⎜0⎟ ⎜ 0.8629 ⎟ 0 0 0 ⎟ Ac = ⎜ ⎟ , Bc = ⎜ ⎟ , CcT = ⎜ ⎟ ⎝ 0 ⎝0⎠ ⎝ −0.7625 ⎠ 1 0 0 ⎠ 0 0 1 0 0 0.2521 with Dc = 0.61. The simulation conditions are exactly the same as those used by Van Overschee and De Moor (1996) [32]: et is a Gaussian white noise sequence with variance 1/9; the reference signal rt is a Gaussian white noise sequence with variance 1. Each simulation generates 1200 data points. We generate 100 simulated data sets, each with the same reference input rt but with a different noise sequence et . Van Overschee and De Moor (1996,1997) [32, 33] used this example to compare several closed-loop identification algorithms, including their proposed one (denoted here as Algorithm Van Overschee and De Moor)1 , the closed-loop algorithm by Verhaegen (1993) [83] (denoted here as Algorithm Verhaegen), the closed-loop algorithm by Ljung and McKelvey(1996) [30] (denoted here as Algorithm Ljung and McKelvey), and the classical N4SID algorithm. The outcome 1
In [32, 33], three algorithms are proposed. Algorithm 1, the best one among the three, is used in this chapter for the comparison.
74
4 Closed-loop Subspace Identification
w s y
u
r
Gp
-
Gc Fig. 4.3. Block diagram of the process example used by van Overschee and De Moor
CSOPIM Imag Part
dB
100
0
0.5 0 −0.5
−1
10
0
10
1
10
Van Overschee−De Moor Imag Part
dB
−100 −2 10 100
0
0.5
Real Part 1
1.5
0.5
Real Part 1
1.5
0.5
Real Part 1
1.5
0.5
Real Part 1
1.5
0.5 0 −0.5
−1
10
0
10
1
10
Verhaegen Imag Part
dB
−100 −2 10 100
0
0.5 0 −0.5
−1
10
0
10
1
10
Ljung−McKelvey Imag Part
dB
−100 −2 10 100
0
0.5 0 −0.5
−100 −2 10
−1 10 rad/sec
0
10
1
10
Fig. 4.4. Closed-loop Monte-Carlo simulations. The left column is Bode magnitude plots; the dotted lines are the true values and the solid lines are the estimated values averaged from 100 runs. The right column is the scatter plots of the eigenvalues of the estimated A matrix.
of the comparison by Van Overschee and De Moor (1996,1997) [32, 33] was that algorithm 1 of Van Overschee and De Moor and algorithm Verhaegen gave an unbiased estimate under closed-loop conditions, while all other algorithms gave a certain bias. Although algorithm Verhaegen gave an unbiased estimate, the variance of the estimated poles was significantly larger than other algorithms due to the fact that the high-order closed-loop model has to be fitted first in
4.3 An Orthogonal Projection Approach
CSOPIM, meas. variance=0.2 Imag Part
dB
100
0
75
0.5 0 −0.5
−1
10
0
10
1
10
Van Overschee−De Moor, meas. variance=0.2
Imag Part
dB
−100 −2 10 100
0
0.5
Real Part 1
1.5
0.5
Real Part 1
1.5
0.5
Real Part 1
1.5
0.5
Real Part 1
1.5
0.5 0 −0.5
−1
10
0
10
1
10
CSOPIM, meas. variance=0.5 Imag Part
dB
−100 −2 10 100
0
0.5 0 −0.5
−1
10
0
10
1
10
Van Overschee−De Moor, meas. variance=0.5
Imag Part
dB
−100 −2 10 100
0
0.5 0 −0.5
−100 −2 10
−1
10
0
rad/sec 10
1
10
Fig. 4.5. Closed-loop Monte-Carlo simulations for the EIV system. The left column is Bode magnitude plots; the dotted lines are the true values and the solid lines are the estimated values averaged from 100 runs. The right column is the scatter plots of the eigenvalues of the estimated A matrix.
the algorithm. Therefore, the conclusion was that algorithm 1 of Van Overschee and De Moor is superior to other algorithms in terms of performance. However, the algorithms proposed in Van Overschee and De Moor (1996,1997) [32, 33] required precise information of at least first i Markov parameters of the controller model, while others did not. In this simulation, we will reproduce the results of Van Overschee and De Moor (1996,1997) [32, 33] and compare them with the most representative algorithm of this chapter, CSOPIM. The simulation results are shown in Figure 4.4. From this figure, we can see that the algorithm CSOPIM has almost identical performance as that of algorithm Van Overschee and De Moor in both bias and variance aspects. However, the algorithm CSOPIM has the advantage over algorithm Van Overschee and De Moor in the sense that it does not need any knowledge about the controller model while algorithm Van Overschee and De Moor does. To see the further advantage of the algorithm CSOPIM, we consider an EIV case by adding white noises to the measurements of both ut and yt . We perform two Monte-Carlo simulations for the EIV case, one with measurement noise variance 0.2 and the other 0.5. The comparison results are shown in Figure 4.5. From this
4 Closed-loop Subspace Identification
dB
100
N4SID
Imag Part
76
0
0.5 0 −0.5
−1
0
10
10
1
10
MOESP Imag Part
dB
−100 −2 10 100
0
0.5
Real Part 1
1.5
0.5
Real Part 1
1.5
0.5
Real Part 1
1.5
0.5
Real Part 1
1.5
0.5 0 −0.5
−1
0
10
10
1
10
CVA−MATLAB Imag Part
dB
−100 −2 10 100
0
0.5 0 −0.5
−1
0
10
10
1
10
CVA Imag Part
dB
−100 −2 10 100
0
0.5 0 −0.5
−100 −2 10
−1
10 rad/sec
0
10
1
10
Fig. 4.6. Closed-loop Monte-Carlo simulations. The left column is Bode magnitude plots; the dotted lines are the true values and the solid lines are the estimated values averaged from 100 runs. The right column is the scatter plots of the eigenvalues of the estimated A matrix.
figure, one can see that the algorithm CSOPIM performs better than algorithm Van Overschee and De Moor in the presence of measurement noises. To appreciate closed-loop subspace identification algorithms, we have also applied classical subspace algorithms, N4SID, MOESP, and two CVAs (CVA according to [60] and MATLAB N4SID with CVA weighting) to closed-loop data. The results are shown in Figure 4.6. From these results, one can see that N4SID, MOESP, and MATLAB based CVA deliver essentially the same performance and all are biased in the presence of feedback control. The CVA programmed according to [60] gives some improved performance compared to the MATLAB N4SID with CVA weighting but the bias is also observed in this example. Since the algorithm CSOPIM is a closed-loop identification algorithm, it must be applicable to open-loop identification. A natural question is how it performs in open-loop identification. Thus, we perform open-loop Monte-Carlo simulations using the open-loop plant without feedback control. The results together with the comparisons with other classical subspace identification algorithms are shown in Figure 4.7. From this figure, one can see that all classical algorithms
4.3 An Orthogonal Projection Approach
CSOPIM Imag Part
dB
100
0
77
0.5 0 −0.5
−1
0
10
10
1
10
N4SID Imag Part
dB
−100 −2 10 100
0
0.5
Real Part 1
1.5
0.5
Real Part 1
1.5
0.5
Real Part 1
1.5
0.5
Real Part 1
1.5
0.5 0 −0.5
−1
10
0
10
1
10
MOESP Imag Part
dB
−100 −2 10 100
0
0.5 0 −0.5
−1
10
0
10
1
10
CVA Imag Part
dB
−100 −2 10 100
0
0.5 0 −0.5
−100 −2 10
−1
10 rad/sec
0
10
1
10
Fig. 4.7. Open-loop Monte-Carlo simulations. The left column is Bode magnitude plots; the dotted lines are the true values and the solid lines are the estimated values averaged from 100 runs. The right column is the scatter plots of the eigenvalues of the estimated A matrix.
deliver similar performance for open-loop identification in this example, and the algorithm CSOPIM delivers an essentially identical performance to that of the classical algorithms in both bias and variance errors. Finally, we verify the claim that one of the instrument-variable based subspace identification methods, SIMPCA, can deliver a bias estimate for closed-loop identification at least under the condition that the external excitation is white noise, but the bias can be reduced if the external excitation is autocorrelated, and the CSIMPCA can effectively eliminate the bias. The closed-loop simulation results for SIMPCA are presented in Figure 4.8. The results presented in the left column of Figure 4.8 correspond to the white-noise external excitation with variance of 1. The results presented in the right column correspond to the autocorrelated external excitation r, where r is filtered white noise; the filter is first-order with a pole 0.9 and the filtered excitation signal has variance 1. The results demonstrate that 1) autocorrelation of the external excitation can indeed reduce the bias of SIMPCA, and 2) the CSIMPCA indeed performs significantly better than SIMPCA.
78
4 Closed-loop Subspace Identification 5
5
r
0
−5
0
0
200
400
100
600
800
−5
1000 1200
dB
dB
−50
100
400
600
800
1000 1200
SIMPCA
0
−50 −1
0
10
10
1
10
−100 −2 10
−1
0
10
100
CSIMPCA
10
1
10
CSIMPCA
50 dB
50 0
−50 −100 −2 10
200
50
0
−100 −2 10
0
100
SIMPCA
50
dB
r
0
−50 −1
10
0
rad/sec 10
1
10
−100 −2 10
−1
10
0
rad/sec 10
1
10
Fig. 4.8. Closed-loop Monte-Carlo simulations
4.4 Summary In this chapter, several closed-loop subspace identification algorithms have been reviewed. Then by adopting the EIV model structure, a subspace orthogonal projection identification method (SOPIM) was elaborated. However, it yields a bias in closed-loop identification under certain conditions. Through analysis of the bias error of SOPIM under closed-loop conditions, it was discovered that other existing instrument subspace methods in the literature may also suffer from the same bias error and therefore, these methods may not be suitable for closed-loop identification. Motivated by this finding, a solution was derived to eliminate bias error for SOPIM as well as for an existing instrument subspace identification algorithm, SIMPCA, for the sake of closed-loop identification. As a result two novel subspace closed-loop identification algorithms CSOPIM and CSIMPCA, named after SOPIM and SIMPCA, respectively, are developed. In addition, the orthogonal projection method discussed in this chapter provides both the extended observability matrix and the Kalman filter state sequence. Therefore, system models may also be recovered from the estimated Kalman state sequence, and the closedloop subspace orthogonal projection identification method via state estimation (CSOPIM-S) is also suggested. Simulations based on a benchmark problem compared the performance of the orthogonal projection algorithms with a number of existing subspace identification algorithms and verified the feasibility and closedloop applicability of the orthogonal projection algorithms.
5 Identification of Dynamic Matrix and Noise Model Using Closed-loop Data∗
5.1 Introduction Model predictive controllers (MPC) have found many successful applications in process industries for about three decades. One of the key aspects of MPC is the prediction of the future process response and minimization of the output deviation from the setpoint by manipulating the inputs. A model for the process is required to make these predictions based on past data. Hence an MPC design starts with first identifying a nominal model for the process. One of the industrially successful predictive control schemes is the dynamic matrix control or DMC, which explicitly uses a lower triangular matrix called ‘dynamic matrix’ containing the step response coefficients corresponding to the deterministic input(s) to the process [1, 2]. Many other MPC formulations also use the dynamic matrix in one way or the other [92, 5, 93]. For constructing the dynamic matrix, in the case of DMC, a step response model for the process is first obtained from the open-loop data. The step response coefficients are then arranged in a specific lower triangular form in the dynamic matrix, as will be discussed in detail in Chapter 6. However, for safety reasons and other practical limitations, open-loop operation of the process may not always be possible, or in some cases there may be a hidden feedback in the system. Estimation of the dynamic matrix from closed-loop data is desirable in such cases. It has been shown [94] that if the model is used for model-based control design then the favorable experimental conditions are actually under closed-loop conditions. Even though subspace identification is used as a vehicle, the goal of the datadriven closed-loop identification method discussed in this chapter is not the estimation of state space matrices {A, B, C, D and K} but the estimation of the dynamic matrices of the process and noise models directly from input-output data. Figure 5.1 illustrates the differences between the data-driven closed-loop subspace approach to be considered here and other existing closed-loop subspace identification methods as discussed in Chapter 4, for getting a dynamic matrix. ∗
This chapter (with revisions) is reprinted with permission from Ind. Eng. Chem. Res., vol 41, Kadali, R., B. Huang, “ Estimation of the Dynamic Matrix and Noise c Model for Model Predictive Control Using Closed-Loop Data”, 842-852, 2002 American Chemical Society. Sections 5.2 and 5.4 have major revisions.
B. Huang et al.: Dyn. Model. Predict. Ctrl. & Perform. Monitor., LNCIS 374, pp. 79–97, 2008. c Springer-Verlag London Limited 2008 springerlink.com
80
5 Identification of Dynamic Matrix and Noise Model Using Closed-loop Data
The method considered in this chapter may be considered as a nonparametric approach for closed-loop identification. Nonparametric model identification methods, although known to give less bias error due to less model structure and order limitations, could result in higher variance (due to the larger number of parameters) compared to parametric model identification methods. This is a tradeoff between bias error and variance error in process identification. All physical processes are typically of high-order and nonlinear in nature, and it is not always possible to represent them by a single linear parametric model. Consequently, bias error is inevitable in practice. On the other hand, it is known that the variance error can be reduced with the increased sample size [11]. Therefore, depending on the application, for example, depending on the data sample size, one can choose between the parametric or nonparametric identification method. Closed-loop data with sufficient excitation
Subspace matrices Conventional approach
Data-driven approach
System A, B, C, D, K
Dynamic matrix Fig. 5.1. Comparing the conventional closed-loop subspace state space identification methods and the proposed data-driven closed-loop subspace approach, for estimating dynamic matrices
However, nonparametric model identification methods do have some practical advantages. Consider the case when we want to identify a process model for designing and implementing a model predictive controller. Even if we identify a parametric model for the process (using closed- or open-loop process data), the parametric model often has to be transferred to a nonparametric (impulse or step response) model form first and then a dynamic matrix for designing the MPC. The concern is that this additional intermediate step may introduce an additional error and inconvenience. When it comes to industrial implementation, nonparametric-model based MPCs have shown considerable success. The idea explored in this chapter is to identify a nonparametric-like model directly and avoid choosing a ‘model structure’ which is unknown and a prerequisite needed
5.2 Estimation of Process Dynamic Matrix and Noise Model
81
for parametric model identification methods. In addition, instead of identifying a traditional nonparametric model such as a step response model, the proposed approach identifies the dynamic matrix directly. The remainder of the chapter is organized as follows. Section 5.2 gives a description of the estimation of the process dynamic matrix and noise model from closed-loop data. Remarks on the different steps of the closed-loop identification along with some guidelines for the practical implementation of the algorithm are provided in Section 5.3. The closed-loop identification method is extended to the case of measured disturbances in Section 5.4, and explained using MATLAB simulations in Section 5.5 and implementation on a pilot scale process in Section 5.6. Summary from the above work is given in Section 5.7.
5.2 Estimation of Process Dynamic Matrix and Noise Model Consider the case when the system described by (3.1)-(3.2) is operating under closed-loop with a linear time-invariant feedback-only controller Gc , expressed in a transfer function form as ut = Gc (rt − yt )
(5.1)
where rt (m × 1) is the setpoint for the process output at the sampling instant t and (rt −yt ) is the output deviation from the setpoint. Assume that the controller does not cancel any plant dynamics, and the controller is expressed in state space representation as xct+1 = Ac xct + Bc (rt − yt )
(5.2)
+ Dc (rt − yt )
(5.3)
ut =
Cc xct
where {Ac , Bc , Cc and Dc } are the dynamic state space system matrices of the controller. By recursively using the above state space equations, as discussed in Chapter 3, we can write the input/output subspace equations for the controller as c (Rp − Yp ) Up = ΓNc Xpc + HN
(5.4)
− Yp ) − Yf )
(5.5) (5.6)
Xfc
= Uf =
c c AN c Xp + ΔN (Rp c ΓNc Xfc + HN (Rf
where
ΓNc
c HN
⎞ Cc ⎟ ⎜ ⎜ Cc Ac ⎟ =⎜ ⎟ ⎠ ⎝ ... N −1 Cc Ac ⎛ 0 Dc ⎜ C B D ⎜ c c c =⎜ ⎝ ... ... −2 −3 Bc Cc AN Bc Cc AN c c ⎛
⎞ ... 0 ... 0 ⎟ ⎟ ⎟ ... ... ⎠ ... Dc
82
5 Identification of Dynamic Matrix and Noise Model Using Closed-loop Data
−1 −2 AN Bc AN Bc ... Bc c c Xpc = xc0 xc1 ... xcj−1 Xfc = xcN xcN +1 ... xcN +j−1
ΔcN =
The matrices Rp and Rf are past and future block-Hankel matrices corresponding to rt . Following the same approach as the derivation of Equation 3.41 from Equations 3.1 and 3.2, we can derive the following equation from Equations 5.2 and 5.3 (5.7) Uf = Lcw Wpc + Lcr Rf − Lcy Yf where Lcw is a coefficient matrix corresponding to past inputs, outputs and setpoints; ⎞ ⎛ Yp ⎟ ⎜ ⎟ ⎜ Wpc = ⎜ Up ⎟ ⎠ ⎝ Rp c and Lcr = Lcy = HN . Substituting Equation 5.7 in Equation 3.41 yields
Yf = Lw Wp + Lu (Lcw Wpc + Lcr Rf − Lcy Yf ) + Le Ef
(5.8)
Solving Yf from Equation 5.8 gives us Yf = (I + Lu Lcy )−1 (Lw Wp + Lu Lcw Wpc ) + (I + Lu Lcy )−1 Lu Lcr Rf +(I + Lu Lcy )−1 Le Ef CL CL = LCL + LCL y Wp yr Rf + Lye Ef
(5.9)
where CL LCL = (I + Lu Lcy )−1 (Lw Wp + Lu Lcw Wpc ) y Wp c −1 Lu Lcr = (I + Lu Lcy )−1 Lu Lcy LCL yr = (I + Lu Ly )
LCL ye
= (I +
Lu Lcy )−1 Le
(5.10) (5.11)
and WpCL = Wpc . Similarly, substituting Equation 3.41 in Equation 5.7 yields CL CL Uf = LCL + LCL u Wp ur Rf + Lue Ef
(5.12)
where CL LCL = (I + Lcy Lu )−1 (Lcw Wpc − Lcy Lw Wp ) u Wp c −1 c Lr = (I + Lcy Lu )−1 Lcy LCL ur = (I + Ly Lu )
(5.13)
Lcy Lu )−1 Lcy Le
(5.14)
LCL ue
= −(I +
5.2 Estimation of Process Dynamic Matrix and Noise Model
83
With the above results, the estimation of the closed-loop subspace matrices using closed-loop data is essentially an open-loop identification problem. Putting Equations 5.9 and 5.12 together, we get ⎞ ⎛ ⎞ ⎞ ⎞ ⎛ ⎛ ⎛ CL CL LCL L L Yf ⎠ = ⎝ y ⎠ WpCL + ⎝ yr ⎠ Rf + ⎝ ye ⎠ Ef ⎝ (5.15) Uf LCL LCL LCL u ur ue From the above equation, and since the setpoint Rf can be chosen to be uncorreCL CL CL lated with Ef , the closed-loop subspace matrices {LCL u , Lur , Ly , Lyr } can be obtained as a solution of a least squares estimation problem. They are estimated by the orthogonal projection of the row space of {Uf and Yf } into the row space spanned by {WpCL and Rf }:
LCL LCL u ur
⎛
= Uf ⎝ = Uf
LCL LCL y yr
= Yf ⎝ = Yf
Rf
⎞† ⎠
(WpCL )T RfT
⎛
WpCL
WpCL Rf
⎞† ⎠
(WpCL )T RfT
⎛
WpCL
⎛
WpCL
(⎝
(⎝
Rf
Rf
⎞
⎠ (W CL )T RT )−1 (5.16) p f
⎞
⎠ (W CL )T RT )−1 (5.17) p f
This projection can be implemented in a numerically robust way with a QRdecomposition. With the estimation of the subspace matrices, the predictions ˆf , Yˆf can then be written as U ⎛ ⎞ ⎛ ⎞ ⎞ ⎛ CL Yˆf LCL L y yr ⎝ ⎠=⎝ ⎠ WpCL + ⎝ ⎠ Rf CL ˆf U LCL L u ur
The first row of Yˆf represents one-step ahead predictions for the output. Therefore the white noise disturbance sequence entering the process can be estimated as T ef = eN eN +1 ... eN +j−1 = Yf (1 : m, :) − Yˆf (1 : m, :)
where (1 : m, :) represents a vector containing the elements from rows 1 to m and all the columns of the matrix. Let us define ˆf = LCL Ξ f = Uf − U ue Ef
84
5 Identification of Dynamic Matrix and Noise Model Using Closed-loop Data
The block Hankel matrix, Ef , for the noise can be constructed using the estimated noise, ef . Therefore, LCL ue is estimated as = Ξf /Ef = Ξf Ef† LCL ue 5.2.1
Estimation of Dynamic Matrix of the Process
Typically if there is no delay in the controller and the system has the same number of inputs and outputs, Lcy or equivalently Lcr is invertible; thus LCL ur = (I + Lcy Lu )−1 Lcy is invertible. In the following derivation, we assume that this is true. Otherwise, the inverse has to be replaced by pseudo inverse. The well-known matrix inversion lemma is given below: [A + BCD]−1 = A−1 − A−1 B[C −1 + DA−1 B]−1 DA−1 Using this lemma, we can write (I + Lu Lcy )−1 = I − Lu (I + Lcy Lu )−1 Lcy
(5.18)
CL −1 can be written as Now, LCL yr (Lur ) CL −1 LCL = (I + Lu Lcy )−1 Lu Lcy [(I + Lcy Lu )−1 Lcy ]−1 yr (Lur )
= (I + Lu Lcy )−1 Lu Lcy [(Lcy )−1 + Lu ] Applying Equation 5.18 to the equation above gives CL −1 = [I − Lu (I + Lcy Lu )−1 Lcy ]Lu Lcy [(Lcy )−1 + Lu ] LCL yr (Lur )
= [Lu Lcy − Lu (I + Lcy Lu )−1 Lcy Lu Lcy ][(Lcy )−1 + Lu ] = Lu [Lcy − (I + Lcy Lu )−1 Lcy Lu Lcy ][(Lcy )−1 + Lu ] = Lu {Lcy − [(Lcy )−1 + Lu ]−1 Lu Lcy }[(Lcy )−1 + Lu ] = Lu {Lcy − Lcy [I + Lu Lcy ]−1 Lu Lcy }[(Lcy )−1 + Lu ] Applying the matrix inverse lemma again, we get {Lcy − Lcy [I + Lu Lcy ]−1 Lu Lcy } = [(Lcy )−1 + Lu ]−1 Thus CL −1 = Lu [(Lcy )−1 + Lu ]−1 [(Lcy )−1 + Lu ] LCL yr (Lur ) d = L u = HN
(5.19)
d , HN
which contains the Markov parameters corresponding to the deHence terministic input, can be identified according to the above derivation as d CL −1 = LCL HN yr (Lur ) ⎛ F 0 ... ⎜ 0 ⎜ ⎜ F1 F0 ... =⎜ ⎜ ⎜ ... ... ... ⎝ FN −1 ... ...
0
⎞
⎟ ⎟ 0⎟ ⎟ ⎟ 0⎟ ⎠ 0
(5.20)
5.2 Estimation of Process Dynamic Matrix and Noise Model
85
where Fi represents the i-th Markov parameter (or impulse response coefficient) with the deterministic input. The dynamic matrix containing the system step response coefficients, SN , can be obtained as ⎞ ⎛ ⎞⎛ ⎛ ⎞ G0 0 ... 0 I 0 ... 0 F0 0 ... 0 ⎟ ⎜ ⎟⎜ ⎜ ⎟ ⎟ ⎜ ⎟⎜ ⎜ ⎟ ⎜ G1 G0 ... 0 ⎟ ⎜ F1 F0 ... 0 ⎟ ⎜ I I ... 0 ⎟ ⎟ ⎟ ⎜ ⎜ ⎜ ⎟ SN = ⎜ ⎟=⎜ ⎟⎜ ⎟ ⎜ ... ... ... 0 ⎟ ⎜ ... ... ... 0 ⎟ ⎜ ... ... ... 0 ⎟ ⎠ ⎝ ⎠⎝ ⎝ ⎠ GN −1 ... ... 0 FN −1 ... ... 0 I I ... I ⎛ ⎞ I 0 ... 0 ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ I I ... 0 d ⎜ ⎟ (5.21) = HN ⎜ ⎟ ⎜ ... ... ... 0 ⎟ ⎝ ⎠ I I ... I where Gi represents the ith step response coefficient with the deterministic input.
5.2.2
Estimation of the Noise Model
The noise model can be estimated from the residuals of the input data. Using CL the definitions for LCL ur and Lue , the noise dynamic matrix is obtained as s −1 CL = Le = −(LCL Lue HN ur )
(5.22)
s HN contains the impulse response coefficients corresponding to the stochastic input.
s HN
⎛
I ⎜ ⎜ L1 =⎜ ⎝ ... LN −1
⎞ 0 ... 0 ⎟ I ... 0 ⎟ ⎟ ... ... 0 ⎠ ... ... 0
where Li represents the ith impulse response coefficient with the stochastic input. s Thus the first column of HN represents the noise model H(z −1 ) in the impulse response form: H(z −1 ) = I + L1 z −1 + L2 z −2 + ... + Li z −i + ... + LN −1 z −N +1 Alogrithm 5.1. The following are the implementation procedures for the datadriven closed-loop subspace identification method discussed above: 1. Construction of the data Hankel matrices {Up , Uf , Yp , Yf , Rp , Rf } using closed-loop data. By linear regression the deterministic closed-loop subspace matrices are identified.
86
5 Identification of Dynamic Matrix and Noise Model Using Closed-loop Data
Remark 5.2. The guidelines presented in Section 5.3 can be used in the selection of the number of rows and columns. By adding a persistent exciting signal, which is uncorrelated with the process noise in the setpoint, we ensure identifiability of the closed-loop subspace matrices. This step is an open-loop identification problem with the setpoint(s) change being the deterministic external inputs and the closed-loop subspace matrices as the model to be identified. 2. Estimation of the vector of noise data from the ‘output data Hankel matrix’, and ‘residual data Hankel matrix’ corresponding to ‘input data Hankel matrix’. Estimation of the stochastic closed-loop subspace matrices. Remark 5.3. The first row of the residual matrix (Yf −Yˆf ) represents one-step ahead prediction errors and an estimate of the noise entering the process. The ˆf ) is the residual data Hankel matrix corresponding to matrix, Ξf = (Uf − U the input. The noise data Hankel matrix, Ef , is constructed using the vector of estimated process noise. By linear regression, the stochastic closed-loop subspace matrix is estimated. 3. Retrieving the open-loop deterministic subspace matrix from the closed-loop subspace matrices. Remark 5.4. Closed-loop subspace matrices are just the open-loop subspace matrices weighted by the subspace matrix corresponding to the sensitivity function. The analogies between the process/noise transfer functions and the open-loop deterministic/stochastic subspace matrices are obvious. The method presented in this chapter is parallel to the approach used in ‘joint input/output closed-loop identification method’, which is well known in the transfer function domain. However with the ‘joint input/output closed-loop identification method’, inverting the transfer function (or transfer function matrices for the multivariate systems) can cause problems: the resultant transfer function (matrix) may be improper or of high order. No such problems are encountered in the subspace-matrices based closed-loop approach since we are dealing with algebraic matrices instead of transfer functions. 4. Retrieving the open-loop stochastic subspace matrix from the closed-loop subspace matrices.
5.3 Some Guidelines for the Practical Implementation of the Algorithm Building the data Hankel matrices is the first step in all the subspace based identification methods. If we want to identify a state space model for the system using the subspace identification methods, then as discussed in Chapter 3 the number of rows, N , is chosen to be larger than the order of the state space model to be identified [38]. The number of columns, j, should tend to infinity. Since we have finite data in real situations we can only choose a finite j, the maximum
5.4 Extension to the Case of Measured Disturbance Variables
87
number of columns that can be constructed with the available data. As far as the closed-loop identification method presented in this chapter is concerned, we are identifying only the subspace matrices and not the state space system matrices. Following are some guidelines that can help in deciding the number of rows and columns of the data Hankel matrices: 1. To obtain the complete process dynamics, the number of rows, N , should be chosen such that the last impulse response coefficient (last element of the d ) approaches zero. first block-column of HN 2. The choice of the number of columns depends on the excitation signal used for identification. The richer the excitation signal the fewer the number of columns required. The number of columns is chosen in such a way that the corresponding impulse response coefficients along the columns of the subspace matrices are very close. The higher the number of columns taken in the data Hankel matrices the closer will the corresponding coefficients in the columns be. 3. Numerical tools like QR decomposition can be used to avoid numerical problems associated with the inversion of large matrices, specially in step 1. 4. Studies on derivation of statistical properties for subspace based identification methods are an area of active research and have been considered in [95, 64, 38, 54] and the references therein, where it has been shown that under open-loop conditions subspace identification can yield a consistent estimation of the parameters. As the closed-loop method discussed in this chapter is equivalent to an open-loop subspace identification problem, the same conclusion may be applied.
5.4 Extension to the Case of Measured Disturbance Variables The closed-loop subspace-based identification method explained in the previous section can be extended to the case where some measured disturbances are available for feedforward control. Consider the case when measurements of some of the disturbance variables are available and we want to identify the subspace matrices corresponding to these variables. Let mt (h × 1) represent the vector of the measured disturbance variables. Assume that the measured disturbance variables are uncorrelated with the setpoint changes. Consider a feedback-only controller described in Equations 5.1-5.6 acting on a process represented by
⎛
ut
xt+1 = Axt + B Bm ⎝
mt
yt = Cxt + D Dm ⎝
mt
⎛
ut
⎞
⎠ + Ket
(5.23)
⎞
⎠ + et
(5.24)
88
5 Identification of Dynamic Matrix and Noise Model Using Closed-loop Data
The matrix input-output Equations 3.20 and 3.41 are modified to include measured disturbances as d m s U f + HN M f + HN Ef Yf = ΓNb Xfb + HN
= Lbw Wpb + Lu Uf + Lm Mf + Le Ef
(5.25) (5.26)
where the superscript b in ΓNb and Xfb is used to distinguish them from ΓN and Xf defined in subspace Equations 3.20 and 3.41 (i.e., the case without considering measurement disturbances) and ⎞ ⎛ Dm 0 ... 0 ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ CB D ... 0 m m m ⎟ ⎜ (5.27) L m = HN = ⎜ ⎟ ⎜ ... ... ... ... ⎟ ⎠ ⎝ CAN −2 Bm CAN −3 Bm ... Dm ⎞ ⎛ Yp ⎟ ⎜ ⎟ ⎜ Wpb = ⎜ Up ⎟ (5.28) ⎠ ⎝ Mp Mp and Mf are the ‘past’ and ‘future’ data Hankel matrices of mt defined in the same way as those corresponding to ut defined in Equations 3.23 and 3.24. Define ⎞ ⎛ Yp ⎟ ⎜ ⎟ ⎜ ⎜ U p ⎟ ⎟ WpCLb = ⎜ ⎟ ⎜ ⎜ Mp ⎟ ⎠ ⎝ Rp
Similar to Equation 5.15, we can derive ⎛ ⎞ ⎛ ⎞ ⎞ ⎞ ⎛ ⎛ CL CL Yf LCLb L L ⎝ ⎠ = ⎝ y ⎠ WpCLb + ⎝ yr ⎠ Rf + ⎝ ym ⎠ Mf Uf LCLb LCL LCL u ur um ⎞ ⎛ LCL ye ⎠ +⎝ Ef LCL ue
(5.29)
where, in addition to the subspace matrices given in Equations 5.10, 5.11, 5.13, and 5.14, we have c −1 c c d −1 c m LCL Ly Lm = −(I + HN HN ) H N HN um = −(I + Ly Lu )
LCL ym
= (I +
Lu Lcy )−1 Lm
= (I +
d c −1 m HN HN ) HN
(5.30) (5.31)
5.5 Closed-loop Simulations
89
The closed-loop subspace matrices are identified by data projections as shown m containing the Markov parameters (impulse rein Section 5.2. The matrix HN sponse coefficient matrices) corresponding to the measured disturbance variables is obtained from the closed-loop subspace matrices as m −1 CL HN = Lm = −(LCL Lum ur )
5.5 Closed-loop Simulations Certain comparative simulations were carried out in MATLAB for two cases, univariate and multivariate systems, among the data-driven subspace approach presented in this chapter, MOESP [83], and CVA [86, 85], all based on the closed-loop version. The purpose of this exercise is to check the validity of the data-driven approach of closed-loop identification for both univariate and multivariate systems, which does not require controller knowledge for closed-loop identification, and also to see how it performs compared to the existing subspace identification methods, namely the MOESP and CVA approaches. It has been shown by Van Overschee and De Moor [66] that the difference among the three subspace identification algorithms, N4SID, MOESP and CVA, is the difference in the way the weighting matrices are used in the subspace identification algorithm. In fact, MATLAB offers the ‘N4SID’ command wherein the user can specify MOESP or CVA and the respective weighting matrices will be used, so that it is equivalent to using MOESP/CVA subspace identification algorithms. Hence in the simulations presented in this section, the MOESP/CVA weighting matrices are used in the ‘N4SID’ algorithm in MATLAB . 5.5.1
Univariate System
Consider the following system [38]: ⎛ ⎞ ⎛ ⎞ ⎛ ⎞ 0.6 0.6 0 1.6161 −1.1472 ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ xt+1 = ⎜ −0.6 0.6 0 ⎟ xt + ⎜ −0.3481 ⎟ ut + ⎜ −1.5204 ⎟ et ⎝ ⎠ ⎝ ⎠ ⎝ ⎠ 0 0 0.7 2.6319 −3.1993 yt = −0.4373 −0.5046 0.0936 xt + (−0.7759)ut + et
A PID controller, 0.1 + 0.08 s + 0.08s, is tuned online for the above system for good setpoint tracking and disturbance rejection performance. We assume that the controller knowledge is unknown for the closed-loop identification. Closedloop input/output/setpoint data are obtained by exciting the system using a designed RBS signal of magnitude 1 for the system setpoint and random white noise of standard deviation 0.1 for the disturbance in MATLAB and Simulink . The closed-loop data is plotted in Figure 5.2. Using the closed-loop subspace identification method presented in Section 5.2, with the number of rows (N ) = 30 and the number of columns (j) = 2000
90
5 Identification of Dynamic Matrix and Noise Model Using Closed-loop Data
Closed−loop data 2
R and Y
1
0
−1
−2
0
500
1000
1500
0
500
1000
1500
2000
2500
3000
3500
2000
2500
3000
3500
1.5 1
U
0.5 0 −0.5 −1 −1.5
Samples
Fig. 5.2. Univariate system: Closed-loop data d s in the data Hankel matrices, the subspace matrices HN and HN are identid fied. Due to the presence of noise, the upper non-diagonal elements in HN and s HN will not be exactly zero but very small numbers (they approach zero as j −→ ∞). The true impulse response coefficients of the system can be calculated from the state space system matrices provided above. The identified impulse response coefficients are plotted against the true impulse response coefficients in Figure 5.3. It is illustrated that the identified impulse response coefficients match well with the true ones. Joint input-output closed-loop MOESP/CVA identification methods are used to identify the deterministic part of the system using the same set of closedloop data. Impulse response coefficients identified through MOESP and CVA are plotted against the true impulse response coefficients in Figure 5.4. We can see that impulse response coefficients identified from both MOESP and CVA methods match reasonably well with the true ones for this univariate system. Therefore, all three methods yielded similar results in this case.
5.5.2
Multivariate System
Consider the following system taken from the MATLAB /MPC toolbox manual: ⎞ ⎛ ⎞ ⎛ ⎛ ⎞⎛ ⎞ 12.8e−s −18.9e−3s 3.8e−8s y1 (s) u (s) 1 ⎠ = ⎝ 16.7s+1 21.0s+1 ⎠ ⎝ ⎠ + ⎝ 14.9s+1 ⎠ e(s) ⎝ 6.6e−7s −19.4e−3s 4.9e−3s y2 (s) u (s) 2 10.9s+1 14.4s+1 13.2s+1
5.5 Closed-loop Simulations
91
Comparing the actual IR−coefficeints with those from Lu and Le 1
Process
0.5
0
−0.5
−1
0
5
10
15
20
25
30
0
5
10
15 Lag
20
25
30
1
Noise
0.5
0
−0.5
−1
Fig. 5.3. Univariate system: comparing the true (solid) IR-coefficients with those obtained in the subspace matrices (dotted) True(−) and MOESP identified(..)
Deterministic response
1 0.5 0 −0.5 −1
0
5
10
15 Time
20
25
30
20
25
30
True(−) and CVA identified(..)
Deterministic response
1 0.5 0 −0.5 −1
0
5
10
15 Time
Fig. 5.4. Univariate system: comparing the true (solid) IR-coefficients with those identified by MOESP/CVA approaches
92
5 Identification of Dynamic Matrix and Noise Model Using Closed-loop Data
Closed−loop data
Closed−loop data
0.5
0.5 R2(−) y2(..)
1
R1(−) y1(..)
1
0
−0.5
0
−0.5
−1
−1 0
500
1000 1500 Time
2000
2500
0
500
1000 1500 Time
2000
2500
0
500
1000 1500 Time
2000
2500
1.5
2
1 0.5 U2
U1
1 0
0
−0.5
−1
−1 −2
−1.5 0
500
1000 1500 Time
2000
2500
Fig. 5.5. Multivariate system: closed-loop data
True(−) and identified(..)
True(−) and identified(..)
1.5
1 G12
G11
1 0.5
0.5 0
0
20
40
60
80
0
100
0
20
40
Lags 0
G22
G21
100
60
80
100
60
80
100
0
−1
−1 −1.5 −2
−1.5 0
20
40
60
80
100
−2.5
0
20
40
Lags
Lags 0.6
0.4 0.3
H2
H1
80
−0.5
−0.5
0.2
0.4 0.2
0.1 0
60 Lags
0
20
40
60 Lags
80
100
0
0
20
40 Lags
Fig. 5.6. Multivariate system: comparing the true (solid) IR-coefficients with those obtained in the subspace matrices (dotted)
5.5 Closed-loop Simulations
1.5
93
0
−0.5 G12
G11
1
−1
0.5 −1.5 0
0
20
40
60
80
100
0
20
40
Lags
60
80
100
60
80
100
Lags
0 1 −0.5
0.6
G22
G21
0.8
−1.5
0.4
−2
0.2 0
−1
0
20
40
60 Lags
80
100
−2.5
0
20
40 Lags
Fig. 5.7. Multivariate system: comparing the true (solid) IR-coefficients with those identified by MOESP approach
where {y1 (s), y2 (s)}, {u1 (s), u2 (s)} and w(s) represent the system outputs, inputs and random noise disturbance respectively. A state-space based MPC controller is designed in MATLAB for the system. A sampling period of T = 2 time units is used in the simulations. Closed-loop input/output data is obtained by exciting the system using a designed RBS signal of magnitude 1 for the setpoint (rt ) and random white noise (et ) of standard deviation 0.1 for the disturbance in MATLAB and Simulink . We again assume that the controller is not known for the closed-loop identification. The closed-loop data is plotted in Figure 5.5. The data-driven closed-loop subspace identification algorithm is used with rows(N ) = 50 and columns(j) d and = 2500 in the data Hankel matrices to identify the subspace matrices HN s HN . The identified impulse response coefficients are plotted against the true impulse response coefficients in Figure 5.6. It can be seen from the plot that the identified impulse response coefficients match closely with the true ones. Next, the MOESP approach is used to identify the deterministic part of the system using the same closed-loop data. The impulse response coefficients identified using the MOESP approach are compared with the true coefficients in Figure 5.7 and the result is not quite comparable with that obtained using the data-driven closed-loop subspace method. It should be noted that a better match between the identified and true coefficients using MOESP could be achieved only with a very high order model. The order of the end model could probably be reduced using a standard model reduction method but with a compromise
94
5 Identification of Dynamic Matrix and Noise Model Using Closed-loop Data
between bias error and complexity. The closed-loop CVA approach [86] involves the inversion of a transfer function matrix which may not always be possible in MATLAB due to the fact that the resultant matrix could contain improper transfer functions.
5.6 Identification of the Dynamic Matrix: Pilot-scale Experimental Evaluation The data-driven closed-loop subspace method for the estimation of the dynamic matrix is tested on a pilot-scale process. The process considered is shown in Figure 5.8.
LT
FC
Input (flow rate)
Controlled variable (level)
LT
Outlet valve (kept constant)
water drain Fig. 5.8. Experimental setup
The input (u) is the inlet water flow rate and the process variable to be controlled (y) is the level of water in the tank. The tank outlet flow valve is kept at a constant position. The head (pressure) of the water in the inlet pipe can be considered as (an unmeasured) disturbance. The tank level is controlled by a PID controller, 2.5 + 0.05 s + 1s. An RBS signal of series of setpoint changes to the level is designed in MATLAB . Closed-loop data of the process input, setpoint and output is collected and plotted in Figure 5.9.
5.6 Identification of the Dynamic Matrix: Pilot-scale Experimental Evaluation
95
Closed−loop data 10
R
9.5 9 8.5 8 200
400
600
800
1000
1200
1400
1600
1800
200
400
600
800
1000
1200
1400
1600
1800
200
400
600
800
1000 Samples
1200
1400
1600
1800
16
U
14 12 10 8
11
Y
10 9 8 7
Fig. 5.9. Pilot scale process: closed-loop system data
Plot of columns(1,20,40 and 60) in subspace matrices 0.1
From ‘Lu‘
0.08 0.06 0.04 0.02 0 0
20
40
60
80
100
120
140
160
180
200
0
20
40
60
80
100 Lags
120
140
160
180
200
1.5
From ‘Le‘
1
0.5
0
Fig. 5.10. Pilot scale process: IR-coefficients from the columns of the identified subspace matrices
96
5 Identification of Dynamic Matrix and Noise Model Using Closed-loop Data
Comparing the IR−models 0.1
From ‘Lu‘
0.08 0.06 0.04 0.02 0 0
20
40
60
80
100
120
140
160
180
200
0
20
40
60
80
100 Lags
120
140
160
180
200
1.5
From ‘Le‘
1
0.5
0
Fig. 5.11. Pilot scale process: comparing the IR-coefficients from subspace matrices identified using the open-loop data (dotted line) and that from the closed-loop data (solid line)
Data Hankel matrices of dimensions with rows(N ) = 200 and columns(j) = 1500 are constructed using the closed-loop data and the subspace matrices d s HN and HN are identified. The columns of the subspace matrices are plotted in Figure 5.10. The figure illustrates that the corresponding impulse response coefd s and HN do match after time shift (time shift ficients in different columns of HN is necessary due to the lower triangular structure of block-Toeplitz matrices). d is checked The accuracy of the impulse response coefficients in the matrix HN by doing an open-loop identification. Open-loop data is collected by exciting the process with an input RBS signal of magnitude 1. The impulse response coefficients identified using the open-loop subspace identification method are plotted together with the coefficients identified suing closed-loop data in Figure 5.11. We can see that there is some mismatch in the impulse response models in the subspace matrices identified using closed-loop data and those identified using open-loop data. The mismatch may be due to the different operating regions between closed-loop and open-loop identifications and some effect of feedback control.
5.7 Summary This chapter provides a data-driven subspace method for identification of process dynamic matrices and the noise model from closed-loop data. Closed-loop subspace matrices are first obtained with persistent setpoint excitations of the
5.7 Summary
97
closed-loop systems. Open-loop subspace matrices are then retrieved from the closed-loop subspace matrices. The process dynamic matrix is obtained from the deterministic subspace matrix and the noise model in the impulse response form is obtained from the stochastic subspace matrix. The method has been extended to the case of measured disturbances under feedback-only control. Results from simulations and from an evaluation on a pilot scale plant are provided to illustrate the proposed data-driven closed-loop subspace identification method.
6 Model Predictive Control: Conventional Approach
6.1 Introduction Predictive controllers have been widely used in process industries for about three decades [92, 5, 93, 6]. Several forms of predictive controllers like IDCOM [96], DMC [1, 2, 5], QDMC [3, 4], RMPCT (reviewed by [5, 6]), GPC [7, 8], etc. have been successfully implemented in process industries through the years. The term predictive control does not designate a specific control strategy but a wide range of control algorithms which make explicit use of a (predictive) process model in a cost function minimization to obtain the control signal [92, 97]. There are many good references for model predictive control in the literature, such as [92, 98, 99, 100, 6, 101, 102]. In this chapter, following MPC introduction in [92, 98, 99, 100], lecture notes in [103, 104, 105], and other references stated above, a tutorial on model predictive control is provided. More technical details can be looked for in the above references as well as many other publications in model predictive control over the last 30 years. Model predictive control is an appropriately descriptive name for a class of model based control schemes that utilize a process model for two central tasks [99]: 1) explicit prediction of future process behavior, and 2) computation of appropriate corrective control action required to drive the predicted output as close as possible to the desired target values. The overall objectives of an MPC may be summarized as [6, 100]: • Prevent violations of input and output constraints. • Drive some output variables to their optimal setpoints, while maintaining other outputs within specified ranges. • Prevent excessive movement of the input variables. • Control as many process variables as possible when a sensor or actuator is not available. The ideas appearing to a greater or lesser degree in all predictive controls are basically [92]: • dependence of the control law on predicted behavior, • explicit use of models to predict the process output at future time instants, B. Huang et al.: Dyn. Model. Predict. Ctrl. & Perform. Monitor., LNCIS 374, pp. 101–119, 2008. c Springer-Verlag London Limited 2008 springerlink.com
102
6 Model Predictive Control: Conventional Approach
• calculation of control sequence minimizing an objective function, and • a receding horizon strategy, i.e., updating of input and shifting of the horizon towards the future at each time instant. A number of MPC algorithms are available commercially. Following are some representative MPC technologies [92, 6]: • DMC Plus (Dynamic Matrix Control) and Aspen Target: ASPEN Tech, • ADMC (Adaptive DMC): CTC, • IDCOM (Identification and Command), HIECON (Hierarchial Constraint Control), PFC (Predictive Functional Control): Adersa, • RMPCT (Robust Multivariable Predictive Control Technology): Honeywell, • SMCA (Setpoint Multivariable Control Architecture), IDCOM-M (Multivariable): Setpoint Inc, • APCS (Adaptive Predictive Control System): SCAP Europa, • SMOC (Shell Multivariable Optimising Controller): Shell, • Connoisseur (Control and Identification package): Invensys, • MVC (Multivariate Control): Continental Controls Inc., • NOVA-NLC (NOVA Nonlinear Controller): DOT Products, • Process Perfecter: Pavilion Technologies. Some of these technologies are nonlinear MPC while most are linear MPC.
6.2 Understanding MPC Predictive control is intuitive and used in our daily activities like walking, driving, studying and so on. Think about the course of studying in a school. Basically one has to do a set of things: • Predict: When one sets a target for a “desired” grade, one has to plan and work towards the target. It may be too early to consider the final target at beginning of a term. Instead, one should think a few days or a few weeks ahead and predict what performance may be achieved over this shorter time window. The target within the shorter time period can be, for example, certain “desired” grades in the assignment, quiz, etc. • Plan: Compare the predicted performance with the shorter time target. If a difference is to be expected, for example, lower than the target, then additional efforts should be considered, subject to constraints of course, such as there are only 24 hours a day. • Act: If it is expected that the additional efforts likely make one meet the target, then the additional efforts will be put into action. Although a set of the additional efforts, for today, tomorrow, and so on, has been planned days or weeks ahead, only the effort planned for today can actually be materialized today. In the next day, the procedure of prediction and planning is repeated, and a new set of efforts is determined. Thus the next day’s action will be taken according to the new planning. This process proceeds continuously until the end of the term
6.3 Fundamentals of MPC
103
Other well-known daily-life analogies in MPC literature include crossing a road and playing chess. In chess, a good player predicts the game a few steps ahead based on the moves of the opponent, and plans a few future moves. However, only one move can actually be applied each time. Based on the following-up move of the opponent, a new set of predictions has to be made and a new set of future moves is determined as a result. This procedure is repeated throughout the game. Analogously, the methodology of all the controllers belonging to the MPC family can be characterized by the following steps [92]: 1. The future outputs for a predetermined horizon N2 , called the prediction horizon, are predicted at each instant t using the process model. These predicted outputs yˆ(t + i|t) for i = 1 . . . N2 depend on the known values up to instant t (past inputs and outputs) and on the future control signals ut+i , i = 0, 1 . . . N2 − 1, which are those to be sent to the process and to be calculated. 2. The set of future control signals is calculated by optimizing a determined criterion in order to keep the process as close as possible to the reference trajectory. The criterion usually takes the form of a quadratic function of the errors between the predicted output signal and the reference trajectory. The control effort is included in the objective function in most cases. An explicit solution can be obtained if the criterion is quadratic, the model is linear and there are no hard constraints; otherwise an iterative optimization method has to be used. Some assumptions about the structure of the future control law are also made in some cases, such as that it will remain constant after a given instant. 3. The control signal ut , i.e., the control action calculated for the current time, is sent to the process whilst the calculated future control actions are discarded because in the next sampling instant yt+1 is already known; step 1 is repeated with the new value and all the sequences are brought up to date. Thus ut+1 is calculated at time instant t + 1 (which in principle will be different from ut+1 calculated at time instant t because of the new information available) using the receding horizon concept.
6.3 Fundamentals of MPC All the MPC algorithms possess common elements and different options can be chosen for each one of these elements, giving rise to different algorithms. These elements are [92]: • prediction model, • objective function, and • algorithms for obtaining the control law. 6.3.1
Process and Disturbance Models
The cornerstone of the MPC is the model. The model generally consists of two parts: process model and disturbance model. While the process model typically
104
6 Model Predictive Control: Conventional Approach
represents input-output relations of a physical plant, the disturbance model is often used to represent the disturbance or simply used to approximate modelplant mismatch. There are different forms of models used in MPC. Some common model structures are illustrated below. Impulse Response Model yt =
∞
fi ut−i
i=1
where, fi is the sampled output when the process is excited by a unit impulse, also known as an impulse response coefficient. Often this sum is truncated and only Ns values are considered, where Ns is the settling time of the process; namely, starting from instant Ns + 1, impulse responses are approximately zero. Thus Ns yt = fi ut−i = (f1 z −1 + f2 z −2 + · · · + fNs z −Ns )ut i=1
Advantages • intuitive and clearly reflects the influence of each manipulated variable; • no a priori information about the process is needed for identification; • allows complex dynamics such as non-minimum phase or delays to be described with ease. Disadvantages • only stable processes without integrators can be represented; • a large number of parameters is necessary.
Step Response Model yt =
∞
gi Δut−i
i=1
where, gi ’s are the sampled output values for the unit step input and Δut = ut − ut−1 . For a stable system, gi will be constant after settling time Ns . As an impulse response coefficient can be considered as the difference between two consecutive step response coefficients, the following relations hold: fi = gi − gi−1 gi =
i
fj
j=1
Step response models have the same advantages and disadvantages of the impulse response models. Transfer Function Model yt =
B(z −1 ) ut A(z −1 )
6.3 Fundamentals of MPC
105
Advantages • valid for unstable processes; • needs fewer parameters. Disadvantages • the structure of the process is fundamental for identification, especially the order of A and B. • some processes may not be described sufficiently by a parametric model with a limited number of parameters. State Space Model xt+1 = Axt + But yt = Cxt + Dut Advantages • multivariate processes can be presented in a straightforward manner. • a large collection of modern control theory and analysis methods can be applied. Disadvantages • the calculations may be complicated with the additional necessity of including an observer if the states are not accessible. • some processes may not be described sufficiently by a parametric model with a limited number of parameters. Time Series Model for the Disturbance A model widely used is the AutoRegressive and Integrated Moving Average (ARIMA) in which the disturbance, i.e., the difference between the measured output and the one calculated by the model, is given by νt =
C(z −1 ) et ΔD(z −1 )
where, Δ = 1 − z −1 , νt is the disturbance, and D and C are often chosen as 1 in practical MPC. 6.3.2
Predictions
Models are generally the vehicles towards control design. In MPC, however, models are not used directly for the control design. Instead, predictors are designed first according to the models, and then the control law is designed according to the predictions. Predictions have been discussed extensively in previous several chapters, where the future outputs are predicted by past inputs and outputs. However, to design an MPC is to determine future inputs in order to drive the process to a desired target. Thus, the future inputs are critical components of the predictors. Furthermore, in MPC the process output does not just follow the setpoint at one specific point, but a trajectory of the setpoint. Thus prediction is not simply one-step ahead but multiple steps ahead.
106
6 Model Predictive Control: Conventional Approach
As an illustration, let’s consider the following simple ARX model: yt = −ayt−1 + bΔut−1 + et
(6.1)
This model is slightly different from the conventional ARX model by replacing ut with Δut . The one-step prediction of the ARX model can be derived straightforwardly by writing Equation 6.1 as yt+1 = −ayt + bΔut + et+1
(6.2)
and then ignoring the future white noise et+1 , following the procedure shown in Chapter 2. This gives us yˆ(t + 1|t) = −ayt + bΔut
(6.3)
Similarly, the two-step prediction can be derived by first writing Equation 6.1 as yt+2 = −ayt+1 + bΔut+1 + et+2
(6.4)
Using Equation 6.2, Equation 6.4 can be written as yt+2 = −a(−ayt + bΔut + et+1 ) + bΔut+1 + et+2 = a2 yt + bΔut+1 − abΔut + et+2 − aet+1 By ignoring future white noises et+1 , et+2 , the two-step ahead prediction can be derived as (6.5) yˆ(t + 2|t) = a2 yt + bΔut+1 − abΔut Combining Equations 6.3 and 6.5, both one- and two-step predictions based on the past input, past output, and future input are obtained: ⎞ ⎞ ⎛ ⎛ ⎞ ⎛ ⎞⎛ yˆ(t + 1|t) b 0 Δut −a ⎠ yt + ⎝ ⎠ ⎝ ⎠=⎝ ⎠⎝ (6.6) Δut+1 yˆ(t + 2|t) −ab b a2 Predictions over additional steps can be derived similarly. Dealing with other model structures such as ARIMAX, BJ, PEM, etc. is, however, more complicated but the idea remains the same. 6.3.3
Free and Forced Response
Predictions, such as Equation 6.6, consist of two terms. The second term (on the right hand side of the equation) depends on future inputs1 . The first term depends only on past outputs (as well as past inputs in general). This fact has been expressed in MPC literature as yˆ(t + j|t) = yf,t+j + yc,t+j
(6.7)
where the two terms on the right hand side are called free response and forced (controlled) response, respectively. The free response, yf,t+j , corresponds to the prediction of the output if the manipulated variable is kept constant at its present value over the prediction horizon. On the other hand, the forced response, yc,t+j , corresponds to the prediction of the output due to the future control actions. 1
At time t, ut is to be determined and therefore, ut is also classified as a future control action.
6.3 Fundamentals of MPC
6.3.4
107
Objective Function
The various MPC algorithms propose different cost functions for obtaining the control law. The general aim is that [92]: • the future output should follow a determined reference signal over the considered horizon; • the control effort necessary for doing so should be considered in the objective function. The general expression for such an objective function is J =
N2
T
[rt+j − yˆ(t + j|t)] Qj [rt+j − yˆ(t + j|t)]
j=N1
+
Nu
[Δut+j−1 ]T Rj [Δut+j−1 ]
(6.8)
j=1
where Qj and Rj are the weighting matrices. Other parameters are discussed below. Prediction Horizon and Control Horizon Prediction starts at N1 , and N2 is the maximum prediction horizon, Nu is the control horizon. N2 − N1 + 1 determines a window of the predictions in which it is desirable for the predicted output to follow the setpoint. Thus, taking a large value of N1 implies that it is not important if there are errors in the first few instants up to N1 . Taking a large value of N2 − N1 + 1, however, implies that output errors are concerned over a long time horizon. Reference Trajectory Reference trajectory is also called setpoint, which is a sequence of future “desired” process outputs. The desired output is not necessarily to be the same as the actual output due to performance limitations of control systems, such as hard constraints on the actuator, time delay of the process, model-plant mismatch etc. However, it is desirable for the actual process output to follow up the desired setpoint eventually. Considering performance limitations, most objective functions use a reference trajectory which does not necessarily have to coincide with the real reference. It is normally a smooth approximation from the current value of the output yt towards the known reference. 6.3.5
Constraints
Constraints are a reality and exist almost everywhere. For instance, no valve can move beyond its operating limits, and most of them cannot allow rapid changes of large magnitude (i.e., large slew rate) or, at least it is not desirable to do so. Many process outputs are also subject to constraints for economic or safety reasons. In chemical reactors, for example, higher temperature may be
108
6 Model Predictive Control: Conventional Approach
desirable for certain reactions but the reactor material has limited temperature tolerance. Too high temperature will cause safety problem. Therefore, in almost all practical model predictive controls, constraints in the amplitude, in the slew rate of the control signal, and in the output are considered: umin ≤ δumin ≤ ymin ≤
ut
≤ umax
ut − ut−1 yˆ(t + j|t)
≤ δumax ≤ ymax
∀t ∀t N1 ≤ j ≤ N2 , ∀t
Note that, although ideally the output constraints should be direct to the actual output, due to the unpredictable disturbances, the actual output can not be directly constrained through the controller. Thus only the predicted output can actually be constrained and, as a result, good predictions are critical in MPC. 6.3.6
Control Law
Control actions, Δut+i , are calculated by minimizing the objective function. To do this the prediction yˆ(t + i|t), such as that in Equation 6.6, are calculated and are substituted in Equation 6.8. By taking derivatives of J with respect to Δut , Δut+1 , . . . , Δut+Nu −1 , and then equating the derivatives to zero, an analytical solution can be obtained . This is a typical least squares problem. However, if there are hard constraints on ut , Δut , or yˆ(t+j|t), analytical solutions are not possible, and numerical optimization is necessary. All computation must be completed within the sampling interval. When numerical optimization is needed, obtaining the solution is not trivial because there will be Nu decision variables in the optimization. The control horizon is used to impose a structure on the control law. Under this concept it is considered that after a certain time window Nu , ut becomes constant or equivalently Δut = 0, i.e., Δut+j−1 = 0
j > Nu
To summarize, the design of a model predictive control involves specification of prediction models, objective functions, and optimization to obtain control laws. To illustrate, a design procedure for dynamic matrix control is elaborated next.
6.4 Dynamic Matrix Control (DMC) DMC was initially developed by Cutler and Ramaker of Shell Oil Co. at the end of the seventies. This algorithm has been widely accepted in industry especially in the petrochemical sector. It was mentioned earlier that any MPC algorithm has three basic elements, namely the prediction model, the objective function and the solution procedure to calculate the control action. In this section we will discuss these elements for a simplified DMC algorithm.
6.4 Dynamic Matrix Control (DMC)
6.4.1
109
The Prediction Model
The process model used in the DMC algorithm is the step response model of the process. Consider an infinite step response model yt =
∞
gk Δut−k
k=1
This can be written as yt =
Ns
gk Δut−k +
∞
gk Δut−k
(6.9)
k=Ns +1
k=1
where the second term of the above equation represents the response due to control actions that were taken from infinite past up to and including time t − Ns − 1. Ns is the settling time in terms of samples. Denote the second term as Zt ; i.e., write Equation 6.9 as yt =
Ns
gk Δut−k + Zt
(6.10)
k=1
In this way the infinite step response model appears to have a finite step response model form, based on which predictions can be made: yˆ(t + p|t) =
Ns
gk Δut+p−k + Zt+p
(6.11)
k=1
As has been discussed, under the MPC design framework, the predicted response is divided into two parts: the free response term and the forced response term. To achieve this, we separate Equation 6.11 into two terms: one term containing future control actions (Δut , Δut+1 , . . .) and the other term containing past control actions (Δut−1 , Δut−2 , . . .): yˆ(t + p|t) =
p
gk Δut+p−k +
k=1
Ns
gk Δut+p−k + Zt+p
k=p+1
We can rearrange this equation and then write it in terms of free and forced response as p ∗ yˆ(t + p|t) = gk Δut+p−k + yt+p (6.12) k=1
where, the term
∗ yt+p =
Ns
gk Δut+p−k + Zt+p
(6.13)
k=p+1
represents the free response, i.e., the response that would be expected if no ∗ future actions are taken. Thus yf,t+p = yt+p . On the other hand the first term
110
6 Model Predictive Control: Conventional Approach
on the right hand side of Equation 6.12 is the forced (controlled) response, i.e., the response due to the future control moves starting from t. Thus this term is yc,t+p under the MPC framework. The objective function in DMC, as in MPC, is to minimize differences between the setpoint and the predicted output over some prediction horizon, say over the next 1 to N2 sample intervals (i.e., choose N1 = 1). The set of predicted output values over this prediction horizon is: ∗ yˆ(t + 1|t) = yt+1 + g1 Δut ∗ yˆ(t + 2|t) = yt+2 + g2 Δut + g1 Δut+1 ∗ yˆ(t + 3|t) = yt+3 + g3 Δut + g2 Δut+1 + g1 Δut+2 .. . ∗ yˆ(t + Nu |t) = yt+N + gNu Δut + gNu −1 Δut+1 + · · · u · · · + g1 Δut+Nu −1 .. . ∗ yˆ(t + N2 |t) = yt+N + gN2 Δut + gN2 −1 Δut+1 + · · · 2 · · · + g1 Δut+N2 −1
Note that Δut+Nu = Δut+Nu +1 = · · · = Δut+N2 −1 = 0 owing to the structural constraint on the future control moves, specified by the control horizon Nu . Define: ⎤ ⎤ ⎤ ⎡ ⎡ ⎡ ∗ yˆ(t + 1|t) yt+1 Δut ⎥ ⎥ ⎢ ⎢ ∗ ⎥ ⎢ ⎥ ⎥ ˆ(t + 2|t) ⎥ ∗ ⎢ yt+2 ⎢ y ⎢ Δut+1 ⎥ ⎥ ⎥ ⎢ ⎢ ⎢ ˆ=⎢ y .. .. ⎥ ⎥ , y = ⎢ .. ⎥ , Δu = ⎢ . ⎦ ⎦ ⎣ . ⎣ . ⎦ ⎣ ∗ yt+N 2
yˆ(t + N2 |t)
Δut+Nu −1
We can express the predicted output using a vector/matrix notation as ˆ = y∗ + GΔu y
(6.14)
where, G is defined as ⎡
g1 g2 .. .
0 g1 .. .
0 0 .. .
⎢ ⎢ ⎢ ⎢ ⎢ G=⎢ ⎢ gNu gNu −1 gNu −2 ⎢ . .. .. ⎢ . ⎣ . . . gN2 gN2 −1 gN2 −2
and is called the dynamic matrix.
··· ··· .. .
0 0 .. .
··· .. .
g1 .. .
· · · gN2 −Nu +1
⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦
6.4 Dynamic Matrix Control (DMC)
6.4.2
111
Unconstrained DMC Design
Let the setpoint trajectory over the prediction horizon be: ⎡
rt+1
⎤
⎥ ⎢ ⎥ ⎢ r ⎥ ⎢ t+2 ⎥ r=⎢ ⎢ .. ⎥ ⎢ . ⎥ ⎦ ⎣ rt+N2 Then a set of control actions can be solved through the following optimization problem: minimize Δu
ˆ )T Q(r − y ˆ) J(Δu) = (r − y
subject to: ˆ = y∗ + GΔu y ˆ in the objective function yields where Q is a weighting matrix. Substituting y the modified objective function: minimize
J(Δu) = (r − y∗ − GΔu)T Q(r − y∗ − GΔu)
Δu
The optimum value of Δu is obtained by taking derivative of J, i.e., ∂J = −2GT QT (r − y∗ − GΔu) = 0 ∂Δu This gives the optimal sequence of control moves:
−1 T T Δu = GT QT G G Q (r − y∗ ) 6.4.3
Penalizing the Control Action
A more common problem formulation for DMC is to add a control-action related term in the objective function to penalize possible “aggressive” control actions: minimize Δu
ˆ )T Q(r − y ˆ ) + ΔuT RΔu J(Δu) = (r − y
subject to: ˆ = y∗ + GΔu y
112
6 Model Predictive Control: Conventional Approach
Following the same approach as the derivation of the unconstrained controller gives us: ∂J = −2GT QT (r − y∗ − GΔu) + 2RT Δu = 0 ∂Δu which can be simplified to yield: −GT QT (r − y∗ ) + (GT QT G + RT )Δu = 0 Thus, the controller becomes:
−1 T T Δu = GT QT G + R G Q (r − y∗ ) 6.4.4
(6.15)
Handling Disturbances in DMC
Up to now, we have not included a disturbance term in the model. In DMC it is assumed that the disturbance exists but remains constant along the entire prediction horizon at the same value as that at time instant t, i.e., the disturbance is assumed to be a step-type disturbance. Alternatively speaking, we can consider that the disturbance model is specified as 1−z1 −1 . With the consideration of the disturbance, the free response expressed by Equation 6.13 has now an additional term of this disturbance νt+p , i.e., ∗ = Zt+p + yt+p
Ns
gk Δut+p−k + νt+p
k=p+1
Now, for the step-type disturbance, as assumed in DMC, we can calculate the disturbance as νt+p = νt = yt − yˆ(t|t − 1) where, the one step prediction yˆ(t|t − 1) is given by yˆ(t|t − 1) =
∞
gk Δut−k
k=1
Therefore, νt+p = νt = yt −
∞
gk Δut−k
k=1
Since Zt =
∞
gk Δut−k
∞
gk Δut+p−k
k=Ns +1
Zt+p =
i=Ns +1
6.4 Dynamic Matrix Control (DMC)
113
Consequently, ∞
∗ yt+p =
gk Δut+p−k +
i=Ns +1
+yt −
Ns
gk Δut+p−k
k=p+1
∞
gk Δut−k
k=1 Ns
= yt +
gk Δut+p−k
k=Ns +1
k=p+1 ∞
−
∞
gk Δut+p−k +
gk Δut−k
k=1
= yt +
∞
gk Δut+p−k −
k=p+1
∞
gk Δut−k
k=1
By change of the index variable with j = k − p, i.e., k = j + p in the second term on the right hand side of the equation above, we get ∗ yt+p
= yt +
∞
gj+p Δut−j −
j=1
∞
gk Δut−k
(6.16)
k=1
By changing the index variable from j to k in the second term on the right hand side of Equation 6.16 again, we get ∗ yt+p = yt −
∞
(gk − gk+p ) Δut−k
k=1
Steady-state is reached in Ns sample intervals after a step change in the input, which implies gNs +1 = gNs +2 = · · · = gNs +∞ Therefore, we can write the general expression for the DMC free response as ∗ yt+p = yt −
Ns
(gk − gk+p ) Δut−k
(6.17)
k=1
6.4.5
Multivariate Dynamic Matrix Control
Recall that for a SISO system the DMC control problem was stated as: minimize Δu
ˆ ) + ΔuT RΔu ˆ )T Q(r − y J(Δu) = (r − y
subject to: ˆ = y∗ + Sm,l Δu y where the dynamic matrix G has been replaced by Sm,l for the sake of generalization. The subscripts of Sm,l denote that the system has l inputs and m outputs.
114
6 Model Predictive Control: Conventional Approach
Consider a multivariate system with m outputs and l inputs. A set of output variables (yi ) is controlled via a set of manipulated variables (uj ). Then the relationship between the ith output and all inputs (i.e., a MISO system) can be expressed using a finite step response model:
yi,t =
Ns l
gij,k Δuj,t−k + Zi,t ,
i = 1, . . . , m
j=1 k=1
where gij,k is the kth step response coefficient from the jth input to the ith output; yi,t is the ith output at time t, Δuj,t−k is the control move at time t − k in the jth input, and Zi,t is similarly defined as in Equation 6.10. With the finite step response model, following the derivation of Equation 6.12, the ith output prediction yˆi (t + p|t) can be derived as:
yˆi (t + p|t) =
p l
∗ gij,k Δuj,t+p−k + yi,t+p
j=1 k=1
∗ where yi,t+p is the free response of the ith output at time instant t + p. Recall that we have defined the vectors for SISO process and the same notation can be carried to MIMO process. For the ith output and the jth input, let
⎡
yˆi (t + 1|t)
⎤
⎡
∗ yi,t+1
⎤
⎥ ⎥ ⎢ ⎢ ⎥ ⎥ ⎢ ⎢ ∗ ⎥ ⎢ yˆi (t + 2|t) ⎥ ∗ ⎢ yi,t+2 ⎥ ⎥,y = ⎢ ˆi = ⎢ y .. .. ⎥ ⎥ i ⎢ ⎢ ⎥ ⎥ ⎢ ⎢ . . ⎦ ⎦ ⎣ ⎣ ∗ yi,t+N2 yˆi (t + N2 |t) ⎤ ⎤ ⎡ ⎡ ri,t+1 Δuj,t+1 ⎥ ⎥ ⎢ ⎢ ⎥ ⎥ ⎢ ⎢ ⎢ ri,t+2 ⎥ ⎢ Δuj,t+2 ⎥ ⎥ ⎥ ⎢ ⎢ ri = ⎢ .. ⎥ , Δuj = ⎢ .. ⎥ ⎥ ⎥ ⎢ ⎢ . ⎦ . ⎦ ⎣ ⎣ ri,t+N2
Δuj,t+Nu −1
Define vectors with all inputs and outputs: ⎡
⎢ ⎢ ⎢ ˆ=⎢ y ⎢ ⎢ ⎣
ˆ1 y
⎤
⎡
y1∗
⎤
⎡
r1
⎤
⎡
Δu1
⎤
⎢ ⎢ ⎥ ⎥ ⎥ ⎥ ⎢ ⎢ ⎢ ⎥ ⎥ ⎥ ⎥ ⎢ ⎢ r2 ⎥ ⎢ Δu2 ⎥ ⎥ ∗ ⎢ y2∗ ⎥ ⎥,y = ⎢ ⎥,r = ⎢ ⎥ , Δu = ⎢ ⎥ ⎢ .. ⎥ ⎢ .. ⎥ ⎥ ⎢ .. ⎥ ⎢ . ⎥ ⎢ . ⎥ ⎥ ⎢ . ⎥ ⎣ ⎣ ⎦ ⎦ ⎦ ⎦ ⎣ ∗ ˆm y ym rm Δul ˆ2 y .. .
6.4 Dynamic Matrix Control (DMC)
115
The multivariable dynamic matrix Sm,l is now defined as: ⎡
Sm,l
⎢ ⎢ ⎢ =⎢ ⎢ ⎢ ⎣
G11 G12 · · · G1l G21 G22 · · · G2l .. .. .. .. . . . . Gm1 Gm2 · · · Gml
⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦
Hence, the multivariable dynamic matrix consists of l × m sub-matrices which contain the step response coefficients relating the individual input-output pairs. The sub-matrix for relating the i-th output yi to the j-th input uj is given by: ⎡
gij,1
0
0
⎢ ⎢ g gij,1 0 ⎢ ij,2 ⎢ . . .. ⎢ . .. ⎢ . . Gij = ⎢ ⎢ ⎢ gij,Nu gij,Nu −1 gij,Nu −2 ⎢ ⎢ . .. .. ⎢ .. . . ⎣ gij,N2 gij,N2 −1 gij,N2 −2
···
0
··· .. .
0 .. .
··· .. .
gij,1 .. .
· · · gij,N2 −Nu +1
⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦
With these definitions, we can write the multivariate output prediction equation: ˆ = y∗ + Sm,l Δu y The multivariate DMC problem can now be stated as: minimize Δu
ˆ )T QT (r − y ˆ ) + ΔuT RΔu J(Δu) = (r − y
subject to: ˆ = y∗ + Sm,l Δu y Following the same approach as the derivation of the univariate DMC solution, the multivariate DMC problem has the solution:
T −1 T Δu = Sm,l QT Sm,l + R Sm,l QT (r − y∗ ) 6.4.6
(6.18)
Hard Constrained DMC
Up to this point we have assumed that any control action we calculate can be implemented (e.g., Δu is unbounded). Also, we have similarly assumed that yˆ(t + j|t) is not restricted. This is rarely the case.
116
6 Model Predictive Control: Conventional Approach
A natural formulation for handling hard constraints is through numerical optimization: minimize Δu
ˆ ) + ΔuT RΔu ˆ )T Q(r − y J(Δu) = (r − y
subject to: ˆ = y∗ + Sm,l Δu y uL ≤ u ≤ uH δuL ≤ Δu ≤ δuH yL ≤ y ˆ ≤ yH There is no analytical solution to this optimization problem. It must be solved numerically at each control interval. 6.4.7
Economic Optimization
In practice, there are often more output variables (controlled variables) than input variables (manipulated variables). It is also possible, however, to have more manipulated variables than controlled variables. This situation could occur when, for example, some controlled variables are constrained variables rather than setpoint tracking variables. As long as these variables are inside constraints, there is no need to “control” them; thus some manipulated variables are released and this may produce “additional” manipulated variables. When extra degree of freedom is available, a steady-state “economic” optimization problem can be used to specify the economic compromises among the process variables. This economic optimization is perhaps the most distinguished feature of the model predictive control technology compared to other modern control technologies. When there are excess input variables, the economic optimization problem is often specified in a linear programming (LP) form: minimize uss
P (uss ) = cTu uss
subject to: rss = Km,l uss + b g(rss , uss ) ≥ 0 uL ≤ uss ≤ uH where Km,l is the gain matrix of the system, rss is steady state setpoint target, and b is a constant vector. The solution to the steady-state economic optimization problem provides steady-state targets uss to which the input variables (u)
6.5 Generalized Predictive Control (GPC)
117
should be driven. The steady-state targets are then introduced into the DMC objective function (for dynamic-layer optimization). These steady state and dynamic optimization problems must be solved online and can be quite computationally expensive, but it is often justified on key process units. In practice, the steady-state economic optimization problem can be solved less frequently than the dynamic one, while the dynamic optimization problem must be solved within a sampling interval. The same economic optimization objective can also apply to the output variables. A typical industrial MPC does economic optimization for both input and output variables, as will be discussed further in Chapter 9.
6.5 Generalized Predictive Control (GPC) GPC and DMC follow the same principles and are two special cases of MPC; thus the MPC framework just introduced applies to both. The main difference between DMC and GPC is: 1) GPC uses parametric process models to derive the predictor; thus a Diophantine equation needs to be solved, while DMC directly uses step response models for predictions. 2) GPC uses a more general disturbance model than the step-type disturbance model of DMC. With the background information just introduced, in this section an MPC solution that handles a more general disturbance model is introduced through a brief overview of GPC. DMC design procedure has been illustrated in the previous section. GPC follows the same principle but with different details. Generalized predictive control or GPC design [106, 7, 8] starts by first identifying an ARIMAX model for the process, expressed as A(z −1 )yt = B(z −1 )ut−1 +
C(z −1 ) et
(6.19)
A, B and C are polynomials in the backshift operator, z −1 , with A and C being monic. = (1 − z −1 ) is the differencing operator. The role of the is to ensure integral action in the controller by including an internal disturbance model of typical load perturbations arising in the process industry [106]. For the sake of simplicity, consider a univariate process as an illustration. A quadratic cost function to be minimized, as a univariate version of Equation 6.8, is J =
N2
i=N1
(rt+i − yˆ(t + i|t))2 +
Nu
λ(ut+i−1 )2
(6.20)
i=1
with λ being the weighting on the control effort. N1 is usually chosen as 1 or the process time delay d. We choose N1 = 1 here for simplicity of presentation. For a discussion on the selection of values for N1 , N2 , Nu and λ, specifically for GPC, readers are referred to [7] and [8].
118
6 Model Predictive Control: Conventional Approach
Using the Diophantine equations [7, 8] Fi C(z −1 ) = Ei + z −i −1 A(z ) A(z −1 )
(6.21)
Ei B = Gi C + z −i Γi
(6.22)
and Equation 6.19, by ignoring the future disturbance term Ei et+i , we obtain the p-step ahead output prediction equation: Fp Γp yt + ut−1 + Gp ut+p−1 C C = Fp ytf + Γp uft−1 + Gp ut+p−1
yˆ(t + p|t) =
∗ = yt+p + Gp ut+p−1
(6.23)
where uft = C −1 ut ytf = C −1 yt and
∗ yt+p = Fp ytf + Γp uft−1
(6.24)
is the free response of the process. Thus the prediction given by Equation 6.23 is the counterpart of DMC prediction given by Equation 6.12, and Equation 6.24 is the counterpart of Equation 6.17 but with a more general noise model used here. Predict yˆ(t+p|t), for p = 1, . . . , N2 , using Equation 6.23. Then the predictions in the vector form can also be formed as ˆ = y∗ + GΔu y
(6.25)
where G is the dynamic matrix containing the step response coefficients of B A derived from Gp of Equation 6.23; y∗ is derived from Equation 6.24. Similar to the derivation of the DMC law of Equation 6.15, the GPC law becomes −1 T
G (r − y∗ ) (6.26) Δu = GT G + λI Other practical issues, such as hard constraints, are dealt with in the same way as that of DMC.
6.6 Summary MPC is the most applied advanced control strategy in process industries. DMC is the first version of MPC and its control philosophy is still being used in modern MPC. Since the 1980s, model predictive control has been studied extensively in both academia and industry. Notable developments have been observed in a variety of models used, higher dimensions of processes considered, robustness
6.6 Summary
119
and stability considerations, etc. Modeling is considered as the most challenging and time consuming work in MPC design. In the next chapter, we shall consider a predictive control design approach that does not need an explicit model but possesses many common MPC features such as tuning of disturbance models, constraint handling etc.
7 Data-driven Subspace Approach to Predictive Control∗
7.1 Introduction A dynamic model of the process is the basic requirement for the design of predictive controllers, and the model is first identified using plant input and output data. Using the process model, predictor matrices are constructed (for example, the dynamic matrix constructed using step response coefficients in DMC [1, 2]), as discussed in Chapter 6. The predictor matrices are used to obtain multistep ahead predictions of the process output(s) and used in the controller design. However these predictor matrices can be directly obtained from the input/output data using subspace matrices, which eliminates the intermediate step of parametric process model identification, providing a means for designing a predictive controller in the generalized predictive controller (GPC) framework (e.g. [16]) or the model predictive control framework, without first identifying a parametric model. Under the subspace framework, since no traditional parametric model is required for the MPC type of controller design, the subspace approach is also referred to as the “model-free approach”, and this term has been adopted in the literature (for example see [14, 15]). The idea explored in this chapter is to obtain the controller matrices used in the predictive controllers directly from process input/output data without the intermediate parametric model identification step. Hence this approach has also been considered as a data-driven approach. Moreover, subspace identification methods involve minimizing the summation of multi-step ahead prediction errors, making the subspace matrices based design approach appealing for predictive control synthesis. The predictive controller based on subspace matrices, discussed in this chapter, uses the same cost-function as MPC and hence an important question is how one obtains the predictions utilized in the cost function. As will be seen, the subspace matrices to be used contain similar information as that of the model used by GPC; thus the analogy to GPC will be the focus of this chapter. One of the key aspects in GPC is the assumption of an ARIMAX model for the process [106, 7]. This requires pre-specification of the order and structure of the model to ∗
This chapter (with revisions) is reprinted from Control Engineering Practice, vol 11, Kadali, R., B. Huang, A. Rossitor, “A Data Driven Subspace Approach to Predictive c 2003 Elsevier Ltd., with permission from Elsevier. Controller Design”, 261-278, Section 7.2 has major revisions.
B. Huang et al.: Dyn. Model. Predict. Ctrl. & Perform. Monitor., LNCIS 374, pp. 121–141, 2008. c Springer-Verlag London Limited 2008 springerlink.com
122
7 Data-driven Subspace Approach to Predictive Control
be identified for controller design. Typically one uses reduced-complexity models which frequently introduce bias errors. Moreover, the model is usually identified in a nonlinear, iterative manner, and in general Diophantine equations need to be solved to obtain the prediction matrices. On the other hand, the predictive controller designed using subspace matrices makes no pre-assumptions about the structure and order of the process model (thus alleviating bias errors). Moreover the prediction matrices are obtained through a single matrix algebraic calculation. In summary, the subspace approach to predictive control will have the key features of GPC [7, 8] such as: (1) long-range prediction over a finite horizon; (2) inclusion of weighting on outputs and control moves in the cost-function; (3) choice of a prediction horizon and a control horizon, after which projected control moves are taken to be zero; and 4) use of a more general noise model. It combines these features with the added advantages of: (1) no pre-assumptions about model order or structure; (2) prediction matrices obtained without iteration; and (3) eliminating the step of solving the Diophantine equations. We also note that extension to the multivariate systems is straightforward with the subspace framework. Although the idea of designing predictive controllers using subspace matrices, such as model-free LQG and subspace predictive controller [12, 13, 14], or using the state space model identified through subspace approach [107, 108, 109], has been around for many years, designing a predictive controller directly from subspace matrices with all the features of the traditional predictive controller has not been investigated. The equivalence of finite horizon LQG to GPC is well known [106]; however, there are several other important issues that need to be addressed in the subspace predictive control framework and they form the main content of this chapter. These issues are: (1) derivation of a predictive control law in the GPC framework (with systematic inclusion of integral actions); (2) extension of the predictive control law to include feedforward control to compensate for measured disturbances; (3) inclusion of a constraint handling facility; and (4) tuning of the noise model. The chapter is arranged as follows. Subspace approach to the predictive controller design with enhanced features is explained in Section 7.2. Inclusion of a noise model for tuning is discussed in Section 7.3. Results from the simulations and actual implementation on a pilot-scale plant using the data-driven subspace predictive control scheme are presented in Section 7.4 and Section 7.5 respectively. The summary is presented in Section 7.6.
7.2 Predictive Controller Design from Subspace Matrices The GPC objective function, Equation 6.20, can also be written as: J = (r − y ˆ)T (r − y ˆ) + uT (λI)u
(7.1)
where the future outputs are over the prediction horizon, t + 1 to t + N2 , and the future incremental inputs are over the control horizon, t to t + Nu − 1. All notations of Equation 7.1 have been defined in Chapter 6. An alternative version
7.2 Predictive Controller Design from Subspace Matrices
123
of objective function without differencing the input in subspace literature is given by ˆ) + uT (λI)u (7.2) J = (r − y ˆ)T (r − y Following the subspace notations of Chapter 3, data column vectors over a horizon for the output and input, respectively, are similarly defined as ⎛ ⎞ yt+1 ⎜ ⎟ .. ⎜ ⎟ ⎜ ⎟ . (7.3) yf = yt+1|t+N2 = ⎜ ⎟ ⎝ yt+N2 −1 ⎠ yt+N2
uf = ut+1|t+N2
⎞ ut+1 ⎟ ⎜ .. ⎟ ⎜ ⎟ ⎜ . =⎜ ⎟ ⎝ ut+N2 −1 ⎠ ut+N2 ⎛
(7.4)
⎞ yt−N ⎜ . ⎟ ⎜ . ⎟ ⎜ . ⎟ ⎟ ⎜ ⎜ y ⎟ =⎜ t ⎟ ⎜ ut−N ⎟ ⎜ . ⎟ ⎜ . ⎟ ⎝ . ⎠ ut
(7.5)
yf = Lw wp + Lu uf + Le ef
(7.6)
⎛
wp = wt−N |t
Analogous to the matrix subspace Equation 3.41, a vector-form subspace equation can be written as
Similarly, analogous to Equation 3.43, the prediction version (by omitting future white-noise disturbances) is given by yˆf = Lw wp + Lu uf
(7.7)
yˆt+1|t+N2 = Lw wt−N |t + Lu ut+1|t+N2
(7.8)
or This is a vector form of predictive subspace representation for the plant model 3.1-3.2. Extract ut from Lw wt−N |t as Lw wt−N |t =
Lw,1 Lw,2
wt−N |t−1 ut
= Lw,1 wt−N |t−1 + Lw,2 ut
(7.9)
124
7 Data-driven Subspace Approach to Predictive Control
Substituting Equation 7.9 in Equation 7.8 yields yˆt+1|t+N2 = Lw,1 wt−N |t−1 + Lw,2 ut + Lu ut+1|t+N2
ut = Lw,1 wt−N |t−1 + Lw,2 Lu ut+1|t+N2 = Lw,1 wt−N |t−1 + Luu ut|t+N2
where Luu =
Lw,2 Lu
(7.10)
Given ut+Nu = ut+Nu +1 = · · · = ut+N2 = 0, Equation 7.10 can be written as yˆt+1|t+N2 = Lw,1 wt−N |t−1 + L∗uu ut|t+Nu −1
(7.11)
where L∗uu is Luu by removing the last N2 − Nu + 1 block columns. For simplicity of presentation, we write Equation 7.11 in the same form as Equation 7.7: yˆf = Lw wp + Lu uf
(7.12)
Here f stands for “future” but with time span from t + 1 to t + N2 for yˆf and from t to t + Nu − 1 for uf ; p in wp stands for “past” but with time span of y backwards from t and time span of u backwards from t − 1. The predictor equation in Equation 7.12 is used in the minimization of the objective function, Equation 7.2, noticing y ˆ = yˆf , u = uf . For consistency of notation, denote r by rf . Equation 7.2 can then be written as J = (rf − yˆf )T (rf − yˆf ) + uTf (λI)uf
(7.13)
To derive the ‘subspace predictive control’ law as in [12], substituting Equation 7.12 in Equation 7.13 and then taking derivative operation computes the optimal future control moves as uf = (λI + LTu Lu )−1 LTu (rf − Lw wp )
(7.14)
For finite {N2 , Nu }, the above control law is called SPC or subspace predictive controller in [12]. As {N2 , Nu } −→ ∞, the above control law becomes an LQG-controller as presented in [13, 14]. However, for implementation on real processes the controller should have an integrator, since the objective function, Equation 7.2, does not admit zero static error in the case of non-zero constant reference unless the open-loop process contains an integrator [106]. Hence we need to use the MPC objective function, with incremental inputs uf or Δu, shown in Equation 7.1. Only one [107, 108, 109] among the several subspacematrices based predictive controller design approaches presented in the literature included an integrator in the predictor equation. In this method, to get an integrator in the predictor, Equation 7.12 is multiplied on both sides with a difference operator, = 1 − z −1 , and then rearranged to get a predictor equation with incremental inputs and outputs. A slightly different subspace identification
7.2 Predictive Controller Design from Subspace Matrices
125
method called DSR [48, 49] is used in this approach. The approach in [17] uses an integrated noise model directly. Since the subspace model in Equation 3.41 is input-output based, it is logical to use a similar technique to that adopted in conventional MPC/GPC [7, 8]. This approach is equivalent to the original MPC/GPC design since the innovations form of the state space representation in Equations 3.1-3.2 combined with integrated noise assumption is equivalent to the ARIMAX representation. 7.2.1
Inclusion of Integral Action
To include integral action in the subspace-matrices based predictive controller, we can rewrite the process representation in Equations 3.1-3.2 as xt+1 = Axt + But + Kat
(7.15)
yt = Cxt + Dut + at
(7.16)
where the noise input at is an integrated white noise, which is common in process industry. Therefore, at+1 = at + et et at =
(7.17) (7.18)
Note that the system considered in Equations 7.15-7.16 is equivalent to an ARIMAX representation, as in Equation 6.19, which is considered in the GPC design. Substituting Equation 7.18 in Equations 7.15-7.16, we obtain ξt+1 = Aξt + But + Ket yt = Cξt + Dut + et
(7.19) (7.20)
where the state ξt = xt − xt−1 . Following the same arguments as the derivation of Equation 7.6, a subspace space version of Equations 7.19-7.20 can be derived as Δyt+1|t+N2 = Lw Δwt−N |t + Lu Δut+1|t+N2 + Le et+1|t+N2 where ⎛
⎛ ⎞ ⎞ Δyt+1 Δut+1 ⎜ ⎜ ⎟ ⎟ .. .. ⎜ ⎜ ⎟ ⎟ ⎜ ⎜ ⎟ ⎟ . . Δyt+1|t+N2 = ⎜ ⎟ , Δut+1|t+N2 = ⎜ ⎟ ⎝ Δyt+N2 −1 ⎠ ⎝ Δut+N2 −1 ⎠ Δyt+N2 Δut+N2 ⎛ ⎞ Δyt−N ⎜ ⎟ .. ⎜ ⎟ . ⎜ ⎟ ⎜ Δy ⎟ t ⎟ Δwt−N |t = ⎜ ⎜ ⎟ ⎜ Δut−N ⎟ .. ⎜ ⎟ ⎝ ⎠ . Δut
(7.21)
126
7 Data-driven Subspace Approach to Predictive Control
By omitting future white-noise disturbances in Equation 7.21, we get the predicted version of the subspace equation: Δˆ yt+1|t+N2 = Lw Δwt−N |t + Lu Δut+1|t+N2
(7.22)
Following the same procedure as the derivation of Equation 7.12, the following prediction equation is obtained Δˆ yf = Lw Δwp + Lu Δuf
(7.23)
where, as usual, f stands for “future” with time span from t + 1 to t + N2 (for Δˆ yf ) and from t to t + Nu − 1 (for Δuf ); p in wp stands for “past” with time span backwards from t (for Δy) and backwards from t − 1 (for Δu). Equation 7.23 can be written as yˆf =
1 1 Lw Δwp + Lu Δuf Δ Δ
Thus, it can be shown that yˆf = yt + ΠLw Δwp + ΠLu Δuf
= yt + LwI Δwp + LuI Δuf
= F + LuI Δuf where
(7.24) (7.25)
⎛
⎞ yt ⎜ ⎟ ⎜ yt ⎟ ⎟ yt = ⎜ ⎜ .. ⎟ ⎝. ⎠ yt
LwI = ΠLw LuI = ΠLu and
⎛
I 0 ... ⎜ ⎜ I I ... Π=⎜ ⎝ ... ... ... I I ...
⎞ 0 ⎟ 0⎟ ⎟ ... ⎠ I
and F is the free response of the process output.
F = yt + LwI Δwp
(7.26)
Note that the matrices LwI and LuI are related in a simple manner to Lw and Lu . Even though LwI and LuI can be alternatively directly identified from the differenced data (i.e., by applying the operator 1 − z −1 to the data), differencing
7.2 Predictive Controller Design from Subspace Matrices
127
data in system identification is not generally recommended. Hence, a simple strategy is to identify Lw and Lu and use these matrices to form LwI and LuI . ˆ = yˆf , the objective function in Equation 7.1 With Equation 7.25, noting y can be expanded as J = (rf − F − LuI uf )T (rf − F − LuI uf ) + uTf (λI)uf Differentiating J with respect to uf and equating it to zero gives the control law uf = (LTuI LuI + λI)−1 LTuI (rf − F )
(7.27)
Only first block row of uf is implemented and the calculation is repeated at each time instant. Note that the above control law derived has a guaranteed integral control action obtained directly from the subspace matrices, without any intermediate parametric model identification step. 7.2.2
Inclusion of Feedforward Control
Measured disturbances are those process input variables which cannot be manipulated for controlling the process outputs. If some of the process disturbances are measurable, then, as shown in Section 5.4, the state space representation of the process in Equations 3.1-3.2 is modified to Equations 5.23-5.24. Further modifying Equations 5.23-5.24 by introducing the differencing operator in a similar way as the derivation of Equations 7.19-7.20 yields ⎛ ⎞ Δut ⎠ + Ket (7.28) ξt+1 = Aξt + B Bm ⎝ Δmt ⎛ ⎞ Δut ⎠ + et Δyt = Cξt + D Dm ⎝ (7.29) Δmt where ξt = xt − xt−1 . Now, analogous to Equation 5.26, subspace representation of Equations 7.287.29 is given by ΔYf = Lbw ΔWpb + Lu ΔUf + Lm ΔMf + Le Ef Omitting the future white-noise term gives us the prediction: ΔYˆf = Lbw ΔWpb + Lu ΔUf + Lm ΔMf where
⎛
ΔYp
⎞
⎟ ⎜ ⎟ ⎜ ΔWpb = ⎜ ΔUp ⎟ ⎠ ⎝ ΔMp
(7.30)
128
7 Data-driven Subspace Approach to Predictive Control
An vector form of Equation 7.30 can be extracted as Δˆ yf = Lbw Δwpb + Lu Δuf + Lm Δmf For predictive control we have the values of measured disturbance only up to the current sampling instant, t, (and do not have the knowledge of the future values of the measured disturbance) i.e., mi for i = t, t − 1, t − 2, ... are known but mt+i for i = 1, 2, ..., N2 are not available. Therefore by ignoring future measurement noises, we may write the prediction expression for Δyf simply as Δˆ yf = Lbw Δwpb + Lu Δuf
(7.31)
Omitting future Δmf can be justified if Δmt is white noise and this is indeed likely to be the case after the differencing of mt . However, if there is some a priori knowledge about Δmt such as its correlation structure, prediction of Δmf is also possible. A procedure similar to that of Equations 7.7-7.12 needs to be performed so that Equation 7.31 can be used for the predictive control design. Again, analogous to the derivation of Equation 7.25, the following equation is obtained: yˆf = yt + LbwI Δwpb + LuI Δuf b
= F + LuI Δuf
(7.32) (7.33)
where LbwI = ΠLbw , LuI = ΠLu , and F b = yt + LbwI Δwpb is the free response. Thus the feedback plus feedforward control law becomes uf = (LTuI LuI + λI)−1 LTuI (rf − F b ) 7.2.3
(7.34)
Constraint Handling
Constraints arise due to physical limitations, quality specifications, safety concerns and limiting the wear of the equipment. One of the main features of MPC, its prediction capability, is useful in anticipating constraint violations and correcting them in an appropriate way [92]. The explicit handling of constraints may allow the process to operate closer to optimal operating conditions [92]. For the constrained case, the computations are more involved. QP-formulation of the constraints is well known and widely available in the literature. Typical process constraints have been discussed in Section 6.3.5. The constraints for the case of predictive controller can be transferred to umin ≤ ut+i ≤ umax i = 0, 1, 2, ..., Nu − 1 δumin ≤ ut+i ≤ δumax i = 0, 1, 2, ..., Nu − 1 ymin ≤ yˆt+i ≤ ymax
i = 1, 2, ..., N2
7.2 Predictive Controller Design from Subspace Matrices
129
Define L1 = U1 =
L2 =
U2 =
L3 =
U3 =
δumin ... ... δumin
T
δumax ... ... δumax ⎞ ⎛ umin − ut−1 ⎟ ⎜ ⎟ ⎜ ... ⎟ ⎜ ⎟ ⎜ ⎠ ⎝ ... umin − ut−1 ⎛ ⎞ umax − ut−1 ⎜ ⎟ ⎜ ⎟ ... ⎜ ⎟ ⎜ ⎟ ⎝ ⎠ ... umax − ut−1 ⎞ ⎛ ymin ⎟ ⎜ ⎜ ... ⎟ ⎟−F ⎜ ⎟ ⎜ ⎝ ... ⎠ ymin ⎛ ⎞ ymax ⎜ ⎟ ⎜ ... ⎟ ⎜ ⎟−F ⎜ ⎟ ⎝ ... ⎠ ymax
T
and ⎛
1 0 ... ⎜ 1 1 ... ⎜ R=⎜ ⎝ ... ... ... 1 1 1
⎞ 0 0⎟ ⎟ ⎟ ... ⎠ 1
where F is the free response vector. The constraints can be rewritten as: L1 ≤ uf ≤ U1 L2 ≤ Ruf ≤ U2 L3 ≤ LuI uf ≤ U3 These constraints can then be combined to the form of a single matrix inequality: Auf ≤ B
(7.35)
130
7 Data-driven Subspace Approach to Predictive Control
with
⎞ −I ⎟ ⎜ ⎜ −R ⎟ ⎟ ⎜ ⎜ −LuI ⎟ A=⎜ ⎟ ⎜ I ⎟ ⎟ ⎜ ⎝ R ⎠ LuI ⎛
⎛
⎞ −L1 ⎜ ⎟ ⎜ −L2 ⎟ ⎜ ⎟ ⎜ −L3 ⎟ B=⎜ ⎟ ⎜ U1 ⎟ ⎜ ⎟ ⎝ U2 ⎠ U3
With the above formulation of process constraints, the optimization problem for a predictive controller takes the form of a standard quadratic programming (QP) formulation and the optimization is done numerically. The quadratic program solved at every instant is min J = min[(rf − F )T (rf − F ) + uTf (LTuI LuI )uf uf
uf
−2(rf − F )T LuI uf + uTf (λI)uf ] = min[uTf (LTuI LuI + λI)uf − 2(rf − F )T LuI uf ] uf
= min[uTf Puf + C T uf ] uf
s.t.
Auf ≤ B
(7.36) (7.37)
where P = LTuI LuI + λI C=
−2LTuI (rf
− F)
(7.38) (7.39)
The optimization of the above QP formulation is carried out by means of the standard commercially available optimization QP code at each sampling instant and then the value of ut is sent to the process. Though computationally more involved than the unconstrained algorithms, the flexible constraints-handling capabilities of predictive controllers are very attractive for practical applications, since the economic operating point of a typical process unit often lies at the intersection of constraints [110, 92]. For more discussion on other types of constraints, for example soft constraints or specifications on the process response characteristics, readers are referred to [92, 97, 5, 6] and the references therein.
7.3 Tuning the Noise Model The disturbance dynamics of industrial processes may change with time frequently. Tuning of the noise model is a key feature of predictive controller formulations like general MPC or GPC [106]. It is necessary to incorporate such a feature in the predictive controller derived from subspace matrices. For this reason we need to separate the state space model of the system into two parts, a
7.3 Tuning the Noise Model
131
deterministic part and a stochastic part, which are similar to the process model and noise model in an equivalent input-output transfer function framework as yt = ytd + yts = [C(zI − A)−1 B + D]ut + [C(zI − A)−1 K + I]et
(7.40)
It can be observed that both deterministic and stochastic parts have the same poles. Hence an equivalent representation for the above equation in the discrete transfer function domain would be an ARMAX model, which is a special case of the general linear model: yt = Gp (z −1 )ut + Gl (z −1 )et where the transfer functions Gp (z −1 ) and Gl (z −1 ) have the same denominators for the ARMAX model. Detailed definition of various model structures can be found in Chapter 2. The noise (or disturbance) model in Equation 7.40 can be written as [C(zI − A)−1 K + I] = I + CKz −1 + CAKz −2 + ...
(7.41)
Thus an assumption that the noise model Gl (z −1 ) = 1 would be equivalent to assuming that the Kalman gain matrix, K, equals zero. Therefore if the user desires to change the stochastic part of the identified innovation model, without changing the deterministic model and hence without changing the system matrices C and A, the only way to do it is by changing the Kalman gain matrix, K. Suppose that a new Kalman gain matrix is represented as K ∗ , which is a matrix for a multiple-output system, and a vector for a single-output system. K ∗ can be expressed as K∗ = K + K
(7.42)
We can then write the new noise model as [C(zI − A)−1 K ∗ + I] = I + CK ∗ z −1 + CAK ∗ z −2 + ... = I + C(K + K )z −1 + CA(K + K )z −2 + ...
= [C(zI − A)−1 K + I] + [CK z −1 + CAK z −2 + ...]
(7.43)
The stochastic part of the output with the new Kalman gain matrix is (yts )∗ = [C(zI − A)−1 K ∗ + I]et
= [C(zI − A)−1 K + I]et + [CK z −1 + CAK z −2 + ...]et
= yts + [CK z −1 + CAK z −2 + ...]et ⎞ et−1 ⎟ ⎜ e ⎜ t−2 ⎟ ≈ yts + CK CAK ... CAN −1 K ⎜ ⎟ ⎝ ... ⎠ et−N ⎛
= yts + γN ep
(7.44)
132
7 Data-driven Subspace Approach to Predictive Control
where
⎞ et−1 ⎟ ⎜ ⎜ et−2 ⎟ ⎟ ep = ⎜ ⎜ .. ⎟ ⎝ . ⎠ ⎛
et−n
and γN =
CK CAK ... CAN −1 K
can be calculated from the elements of the estimated ΓN matrix and a user specified gain matrix K . Estimation of ΓN has been discussed in Chapter 3. However N here should be chosen sufficiently large so that CAN −1 → 0 to justify the truncation used in deriving Equation 7.44. Since knowledge of the state space system matrices A and C is not required, the new stochastic model can be incorporated in a model-free manner. Now the prediction with the “tuned” noise model can be written as (ˆ yt )∗ = yˆt + γN ep
(7.45)
where γN can be considered as a vector of impulse response coefficients (Markov parameters for the multivariate systems) of the tuned component of the new noise model. Noise model tuning is often used as a tool to compensate for the process-model mismatch resulting from changes of the process from time to time or simply as tuning parameters. ep contains the past prediction errors and can be estimated from the data as one step ahead prediction errors. In essence adding the term γN ep is equivalent to filtering the past prediction errors. Hence K is used as a tuning parameter and is chosen in such a way that it minimizes the prediction errors. Thus incorporating a new noise model simply involves the addition of a new term in the calculation of the free response of the process. Hence the free response calculation, for example Equation 7.26, is modified as F = yt + LwI Δwp + Υ ep
(7.46)
where Υ is a left-upper triangular matrix constructed from the elements of γN . ⎛ ⎞ γN (1) γN (2) ... γN (N − 1) γN (N ) ⎜ ⎟ γN (N ) 0 ⎟ ⎜ γ (2) γN (3) ... Υ =⎜ N ⎟ ⎝ ... ... ... ... ... ⎠ γN (N2 ) ... γN (N ) ... 0
where γN (i) is the ith block column in γN .
7.4 Simulations The data-driven subspace predictive control design method is tested in simulations. The system example is taken from MATLAB /MPC toolbox working example.
7.4 Simulations
⎛ ⎝
y1 (s) y2 (s)
⎞
⎛
⎠=⎝
12.8e−s −18.9e−3s 16.7s+1 21.0s+1 6.6e−7s −19.4e−3s 10.9s+1 14.4s+1
⎛
+⎝
⎞⎛ ⎠⎝
3.8e−8s 14.9s+1 4.9e−3s 13.2s+1
⎞
u1 (s) u2 (s)
133
⎞ ⎠
⎛
⎠ w(s) + ⎝
e1 (s) e2 (s)
⎞
(7.47)
⎠
Open-loop input/output data is obtained by exciting the open-loop system using a designed RBS signal of magnitude 1 for the inputs, ut , and random numbers of standard deviation 0.1 for the white noise sequences, et , in MATLAB and Simulink . A random-walk signal is designed for the measured disturbance wt by passing a white-noise signal of standard deviation 0.1 through an integrator. Sampling interval is taken as 2 units of time. Using subspace identification, with N = 50 (row blocks) and j = 2000 (column blocks) in the data Hankel matrices, the subspace matrices Lw , Lu and Lw are identified. The simulation data and the models from subspace matrices are plotted in Figures 7.1-7.2. As can be seen from Figure 7.2, the impulse response models from the identified subspace matrices match well with the true impulse response models. Data from simulations
U1
1 0 −1 0
1000
2000
0
1000
2000
0
1000
2000
0
1000
2000
U2
1 0 −1
W
5 0 −5
Y1
50 0 −50
Y2
50 0 −50
0
1000 2−time units samples
2000
Fig. 7.1. Inputs, measured disturbance and outputs data from simulations
In Figure 7.3 the simulation results with subspace based predictive controller without an integrator (SPC in [12]) is compared with the predictive controller with integral actions. As illustrated, the controller with integrator gives no offset
134
7 Data-driven Subspace Approach to Predictive Control
1.5
0 −0.5 G12
G11
1
−1
0.5 −1.5 0
0
20
40
60
80
100
0
20
40
Lags
60
80
100
60
80
100
60
80
100
Lags 0
1 G22
G21
−0.5 0.5
−1 −1.5 −2
0
0
20
40
60
80
−2.5
100
0
20
40
Lags
Lags 0.6
0.3
G23
G13
0.4
0.2
0.2
0.1 0
0.4
0
20
40
60
80
0
100
0
20
Lags
40 Lags
Fig. 7.2. Comparison of the process and noise models from subspace matrices with the true models
1 Y2
Y1
1
0.5
0
0.5
0 0
100
200
300
400
0
100
200
300
400
0
100
200
300
400
0
100
200 300 Samples
400
0.1
0.2 U2
U1
0.1 0 −0.1
0 −0.1
−0.2 0
100
200
300
−0.2
400
0.2
0.05 0 Δ U2
Δ U1
0.1 0 −0.1
−0.05 −0.1
−0.2 0
100
200 300 Samples
400
Fig. 7.3. Predictive controller without and with the integrator. wt = 0; et =0
7.4 Simulations
1 0.5
0.5
Y2
Y1
1
135
0 −0.5
0 0
100
200
300
−1
400
0
100
200
300
400
0
100
200
300
400
0
100
200 300 Samples
400
0.1 0.2
0 U2
U1
0 −0.2
−0.2 −0.3
0
100
200
300
−0.4
400
0.2
0.05
0.1
0 Δ U2
Δ U1
−0.4
−0.1
0
−0.05
−0.1
−0.1
−0.2 0
100
200 300 Samples
400
Fig. 7.4. Predictive controller without and with the feedforward control. et =0
1 Y2
Y1
1 0.5 λ=2 0 100
200
300
400 0.2
0.2
0.1 U2
0.4
0 −0.2 −0.4
λ=2
λ=25
0 0
U1
0.5
λ=25
0
100
200
300
400
0
100
200
300
400
0
100
200 300 Samples
400
0 −0.1 −0.2
0
100
200
300
400
0.5
Δ U2
Δ U1
0.1
0
0 −0.1 −0.2
−0.5
0
100
200 300 Samples
400
Fig. 7.5. Variation of input weighting, λ. wt = 0; et =0
7 Data-driven Subspace Approach to Predictive Control
1 Y2
Y1
1
0.5
0.5 m=10
m=1 m=10
m=1
0 0
100
200
300
0 400
0
100
200
300
400
0
100
200
300
400
0
100
200 300 Samples
400
0.1
0.2
0
U2
U1
0.1 0 −0.1
−0.1
−0.2 100
200
300
−0.2
400
0.2
0.1
0
0
Δ U2
Δ U1
0
−0.2 −0.4
−0.1
0
100
200 300 Samples
−0.2
400
Fig. 7.6. Variation of control horizon, Nu . wt = 0; et =0
1 Y2
Y1
1 0.5 n=3
0.5
n=3
n=20
n=20
0
0 0
100
200
300
400
0
100
200
300
400
0
100
200
300
400
0
100
200 300 Samples
400
0.2 0.2
0.1 U2
U1
0.1 0 −0.1
0 −0.1
−0.2 0
100
200
300
−0.2
400
0.1 0.2 0.05 Δ U2
0.1 Δ U1
136
0
0 −0.05
−0.1 −0.1 −0.2 0
100
200 300 Samples
400
Fig. 7.7. Variation of prediction horizon, N2 . wt = 0; et =0
7.4 Simulations
1
Y2
Y1
1
0.5
0
0.5
0 0
100
200
300
400
0
100
200
300
400
0
100
200
300
400
0.1 0.2 0.05 U2
U1
0.1 0
0 −0.05
−0.1 −0.1 −0.2 0
100
200
300
400
0.02
0.01 Δ U2
Δ U1
0
|Δ U|≤ 0.006
|Δ U|≤ 0.012
|Δ U|≤ 0.016
|Δ U|≤ 0.02
0.01
0
−0.01 −0.01
−0.02 0
100
200 300 Samples
400
0
100
200 300 Samples
400
Fig. 7.8. Constrained predictive controller. wt = 0; et =0
1 Y2
Y1
1 0.5 0
0 0
100
200
300
400
0.4
0.2
0.2
0.1 U2
U1
0.5
0 −0.2 −0.4
0
100
200
300
400
0
100
200
300
400
0
100
200 300 Samples
400
0 −0.1 −0.2
0
100
200
300
400
0.5
Δ U2
Δ U1
0.1
0
0 −0.1 −0.2
−0.5
0
100
200 300 Samples
400
Fig. 7.9. Noise model tuning for model mismatch. wt = 0; et =0
137
138
7 Data-driven Subspace Approach to Predictive Control
for non-zero setpoints. In Figure 7.4 the predictive controller performance is compared for the cases without and with feedforward control. Better controller performance has been achieved with feedforward control. For a range of values for λ, N2 , Nu and constraints on the input moves, u, a subspace-matrices based predictive controller is implemented on the above process in MATLAB and Simulink . The closed-loop system responses for different sets of tuning parameters are illustrated in Figures 7.5-7.8. In Figure 7.5 it can be seen that as the weighting, λ, on the input increases the controller response becomes less aggressive. For a given prediction horizon, as the control horizon, Nu , increases, the controller gives more aggressive tracking performance as shown in Figure 7.6. For a given control horizon, as the prediction horizon, N2 , increases, the controller gives better setpoint tracking performance shown in Figure 7.7. Figure 7.8 shows the setpoint tracking under different constraints on the incremental control moves, u. It can be seen that smaller the magnitude of the maximum allowed control moves, more sluggish is the controller response to setpoint changes. Noise Model Tuning To illustrate the tuning of the noise model with the subspace approach, consider that the process model changes with time; in other words there is a mismatch between the true process model and the identified process model used in the controller design. The model-plant mismatch transfers to additional unknown “disturbances”. Consider the case when the process model from Equation 7.47 changes to ⎛ ⎝
y1 (s) y2 (s)
⎞
⎛
⎠=⎝
12.8e−s −18.0e−3s 16.7s+1 21.0s+1 8e−7s 10.9s+1
−18e−3s 14.4s+1
⎛
+⎝
⎞⎛ ⎠⎝
3.8e−8s 14.9s+1 4.9e−3s 13.2s+1
u1 (s) u2 (s)
⎞
⎞ ⎠
⎛
⎠ w(s) + ⎝
e1 (s) e2 (s)
⎞ ⎠
Figure 7.9 illustrates the controller response without and with the on-line noise model tuning feature.
7.5 Experiment on a Pilot-scale Process The data-driven subspace predictive controller is tested on a multivariate pilotscale system. The system considered, shown in Figure 7.10, is a three tank system with two inlet water flows. The levels of tank-1 and tank-2 are the two controlled variables (CVs). The setpoints (SPs) for the flow rates through the valves-A & B are the two manipulated variables (MVs). The flow rates through the valves-A & B are controlled through local PID controllers on each valve. The setpoints for the flow rates come from a higher-level advanced controller application. The local
7.5 Experiment on a Pilot-scale Process
139
PID controllers, which are univariate, are at faster sampling rate (1 second). The higher level controller, which is multivariate and does computations to minimize an optimization objective function, sends controller outputs every 6 seconds. The system is configured so as to emulate a typical multivariate system in the industry. Valve-A
Water flow Local CASCADE-B
A/D and D/A converter
SP
FC
FI
SP
FI
LI
Valve-B
FC
Local CASCADE-A
LI
TANK-1
TANK-2
TANK-3
Valve-C Drain
Drain
Fig. 7.10. Experimental setup
Tank-3 and valve-C are used primarily to introduce interactions between the variables in the system. As can be seen in Figure 7.10 a change in the level in tank-1 affects the level in tank-2 via tank-3 level. The degree of interaction can be manipulated by changing the valve-C position. If the valve-C is completely closed then the level in tank-2 is independent of the level in tank-1 (zero interactions). By opening the valve-C interactions are introduced in the tank-2 level. Valve-C is maintained at a fixed open position throughout the exercise. Note that the level in tank-1 is independent of the levels in tanks-2 & 3. The step response models for the system, which are formed from the impulse response coefficients in the subspace matrices, are plotted in Figure 7.11. The interactions between the variables are clear from the step response plots. Open-loop step-test data for the system is collected by sending two uncorrelated PRBS (pseudo random binary signals) for the SPs of the flow rates through valves-A & B. Subspace matrices are identified using the open-loop data. A multivariate subspace-matrices based predictive controller is then designed for the system. The controller parameters (weighting matrices, prediction horizon, control horizon and noise model) are tuned for a smooth control response. The closed-loop response for the unconstrained and constrained (|Δu| ≤ 0.5) cases are plotted in Figure 7.12 and Figure 7.13 respectively.
140
7 Data-driven Subspace Approach to Predictive Control
0.02 0 Tank−1 level and U2
4
−0.02
3
−0.04
2
−0.06
1
−0.08
0
−0.1
−1
−0.12
0
50
100
150
200
1.4
1
1.2
0
50
100
150
200
0
50
100
150
200
2
1.2
Tank−2 level and U
Tank−2 level and U1
Tank−1 level and U
1
5
0.8 0.6 0.4 0.2 0
−0.2
1 0.8 0.6 0.4 0.2
0
50
100
150
0
200
Fig. 7.11. Step response coefficients from the subspace matrices identified using the open-loop data 3 Tank−2 level
Tank−1 level
3 2 1 0 200
400
600
800
0 200
400
600
800
1000
200
400
600
800
1000
200
400
600
800
1000
18
U2 (SP for Valve−B)
18
1
−1
1000
16
16
14
14
12
12
1
U (SP for Valve−A)
−1
2
10
10
200
400
600
800
1000
1
0.5 Δ U2
ΔU
1
0.5 0
0
−0.5 −1
−0.5 200
400
600
800
1000
Fig. 7.12. Subspace based predictive controller on the pilot scale process
7.6 Summary
3 Tank−2 level
Tank−1 level
3 2 1 0 200
400
600
800
0 200
400
600
800
1000
200
400
600
800
1000
200
400
600
800
1000
18 16
16
14
14
12
12
10
10
200
400
600
800
1000
Δ U2
0.5
1
0.5
ΔU
1
U2 (SP for Valve−B)
18
2
−1
1000
1
U (SP for Valve−A)
−1
141
0
−0.5
0
−0.5 200
400
600
800
1000
Fig. 7.13. Subspace based predictive controller for the constrained case on the pilot scale process
7.6 Summary In this chapter, the design of the predictive controller, in the MPC/GPC framework, using the subspace matrices calculated through the subspace projection method is addressed. Important issues in practical implementation of the predictive controllers such as integral actions, constraint handling and feedforward control are discussed. It has been shown that the noise model can be independently specified by the user through the addition of a new term to the predictor equation in the model-free manner, which is shown to be equivalent to changing the Kalman filter gain matrix. The data-driven subspace predictive controller is tested on multivariate systems in simulations and on a pilot scale process.
8 Control Loop Performance Assessment: Conventional Approach
8.1 Introduction Established in the later 1980s by the work of Harris [19], the research on control loop performance monitoring has been and remains one of the most active research areas in process control community. A number of new approaches have been developed since then [22, 21, 20]. It is estimated that several hundreds of papers have been published on this or related topics [111]. On the practical side, Eastman Kodak recently reported regular loop monitoring on over 14000 PID loops [111], and some commercial control performance assessment software including multivariate performance assessment has also come available. Harris (1989) [19] developed a performance assessment technique using only routine operating data for univariate control loops. Assuming that the control objective is to reduce process variance, minimum variance control is used naturally as the benchmark standard against which current control loop performance is assessed. It has been shown [19, 22] that for a system with time delay d, a portion of the output variance is feedback control invariant and can be estimated from routine operating data. This portion of the output variance equals the variance achieved under the minimum variance control; thus the method for the estimation of the minimum variance from routine operating data is established. To separate this feedback-control invariant term, one needs to model the closed-loop output data yt by a moving average (MA) model such as yt = f0 et + f1 et−1 + · · · + fd−1 et−(d−1) +fd et−d + fd+1 et−(d+1) + · · · ymv
where et is a white noise sequence and ymv is the portion representing the minimum variance control output, independent of feedback control [22]. This portion of the output can be estimated by time series analysis of routine closed-loop operating data, and its variance is used subsequently as a benchmark measure of theoretically achievable absolute lower bound of output variance to assess control loop performance. Using minimum variance control as the benchmark does not mean that one has to implement such a controller on the actual process. B. Huang et al.: Dyn. Model. Predict. Ctrl. & Perform. Monitor., LNCIS 374, pp. 145–155, 2008. c Springer-Verlag London Limited 2008 springerlink.com
146
8 Control Loop Performance Assessment: Conventional Approach
This benchmark control may or may not be achievable in practice depending on several physical constraints [22]. However, as a benchmark, it provides useful information such as how good the current controller performance is compared to the minimum variance controller and how much potential there is to improve controller performance further. If the controller indicates a good performance measure relative to minimum variance control, further tuning or re-designing of the control algorithm is neither necessary nor helpful. In this case, if further reduction of process variation is desired, implementation of other strategies such as feedforward control or re-engineering of the process itself may be necessary. On the other hand, if the controller indicates a poor performance measure, further analysis, such as model analysis, robustness analysis, constraint analysis, etc., may be necessary, since the poor performance may be due to soft or hard constraints, such as poorly damped zeros or hard constraints on control actions. As an introduction to feedback control performance assessment, problems and solutions for both SISO processes and MIMO processes are reviewed in a tutorial form and key MATLAB programs are illustrated through examples. Based on the authors’ experience in the applications of control performance assessment algorithms, we believe that the procedure introduced in this chapter is intuitive and easy to implement. Other advanced algorithms may also be found from [112, 22, 25, 27]. For the detailed theory behind this chapter, readers are referred to [22]. The remainder of this chapter is organized as follows: In Section 8.2 the feedback-control invariant term is derived and extraction of the minimum variance from time series analysis of routine operating data for SISO process is presented. The treatment of MIMO process is discussed in Section 8.3, followed by summary in Section 8.4.
8.2 SISO Feedback Control Performance Assessment Consider a SISO process under feedback control shown in Figure 8.1, where Gp is ˜ p , where d is the time-delay, the plant transfer function. Furthermore, Gp = z −d G ˜ p is the delay-free plant transfer function. Gl is the disturbance transfer and G function, et is a white noise sequence with zero mean, and Gc is the controller transfer function. It follows from Figure 8.1 that yt =
Gl Gl et = e −d ˜ p Gc t 1 + Gp Gc 1+z G
(8.1)
Using the Diophantine identity: Gl = f0 + f1 z −1 + · · · + fd−1 z −d+1 +Rz −d F
where fi (for i = 1, · · · , d − 1) are impulse response coefficients of Gl , and R is the remaining rational, proper transfer function, Equation 8.1 can be written as
8.2 SISO Feedback Control Performance Assessment
147
et
GL ut -
yt
Gp
Gc
Fig. 8.1. Schematic diagram of SISO process under feedback control
F + z −d R e ˜ p Gc t 1 + z −d G ˜ p Gc R − FG = [F + z −d ]et ˜ p Gc 1 + z −d G
yt =
= F et + Let−d
(8.2)
˜ G R−F G
where L = 1+z−d G˜p Gc is a proper transfer function. Since F et = f0 et + · · · + p c fd−1 et−d+1 , which is independent of white noise occurring before t − d + 1, the two terms on the right hand side of Equation 8.2 are independent, and as a result, V ar(yt ) = V ar(F et ) + V ar(Let−d ) Accordingly, V ar(yt ) ≥ V ar(F et ) The equality holds when L = 0, i.e., ˜ p Gc = 0 R − FG This yields the minimum variance control law: Gc =
R ˜pF G
Since F is independent of the controller transfer function Gc , the term F et , which is the process output under minimum variance control, is feedback-control invariant. Therefore, if a stable process output yt is modeled by an infinite movingaverage model, then its first d terms constitute an estimate of the minimum variance term F et . The estimation procedure is elaborated below. Consider that a set of representative closed-loop routine operating data is available, y1 , y2 , . . . , yN . Using time series analysis, a model in the difference equation form can be built: yt + a ˆ1 yt−1 + · · · + a ˆp yt−p = et + cˆ1 et−1 + · · · + cˆγ et−γ
(8.3)
148
8 Control Loop Performance Assessment: Conventional Approach
With backshift operator, z −1 , this difference equation can be written as a discretetime transfer function yt 1 + cˆ1 z −1 + . . . + cˆγ z −γ = et 1+a ˆ1 z −1 + . . . + a ˆp z −p
(8.4)
Equation 8.4 is an estimate of Equation 8.1. Owing to the feedback-invariance property, after long division of Equation 8.4: yt = fˆ0 + fˆ1 z −1 + . . . + fˆd−1 z −d+1 + . . . et the first d terms of this long division result, fˆ0 , fˆ1 , . . . , fˆd−1 , are the estimates of f0 , f1 , . . . , fd−1 respectively. Thus minimum variance can be calculated as 2 2 = V ar[(fˆ0 + fˆ1 z −1 + . . . + fˆd−1 z −d+1 )et ] = [fˆ02 + fˆ12 + . . . + fˆd−1 ]σe2 σ ˆmv
where σe2 can be estimated as the variance of the residuals after fitting Equation 8.3. Example 8.1. Consider following SISO process, as used by Desborough and Harris(1992) [113], with time delay d = 2: yt = ut−2 +
1 − 0.2z −1 et 1 − z −1
(8.5)
For a simple integral controller Δut = −Kyt , it can be derived that the closed-loop response is given by yt = et + 0.8et−1 +
0.8(1 − K/0.8 − Kz −1) et−2 1 − z −1 + Kz −2
(8.6)
where σe2 = 1. Note that the first two terms are independent of K and represent the process output under minimum variance control. Thus, the theoretical (exact) minimum variance can be calculated as 2 = V ar(et + 0.8et−1 ) = (1 + 0.82 )σe2 = 1.64 σmv
(8.7)
Note that this minimum variance is independent of the controller parameter K. Let K = 0.5 for example. A set of data (8000 data points) can be simulated from the closed-loop process, and a time series model is identified as 1.001 − 0.1988z −1 + 0.01122z −2 yt = et 1 − 0.9859z −1 + 0.5024z −2
(8.8)
with a variance of the fitting residuals as 1.0011. Using a long division or using the MATLAB function impulse, an infinite impulse response model or moving average model is obtained as yt = 1 + 0.7873z −1 + 0.2850z −2 + . . . et
8.2 SISO Feedback Control Performance Assessment
149
Thus the minimum variance can be calculated as 2 = (1 + 0.78732)σe2 σ ˆmv
where σe2 can be estimated from the variance of the residuals, which is 1.0011. Therefore, 2 = 1.6216 (8.9) σ ˆmv The actual variance of data can be calculated as σy2 = 1.8363. Although this actual variance can be calculated directly from data, it is recommended to use the estimated time series model (Equation 8.8), variance of the residuals σe2 , and MATLAB function covar to estimate the variance of data, σy2 . As a result, the performance index is η=
1.6216 = 0.8831 1.8363
A new set of data corresponding to K = 0.9 is simulated again. The variance is calculated as σy2 = 6.3046. A different time series model estimated from the closed-loop data is yt 1.001 − 0.2165z −1 + 0.00228z −2 = et 1 − 1.003z −1 + 0.9038z −2 with variance of the residuals as 1.0010. The impulse response model can be calculated as yt = 1 + 0.7868z −1 − 0.1168z −2 + . . . et and the minimum variance can be estimated as 2 σ ˆmv = 1.6207
(8.10)
The performance index is calculated as η=
1.6207 = 0.2571 6.3046
Comparing Equations 8.9 and 8.10, one can see that the minimum variance is the same despite different control parameters used and thus it is indeed feedbackcontrol invariant. The following is an algorithm for SISO feedback control performance assessment using MATLAB : Algorithm 8.2. SISO feedback control performance assessment: 1. Sample representative routine operating data from the closed-loop system to be evaluated, when there is no setpoint change. Subtract the mean from the data.
150
8 Control Loop Performance Assessment: Conventional Approach
2. Perform time series analysis of this set of data using MATLAB functions such as n4sid and refining it by pem to identify a time series model. Make sure to find a stable model as the closed-loop system is stable. Several iterations of modeling procedures may be necessary for an appropriate model. ˆ cl . Let’s call this estimated time series model G 3. Obtain time delay, d, according to a priori process knowledge. ˆ cl to an impulse response model using MATLAB functions such 4. Expand G as impulse so that ˆ cl = fˆ0 + fˆ1 q −1 + . . . + fˆd−1 q −d−1 + . . . G 5. Obtain the variance of the noise (residuals), σe2 , directly from the model object after applying the MATLAB function n4sid or pem. 6. Calculate the minimum variance as 2 2 = (fˆ02 + fˆ12 + . . . + fˆd−1 )σe2 σ ˆmv
7. Estimate the actual variance σy2 using MATLAB functions such as covar ˆ cl and σe2 as input arguments. with G 8. The performance index can be calculated as η=
2 σ ˆmv σy2
8.3 MIMO Feedback Control Performance Assessment Huang et al. (1997) [114] developed a filtering and correlation analysis algorithm (FCOR) for MIMO feedback control performance assessment. Harris et al. (1996) [112] developed a parallel approach based on a spectral interactor and spectral factorization, which can also extract the minimum variance term from routine operating data. Since then Ko and Edgar (2001) [25] have developed an algorithm that avoided the direct use of the interactor matrix by imbedding calculation of the interactor matrix into algebraic manipulation of Markov parameters of the process model. McNabb and Qin (2003) [27] developed an alternative approach for minimum-variance benchmarking based on state space models estimated from subspace identification. Huang et al. (2005) [115] proposed an estimation of bounds on the minimum variance without knowing the interactor matrix, but the order of the interactor matrix is required. In this section conventional MIMO feedback control performance assessment is introduced in a tutorial form. A MIMO process can be modeled as yt = Gp ut + Gl et ,
(8.11)
where Gp and Gl are proper, rational transfer function matrices for the plant and noise, respectively; yt is an output vector and ut an input vector. et represents a white noise vector with a zero mean and a covariance matrix of Σe .
8.3 MIMO Feedback Control Performance Assessment
151
Furthermore, if Gp is a proper, full-rank transfer function matrix, a unitary interactor matrix D can be factored out such that DGp = G˜p , where G˜p is the delay-free transfer function matrix of Gp . Therefore, Equation 8.11 can be expressed as (8.12) yt = Gp ut + Gl et = D−1 G˜p ut + Gl et . Pre-multiplying both sides of Equation 8.12 by z −d D, where d is the order of the interactor matrix D and is defined as the maximum power of z in D, gives z −d Dyt = z −d G˜p ut + z −d DGl et .
(8.13)
Let y˜t = z −d Dyt and G˜l = z −d DGl , Equation 8.13 becomes y˜t = z −d G˜p ut + G˜l et .
(8.14)
Huang and Shah (1999) [22] showed that since D is a unitary interactor matrix, the minimum variance control law that minimizes the objective function of the ytT y˜t ), also minimizes the objective interactor-filtered variable y˜t , namely J1 = E(˜ function of the original variable yt , J2 = E(ytT yt ), and J1 = J2 , which means that E(˜ ytT y˜t ) = E(ytT yt ). It is further shown that if one writes G˜l as G˜l = F0 + F1 z −1 + . . . + Fd−1 z −d+1 + z −d R = F + z −d R
(8.15)
then the minimum variance can be calculated as minE(˜ ytT y˜t ) = trace[Cov(F et )] ˆ cl is estimated from time series modeling Furthermore, if a transfer function G of output data yt (after mean-centered), by multiplying it with z −d D and after ˆ cl can be written as expanding it into a Markov parameter form, z −d DG ˆ cl = F + z −d L z −d DG i.e., its first d terms that are represented by F are the same as expanded from G˜l shown in Equation 8.15. This property is also known as the feedback-control invariance property [112, 114]. Therefore, the minimum variance term F et can be estimated from routine operating data yt . The following is an algorithm for MIMO feedback control performance assessment using MATLAB : Algorithm 8.3. MIMO feedback control performance assessment: 1. Sample representative routine operating data from closed-loop control system to be evaluated, when there is no setpoint change. Subtract the mean from the data. 2. Perform time series analysis of this set of data using MATLAB functions such as n4sid and refining the resulting model by pem to identify a multivariate time series model. Make sure to find a stable model. Several iterations of modeling procedures may be necessary for an appropriate model. Let’s call ˆ cl . this estimated time series model G
152
8 Control Loop Performance Assessment: Conventional Approach
3. Obtain the interactor matrix, D, according to a priori process knowledge, using the algorithm discussed in Huang and Shah (1999) [22] based on the open-loop model Gp , or based on identification from closed-loop experiment data [22]. ˆ cl by the interactor matrix q −d D and then expand the product to 4. Multiply G a Markov parameter form using MATLAB functions such as impulse to obtain ˆ cl = F0 + F1 q −1 + . . . + Fd−1 q −d−1 + . . . q −d DG 5. Extract the covariance matrix of the noise (residuals), Σe , from the model object obtained after applying the MATLAB function n4sid or pem. 6. Calculate the minimum variance matrix for the interactor-filtered output y˜t as T ˜ mv = F0 Σe F0T + F1 Σe F1T + . . . + Fd−1 Σe Fd−1 Σ 7. Estimate the actual variance Σy using MATLAB functions such as covar ˆ cl and Σe as the input arguments. with G y T y˜), the multivariate performance index can be calcu8. Since E(y T y) = E(˜ lated as ˜ mv ) trace(Σ η= trace(Σy ) If only the multivariate performance index is of interest, the computation is complete at this point. However, if one is also interested in estimating performance indices for each individual output, the following additional steps are needed. 9. Write the interactor matrix in the form D = D0 q d + D1 q d−1 + . . . + Dd−1 q Perform the following matrix calculation to obtain matrices E0 , E1 , · · · , Ed−1 : [E0 , E1 , · · · , Ed−1 ] = ⎛
F0
⎜ ⎜F ⎜ 1 ⎜. ⎜ T [D0T , D1T , · · · , Dd−1 ] ⎜ .. ⎜ ⎜. ⎜ .. ⎝ Fd−1
F1
· · · Fd−1
F2 .. .
···
Fd−1
⎞ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎠
10. Then the minimum variance matrix of the original output yt can be calculated as T Σmv = E0 Σe E0T + E1 Σe E1T + . . . + Ed−1 Σe Ed−1
8.3 MIMO Feedback Control Performance Assessment
153
11. Finally, the performance indices for each individual output can be calculated as η1:m = [diag(Σmv )]. ∗ [diag(Σy )]−1 where .∗ is an element-by-element multiplication operator following MATLAB notations. Example 8.4. Consider a 2 × 2 multivariable process with open-loop transfer function matrix Gp and disturbance transfer function matrix Gl given by1 ⎞ ⎛ Gp = ⎝ ⎛
Gl = ⎝
z −1 4z −2 1−0.4z −1 1−0.1z −1 0.3z −1 z −2 1−0.1z −1 1−0.8z −1
1 −0.6z −1 1−0.5z −1 1−0.5z −1 0.5z −1 1.0 1−0.5z −1 1−0.5z −1
⎠
⎞ ⎠
The white noise, et , is two-dimensional normal-distributed with Σe = I. The output performance is measured by J = E[ytT yt ]. The controller is two multiloop minimum variance controllers designed according to the two single loops without considering their interaction: ⎞ ⎛ 0.5−0.20z −1 0 −1 1−0.5z ⎠ Gc = ⎝ 0.25−0.200z −1 0 −1 −1 (1−0.5z )(1+0.5z ) A unitary interactor matrix D can be factored out from Gp as ⎛ ⎞ −0.9578z −0.2873z ⎠ D=⎝ −0.2873z 2 0.9578z 2
Then z −d DGl where d = 2 is given by ⎛ z −d DGl = ⎝
−1.1015z −1 1−0.5z −1
−0.2873z −1 +0.5747z −2 1−0.5z −1
−0.2873+0.4789z −1 1−0.5z −1
0.9578+0.1724z −1 1−0.5z −1
⎞ ⎠
This can be further expanded as ⎛ ⎞ ⎛ ⎞ 0 0 −1.1015 −0.2873 ⎠+⎝ ⎠ z −1 + . . . z −d DGl = ⎝ −0.2873 0.9578 0.3352 0.6513 1
(8.16)
(8.17)
The off-diagonal elements of Gl need to have at least one sample delay if using System Identification toolbox for computation, since a standard assumption of the toolbox is Gl (z −1 = 0) = I.
154
8 Control Loop Performance Assessment: Conventional Approach
Thus, the feedback controller-invariant term is ⎞ ⎛ −1.1015z −1 −0.2873z −1 ⎠ et F et = ⎝ −0.2873 + 0.3352z −1 0.9578 + 0.6513z −1
The theoretical variance matrix under minimum variance control can be calculated as ⎛ ⎞ 1.2958 −0.5563 ⎝ ⎠ −0.5563 1.5365
This will be the theoretical (exact) minimum variance to assess performance of the feedback controller. Now a set (5000 samples) of closed-loop data is simulated from the system stated above. By applying n4sid followed by pem to the mean-centered data, a ˆ cl is estimated. The covariance of the residuals multivariate time series model G is extracted from the model object as ⎛ ⎞ 1.0149 0.0073 ⎠ Σe = ⎝ 0.0073 1.0133 ˆ cl (where d = 2), can be Using the impulse function of MATLAB , z −dDG expanded into the Markov parameter form: ⎛ ⎞ ⎛ ⎞ 0 0 −0.9651 −0.2894 ˆ cl = ⎝ ⎠+⎝ ⎠ z −1 + . . . z −d DG −0.2843 0.9648 0.3358 0.6751
Comparing this to Equation 8.17, one can see indeed that the first d terms (d = 2 in this example) are feedback-control invariant. The following results are obtained according to Algorithm 8.3: ⎛ ⎞ 1.0341 −0.5322 ˜mv = ⎝ ⎠ Σ −0.5322 1.6001 ⎛ ⎞ 1.0745 −0.1462 ⎠ Σmv = ⎝ −0.1462 1.5596 ⎞ ⎛ 3.1688 −0.2488 ⎠ ΣY = ⎝ −0.2488 1.5241
With these intermediate results, performance indices can be calculated as ˜mv ) trace(Σ trace(Σmv ) = = 0.5613 ΣY ΣY ⎞ ⎛ ⎛ 0.3391 η1 ⎠ ⎝ ⎠ = [diag(Σmv )]. ∗ [diag(ΣY )]−1 = ⎝ 1.0233 η2 η= ⎞
8.4 Summary
155
8.4 Summary A tutorial for control loop performance assessment has been presented in this chapter. It has demonstrated that there exists a term that is independent of feedback control, and this term is the minimum variance of the feedback control loop. Two algorithms for SISO and MIMO feedback control performance assessment were presented respectively, followed by illustrative examples with a step-bystep guide for calculating the minimum variance. The techniques presented in this chapter are limited to minimum variance control benchmarking. However, it constitutes a foundation for control performance assessment. Other advanced and recently-developed methods will be discussed in the next few chapters.
9 State-of-the-art MPC Performance Monitoring
9.1 Introduction Among many algorithms for control performance monitoring, minimumvariance-control based performance monitoring, as discussed in Chapter 8, remains the most popular one owing to its non-intrusiveness. While the univariate performance monitoring has found widespread applications, the multivariate counterpart has found limited applications due to the complexity of the algorithms and demand of more a priori information. Recently several simplified multivariate performance monitoring algorithms have been developed and the perspective of practical applications appears to be promising. Just like PID that is the most accepted univariate control algorithm and needs special attention for its control performance, MPC is the most accepted multivariate control and its performance monitoring technology is in great demand. Some special features of MPC include constraint handling and economic optimization, which require special treatment when monitoring its control performance. It is often debated whether minimum variance control is a suitable benchmark for MPC. The main argument lies in the fact that minimum variance control has a different control structure or objective from MPC; for example, it does not handle hard constraints. However, minimum variance control yields an absolute lower bound of process variance. This lower bound of variance is independent of which control law is used. If output variance of a control system is near minimum variance, no control tuning will help the improvement of variance no matter what type of control algorithms has been or will be used and no matter whether the control algorithm is the constrained one or not. How the variance reduction is transferred to MPC’s benefit potential and how MPC’s performance can be assessed through the variance are the research topics of current interest, and answers to these problems are the key to make minimum variance benchmarking useful for MPC monitoring. In addition to variance monitoring, there are many algorithms and case studies for MPC performance monitoring that have been reported in the literature. Broadly speaking, they can be classified into two groups, namely model-based B. Huang et al.: Dyn. Model. Predict. Ctrl. & Perform. Monitor., LNCIS 374, pp. 157–175, 2008. c Springer-Verlag London Limited 2008 springerlink.com
158
9 State-of-the-art MPC Performance Monitoring
and model-free approaches. Among the model-based approaches, there are minimum variance control benchmarking, generalized minimum variance control benchmarking, LQG (linear quadratic Gaussian)/MPC tradeoff curve approach, design objective versus achieved objective approach, historical-data benchmarking, on-line model validation, and model-based simulation approach. The common feature of this group of algorithms is the need of process models or interactor matrices. The model-free group includes impulse response curvature, prediction error curvature, Markov chain, and statistical data analysis. This group of algorithms does not assess control performance against a benchmark control and is nonparametric in general. No models or interactor matrices are needed; thus they are attractive for practical applications. Most recently, economic performance assessment of MPC has also received attention [116, 117, 118]. A recent work by Ordys et al. (2007) [119] provides a broad context to control performance monitoring other than merely evaluating the controller performance. What is discussed in [119] includes business performance benchmarking, economic auditing of control systems, plantwide disturbance diagnosis, oscillation detection, and valve stiction monitoring. As such, the research on control performance monitoring has gone beyond the traditional scope. Another interesting way of considering MPC performance monitoring is through on-line model validation. Since MPC is an optimizing algorithm based on the models, unless the MPC objective function is not well formulated, MPC will be optimal by nature if the models are correct. Therefore, if the MPC objective function has been designed well, the model that is being used by MPC needs more attention. Following these arguments, monitoring of MPC performance boils down to two basic problems: evaluation and validation. Evaluation is for the MPC objective function and validation is for the model used by MPC through the real data. Parameters that affect controller objective functions are often called tuning parameters. Factors that may affect model quality include process nonlinearity, instrument malfunctions, time varying process and/or disturbance dynamics, inadequate system identification, etc. As pointed out by Jelali (2006) [120], a fundamental question related to the performance assessment of MPC is whether detected poor control performance is due to bad controller tuning or inaccurate modeling. Noticeably, the model validation problem has been iterated in Kozub (1997) [121], Kesavan and Lee (1997) [122], and Dumont et al. (2002) [123].
9.2 MPC Performance Monitoring: Model-based Approach 9.2.1
Minimum-variance Control Benchmark
Minimum variance control continues to be the most frequently used benchmark for feedback control performance assessment owing to its theoretical elegance, non-intrusiveness and ability to provide the absolute lower bound of process variance. It originated from Harris’s work [19] for univariate control loop
9.2 MPC Performance Monitoring: Model-based Approach
159
performance assessment. It has since been extended to multivariate control systems by [112, 114, 25, 27]. A tutorial on this topic has been provided in Chapter 8. The drawback of the minimum variance benchmark, however, is the requirement of the interactor matrix or its equivalence that is conceptually difficult and computationally challenging. In addition, minimum variance control is an aggressive control and is rarely implemented in practice; thus its control objective may not be directly compatible with that of MPC. 9.2.2
LQG/MPC Benchmark
Practical constraints on control systems such as valve constraints imply that performance of minimum variance control is most likely not achievable in practice and the best performance of the practical controllers is often lower than that of minimum variance control. Thus, using minimum variance as a benchmark to evaluate a practical control system such as MPC tends to underestimate the performance. In general, tighter quality specifications result in smaller variation in the process output but typically require more control effort. It may therefore be interesting to know how far the control performance is from the “best” achievable performance for any pre-specified control effort; i.e., in mathematical form the resolution of the following problem may be of interest [22]: Given ||ut ||2 ≤ α, what is min{||rt − yt ||2 }? The solution is given by a tradeoff curve as shown in Figure 9.1. This curve can be obtained from solving the LQG problem [124, 125], where the LQG objective function is defined by J(λ) = E[(rt − yt )2 ] + λE[u2t ] By varying λ, various optimal solutions of E[(rt − yt )2 ] and E[u2t ] can be calculated. Thus a curve with the optimal E[u2t ] as the abscissa and E[(rt − yt )2 ] as the ordinate is formed from these solutions. Any linear controller can only operate in the region above the tradeoff curve [125] , shown in Figure 9.1. It is clear that given E[u2t ] = α, the minimum value (or the Pareto optimal value [125]) of E[(rt − yt )2 ] can be found from this curve. This curve therefore represents the limit of performance and can be used for performance assessment purposes [22]. The LQG-related performance assessment problem has also been studied by [126] from signal processing point of view. A similar approach known as generalized minimum variance benchmarking was investigated by [127]. Applications of the LQG benchmark can be found in a number of case studies, for example in [128]. By considering a special property of MPC, namely the disturbance model is normally assumed to be random walk, Julien et al. (2004) [129] modified the LQG curve to form an MPC tradeoff curve. Since the random-walk disturbance model may be different from the real disturbance model, the MPC tradeoff curve lies above the LQG curve. They further show that it is possible to update process
160
9 State-of-the-art MPC Performance Monitoring
models and disturbance models using routine operating data, provided certain conditions on the disturbance dynamics are satisfied. The advantage of the LQG/MPC benchmark is its ability to identify achievable minimum variance subject to input variance constraints; thus it is a more realistic benchmark than minimum variance control when evaluating constraint control such as MPC. The downside is its requirement of a complete process model and the fact that it can only address input variance constraints i.e., soft constraints but not hard constraints. Var(y)
Achievable Performance
Minimum variance
Var(u)
Fig. 9.1. An LQG tradeoff curve
9.2.3
Model-based Simulation Approach
If process models are given, a natural approach towards performance assessment of MPC is through direct simulations. Ko and Edgar (2001) [130] studied this approach. The disturbance models can be unknown but may be estimated from the output data with known process models. With both process and disturbance models being available, the actual trajectory of MPC can be simulated and statistical properties such as variance/ covariance can be calculated. The authors [130] considered performance assessment of MPC from the following four perspectives: 1. Benchmark based on unconstrained minimum variance control, which is the same as the conventional minimum variance benchmark. 2. Benchmark based on constrained minimum variance control. This will provide a more realistic benchmark than unconstrained minimum variance one at a cost of more a priori process information required. 3. Benchmark based on constrained model predictive control, which is directly applicable to MPC performance monitoring. 4. Comparison of the difference among the above three benchmarks and study of performance change due to changes in prediction/control horizons and constraints.
9.2 MPC Performance Monitoring: Model-based Approach
161
The advantage of this approach is that it directly targets MPC and is able to consider the hard constraints. The limitation is the requirement of complete process models and, furthermore, the results will be more sensitive to the model errors since it is a simulation-based approach. 9.2.4
Designed/Historical vs Achieved
Design of MPC is an optimization of its objective function, typically in a quadratic form. Monitoring of MPC performance may be conducted by comparing the achieved MPC performance objective versus the designed one [131]. Specifically, a multivariate MPC objective function may be written as Jdes =
N2
([rt+j − yˆ(t + j|t)] Q[rt+j − yˆ(t + j|t)] + T
Nu
[Δut+j−1 ]T R[Δut+j−1 ]
j=1
j=N1
The achieved performance objective function can be calculated from inputoutput data as Jach =
N2
([rt+j − yt+j ]T Q[rt+j − yt+j ] +
Nu
[Δut+j−1 ]T R[Δut+j−1 ]
j=1
j=N1
Thus a performance index can be defined as ηdes =
Jdes Jach
While Jach may be calculated from actual input-output data, calculation of Jdes does need to have the complete process model. To avoid the modeling problem, Jdes has also been chosen as an estimate of Jach from a set of historical data with “acceptable” performance, which is similar to the historical benchmark [22]. However, calculation of Jach can also be a problem in practice due to its randomness and possibly large variability. Zhang and Henson (1999) [132] proposed time series modeling of ηdes and then performing a residual test to determine the change of ηdes . A case study was performed by Schafer and Cinar [128] to compare various approaches. The advantage of this designed-vs-achieved approach is its relative simplicity. However, a control objective function may only be a subjective wish, and often treated as a tuning “parameter” for MPC design; it may not be directly linked to the truly desired objective such as economic benefits. Thus, achieving the designed objective does not necessarily mean achieving the optimal performance. 9.2.5
Historical Covariance Benchmark
A historical covariance benchmark is proposed in [24], denoting benchmark data as period I and monitored data as period II. A measure for detecting the change
162
9 State-of-the-art MPC Performance Monitoring
of covariance as proposed by [24] is to monitor the ratio of the determinants of covariance matrices det{cov(yII )} Iv = det{cov(yI )} It is suggested that, if the ratio is greater than one, the performance of the monitored period is in general worse than the benchmark period and the worst performance direction of the monitored period should be examined. The direction along which the largest variance inflation occurs is given by p = argmax
pT cov(yII )p pT cov(yI )p
The solution is given by the generalized eigenvector problem: cov(yII )p = μcov(yI )p where μ is the generalized eigenvalue and p is the corresponding eigenvector. The solution maximizes the covariance ratio, indicating that the covariance of the monitored data deviates the most from the benchmark data along the direction of p. Thus, this approach can be used to identify the direction that has the most variance inflation. The advantage of a historical benchmark is simplicity and the fact that no a priori information is needed. The disadvantage, as also noted by [24], is that it does not establish an absolute lower bound and the selection of the historical benchmark is subjective. 9.2.6
MPC Performance Monitoring through Model Validation
MPC, an acronym of model predictive control, is model critical by nature. Thus, model quality has been considered one of the two essential aspects in MPC performance monitoring [121, 122, 123]. There are many model validation algorithms available in the literature. Most are designed for off-line model validation immediately after model identification. Few are for on-line monitoring purposes. A notable work is by Basseville (1998) [133] who proposed maximum likelihood tests for local model parameter validation. Termed local approach, the monitoring algorithm converts multiplicative changes (parameter changes) into an equivalence of additive changes; thus it greatly simplifies the on-line computations. The local approach originally proposed by [133] has been extended to modelpredictive-control-relevant model validation in [134] and disturbance-dynamicsindependent model validation in [135], and is summarized below. Consider a process model given by yt = Gp ut + Gl et
(9.1)
where Gp and Gl are process and disturbance models in transfer function form, respectively.
9.2 MPC Performance Monitoring: Model-based Approach
163
The optimal one-step ahead predictor can be derived as −1 yˆ(t|t − 1) = (I − G−1 l )yt − Gl Gp ut
and the optimal one-step ahead prediction error is given by (t|t − 1) = yt − yˆ(t|t − 1) = G−1 l (yt − Gp ut ) In general, system identification is to find Gp and Gl to minimize the following prediction-error objective function J=
N 1 1 2 (t|t − 1) N t=1 2
(9.2)
which is known as the prediction error method [11]. The gradient of Equation 9.2 can be calculated as N ∂J 1 ∂ yˆ(t|t − 1) T =− ) (t|t − 1) ( ∂θ N t=1 ∂θ
(9.3)
T T ]T , and θ where θ = [ θG θG Gp , θGl are parameter vectors of Gp and Gl , p l respectively. Assume that θ0 is calculated by equating Equation 9.3 to zero. Then the estimation θ0 based on the prediction error method that minimizes Equation 9.2 must satisfy N ∂J 1 ∂ yˆ(t|t − 1) T |θ0 = −[ ) (t|t − 1)]θ=θ0 = 0 ( ∂θ N t=1 ∂θ
(9.4)
As the sample size increases, Equation 9.4 is asymptotically equivalent to E[(
∂ yˆ(t|t − 1) T ) (t|t − 1)]θ=θ0 = 0 ∂θ
(9.5)
Obviously, for any new data generated from Equation 9.1, if θ = θ0 , then Equation 9.5 should continue to hold. If, on the other hand, θ = θ0 then Equation 9.5 will be non-zero. Hence, the model validation problem is reduced to that of monitoring the mean of a vector which is defined as: H(yt , ut , θ) = (
∂ yˆ(t|t − 1) T ) (t|t − 1) ∂θ
(9.6)
This satisfies the so-called primary residual as defined in [133]. To reduce the false alarm rate, the actual monitoring can be done by defining the normalized residual: N 1 H(yt , ut , θ) (9.7) ξN (θ) = √ N t=1
164
9 State-of-the-art MPC Performance Monitoring
With these transformations, the change detection problem can be formulated as the following statistical test H0 : θ = θ 0
versus
1 H1 : θ = θ 0 + √ η N
where η is the change of model parameters to be detected. It is shown in [133] that the normalized residual has the following distribution asymptotically: ξN (θ0 ) ∼ N (−M (θ0 )η, Σ(θ0 )) (9.8) where ∂ H(yt , ut , θ)) ∂θ ∞ Σ(θ) = Cov(H(y1 , u1 , θ), H(yt , ut , θ))
M (θ) = E(
t=−∞
Thus detection of parameter changes is transferred to monitoring mean of the normalized residual. The maximum likelihood ratio test can be applied. In practice, M (θ0 ) may be approximated by M (θ0 ) ≈
N ∂ 1 [ H(yt , ut , θ)]θ=θ0 ∂θ N t=1
and Σ(θ0 ) may be approximated by N 1 Σ(θ0 ) ≈ H(yt , ut , θ0 )H T (yt , ut , θ0 ) + N t=1
+
I i=1
N −i 1 (H(yt , ut , θ0 )H T (yt+i , ut+i , θ0 ) + N − i t=1
+H(yt+i , ut+i , θ0 )H T (yt , ut , θ0 ))
where the value I should be properly selected according to the correlation of the signals. Typically, one can increase the value of I gradually until the result converges. With these results, detection of changes in the parameters θ is asymptotically equivalent to the detection of changes in the mean of a Gaussian vector. The generalized likelihood ratio (GLR) test for detecting unknown changes in the mean of a Gaussian vector is a χ2 test. It can be shown that the GLR test of H1 against H0 can be written as χ2global = ξN (θ0 )T Σ −1 (θ0 )M (θ0 ) ×(M T (θ0 )Σ −1 (θ0 )M (θ0 ))−1 M T (θ0 )Σ −1 (θ0 )ξN (θ0 )
9.3 MPC Performance Monitoring: Model-free Approach
165
If M (θ0 ) is a square matrix, then this test can be simplified to χ2global = ξN (θ0 )T Σ −1 (θ0 )ξN (θ0 ) where χ2global has a central χ2 distribution under H0 , and a noncentral χ2 distribution under H1 . The degree of freedom of χ2global is the dimension of θ. A threshold value χ2α can be found from a χ2 table, where α is the false alarm rate specified by the users. If χ2global is found to be larger than the threshold value, then a change in the parameters is detected. In many practical applications, it may be desirable that the model validation algorithm should not give an alarm if parameter changes occur only in the disturbance models. Huang (2000) [135] has demonstrated that by associating the detection algorithm with the output error method in system identification [11], the model validation algorithm based on the local approach is disturbance-dynamics independent. Along the same lines, if parameter changes do not affect control performance, then an alarm should not be issued. This leads to control-relevant model validation, specifically, MPC-relevant model validation [134]. Instead of using original input-output data directly, data filtered by an MPC-relevant filter is sent to the model validation algorithm so that the validation algorithm is sensitive only to parameter changes critical to control performance.
9.3 MPC Performance Monitoring: Model-free Approach Model-based approaches are attractive and give more insight about the control performance if the models are indeed available. However, practical applications do not always have models or at least accurate models to access. One most likely faces piles of routine operating data without knowing the models; thus the modelfree approach becomes an important means for control performance monitoring in practice. There are several model-free approaches available in the literature, mainly based on closed-loop impulse response [22, 136, 23], and variance of multistep prediction errors [112, 137, 136]. Some are simply based on conventional data analysis and signal processing techniques [138]. Earlier work in using the modelfree approach for performance monitoring may be traced back to [139, 140]. In this section, we shall review some recent developments in this direction. 9.3.1
Impulse-Response Curvature
As discussed in Chapter 8, if Gcl is identified from routine operating data yt , multiplying it with the interactor matrix yields z −d DGcl = F + z −d L = F0 + F1 z −1 + . . .+ Fd−1 z −d+1 + Fd z −d + Fd+1 z −d−1 + . . . Therefore, the interactor-filtered closed-loop response can be written as y˜t = F0 et + F1 et−1 + . . . + Fd−1 et−d+1 + Fd e−d + Fd+1 et−d−1 + . . .
166
9 State-of-the-art MPC Performance Monitoring
The coefficient matrices F0 , F1 , . . . are impulse response coefficient matrices of z −d DGcl . A normalized 2-norm measure of these coefficient matrices is defined as ri = trace(Fi Σe FiT )
where Σe = Cov[et ]. A normalized impulse response curve is the plot of ri versus i [22]. An impulse response curve represents a dynamic relationship between the whitened disturbance and the process output. This curve typically reflects how well the controller regulates stochastic disturbances. In the univariate case, the first d impulse response coefficients are feedback-controller invariant, where d is the process time delay. Therefore, if the loop is under minimum variance control, the impulse response coefficients should be zero after d − 1 lags. The normalized multivariate impulse response (NMIR) curve [22] reflects this idea. The first d NMIR coefficients are feedback-controller invariant, where d is the order of the interactor matrix [22]. If the loop is under multivariate minimum variance control, then the NMIR coefficients should decay to zero after d − 1 lags. The sum of squares under the NMIR curve is equivalent to the trace of the covariance matrix of the data. Similar to the conventional impulse response curve, NMIR can be used to assess control performance of MIMO systems. Shah et al. (2002) [23] further proposed to calculate NMIR rather than from z −d DGcl but from Gcl directly, thus avoiding the use of the interactor matrix. This latter approach does not have a direct connection with the minimum variance control performance but does yield a convenient and approximate measure of control performance. 9.3.2
Prediction-error Approach
In the univariate control loop, one interprets output variance under minimum variance control as the variance of the optimal prediction error. One can imagine that if a closed-loop output is highly predictable, then one should be able to do better, i.e., to compensate the predictable content by a better-designed controller. Should a better controller be implemented, the closed-loop output would have been less predictable. Therefore, high predictability of a closed-loop output implies that there is potential to improve its performance by control retuning, or in other words, the existing controller may not have been satisfactory in terms of exploiting its potential. The analogy has also been applied to MIMO systems [112, 137, 136]. Most recently, Huang et al. (2006) [141] systematically analyzed the prediction error based methods and proposed a curvature measure of closed-loop potential according to optimal prediction error. This method was demonstrated by its application to model predictive control of a simulated distillation column. This prediction-error based methods will be elaborated in Chapter 11. 9.3.3
Markov Chain Approach
The Markov chain analysis provides an alternative approach by which we can assess the performance from a different angle. A Markov-chain model is relatively
9.4 MPC Economic Performance Assessment and Tuning
167
unstructured and is flexible in modeling nonlinear or non-Gaussian processes. The simple structure and rich class of Markov chains make it one of the most important models for random processes. Harris and Yu (2003) [142] applied the Markov chain approach to performance monitoring of MPC. By monitoring the degree-of-freedom (DOF) of constraints, they demonstrated how the Markov chains can be used to analyze industrial data. The Markov chain model can also be used to monitor other important properties of model predictive control systems, namely system stability and economic performance. Two indices have been defined in [143]: one is out-of-control index (OCI) and the other transition tendency index (TTI). For a multivariable process, the OCI is defined as the number of out-of-control variables. The Markov chain analysis performed on the OCI can reveal the stability performance of the process under the model predictive control. The TTI is defined on the transition probability matrix of OCI. It provides a standardized index about the process transition tendency. By applying the Markov chain model to MPC objective functions estimated from data, similar analysis of economic benefits can be performed. Compared with MVC-benchmarking methods, the advantage of Markov-chain approaches is that they can provide some insight related to the transient and steady state of control systems. For example, the transition probability matrix reveals how the process evolves from one state to others; the equilibrium distribution predicts where the process will most likely be at steady state; the passage time and passage details elaborate how the process behaves in a transition, etc. The metrics for stability and economic performance can provide us with complementary information about the performance of the control system, which will enable us to achieve better understanding of how process evolves from good performance to poor performance or vice versa [144].
9.4 MPC Economic Performance Assessment and Tuning Minimum-variance-based performance assessment (MVPA) as illustrated in the previous sections appears not to be sufficient for model predictive control performance monitoring, particularly in terms of constraint handling and economic objectives. Recently, an algorithm based on the linear matrix inequality (LMI) for MPC economic performance analysis and for providing MPC tuning guidelines (named LMIPA) has been proposed. Mathematical details can be found in [145, 116]. This algorithm, when provided with process data and the plant steady-state gain matrix, will perform economic performance assessment for MPC. The main results are summarized below. For an m × l system with l inputs and m outputs having steady state process ¯j0 ) be the current gain matrix, K, controlled by an MPC controller, let (¯ yi0 , u mean operating points for ith output and j th input and be referred to as the base case operating points. Also, let Lyi and Hyi be the low and the high limits for yi , and Luj and Huj be the low and the high limits for uj , respectively. If (¯ yi , u¯j ) are the new mean operating points for yi and uj , respectively, after some
168
9 State-of-the-art MPC Performance Monitoring
tunings of the system, then the economic cost function for the system can be defined as a quadratic function: J=
m n 2 2 αi × y¯i + βi2 (¯ γj × u yi − μi ) + ¯j + ηj2 (¯ uj − νj ) i=1
(9.9)
j=1
where, μi and νj are the target values for ith CV and j th MV respectively, αi and βi are the linear and quadratic coefficients for yi , and γj and ηj are the linear and quadratic coefficients for uj . It is assumed that the derivative of the quadratic objective function does not equal zero within the range of constraints so that the optimum does not occur inside the constraints. For a system with the defined objective function the assessment of economic yields can be done for various cases described below: 1. 2. 3. 4. 5.
assessment of ideal yield, assessment of optimal yield without tuning the controller, assessment of improved yield by reducing variability, assessment of improved yield by constraint relaxation, and constraint tuning for desired yield.
1. Assessment of ideal yield For assessing the ideal yield, steady state operations are considered by assuming no variability in both inputs and outputs. Under this scenario the optimization problem for the system is defined as: min J
y¯i ,¯ uj
(9.10)
subject to: Δyi =
l
[Kij × Δuj ]
(9.11)
j=1
y¯i = y¯i0 + Δyi ¯j0 + Δuj u ¯j = u
(9.12) (9.13)
Lyi ≤ y¯i ≤ Hyi
(9.14)
Luj ≤ u ¯j ≤ Huj
(9.15)
where, i = 1, 2, . . . , m and j = 1, 2, . . . , l. Note that in this optimization problem, J is the cost. If the nominal yield is ξ0 , then the relation between the cost and the yield ξ is ξ = ξ0 − J i.e., there is a straightforward relation between yield and cost; thus we may mix the usage of these two terms in the following discussions.
9.4 MPC Economic Performance Assessment and Tuning
169
2. Assessment of optimal yield without tuning the controller The assessment of the optimal yield without tuning the controller means to assess the yield that should be obtained from the controller for the given constraints and the existing variability of the base case operations. This scenario considers moving the actual operating point of y¯i and u¯j to their optimal operating points, as close as possible, subject to the constraints. Under this scenario the optimization problem for the system is the same as defined in Equation 9.10 subject to equalities defined in Equations 9.11, 9.12 and 9.13; however, the inequalities are according to Equations 9.16 and 9.17: Lyi + 2 × σi0 ≤ y¯i ≤ Hyi − 2 × σi0 ¯j ≤ Huj − 2 × Rj0 Luj + 2 × Rj0 ≤ u
(9.16) (9.17)
where, i = 1, 2, . . . , m and j = 1, 2, . . . , l; σi0 and Rj0 are the standard deviation and the quarter of the range for yi and uj , respectively, under the base-case operation. The inequalities defined for the problem allow 5% constraint limit violation i.e., 95% of the operation is within two standard deviations [116]. 3. Assessment of improved yield by reducing variability This case involves tuning of the control system such that the variability of one or more variables can be reduced. Reducing the variability provides us with the opportunity to push the operating points closer to the optimum and thus improve the yield. Practically, the reduction in variance of one variable (say a quality variable) may be transferred to the increased variability of some other variables, such as constrained variables. Since constraint variables do not directly affect the profit, their variability is not of concern, as long as they are maintained within the constraint limits. Thus, variability of a quality variable may be reduced by transferring it to that of the constraint variables. For assessing the improved yield by variability reduction the optimization problem and its equality conditions are the same as those defined in Equations 9.10-9.13; however the inequalities are changed and are defined in Equations 9.18 and 9.19: Lyi + 2 × σi0 × (1 − Syi ) ≤ y¯i ≤ Hyi − 2 × σi0 × (1 − Syi ) (9.18) ¯j ≤ Huj − 2 × Rj0 × (1 − Suj ) (9.19) Luj + 2 × Rj0 × (1 − Suj ) ≤ u where, i = 1, 2, . . . , m and j = 1, 2, . . . , l; Syi and Suj are the percentage reduction of the variability of yi and uj , respectively, relative to the base case operation. 4. Assessment of improved yield by relaxing constraints Relaxing the constraints for one or more process variables in an MPC controller may provide more operating freedoms so as to increase the yields. Relaxing the limits for the constraint variables may help in moving the quality variables closer to the optimum operating points, thus improving the yield. The optimization problem for this case is defined by Equation 9.10, the equality conditions are defined by Equations 9.11, 9.12 and 9.13;
170
9 State-of-the-art MPC Performance Monitoring
however, the inequalities defining the constraints are considered in Equations 9.20 and 9.21: Lyi + 2σi0 − yholi × ryi ≤ y¯i ≤ Hyi − 2σi0 + yholi × ryi ¯j ≤ Huj − 2Rj0 + uholj × ruj Luj + 2Rj0 − uholj × ruj ≤ u
(9.20) (9.21)
where, i = 1, 2, . . . , m and j = 1, 2, . . . , l; yholi , uholj are half of the range of the constraint limits for yi and uj respectively; ryi and ruj are the user specified percentage relaxation in the constraint limits for yi and uj respectively. 5. Constraint tuning for desired yield The constraint tuning guidelines for achieving the target value of returns or yields from the system can be obtained by performing optimization defined for this case. If the ratio of the target yield and the ideal yield is Rc then the constraint tuning guidelines (expressed as percentage of changes), rˆyi and rˆuj for yi and uj , respectively, can be obtained as a solution to an optimization problem. The optimization for this scenario is defined as: min
y¯i ,¯ uj ,ˆ ryi ,ˆ ruj ,r
r
(9.22)
subject to, Lyi + 2σi0 − yholi × rˆyi ≤ y¯i ≤ Hyi − 2σi0 + yholi × rˆyi
(9.23)
Luj + 2Rj0 − uholj × rˆyi ≤ u¯j ≤ Huj − 2Rj0 + uholj × rˆuj 0 ≤ rˆyi , rˆuj ≤ r
(9.24) (9.25)
ξ0 −J ξ0 −Jideal
= Rc
(9.26)
where, i = 1, 2, . . . , m and j = 1, 2, . . . , l; r is the maximum percent change of the constraints. Thus minimization of r minimizes the possible change of constraints for the same performance. Other equalities for the optimization problem remain the same as in Equations 9.11, 9.12 and 9.13. The economic performance assessment of the controller can now be done using the information obtained after performing the optimizations discussed above. Two terms, economic performance index (ηE ) and theoretical economic performance index (ηT ), are defined to assess the economic performance. ηE =
ξE ξI
(9.27)
ηT =
ξT ξI
(9.28)
where ξE is the optimal yield without tuning of the controller, as obtained from Case 2 discussed above, ξI is the ideal yield, as obtained from Case 1, and ξT is the theoretical yield that can be achieved through a minimum variance control for variance reduction.
9.5 Probabilistic Inference for Diagnosis of MPC Performance
171
9.5 Probabilistic Inference for Diagnosis of MPC Performance 9.5.1
Bayesian Network for Diagnosis
Research in monitoring of MPC performance has achieved considerable progress as discussed in the last few sections. Diagnosis of lower-than-expected performance for an MPC system is, however, relatively underdeveloped despite a number of available monitoring tools that target individual problem sources. This is due in part to the complexity of MPC and lack of a systematic framework for the diagnosis. Thus it is necessary to build such a framework for MPC performance diagnosis. The details of this section can be found in [146]. A summary is presented here. A typical control system consists of at least four components, sensor, actuator, controller and process, each subject to possible performance degradation or abnormality. Any problem in one of these four components can affect control system performance. Each of them may have its own monitoring algorithms to detect its problem but these algorithms may, however, all be affected by problem in one or more of the four components. The problem sources and the monitors are interconnected. A Bayesian network or graphic model, which explores the sparse structure of the interconnections, provides a framework for modeling and solving such a network problem. The building blocks of the graphic models are a network of nodes connected by conditional probabilities. These nodes are random variables, which can be continuous, discrete or simply binary. Consider the case of the simplest binary random variables. If there are n binary random variables, the complete distribution is specified by 2n − 1 joint probabilities. In the illustrative Figure 9.2, there are 4 binary nodes, each node having two possible outcomes. For example, ¯ To completely determine the distribution of node A may take the value1 A or A. the 4 binary variables, we need to determine joint probability P(A,B,C,D) that has 16 outcomes. By taking into account that the sum of all probabilities must equal 1, we need to calculate 15 probabilities. However, as illustrated in Fig. 9.2, by exploring the graphic relationship of each node, only 7 probabilities need to be determined, a considerable reduction from 15 calculations. The structure of the graphic relationship is an example of incorporating a priori process knowledge and takes advantage of sparse structure of probabilistic relations among the nodes. With the increase of the nodes, the saving of computations is remarkable, making it possible to apply statistical inference theory to solve a complex network problem. If a network chart like Figure 9.2 is available, one can make a variety of inferences. For example, if we have the observations of B, C, D, we would like ¯ The decision process can to make a decision to determine whether A=A or A=A.
1
We use capital letters such as A to represent a random variable, and italic capital letters such as A to represent a value taken by the random variable.
172
9 State-of-the-art MPC Performance Monitoring P ( A)
A
⎧ P (C | A) ⎨ ⎩ P (C | A)
⎧ P ( B | A) ⎨ ⎩ P ( B | A)
B
C
D
⎧ P( D | B) ⎨ ⎩ P( D | B )
Fig. 9.2. An example of BN
be written under the Bayesian framework as P (A|BCD) that can be calculated, according to Bayes rule, as P (A|BCD) =
P (ABCD) P (ABCD) = P (BCD) A P (ABCD)
Using a construction rule of the Bayesian network [147] and according to the relationship of four nodes in Figure 9.2, the joint probability can be calculated as P (ABCD) = P (A)P (C|A)P (D|B)P (B|A) The seven probabilities specified in Figure 9.2 are sufficient to calculate P (A|BCD) to make an optimal inference about the state of A. To show the flexibility of the graphic-model approach, we consider the inference of C given observation D where in this case we assume that C cannot be observed. According to the Bayes rule, it can be derived that P (ABCD) P (C|D) = AB AB C P (ABCD) ¯ where C can take values of C and C. Due to the unavoidable uncertainties in practice, none of monitoring algorithms can give a definite answer. A decision is usually made according to certain probabilistic confidence or risk. Any diagnostic algorithm can at most give the most likely problem source according to its probabilistic inferencing. In addition to the “most likely” problem inferred, there is a “second most likely” problem, third, and so on. According to the diagnostic results, engineers or instrument technicians have to service the specific controller or the instrument according to a troubleshooting sequence. There is an associated cost for each service, some more and some less. Interestingly, an optimal troubleshooting sequence does not necessarily follow the order of likelihood of the problems [148]. An optimal sequence of the service not only according to the occurrence probability of the
9.5 Probabilistic Inference for Diagnosis of MPC Performance
173
problem but also according to the cost of the service has to be established. As an example, if there are two possible problem sources, A and B, and there are two observations C and D. The conditional probability has been calculated as P (A|CD) = 0.6, P (B|CD) = 0.4, while the service cost (confirming the problem) for A is $1000 and for B is $2000. Now the question is whether one should confirm the instruments according to the sequence AB or BA? An objective function considering the probabilities together with the service costs should be formulated and optimized in order to find the best service sequence. While many monitoring algorithms have been developed, many are being or are still to be developed. The research in this direction not only needs to consolidate and enhance monitoring algorithms that have been developed, but also needs to integrate them into a probabilistic diagnosis and troubleshooting framework. One of the objectives of this section is to illustrate a novel framework for control system diagnosis and troubleshooting, particularly for the most commonly-seen performance-related problems and their diagnosis/ troubleshooting, namely the problems in control tunings, process models, actuators, and sensors. 9.5.2
Decision Making in Performance Diagnosis
Let’s assume, at the moment, that monitors for all four problem sources are available. Furthermore, for simplicity of illustration, we assume that the sensor monitor, actuator monitor, and model validation (monitor) are designed in such way that they are only sensitive to their own problem sources. The performance monitor, on the other hand, is sensitive to all problem sources. For an off-line analysis, we do not have to consider the evolution of problems with time; thus a static Bayesian network can be built, shown in Figure 9.3, where shaded nodes are hidden and others are observed (evidence node). Each evidence node is a child of the problem source node (parent node). Clearly the performance monitor is a child of all three problem source nodes. The structure of this network also implies that four evidence nodes are conditionally independent, conditioned on the three problem source nodes, an assumption that may not be rigorously true but nevertheless simplifies network inference considerably. With availability of the three evidence nodes on the right hand side of three problem source nodes, it seems sufficient to solve the problem. Why does one need the performance monitor node? In addition to the obvious fact that this additional evidence node increases credibility (certainty) when performing inference (diagnosis) it can be a backup if one of the other nodes become unavailable (e.g. due to missing data). Furthermore, if other three nodes do not provide convincing inference such as that the evidence is somewhere between yes and no, then this fourth node plays an important role in reducing the ambiguity. The performance monitor node plays another interesting role in this network. Without evidence from the performance monitor the three problem nodes are independent according to the assumption. With the evidence from this node the three problem nodes become dependent. For example, the actuator fault may
174
9 State-of-the-art MPC Performance Monitoring
Performance monitor
Sensor
Sensor monitor
Actuator
Actuator monitor
Process model
Model validation
Fig. 9.3. An example of static Bayesian network for performance monitoring
Performance monitor
Sensor
Sensor monitor
Actuator
Actuator monitor
Process model
Model validation
t
Performance monitor
Sensor
Sensor monitor
Actuator
Actuator monitor
Process model
Model validation
t+1
Fig. 9.4. An example of dynamic Bayesian network for performance monitoring
be independent of sensor fault or process model parameter change. However, if the performance monitor indicates a change of control performance, knowing that the actuator is the problem will reduce the chance for the sensor or process model being the problem. Bayesian inference is a mathematical method that “squeezes” all available evidences/observations to gain as much information as possible. Inference from the static Bayesian network is relatively simple. Many standard Bayesian network inferencing software can readily provide solutions. For on-line applications, the static Bayesian network may not be sufficient. In this case, the problem sources may be sequentially dependent. Suppose, for example, a sensor has mean-time-between-failures (MTBF) of 1000 hrs. The sensor monitor is evaluated every 1 hr. If the sensor has no fault at current time t, then the probability of no fault at next evaluation time t + 1 would be 1 − 1/1000 = 0.999. On the other hand, for an old sensor that has MTBF of
9.6 Summary
175
10 hrs, the probability becomes 0.9. Knowing this dependence probability can significantly reduce false alarms, as shown by Smyth (1994) [149]. Therefore, a dynamic Bayesian network should be created for on-line sequential monitoring and diagnosis. An example is shown in Figure 9.4, where only two time slices are shown at time t and t + 1, respectively. Repeating these two slices provides an entire dynamic network. This network is also known as the dynamic Bayesian network, and once again, it can be solved by dynamic Bayesian inference algorithms [150, 149]. Some applications of Bayesian methods for MPC performance assessment have recently been reported in [117, 118].
9.6 Summary Monitoring of model predictive control performance has attracted great interest both in academia and in industry. The problem is challenging, owing to the fact that industrial MPC applications have to consider many practical issues. Many of the designs and practical implementations may not be easily formulated under a unified mathematical framework. Thus, MPC performance monitoring has to take into account the practicality of MPC applications, and the solution can be problem dependent. Performance monitoring problem of MPC remains open in general. Effective and mathematically rigorous solutions need to be determined. In the next three chapters, we shall present some possible solutions towards this direction.
10 Subspace Approach to MIMO Feedback Control Performance Assessment
10.1 Introduction As stated in the previous chapters, periodic performance assessment of controllers is important for maintaining normal process operation and to sustain the performance of controllers achieved when the controllers are commissioned. Controller performance assessment using closed-loop data has received much attention over the past decade. Typically, the process response variance is compared with a benchmark variance for assessing performance of the controller. Several benchmarks such as minimum variance control [19, 22], linear quadratic Gaussian (LQG) control [22], designed controller performance versus achieved controller performance [151], and many other alternative performance measures have been proposed for assessing controller performance. Among these approaches the MVC-benchmark remains the most popular benchmark. One of the reasons for the popularity of the MVC benchmark to assess performance of control loops in the industry [20, 152] is that it is non-intrusive and routine operating data is sufficient for the calculation. Although its control objective does not directly fit that of MPC, the MVC benchmark does provides a minimum lower bound of variance that can be achieved by any feedback control. For this reason, it has been an important and necessary component in all performance assessment software. As demonstrated in Chapter 8, calculation of the MVC-benchmark variance for univariate systems from routine operating data requires a priori knowledge of the process time delay. For multivariate systems, minimum variance control involves inverse of the delay-free part of the process transfer function matrix, which is calculated as the transfer function matrix pre-multiplied by the interactor matrix. The a priori knowledge of the first few Markov parameter matrices is required for calculation of the interactor matrix [22]. Hence calculation of the MVC-benchmark variance is not straightforward in the multivariate case. Furthermore, the concept of the interactor matrix is not well known in practice. As a result, estimation of the MVC-benchmark without the interactor matrix has been an active area of research. B. Huang et al.: Dyn. Model. Predict. Ctrl. & Perform. Monitor., LNCIS 374, pp. 177–193, 2008. c Springer-Verlag London Limited 2008 springerlink.com
178
10 Subspace Approach to MIMO Feedback Control Performance Assessment
The success of subspace identification has inspired an alternative approach to predictive control design as described in Chapter 7 and to control performance assessment, discussed in this chapter. Subspace identification methods allow the direct identification of a state space model for the system from the data. In Chapter 3, certain subspace matrices are identified as an intermediate step in subspace identification methods, which correspond to the system states (or past inputs and outputs), the deterministic inputs and the stochastic inputs. Normally, subspace identification proceeds in two steps. The first step is a data projection step to obtain the subspace matrices; the second step is to extract the state space matrices from the subspace matrices. However, as has been demonstrated in Chapter 7, predictive control can be designed directly from subspace matrices. Similarly the MVC-benchmark variance can also be calculated from the subspace matrices; therefore it provides a novel approach for obtaining the MVC-benchmark variance and eliminates the need of estimating the interactor matrix or extracting the model/Markov parameter matrices. The important difference between the “calculation of the subspace matrix” and subspace identification is that the former does not extract an explicit “model” and is also known as a model-free approach, as demonstrated in Chapter 7. Under this framework, no interactor matrix, Markov parameter, multivariate transfer function matrix, state space model etc. are needed and this will make the subspace approach for multivariate controller performance assessment more attractive in practice. All existing methods including the one discussed in this chapter for multivariate MVC benchmark computation start from either an interactor matrix [112, 114], Markov parameters [25], an interactor matrix equivalent time-delay matrix [27], or a set of identification experiment data (the approach addressed in this chapter). All different approaches should yield an equivalent result. However, what are the advantages of the direct data-driven approach as discussed in this chapter? The first advantage is the consolidation of open-loop model identification, closed-loop time series analysis, and extraction of the MVC benchmark into a single step. The second advantage is that the model structure error or bias error is avoided or alleviated without forcing the process data to fit into a parametric model. The third advantage is the conceptual simplicity; several complex concepts and tedious procedures in identifying a parametric multivariate model and in a multivariate MVC computation are avoided. Certainly, these advantages come at certain costs as well. For example, a model-free approach is known to trade a larger variance error for a smaller bias error. Nevertheless, the variance error can always be reduced by sampling more data, which is usually not a problem in performance assessment. The remainder of this chapter is organized as follows. Some important subspace matrices are briefly revisited in Section 10.2 for better understanding of this chapter. Section 10.3 is the main section of this chapter, which provides a method for the estimation of the multivariate MVC-benchmark directly from input/output data. The main results are illustrated though simulations and an industrial application example in Section 10.4, followed by summary in
10.2 Subspace Matrices and Their Estimation
179
Section 10.5. For readers who are only interested in the final algorithm expressed as input/output data, we refer them to Algorithm 10.5.
10.2 Subspace Matrices and Their Estimation 10.2.1
Revisit of Important Subspace Matrices
Subspace identification methods allow estimation of a state space model for the system directly from the data. Consider the following state space representation of a linear time-invariant system with l-inputs (ut ), m-outputs (yt ) and n-states (xt ) as: xt+1 = Axt + But + Ket yt = Cxt + et
(10.1) (10.2)
This model is slightly different from that described by Equations 3.1 and 3.2; the D-matrix is taken as zero owing to the zero-order-hold in typical multivariate (computer) process control systems, a common assumption in control performance assessment literatures. The matrix input-output equations used in subspace identification have been given in Chapter 3: d s U p + HN Ep Yp = ΓN Xp + HN d s Yf = ΓN Xf + HN Uf + HN Ef
(10.3) (10.4)
Xf = AN Xp + ΔdN Up + ΔsN Ep
(10.5)
d is slightly modified Owing to the assumption of D = 0, the subspace matrix HN from that of Chapter 3. Other matrices remain unchanged. The following three important system matrices are recalled here:
d HN
s HN
T C T (CA)T ... (CAN −1 )T ⎞ ⎛ 0 0 ... 0 ⎟ ⎜ ⎟ ⎜ ⎜ CB 0 ... 0 ⎟ ⎟ ⎜ =⎜ ⎟ ⎜ ... ... ... ... ⎟ ⎠ ⎝ CAN −2 B CAN −3 B ... 0 ⎞ ⎛ I 0 ... 0 ⎟ ⎜ ⎟ ⎜ ⎜ CK I ... 0 ⎟ ⎟ ⎜ = ⎜ ⎟ ⎜ ... ... ... ... ⎟ ⎠ ⎝ CAN −2 K CAN −3 K ... I
ΓN =
(10.6)
(10.7)
(10.8)
180
10.2.2
10 Subspace Approach to MIMO Feedback Control Performance Assessment
Estimation of Subspace Matrices from Open-loop Data
If Ef is independent of past input Up , past output Yp (or equivalently their combination Wp ), and future input Uf , by performing an oblique projection of Equation 10.4 along the row space Uf on to the row space of Wp , we get d s Yf /Uf Wp = ΓN Xf /Uf Wp + HN Uf /Uf Wp + HN Ef /Uf Wp
(10.9)
It has been shown in Chapter 4 that the last two terms of Equation 10.9 are zero: Uf /Uf Wp = 0 is by the property of the oblique projection according to Equation 3.50; Ef /Uf Wp = 0 is based on the assumption that future disturbance is independent of past input/output and future input. This assumption holds under the open-loop condition. Thus Equation 10.9 can be simplified to ˆf Yf /Uf Wp = ΓN Xf /Uf Wp = ΓN X
(10.10)
where the last equality follows from Section 3.5.2. Since Yf /Uf Wp is nothing but the linear combination of Wp , we can write ˆf Yf /Uf Wp = Lw Wp = ΓN X
(10.11)
where Lw can be calculated using Yf , Uf , Yp and Up according to Equation 3.49. Note Equation 10.11 has also been derived in Equation 3.62. With the optimal state prediction in Equation 10.10 and the linear expression in Equation 10.11, by noting that Ef is white noise, it follows from Equation 10.4 that an optimal prediction of Yf can be written as d Yˆf = Lw Wp + HN Uf
(10.12)
d , rewrite Equation 10.4 as To estimate HN d s U f + HN Ef Yf − ΓN Xf = HN
(10.13)
Right multiplying Equation 10.13 by UfT (Uf UfT )−1 , noting the independence of Uf and Ef , when j → ∞ yields 1 d 1 (Yf − ΓN Xf )UfT (Uf UfT )−1 = HN j j or d (Yf − ΓN Xf )UfT (Uf UfT )−1 = HN
(10.14)
where, for the sake of the estimation, ΓN Xf may be replaced by its estimate ˆ f = Lw Wp according to Equation 10.11, i.e., ΓN X d (Yf − Lw Wp )UfT (Uf UfT )−1 = HN d Now, as in Chapter 3, denote Lu HN . Then
(10.15)
10.3 Estimation of MVC-benchmark from Input/Output Data
181
d L u = HN = (Yf − Lw Wp )UfT (Uf UfT )−1
= (Yf − Yf /Uf Wp )UfT (Uf UfT )−1 = (Yf − Yf /Uf Wp )Uf†
(10.16)
where Lu is also known as the subspace matrix corresponding to the deterministic input, defined in Chapter 3. The procedure discussed above provides an alternative derivation of subspace matrices estimation using open-loop data. With these notations, Equation 10.12 is often written as Yˆf = Lw Wp + Lu Uf
(10.17)
In view of Equation 10.17, Lw and Lu can also be found by solving the following straightforward least squares problem as discussed in Chapter 3: ⎞ ⎛ Wp ⎠ ||2F min ||Yf − Lw Lu ⎝ (10.18) Lw ,Lu Uf The solution is given by the orthogonal projection of the row space of Yf on to the row space spanned by Wp and Uf as discussed in [38]: ⎞ ⎛ Wp ⎠ Yˆf = Yf / ⎝ Uf
Lw and Lu can be calculated according to Equation 3.44.
10.3 Estimation of MVC-benchmark from Input/Output Data 10.3.1
Closed-loop Subspace Expression of Process Response under Feedback Control
Define data column vectors over a horizon of N for the output, input, and white noise disturbance, respectively, as ⎞ ⎛ yt−N ⎞ ⎞ ⎞ ⎛ ⎛ ⎛ ⎜ . ⎟ ⎜ . ⎟ yt ut et ⎜ . ⎟ ⎟ ⎟ ⎟ ⎜ ⎜ ⎜ ⎟ ⎜ .. .. .. ⎟ ⎟ ⎟ ⎜ ⎜ ⎜ ⎟ y ⎟ , uf = ⎜ ⎟ , ef = ⎜ ⎟ , wp = ⎜ . . . yf = ⎜ ⎜ t−1 ⎟ ⎟ ⎟ ⎟ ⎜ ⎜ ⎜ ⎟ ⎜ u ⎝ yt+N −2 ⎠ ⎝ ut+N −2 ⎠ ⎝ et+N −2 ⎠ ⎜ t−N .. ⎟ ⎟ ⎜ yt+N −1 ut+N −1 et+N −1 ⎝ . ⎠ ut−1 (10.19) Using the notations defined in Equation 10.19, it follows from Equation 10.4 that a column vector yf can be written as d s yf = ΓN xt + HN u f + HN ef
(10.20)
182
10 Subspace Approach to MIMO Feedback Control Performance Assessment
This is a vector form of subspace representation of the plant model 10.1-10.2. Consider a regulatory feedback control law expressed in the state space form as xct+1 = Ac xct + Bc yt ut = −Cc xct − Dc yt c By defining ΓNc , HN conformably with Equations 10.6-10.8, we have the vector form of subspace representation of the control model: c uf = −ΓNc xct − HN yf
(10.21)
Substituting Equation 10.21 in 10.20 yields d c c d c s ΓN xt − HN HN y f + HN ef yf = ΓN xt − HN
After rearrangement and simplification, we get
xt d c −1 d c −1 s d c HN ) HN e f + (I + HN yf = (I + HN HN ) ΓN −HN ΓN xct d c −1 cl d c −1 s d c = (I + HN HN ) ΓN −HN ΓN xt + (I + HN HN ) HN ef (10.22) where, to distinguish closed-loop state and open-loop state in this chapter, we have defined
Δ xt xcl t = xct
as closed-loop states. On the other hand, the state space model of the closed-loop system can also be written directly as cl xcl t+1 = Acl xt + Kcl et
(10.23)
Ccl xcl t
(10.24)
yt =
+ et
The corresponding subspace representation can be written as cl yf = ΓNcl xcl t + HN e f
(10.25)
cl HN
where, like Equation 10.8, consists of impulse response coefficients (Markov parameter matrices) of the closed-loop system from white noise et to output yt . cl In particular, the first block column (N m × m) of HN corresponds to the first N impulse response coefficient matrices of the closed-loop system from the white noise et to the output yt . Comparing Equation 10.25 with 10.22, it follows that cl d c −1 s = (I + HN HN ) HN HN
(10.26)
Equation 10.26 is an expression of the closed-loop subspace matrix in relation to the open-loop subspace matrices and will be useful in the next section. In the following, we shall consider the closed-loop expression under minimum variance control. Consider the following theorem derived in [25] under the framework of Markov parameter representation of the process model:
10.3 Estimation of MVC-benchmark from Input/Output Data
183
s s Theorem 10.1. Denote the first block column (N m × m) of HN as HN,1 , which is an assembly of the first N Markov parameter matrices of the disturbance model. Then the output expression under minimum variance control in the sense of min E[ytT yt ] can be written as
yt |mvc =
ω−1
Fi et−i
(10.27)
i=0
where ω is a finite integer and ⎞ ⎛ F0 ⎟ ⎜ ⎜ .. ⎟ d d † s (HN ) )HN,1 ⎜ . ⎟ = (I − HN ⎠ ⎝ FN −1
(10.28)
where N ≥ ω.
Proof. We refer to [25].
In [25], the number of open-loop Markov parameter matrices is chosen to be ω, i.e., N in Equation 10.28 is the same as ω and it is also implicitly shown in [25] that Fi−1 = 0 for i > ω (10.29) This fact will be elaborated again in Remark 10.1. Now define d d † s L1 = (I − HN (HN ) )HN,1
With Equation 10.27, noting Equation 10.29, it follows from [22] that the output variance under minimum variance control can be written as Jmvc min E[ytT yt ] = trace(min Eyt ytT ) = trace(L1 Σe LT1 ) d d † s sT d d † T = trace(I − HN (HN ) )HN,1 Σe HN,1 (I − HN (HN ) )
(10.30)
With Equation 10.30, we are now ready to derive the main results of this chapter in the following section. 10.3.2
Estimation of MVC-benchmark Directly from Input/Output Data
Equation 10.30 is the lower bound of the output variance of yt known as the minimum variance control (MVC) benchmark expressed under the subspace framework. We will show in this section that Equation 10.30 can be directly calculated d from input/output data. By a glance at Equation 10.30, we see that both HN s d and HN,1 are needed. HN can be estimated directly through input/output data s may be estimated together with Lu from the by Equation 10.16. Although HN set of open-loop data according to Equation 10.13, it is not recommended since
184
10 Subspace Approach to MIMO Feedback Control Performance Assessment
the disturbances between the open-loop experiment and the closed-loop operas should be estimated from a set tion can be significantly different. Therefore, HN s of most representative closed-loop routine operating data. We will show that HN under closed-loop conditions can be replaced by a closed-loop subspace matrix cl , defined in Equation 10.25, under the existing control. HN The following two lemmas are useful for proving the main theorem of this section. Lemma 10.1. For any stable feedback control, the closed-loop subspace matrix cl HN , defined as the subspace matrix from white noise Ef to the output Yf , satisfies the following relation as j → ∞: −1
−1
cl = Lh diag(ΦT Σe 2 , · · · , ΦT Σe 2 ) HN
(10.31)
where Φ is a unitary matrix and Lh is a lower triangular matrix resulted from an QR decomposition according to the following equation: 1 Lh Q = √ (Yf − ΓNcl Xfcl ) j
(10.32)
where Q is a unitary matrix. Proof. Expand vector subspace Equation 10.25 to a matrix subspace equation: cl Yf = ΓNcl Xfcl + HN Ef
(10.33)
where the data Hankel matrix Yf is similarly defined as in Equation 3.26 but with closed-loop data; Xfcl is similarly defined as Equation 3.30 but with closed-loop states. Note that, according to Equation 10.33 1 cl 1 √ (Yf − ΓNcl Xfcl ) = √ HN Ef j j cl where HN is a lower triangular Toeplitz matrix corresponding to the stochastic input (disturbance) with all block diagonal matrices being identity matrices (see also Equation 10.8 for a similar matrix), and thus all diagonal elements are 1. Perform a QR decomposition:
1 Lh Q = √ (Yf − ΓNcl Xfcl ) j Then
1 cl L h Q = √ HN Ef j where Lh is a lower triangular matrix and Q is a unitary matrix. Therefore, we cl and Ef given by have the solution of HN cl HN = Lh Λ−1
Ef = jΛQ
where Λ−1 is an invertible block lower triangular matrix and so is Λ.
(10.34) (10.35)
10.3 Estimation of MVC-benchmark from Input/Output Data
185
Now decompose Lh into the sum of a block diagonal matrix containing all ¨ h ) and the reblock diagonal elements (m × m matrices) of Lh (denoted as L ˜ maining matrix (denoted as Lh ): ¨h + L ˜h Lh = L Then according to Equation 10.34 cl ¨ h Λ−1 + L ˜ h Λ−1 =L HN
Since Λ−1 is a block lower triangular matrix and all block diagonal elements of cl cl are identity matrices, the block diagonal elements of HN depend only on HN −1 ¨ Lh Λ and ¨ h Λ−1 = I L and as a result ¨h Λ=L
(10.36)
which indicates that Λ is in fact a block diagonal matrix. On the other hand, since lim
j→∞
1 Ef EfT = E[ef eTf ] = diag(Σe , · · · , Σe ) j
this, together with Equation 10.35, yields lim ΛΛT = diag(Σe , · · · , Σe )
j→∞
(10.37)
Since Λ is a block diagonal matrix according to Equation 10.36, it follows from Equation 10.37, as j → ∞, 1
1
Λ = diag(Σe2 Φ, · · · , Σe2 Φ)
(10.38)
where Φ is a unitary matrix. Substituting Equation 10.38 in Equation 10.34 yields Equation 10.31. d d † s d d † cl Lemma 10.3. (I −HN (HN ) )HN is identical to (I −HN (HN ) )HN . As a result, in calculating the MVC benchmark from Equation 10.30, the first block column d d † s d d † (HN ) )HN can be replaced by that of (I − HN (HN ) ) (N m × m) of (I − HN cl HN , i.e., d d † s d d † cl (I − HN (HN ) )HN,1 = (I − HN (HN ) )HN,1 (10.39)
Proof. We shall show d d † s d d † cl (I − HN (HN ) )HN = (I − HN (HN ) )HN
(10.40)
Substituting Equation 10.26 in Equation 10.40 yields d d † s d d † d c −1 s (HN ) )HN = (I − HN (HN ) ) (I + HN HN ) HN (I − HN
(10.41)
186
10 Subspace Approach to MIMO Feedback Control Performance Assessment
Therefore by observation, we need to show that d d † d d † d c −1 (HN ) ) = (I − HN (HN ) ) (I + HN HN ) (I − HN
(10.42)
to prove the lemma, which is equivalent to showing d d † d c d d † (I − HN (HN ) ) (I + HN HN ) = (I − HN (HN ) )
(10.43)
Expanding the left hand side term in the above equation d d † d c (I − HN (HN ) )(I + HN HN ) d c d d † d d † d c (HN ) HN H N = I + HN HN − HN (HN ) − HN
(10.44)
d d † d c d d † d c = I − HN (HN ) + {HN HN − HN (HN ) H N HN }
(10.45)
=I−
d d † HN (HN )
(10.46)
d d † d d (HN ) HN = HN . The last equation follows since HN
Lemma 10.3 is essentially the subspace version of the control invariance property of the first few Markov parameters of the interactor-filtered disturbance model under the transfer function framework derived by Huang and Shah (1999) [22]. With Lemmas 10.1 and 10.3, we are ready to show the following theorem: Theorem 10.2. Let
ˆT ˆ h = √1 (Yf − Yf Yp† Yp )Q L j
(10.47)
ˆ is a unitary matrix and let L ˆ h,1 be the first block column (N m × m) of where Q ˆ Lh where Yp , Yf are Hankel matrices of closed-loop routine operating data. Let Lu be calculated from Lu = (Yf − Yf /Uf Wp )Uf† (10.48) where Yf , Uf , Wp are Hankel matrices of open-loop input/output data with persistent excitations. Then the multivariate minimum variance benchmark can be estimated directly from input/output data according to ˆ T (I − Lu L† )T ˆ h,1 L Jˆmvc = trace(I − Lu L†u )L h,1 u
(10.49)
This is an explicit expression of the multivariate minimum variance control benchmark directly calculated through an algebraic manipulation of the input/ output data. Proof. Using Lemma 10.3, Equation 10.30 can be written as d d † cl cl T d d † T (HN ) )HN,1 Σe [HN,1 ] (I − HN (HN ) ) Jmvc = trace(I − HN
(10.50)
According to Equation 10.31 of Lemma 10.1, the first block column (N m×m) cl should satisfy of HN − 12
cl HN,1 = Lh,1 ΦT Σe
where Lh,1 is the first block column (N m × m) of Lh .
(10.51)
10.3 Estimation of MVC-benchmark from Input/Output Data
187
Substituting Equation 10.51 in Equation 10.50 yields −1
−1
d d † d d † T Jmvc = trace(I − HN (HN ) )Lh,1 ΦT Σe 2 Σe Σe 2 ΦLTh,1 (I − HN (HN ) ) d d † d d † T = trace(I − HN (HN ) )Lh,1 LTh,1 (I − HN (HN ) )
(10.52)
To estimate Jmvc from data of finite length, we can use Lu in Equation 10.16 d in Equation 10.52. The estimate of Lh,1 or Lh is discussed next. to replace HN Making an orthogonal projection of Equation 10.33 on to Yp yields cl Ef /Yp Yf /Yp = ΓNcl Xfcl /Yp + HN
= ΓNcl Xfcl /Yp
(10.53)
where Ef /Yp = 0 due to the fact that future white noise disturbance is independent of the past output. Write Yf /Yp Lcl y Yp where † Lcl y = Yf Yp
(10.54)
Analogous to the derivation of the state estimate as an orthogonal projection in Section 3.5.2, it can be shown that ˆ fcl = Xfcl /Yp X ˆ cl is the closed-loop state estimate based on the Kalman filter. Therefore where X f † ˆ fcl = Yf /Yp = Lcl ΓNcl Xfcl /Yp = ΓNcl X y Yp = Yf Yp Yp
(10.55)
ˆ cl according to EquaReplacing ΓNcl Xfcl of Equation 10.32 by the expression of ΓNclX f tion 10.55, we get an estimate of Lh through the following QR decomposition: 1 Lh Q = √ (Yf − Yf Yp† Yp ) j
(10.56)
1 Lh = √ (Yf − Yf Yp† Yp )QT j
(10.57)
Consequently
The estimated version of Equation 10.52 can then be written as Jˆmvc = trace(I − Lu L†u )Lh,1 LTh,1 (I − Lu L†u )T where Lh,1 is the first block column of Lh and Lh is estimated from Equation 10.57.
188
10 Subspace Approach to MIMO Feedback Control Performance Assessment
The main result of this chapter is summarized in the following algorithm: Algorithm 10.5. Let a set of open-loop experiment data with persistent input excitations be Sopen = {y1 , u1 , · · · , y2N +j−1 , u2N +j−1 } and a set of closed-loop routine operating data be c c Sclose = {y1c , uc1 , · · · , y2N +j−1 , u2N +j−1 }
Note that it is not necessary to have the same lengths between open-loop and closed-loop data. Form open-loop data Hankel matrices, Up , Uf , Yp , Yf , according to Equations 3.23-3.26. Form closed-loop data Hankel matrices Yp , Yf similarly. We can then use the following procedure to calculate the multivariate MVC benchmark directly from the data Hankel matrices: 1. From Sopen , the estimate of HN , denoted by Lu , is calculated according to Equation 3.44 or the following explicit expression Lu = (Yf − Yf /Uf Wp )Uf† where Uf† is pseudo inverse of Uf . 2. From Sclose , calculate Lh according to a QR decomposition of Yf Yp† Yp ) so that 1 Lh = √ (Yf − Yf Yp† Yp )QT j
√1 (Yf j
−
where Q is a unitary matrix resulted from the QR decomposition, and extract the first block column (N m × m) of Lh , denoted as Lh,1 . 3. Then an estimate of the minimum variance or MVC benchmark can be calculated according to Jˆmvc = trace(I − Lu L†u )Lh,1 LTh,1 (I − Lu L†u )T 4. The performance index can be calculated according to the following formula: η=
Jˆmvc trace(Cov(yt ))
and individual indices for each output can be calculated by replacing trace with diag when calculating the benchmark variance and the actual variance. Remark 10.1. A few practical points in the implementation of Algorithm 10.5 are summarized in the following: • The estimation of Lu is subject to disturbances. Thus, Lu may not be a lower triangular matrix although it should be theoretically. For the same reason, some elements in the lower triangular part, which are supposed to be zero due to time delays, may not be zero. These small valued-elements, particularly in
10.3 Estimation of MVC-benchmark from Input/Output Data
189
the upper triangular part, if not cleaned, may affect or even cause the algorithm to fail. Thus, it is recommended to compare each element of Lu with its confidence intervals and reset it to zero if it is within the confidence interval. Unfortunately, there is no systematic result available to calculate the confidence interval of Lu in the literature. An approximate solution is to calculate the statistics based on the upper triangular elements of Lu , which are supposed to be all zeros theoretically. Therefore, they represent the effect of disturbances if they are not zero, and a statistic such as the standard deviation of the disturbance effect may be calculated from these elements. A confidence interval can then be constructed from the statistic and may be used to clean the elements in the lower triangular part of Lu . All upper triangular elements should be cleaned and set to zero. Considering the uncertainty, care must be taken when choosing the tolerance for the pseudo inverse L†u . The tolerance may be chosen according to the above calculated statistic. • The choice of the horizon N for open-loop data is also important. In view of the quantity (I − Lu L†u ), it is clear that this quantity would have been zero if the lower triangular Toeplitz matrix Lu were invertible. This is not the case due to the existence of time delays. However according to Equation 10.28, d d † s (HN ) )HN,1 represents the impulse response of the closed-loop sys(I − HN tem under minimum variance control, or equivalently, (I − Lu L†u )Lh,1 represents the covariance (Σe )-weighted impulse response of the closed-loop system under minimum variance control. According to Huang and Shah (1999) [22], the impulse response of the closed-loop system under minimum variance control has a finite number of impulse response terms. Therefore, the dimension (I − Lu L†u )Lh,1 will not increase after a certain number, i.e., any additional elements due to the increase of N will be all zero after a certain dimension. This condition has also been shown implicitly in [25] although it is not used for their algorithm. Consequently, if N is initially chosen to be sufficiently large, then the MVC benchmark can be correctly calculated by Equation 10.49. However, since the choice of N is also related to the system “order”, too large an N implies too high an “order” used that can introduce a problem similar to “overparameterization” and can affect performance assessment results. In addition, the algorithm is limited to rectangular or “fat” systems, i.e., the number of outputs should not be more than that of the inputs. • Since the minimum variance benchmark depends on the time delays, it is preferable that the input signals for the open-loop experiment have higher frequency than the lower frequency. This has been discussed in Huang and Shah(1999) [22]. The need of a set of open-loop data seems demanding but is consistent with conventional multivariate control performance assessment algorithms where a process model/interactor matrix is needed. However, the proposed method eliminates the need to calculate the interactor matrix or process model. In addition, the open-loop test data is usually available for advanced multivariable control systems. If the open-loop test is not available, the proposed algorithm may be modified by using the closed-loop subspace approach as discussed in Chapter 4.
190
10 Subspace Approach to MIMO Feedback Control Performance Assessment
10.4 Simulations and Application Example Using two numerical simulation examples we will demonstrate the equivalence of the multivariate MVC-benchmark variance obtained using the data-driven approach discussed in this chapter to that obtained through the interactor-matrix filtering approach presented in [22]. d s cl , HN , and HN can all be Given process models, the theoretical values of HN calculated and the theoretical value of Equation 10.50 can be obtained. This provides a way to demonstrate the identity between the proposed algorithm and the traditional algorithms if no uncertainty results from the disturbances. To this end, we will first calculate the theoretical values of the MVC benchmark using the developed method as well as the traditional interactor-matrix based method to illustrate that they are indeed identical, and then compare the theoretical value with the one directly estimated from the input/output data. Consider a 2 × 2 multivariate process with an open-loop transfer function matrix Gp and a disturbance transfer function matrix Gl given, respectively, by ⎛ ⎞ Gp = ⎝
⎛
Gl = ⎝
z −(d−1) 1−0.4z −1
0.5z −d 1−0.1z −1
0.3z −(d−1) z −d 1−0.4z −1 1−0.8z −1 1 −z −1 1−0.5z −1 1−0.6z −1 z −1 1.0 1−0.7z −1 1−0.8z −1
⎠
⎞ ⎠
The interactor matrix can be calculated as ⎞ ⎛ −0.9578z d−1 −0.2873z d−1 ⎠ D=⎝ −0.2873z d 0.9578z d
The white noise excitation, et , is a two-dimensional normal-distributed white noise sequence with Σe = 0.25I. The output performance is measured by J = E[ytT yt ]. Consider that the following multiloop controller is implemented in the process: ⎞ ⎛ 0.5−0.20z −1 0 −1 ⎠ Gc = ⎝ 1−0.5z 0.25−0.200z −1 0 (1−0.5z −1 )(1+0.5z −1 ) Two sets of data are generated. The first set of data of 1000 samples is generated from open-loop simulation. The input signal is generated from the function idinput of the MATLAB System Identification toolbox with magnitude of 1 and cut-off frequency 0.5. The other set of data is routine operating data of 1000 samples, subject to the disturbance only, without external excitations. We choose N = 15 for both open-loop and closed-loop data Hankel matrices. The theoretical minimum variance benchmark and the performance index using the traditional interactor-matrix based approach are calculated according to
10.4 Simulations and Application Example
191
Huang and Shah(1999) [22]. Using the proposed approach, we can also calculate the theoretical value as well as the estimated one for different values of d. The results are shown in Figure 10.1. It is clear that the traditional approach for calculating the theoretical values, based on the interactor matrix, yields exactly the same results as the theoretical values calculated according to the proposed approach, confirming the identity of the two approaches. One can also see that the estimated values match the theoretical ones reasonably well. 1 traditional approach proposed approach (theoretical) proposed approach (estimated)
0.9
0.8
0.7
0.5
J
MVC
0.6
0.4
0.3
0.2
0.1
0
1
2
3
4 d
5
6
7
Fig. 10.1. Comparison of performance indices for different values of d
Now, consider the following setting of the models: ⎞ ⎛ z −1
−1
Gp = ⎝ 1−0.4z−1
0.5z −2 1−0.1z −1
0.3z z −2 1−0.4z −1 1−0.8z −1
⎛
Gl = ⎝
1 −z −1 1−0.5z −1 1−0.6z −1 z −1 1.0 1−0.7z −1 1−0.8z −1
⎠
⎞ ⎠
The following multiloop controller is implemented on the process: ⎞ ⎛ 0.5−0.20z −1 0 −1 ⎠ Gc = k ⎝ 1−0.5z 0.25−0.200z −1 0 −1 −1 (1−0.5z )(1+0.5z ) The interactor matrix can be calculated as ⎛ ⎞ −0.9578z −0.2873z ⎠ D=⎝ −0.2873z 2 0.9578z 2
192
10 Subspace Approach to MIMO Feedback Control Performance Assessment
The theoretical results as well as the estimated ones for different values of the control gain k together with the theoretical results obtained from the traditional interactor-matrix based approach are shown in Figure 10.2. Once again, the traditional approach for calculating the theoretical values based on the interactor matrix yields exactly the same results as the theoretical values calculated according to the proposed approach. The estimated values match the theoretical ones reasonably well. 1 traditional approach proposed approach (theoretical) proposed approach (estimated)
0.9
0.8
0.7
0.5
J
MVC
0.6
0.4
0.3
0.2
0.1
0 0.8
1
1.2
1.4 k
1.6
1.8
2
Fig. 10.2. Comparison of performance indices for different values of control gain k d d † To illustrate the feature that there is only a limited dimension of (I−HN (HN ) ) beyond which all elements are zero, we use k = 1 and N = 3 as an example shown below: ⎞ ⎛ 10 0 0 00 ⎟ ⎜ ⎟ ⎜ ⎜0 1 0 0 0 0⎟ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ 0 0 0.0826 −0.2752 0 0 d d † ⎟ I − HN (HN ) =⎜ ⎟ ⎜ ⎜ 0 0 −0.2752 0.9174 0 0 ⎟ ⎟ ⎜ ⎟ ⎜ ⎜0 0 0 0 0 0⎟ ⎠ ⎝ 00 0 0 00 s HN ,
d d † The result indicates that beyond N = 2, the dimension of I − HN (HN ) will no longer increase. Thus, the choice of N = 15 in the previous two simulations is sufficient for all computations.
10.5 Summary
193
1 0.9 0.8
Performance indices
0.7 0.6 0.5 0.4 0.3 0.2 0.1 0
1
2
3
4
5
6
7 8 9 10 Controlled variable #
11
12
13
14
15
Fig. 10.3. Performance assessment result based on the proposed subspace approach
Next, we consider a practical example. The purpose of this example is to apply the performance assessment algorithms introduced in above sections to a reaction process. The objective is to use the minimum variance control benchmark to assess the performance of the existing process controllers. This will identify the potential for further improvement in terms of variability reduction and this reduction of variability can directly be transferred to economic benefits. The process has 41 controlled variables and 15 manipulated variables, where 15 important controlled variables are selected for this analysis. While the conventional performance assessment algorithm has a considerable challenge in the computation for this example due to the large dimension of the system, the proposed subspace approach with N = 15 gives the solution of performance indices for all 15 controlled variables with reasonable computation time, and the results are shown in Figure 10.3.
10.5 Summary A novel framework based on input/output data is proposed for calculating the minimum variance benchmark for multivariate feedback control systems. No explicit model and interactor matrices are needed to perform the calculation. Calculation of the multivariate performance index based on data without using the interactor matrix or a parametric model is an important step towards practical applications of the multivariate performance assessment technique. The equivalence of the proposed approach to the conventional interactor-matrix based approach for obtaining the MVC-benchmark variance is proved and also illustrated through simulations.
11 Prediction Error Approach to Feedback Control Performance Assessment∗
11.1 Introduction As discussed in previous chapters, performance assessment of multivariate control systems requires a priori knowledge about the process models, such as interactor matrices. In recent years, there has been growing interest in reducing the complexity of the a priori knowledge requirement, such as the work by Ko and Edgar (2001) [25], Kadali and Huang (2003) [26], and McNabb and Qin (2003) [27]. Although these attempts have reduced the complexity of the a priori knowledge requirement to some extent, they all require certain information that is computationally simpler but fundamentally equivalent to the interactor matrices; for example, the open-loop process Markov parameter matrices, the lower triangular Toeplitz matrix, or the multivariate time delay (MTD) matrix. That is, they all require a priori knowledge that is beyond the pure time delays between each pair of the inputs and outputs. Harris et al. (1996, 1999) [112, 21] introduced an extended horizon performance monitoring approach without using the interactor matrix. Kadali and Huang(1998) [137] and Shah et al. (2002) [23] introduced curvature measures of multivariate performance without relying on the interactor matrix. Most recently, Huang et al. (2004) [115] proposed an algorithm for multivariate control performance assessment with a priori knowledge of the order of the interactor matrix (OIM) only. In the univariate case, one interprets output variance under minimum variance control as the variance of the optimal prediction error. One can imagine that if a closed-loop output is highly predictable, then one should be able to do better, i.e., to compensate the predictable content by a better-designed controller. Should a better controller be implemented, then the closed-loop output would have been less predictable. Therefore, the high predictability of a closed-loop output implies the potential to improve its performance by control re-tuning and/or re-design, or in other words, the existing controller may not have been satisfactory in terms of exploiting its potential. Actual process systems often have time delays, which prevent the complete compensation of the predictable content of the output. For example, if a univariate process has two sample time delays, then the compensation control action ∗
This chapter (with revisions) is reprinted from Journal of Process Control, vol 16, Huang, B., S.X. Ding, and N. Thornhill, “Alternative Solutions to Multivariate c 2006 Elsevier Feedback Control Performance Assessment Problems”, 457-471, Ltd., with permission from Elsevier.
B. Huang et al.: Dyn. Model. Predict. Ctrl. & Perform. Monitor., LNCIS 374, pp. 195–211, 2008. c Springer-Verlag London Limited 2008 springerlink.com
196
11 Prediction Error Approach to Feedback Control Performance Assessment
will not take effect on the output until two samples later and the one-step-ahead prediction will not be useful for its compensation. In this case, the best that a controller can do is to compensate the predicted content according to the twostep optimal prediction (multi-step optimal prediction) and the minimum control error will coincide with the two-step optimal prediction error. Therefore, in this example, the two-step optimal prediction error is the lower bound of the output error that can be achieved by a feedback controller. This lower bound is also known as the minimum variance that is often used for control loop performance assessment [19]. Although the same rationale cannot be exactly carried over to multivariate processes due to the relatively complex delay structure of multivariate processes, multi-step optimal predictions can nevertheless provide useful information about the control performance [136]. The plot of multi-step optimal prediction error vs the prediction horizon is analogous to the closed-loop step response of the univariate process to the disturbance. The analogy provides an interesting interpretation of the multi-step optimal prediction error for multivariate processes and results in a novel performance measure. Since MPC is a multivariable control strategy based on optimal multi-step predictions, the multi-step prediction-error based approach for performance assessment is relevant to MPC. Motivated by the above rationale, this chapter is concerned with (1) development and analysis of an alternative performance assessment approach based on optimal predictions, (2) development of two datadriven subspace algorithms to estimate the multi-step optimal prediction error variance. The remainder of this chapter is organized as follows. Prediction-error based multivariate feedback control performance assessment without relying on any a priori knowledge of the process is presented in Section 11.2. Data-driven subspace algorithms for the computation of the multi-step prediction error variance are derived in Section 11.3, followed by summary in Section 11.4. While Chapter 10 represents a novel approach to multivariate feedback control performance assessment with minimum variance control as the benchmark, the current chapter presents an alternative approach to multivariate feedback control performance assessment, which can be useful for MPC performance monitoring.
11.2 Prediction Error Approach to Feedback Control Performance Assessment In this section, we consider assessment of multivariate feedback control loop performance without relying on any a priori knowledge of interactor matrices or process models. There are several interactor-matrix free methods in the literature, mainly based on closed-loop impulse responses [22, 136, 23], and variance of multi-step prediction errors [112, 137, 136]. Earlier work in using the interactor-free approach may be traced back to [139, 140]. In this section, we shall extend the above mentioned methods, mainly the variance-of-predictionerror based methods, to a novel closed-loop potential graphic measure and a single numerical metric to evaluate control performance potential.
11.2 Prediction Error Approach to Feedback Control Performance Assessment
197
Consider a closed-loop multivariate process represented by a moving average or a Markov parameter model: yt = F0 et + F1 et−1 + · · · + Fi−1 et−(i−1) + Fi et−i + · · ·
(11.1)
This moving average model can be estimated from routine operating data via multivariate ARMA (AutoRegressive Moving Average) modeling, followed by a long division, without any a priori knowledge about the interactor matrices. Since et is white noise, the optimal ith step prediction is given by yˆ(t|t − i) = Fi et−i + Fi+1 et−i−1 + · · ·
(11.2)
and the prediction error ε(t|t − i) = yt − yˆ(t|t − i) is given by ε(t|t − i) = F0 et + F1 et−1 + · · · + Fi−1 et−(i−1)
(11.3)
The covariance of the prediction error can be calculated as T Cov(ε(t|t − i)) = F0 Σe F0T + F1 Σe F1T + · · · + Fi−1 Σe Fi−1
and its scalar measure as
T ) si = trace[Cov(ε(t|t − i))] = trace(F0 Σe F0T + F1 Σe F1T + · · · + Fi−1 Σe Fi−1
The incremental prediction error can be calculated as
T ri = trace[Cov(ε(t|t − i)) − Cov(ε(t|t − (i − 1)))] = trace(Fi−1 Σe Fi−1 )
If we plot si versus i, then the plot reflects how the prediction error increases with the prediction horizon. Note that as i → ∞, Cov(ε(t|t − i)) → Cov(yt ). This fact can be seen by comparing Equation 11.1 with Equation 11.3, which becomes identical when i → ∞. Remark 11.1. si is nothing but the disturbance-covariance weighted sum of squares of the closed-loop impulse response coefficient matrices, up to time i − 1. As a result, si is analogous to the step response to some extent, and can be used to determine dynamic information such as the settling time of the closed-loop response to the disturbance. As explained in [23], ri is a 2-norm measure of the impulse response coefficient matrices and is analogous to the squared impulse response coefficients of a univariate process. Therefore, this plot of ri versus i, is also an indication of closed-loop performance of a multivariate controller, which has been discussed in the literature [22, 23]. In [136], the impulse response as a measure of control performance has been extended to the individual impulse response of each output in response to each shock of the disturbances as a measure of the interactions of the variables. The researchers [136] have also proposed the use of forecast-error variance decomposition (FEVD) for the measure of interactions. While si is the overall measure of prediction error variance, the FEVD is the decomposition of prediction error variance of individual variable to each shock of the disturbances.
198
11 Prediction Error Approach to Feedback Control Performance Assessment
Motivated by the interpretation of ri and si , we define the closed-loop potential pi as s∞ − si pi = (11.4) s∞ Since si is monotonically increasing with i, pi is monotonically decreasing. Since s0 = trace[Cov(yt − yˆ(t|t))] = 0, p0 = 1. Therefore, pi starts from 1 at i = 0 and monotonically decreases to 0 and 0 ≤ pi ≤ 1. Unlike the impulse response or variance of prediction error, pi is dimensionless and this facilitates the comparison of control performance. pi can be interpreted as follows: If a deadbeat control action can take effect from time i, then the process output SSE can be reduced by 100 × pi percent. From a stochastic view point, if i equals the interactor matrix order d, it is possible that the variance of the multivariate output can be reduced by 100×pi percent of the current variance for a simple interactor matrix structure [153, 115, 22]. Since the order of the actual interactor matrix may not be known, one would look for the trajectory of the closed-loop potential versus a range of possible d. Potential plots such as those illustrated in Figure 11.1 are useful. The fact that the potential decays faster to zero indicates less possibility to improve the control performance further. Due to the monotonically decreasing nature of the potentials and their fixed starting and ending values, the area below the potential plot reflects the rate of its decaying. Therefore, it is possible to define a scalar index to monitor the change of the closed-loop potential. This index is called the relative closed-loop potential index and can be calculated as (2) p di (11.5) ηp = i(1) − 1 pi di (1)
where pi is a reference potential calculated, for example, from the data sampled (2) before control tuning, and pi is calculated from data sampled after the tuning. The value of ηp gives the percent change of the closed-loop potential with the positive sign indicating an increased potential and the negative sign indicating a decreased potential. Note that an increase of the potential implies a deteriorated tuning performance while a decrease of the potential implies an improved tuning performance. Analogous to FEVD [136], closed-loop potentials can also be defined for the individual output variable and the relative closed-loop potential index for each output can also be derived. To calculate the potential of an individual variable, the trace operator trace[.] should be replaced by diagonalization operator diag[.]. The closed-loop potential is an extension of variance of prediction error as a performance measure. It is naturally related to variance-of-prediction-error based measures for performance assessment, such as P I3 (k) in [112] where, however, only a fixed k was considered. The graphic extension, scalar measure of closedloop potential, and its interpretations presented in this chapter have provided great enhancements to the previous methods. Example 11.1. Consider a 2×2 multivariable process with the open-loop transfer function matrix Gp and the disturbance transfer function matrix Gl given by
11.2 Prediction Error Approach to Feedback Control Performance Assessment
199
1 0.8
p
i
0.6 0.4 0.2 Decreasing control gain 0
0
1
2
3
4
5 i or d
6
7
8
9
10
Fig. 11.1. Illustration of closed-loop potential pi plot
⎛
Gp = ⎝ ⎛
Gl = ⎝
z −1 0.5z −2 1−0.4z −1 1−0.1z −1 0.3z −1 z −2 1−0.4z −1 1−0.8z −1 1 −z −1 1−0.5z −1 1−0.6z −1 z −1 1.0 1−0.7z −1 1−0.8z −1
⎞ ⎠
⎞ ⎠
The disturbance, et , is a two-dimensional normal-distributed white noise sequence with Σe = I. Consider that the following multiloop controller is implemented on the process: ⎞ ⎛ −1 k 0.5−0.20z 0 −1 ⎠ Gc = ⎝ 1−0.5z 0.25−0.200z −1 0 −1 −1 (1−0.5z )(1+0.5z )
In this example, three controller gains, k = 2.8, 2.9, 3.0 respectively, are considered. si and ri for i = 1, 2, · · · , 10 are calculated and plotted in Figure 11.2. The si plot (top panel) indicates that the closed-loop settling time increases with the increasing of the controller gain, so does the SSE. For example, the settling time for k = 2.8 is about 6 samples while the settling time for k = 3.0 is more than 10 samples. The ri plot (bottom panel) of Figure 11.2 presents similar information as si . However, unlike the si plot which is monotonically increasing, the ri plot has a more complicated and hard-to-interpret pattern. We therefore recommend the use of the si plot and the pi plot (to be discussed below). The si plot or its equivalent has also been discussed in [137, 136]. The potential plot of pi shown in Figure 11.3 is more relevant in the interpretation of control performance. For example, pi for k = 3.0, has the slowest rate to approach its steady state and thus its potential decreases to zero at the slowest rate. For a considerable range of the process delays (expressed by interactor order d for example), its potential is significantly different from zero. As an example, for an interactor order up to 5 samples, the potential is larger than 0.3, i.e., 30% reduction of variance is possible for the interactor order up to 5.
200
11 Prediction Error Approach to Feedback Control Performance Assessment
On the other hand, for the tuning of k = 2.8, the potential dies to zero quickly. In this case, there is not much potential left after the interactor order is greater than 5. 20
si
15
10 k=2.8 k=2.9 k=3.0
5
0
0
5
10
15
i 4 k=2.8 k=2.9 k=3.0
ri
3
2
1
0
0
5
10
15
i
Fig. 11.2. si and ri plots
For control tuning of multivariate systems or control upgrading from multiloop control to multivariable control such as MPC, one is interested in whether control performance is indeed improved. If an existing controller gain is k = 2.9, assume that the gain is tuned to 2.8 or 3.0 and representative closed-loop data are sampled before and after the tuning. Then the corresponding scalar measures of the relative closed-loop potentials calculated from the data are −0.16 and 0.28 for tunings k = 2.8 and 3.0 respectively with tuning k = 2.9 as the reference. These results indicate that 1) if the controller gain increases to 3.0, the resulting system has increased closed-loop potential by 28%, indicating a deteriorated performance; 2) if the controller gain decreases to 2.8, the resulting system has reduced closed-loop potential by 16%, indicating an improved performance. The scalar measures of the individual relative closed-loop potentials for the first output calculated from the data are −0.18 and 0.27 for tunings k = 2.8 and 3.0, respectively. The scalar measures of the individual relative closed-loop potentials for the second output calculated from the data are −0.04 and 0.10 for tunings k = 2.8 and 3.0, respectively. These results indicate that the tunings have much less effect on the second output than the first output. This fact can also be visualized from Figure 11.4. More interesting elaboration of individual potentials will be discussed in a case study example shortly.
11.3 Subspace Algorithm for Multi-step Optimal Prediction Errors
201
1 k=2.8 k=2.9 k=3.0
0.8
p
i
0.6 0.4 0.2 0
0
5
10
15
i
Fig. 11.3. pi plot 1 k=2.8 k=2.9 k=3.0
0.6
i
p ,y1
0.8
0.4 0.2 0
0
5
10
15
i 1 k=2.8 k=2.9 k=3.0
0.6
i
p ,y2
0.8
0.4 0.2 0
0
5
10
15
i
Fig. 11.4. Individual closed-loop potential. The top panel is closed-loop potential of y1 ; the bottom panel is closed-loop potential of y2 .
11.3 Subspace Algorithm for Multi-step Optimal Prediction Errors 11.3.1
Preliminary
Consider again a time series model of a closed-loop system in the innovation state space form: xt+1 = Acl xt + Kcl et
(11.6)
yt = Ccl xt + et
(11.7)
202
11 Prediction Error Approach to Feedback Control Performance Assessment
Note that, if one wishes, this parametric state space model (or equivalently ARMA model) can be estimated from closed-loop routine operating data using a standard time series analysis software package. The closed-loop Markov parameter matrices can then be derived from the model, and finally prediction errors can be calculated. Here, we present two data-driven subspace algorithms to calculate the multistep optimal prediction errors, from which both si and pi can easily be calculated without needing a model. The relation between the state space representation, Equations 11.6-11.7, and the moving average model, Equation 11.1, can be established as the following: F0 = I F1 = Ccl Kcl F2 = Ccl Acl Kcl .. . −2 FN −1 = Ccl AN Kcl cl
(11.8)
Following the standard subspace notations as defined in Chapter 3, one can derive, through the iterative substitution of Equations 11.6 and 11.7, subspace matrix equations as
11.3.2
cl Ef Yf = ΓNcl Xf + HN cl cl Yp = ΓN Xp + HN Ep
(11.9) (11.10)
cl Xf = AN cl Xp + ΔN Ep
(11.11)
Calculation of Multi-step Optimal Prediction Errors
Rewrite Equation 11.9 as cl Yf − ΓNcl Xf = HN Ef
(11.12)
We now evaluate the following quantity: lim
j→∞
1 1 cl cl T (Yf − ΓNcl Xf )(Yf − ΓNcl Xf )T = lim HN Ef EfT (HN ) j→∞ j j
Notice that
⎛
⎜ ⎜ ⎜ 1 T lim Ef Ef = ⎜ ⎜ j→∞ j ⎜ ⎝
(11.13)
⎞
Σe Σe ..
. Σe
⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎠
(11.14)
cl has the same Substituting Equation 11.14 in Equation 11.13, noting that HN form as that of 3.32 (by replacing A, C, K with Acl , Ccl , Kcl , respectively), yields
11.3 Subspace Algorithm for Multi-step Optimal Prediction Errors
1 (Yf − ΓNcl Xf )(Yf − ΓNcl Xf )T = j→∞ j ⎛ T T Σe Σe Kcl Ccl ⎜ C K Σ T T Ccl Kcl Σe Kcl Ccl + Σe ⎜ cl cl e ⎜ ⎝ ··· ··· −2 Kcl Σe ··· Ccl AN cl
203
lim
··· ··· ··· ···
···
−2 T T T Σe Kcl (AN ) Ccl cl
···
···
··· ···
⎞
··· −2 −2 T T T Ccl AN Kcl Σe Kcl (AN ) Ccl cl cl
+ · · · + Σe
⎟ ⎟ ⎟ (11.15) ⎟ ⎠
Comparing Equation 11.15 with Equation 11.8, it is established that diagonal elements of limj→∞ 1j (Yf − ΓNcl Xf )(Yf − ΓNcl Xf )T are given by 1 diag{ lim (Yf − ΓNcl Xf )(Yf − ΓNcl Xf )T } j→∞ j ⎛ F Σ FT ⎜ 0 e 0 ⎜ F1 Σe F1T + F0 Σe F0T =⎜ .. ⎜ . ⎝ FN −1 Σe FNT −1 + · · · + F0 Σe F0T
⎞ ⎟ ⎟ ⎟ ⎟ ⎠
(11.16)
That is to say, the diagonal elements that are the variance of prediction errors from one-step prediction all way to N -step prediction, can be calculated simultaneously. Thus, Equation 11.16 is a very useful integrated formula that simultaneously calculates the prediction errors over consecutive steps with a single computation. The simultaneous calculation of multiple step prediction errors provides a means for the visualization of the prediction errors si and closed-loop potentials pi over the extended horizon when the interactor order is unknown. Next we need to derive ΓNcl Xf from data. Considering the fact that the future disturbance Ef (white noise) is independent of the past output Yp , performing an orthogonal projection of row space of Equation 11.9 on to the row space of Yp yields (11.17) Yf /Yp = ΓNcl Xf /Yp As shown in Section 3.5.2, Xf /Yp is one realization of the Kalman filter state for the system represented by Equations 11.6-11.7. Therefore, we may write Equation 11.17 as ˆf (11.18) Yf /Yp = ΓNcl X and consequently Yf /Yp is an estimate of ΓNcl Xf . The remaining problem is to calculate the orthogonal projection used in Equation 11.17, Yf /Yp , which can be expressed as Yf /Yp = LYp
(11.19)
204
11 Prediction Error Approach to Feedback Control Performance Assessment
The calculation of this projection without relying on the models (such as Equations 11.6 and 11.7) can be formally stated as: Problem 11.3. Given the measurement y0 , y1 , · · · , yN of the closed-loop process, calculate prediction errors over a finite horizon using only the output data. In view of structure of the data Hankel matrices Yp and Yf , this problem can be re-casted as: Given Yp , find an optimal (in the sense of Frobenius norm) linear predictor of Yf in the form Yˆf = LYp (11.20) to minimize the Frobenius norm of the prediction error. The problem may be solved by minimization of the following objective function: min = ||Yf − LYp ||2F L
The solution is well known in subspace literature [67], which is given by the orthogonal projection of the row space of Yf onto the row space of Yp , i.e., Yˆf = Yf /Yp = Yf Yp† Yp = Yf YpT (Yp YpT )−1 Yp
(11.21)
L = Yf YpT (Yp YpT )−1
(11.22)
Therefore, Thus, the optimal predictions over multiple steps can be calculated from the time series data yt without relaying on an explicit time series model. Remark 11.4. Equation 11.22 is a regression equation that is known as a projection in subspace system identification literature. However, for system identification or time series modeling, one has to extract the system matrices/parameters from the calculated L matrix, i.e., another regression is needed in order to estimate model parameters. Therefore, in subspace literature, the projection matrix L has not been treated as a model. However, to obtain an appropriate estimation of L, the row dimension of Yf or Yp should be larger than the order of the underlining system as suggested in [67]. Although there is no limitation on how large this row dimension can be and, thus, there is some freedom to choose it, a too-large row dimension will reduce the column dimension of Yf or Yp , leading to larger variance error of estimation of L, and also induce “overparameterization” problem as discussed in Chapter 10. On the other hand, if the covariances of the output data are calculated, one can also perform the projection using these covariance matrices. The problem is stated as: Problem 11.5. Given the measurement y0 , y1 , · · · , yN of the closed-loop process, calculate prediction errors over a finite horizon using only variance and covariance of the output data.
11.3 Subspace Algorithm for Multi-step Optimal Prediction Errors
Let’s first define two covariance matrices ⎞ ⎛ Λ0 Λ−1 · · · Λ1−N ⎟ ⎜ Λ0 · · · Λ2−N ⎟ ⎜ Λ1 Σp,p = ⎜ ⎟ ⎝ ··· ··· ··· ··· ⎠ ΛN −1 ΛN −2 · · · Λ0 ⎞ ⎛ ΛN ΛN −1 · · · Λ1 ⎟ ⎜ ⎜ ΛN +1 ΛN · · · Λ2 ⎟ Σf,p = ⎜ ⎟ ⎝ ··· ··· ··· ··· ⎠ Λ2N −1 Λ2N −2 · · · ΛN
205
(11.23)
(11.24)
where
Λi = E[yt+i ytT ] It can be verified by direct substitution that 1 [Yp YpT ] = Σp,p j→∞ j 1 lim [Yf YpT ] = Σf,p j→∞ j lim
(11.25) (11.26)
Now, rewrite Equation 11.22 as L= As j → ∞
1 1 Yf YpT ( Yp YpT )−1 j j
L → Σf,p (Σp,p )−1
(11.27)
(11.28)
Thus the optimal prediction over multiple steps is given by Yˆf = Σf,p (Σp,p )−1 Yp
(11.29)
where Σf,p and Σp,p can be calculated from variance and covariance of time series data yt . No time series modeling is needed. To illustrate the algorithms, we reconsider Example 11.1. With k = 3, a set of 2000 data points is simulated. Optimal prediction errors si and closed-loop potentials pi are estimated from simulated routine operating data and compared to their theoretical values in Figure 11.5. One can see that the estimated results have a good agreement with the theoretical ones. Remark 11.6. To use the algorithms discussed, one should have one set of routine operating data for one controller setting and then produce one p-versus-i plot, which is used as a reference. A natural question is how this set of data can be obtained and the algorithm can be applied. There are several practical scenarios in which this algorithm can be applied. 1) When a control engineer is setting up a new multivariable control structure there would be opportunity to experiment with different tunings and build up a reference family of curves that can be
206
11 Prediction Error Approach to Feedback Control Performance Assessment
20
si
15
10
5
0
theoretical estimated 0
1
2
3
4
5 i
6
7
8
9
10
1 theoretical estimated
0.8
p
i
0.6 0.4 0.2 0
0
1
2
3
4
5 i
6
7
8
9
10
Fig. 11.5. Theoretical and estimated si and pi plots
compared with the potential curves in future. 2) Process control engineers could have a general engineering judgement for how their process should behave. This is similar to the user-defined benchmark [121, 22]. They would often know that it should be settled after a certain elapsed time and therefore be able to make a user-defined potential curve that is a “good enough” target for the multivariable controller. 3) When control engineers perform field tuning/re-tuning of the controllers, this algorithm provides the direction of the tunings without involving complicated calculations. One caution, when applying this method, is that the dynamics and correlation structure of the disturbance should remain the same throughout the performance assessment period to ensure a fair comparison of control performance before and after the tuning. 11.3.3
Case Study
For a petrochemical distillation column, shown in Figure 11.6, which separates chemical petro in a refinery, a multivariable predictive controller has been developed [154]. Its feed is chemical petrol from the desulfurisation, its top product is light petro with boiling point between 30 and 65.o C, and its bottom product is heavy petro with boiling point between 65 and 180.o C. According to the multivariable model provided in [154], we have designed an MPC using MATLAB /MPC toolbox, to simulate the closed-loop responses. In this section, we will use the data-driven subspace algorithms to evaluate closedloop potentials for various tunings.
11.3 Subspace Algorithm for Multi-step Optimal Prediction Errors
0-50%
207
50-100% Off Gas
PC
LC FC
QT
TT
Top Prod
Steam
TC UC
Feed
FT TT PDT LC
Fig. 11.6. Schematic diagram of distillation column
This process has 10 controlled variables (CVs), 4 manipulated variables (MVs) and one disturbance variable, plus noise in most of CVs. All CVs/MVs and their corresponding parameters are shown in Table 11.1 and 11.2. Table 11.1. MVs and their constraints, weights used in MPC design No 1 2 3 4
MV Reflux PID setpoint Pressure PID setpoint Feed temperature Duty valve
Min 299 t/d 0.145 bar 0% 62%
Max 700 t/d 0.6 bar 100% 100%
Rate 20 t/d 0.001 bar/min 0.5%/min 0.1%/min
weight 0.0059 1000000 0.5153 1.7374
Table 11.2. CVs and their constraints, weights used in MPC design No 1 2 3 4 5 6 7 8 9 10
CV unit Min Max weight o Final boil point (Top) C 67.75 68.25 1 Pressure corrected temperature (bottom) o C 85.5 86.5 0.04 Pressure bar 0.15 0.15 3.33 o Feed temperature C 70.0 70.0 0.01 o Pressure corrected temperature (top) C 56.0 66.0 1.0 Pressure PID valve position % 0 42 25 Duty MW 4.0 5.5 1.0 Bypass valve position % 0 95 1.0 Reflux PID flow t/d 0 650 0.0001 Furnace duty violation MW -1000 0 1.25
208
11 Prediction Error Approach to Feedback Control Performance Assessment
1.2 Tuning 1 Tuning 2 Tuning 3
1
0.8
p
i
0.6
0.4
0.2
0
−0.2
0
5
10
15
20 i
25
30
35
40
Fig. 11.7. Comparison of overall closed-loop potentials from Tuning 1, 2 and 3 for three quality CVs 1.2 FBP Pressure Bottom PCT
1
0.8
p
i
0.6
0.4
0.2
0
−0.2
0
5
10
15
20 i
25
30
35
40
Fig. 11.8. Individual closed-loop potentials from Tuning 1 for three quality CVs
According to [154], the first three CVs are quality variables. The concentration of the heavy petro in the top product (distillate) has to be kept below 2.5%. The concentration is estimated by the final boiling point (FBP), i.e., the boiling temperature of the product if 99% of the product has been evaporated. For bottom product quality, the pressure compensated temperature (PCT) is calculated and used as CV. Meanwhile, the column pressure has to be kept at its minimum value in order to reduce the energy consumption (reboiler heating). CV4 and CV5 are auxiliary variables that are controlled with less accuracy than the first three main controlled variables. The remaining CVs are constrained variables.
11.3 Subspace Algorithm for Multi-step Optimal Prediction Errors
209
1.2 FBP Pressure Bottom PCT
1
0.8
p
i
0.6
0.4
0.2
0
−0.2
0
5
10
15
20 i
25
30
35
40
Fig. 11.9. Individual closed-loop potentials from Tuning 2 for three quality CVs
1.2 FBP Pressure Bottom PCT
1
0.8
p
i
0.6
0.4
0.2
0
−0.2
0
5
10
15
20 i
25
30
35
40
Fig. 11.10. Individual closed-loop potentials from Tuning 3 for three quality CVs
To reflect model-plant mismatch, rather than using the exact model for the MPC design, an approximate model, through the ‘linearization’ function in Simulink by perturbing the original model, is obtained and the MPC controller is designed based on the approximate model. The designed MPC controller is then implemented in the process described by the original model. All CVs except for CV6 and CV8 are subject to disturbances, which are filtered white noise with a filter time constant of 5 minutes. CV6 and CV8 (OPs) can be measured exactly and are not subject to the noise. A disturbance variable (DV) is the feed flow rate, which is simulated by filtered white noise with filter time constant 10 minutes.
210
11 Prediction Error Approach to Feedback Control Performance Assessment
1.2 FBP Pressure Bottom PCT
1
0.8
p
i
0.6
0.4
0.2
0
−0.2
0
5
10
15
20 i
25
30
35
40
Fig. 11.11. Individual closed-loop potentials from Tuning 4 for three quality CVs 1.2 Tuning 1 Tuning 2 Tuning 3 Tuning 4
1
0.8
p
i
0.6
0.4
0.2
0
−0.2
0
5
10
15
20 i
25
30
35
40
Fig. 11.12. Comparison of overall closed-loop potentials from Tuning 1, 2, 3 and 4 for three quality CVs
For the MPC design, sampling time is chosen as 1 minute, prediction horizon as 10 samples, and control horizon as 2 samples. Since CV3 and CV4 are setpoint tracking variables (with the same upper and lower constraints), step disturbance is selected in the GUI of the MPC toolbox. We consider different tunings by adjusting the “Overall” performance knob from 0 (Most robust), 0.5 (Median), to 1 (Fastest response). These three tunings are labeled as tuning 1, 2 and 3. Tuning 4 has the same parameters as tuning 2 except for the weighting of CV1, which is increased from 1 to 100. In computing overall closed-loop potential, the data of each CV is scaled by the square root of weighting of the corresponding CV before performing
11.4 Summary
211
further computations. Comparison of overall closed-loop potentials for tuning 1, 2 and 3 is shown in Figure 11.7. The numerical measures of the relative potentials are -0.048 for tuning 1 and -0.054 for tuning 3, respectively, with tuning 2 as the reference. These numerical measures show that there is not much change in the performance among the first three tunings. However, by visualizing individual closed-loop potentials shown in Figure 11.8, 11.9, and 11.10, one can see that there is some difference between tuning 1 and tuning 2/3, while tuning 2 and 3 are quite similar. The difference lies in the FBP (CV1) which is the most important quality variable. Both tuning 2 and 3 deteriorate the control performance of this critical CV by slightly improving other two CVs, relative to tuning 1. Therefore, tuning 1 is recommended among these three tunings. The individual closed-loop potentials also indicate that the first three tunings do not yield good control performance for FBP compared to the other two critical CVs; that is, it has the slowest decaying closed-loop potential across all three tunings. Assuming that improvement of the control of FBP is the most important, we simulate tuning 4 (with more weighting on FBP), and the comparison of overall closed-loop potentials for all 4 tunings is shown in Figure 11.12. The numerical closed-loop relative potential for tuning 4 is -0.17 with tuning 2 as the reference, a clear indication of improved performance. It therefore turns out that tuning 4 is the best among the four tunings in terms of overall performance. The individual closed-loop potentials shown in Figure 11.11 indicate that the improvement comes from the improvement of FBP at the cost of Bottom PCT, showing a tradeoff between top product quality and bottom product quality.
11.4 Summary In this chapter, we have discussed prediction-error based data-driven subspace solutions for multivariate feedback control performance assessment problem, which is useful for MPC performance monitoring. We have elaborated a novel performance measure according to closed-loop potential. The solution is based on the multi-step optimal prediction error. Two data-driven subspace algorithms have been developed for estimating the optimal prediction errors and closed-loop potentials. A case study for an MPC-controlled distillation column has demonstrated key features of the algorithms.
12 Performance Assessment with LQG-benchmark from Closed-loop Data∗
12.1 Introduction The MVC-benchmark discussed in Chapter 10 may not be directly applicable to assessing the performance of those control systems whose objective is not just minimizing process output variance but also keeping the input variability (for example, valve movement) within some specified range to reduce upset to other processes, conserve energy and lessen the equipment wear. The objective of such controllers may be expressed as minimizing a linear quadratic function of input and output variances. The LQG-benchmark is a more appropriate benchmark for assessing the performance of such controllers. However, calculation of the LQG-benchmark requires a complete process model [155, 22], which is a demanding requirement or simply not possible in practice. An open-loop test for obtaining the process model may not always be feasible. A frequency domain approach is proposed by Kammer ([156, 157, 158]) for testing the LQ optimality for performance assessment of a controller using closed-loop data with setpoint excitations. However this approach does not give quantitative values for the controller performance in terms of process input and output variances. In other words, it does not separate the non-optimality/optimality with respect to process response (output) variance and process input variance. The idea of using a subspace-matrices based approach to obtain the LQG-benchmark variances of the process input and output for controller performance assessment has been explored in [159]. The required subspace matrices, corresponding to the deterministic and stochastic inputs respectively, are estimated from closed-loop data with setpoint excitation. This method is applicable to both univariate and multivariate systems. Subspace identification methods allow estimation of a state space model for the system directly from process data. As discussed in previous chapters, certain subspace matrices, corresponding to the states (or past inputs and outputs), deterministic inputs and stochastic inputs, are identified as an intermediate step in subspace identification methods. Chapter 5 illustrates identification of the deterministic subspace matrix and stochastic subspace matrix from closed-loop ∗
This chapter (with revisions) is reprinted from ISA Transactions, vol 41, R. Kadali, and B. Huang, “Controller Performance Analysis with LQG Benchmark Obtained c 2002 ISA. All Rights Reserved. under Closed Loop Conditions”, 521-537, 2002, Reprinted with permission. Sections 12.2 and 12.3 have major revisions.
B. Huang et al.: Dyn. Model. Predict. Ctrl. & Perform. Monitor., LNCIS 374, pp. 213–227, 2008. c Springer-Verlag London Limited 2008 springerlink.com
214
12 Performance Assessment with LQG-benchmark from Closed-loop Data
data without requiring any a priori knowledge of the controllers. This method requires setpoint excitations and is also applicable to the case of measured disturbances. It provides a means for the calculation of more practical controller performance benchmarks like the LQG-benchmark using closed-loop data. As will be shown in this chapter, the explicit process model is not required for obtaining the LQG-benchmark. The method for designing the optimal LQG controller directly from subspace matrices has been proposed in [31, 13, 14]. Performance assessment problem with the LQG benchmark will be elaborated in this chapter. If some of the disturbance variables are measurable, analysis of feedforward control performance is also a worthwhile study. This analysis requires the subspace matrix corresponding to the measured disturbance variables. Using the subspace approach discussed in Chapter5 subspace matrices corresponding to the measured disturbance variables can also be estimated under closed-loop conditions. This provides a means for the profit analysis of implementing feedforward control on the process. This chapter is organized as follows. In Section 12.2 a method for obtaining the LQG benchmark from closed-loop data is presented. The LQG benchmark with measured disturbances is discussed in Section 12.3. Controller performance analysis indices are defined and described in Section 12.4. A summary of all methods is presented in Section 12.5. Simulation results are discussed in Section 12.6, followed by an application on a pilot scale process in Section 12.7. A chapter summary is given in Section 12.8.
12.2 Obtaining LQG-benchmark from Feedback Closed-loop Data As shown in Equations 3.1-3.2, a linear time-invariant system can be described in a state space innovations form as: xt+1 = Axt + But + Ket
(12.1)
yt = Cxt + Dut + et
(12.2)
To assess controller performance, we compare the variance of the current controller, in terms of the output, input or both, with the variance under an optimal control. This gives rise to the question of selection of the optimal controller. Though the primary objective of a control system is often to minimize the output variance, we may also want to limit the input variance for reasons such as energy conservation and limiting equipment wear. In other words, a compromise between process input variance and output variance is necessary. The LQG control is one of such controls that take into account both input and output variances of the process and represents a limit of control performance in terms of input and output variances [125, 155, 22]. To obtain input and output variances under LQG control or, in short, the LQG-benchmark variances, we need to obtain the closed-loop expressions for process inputs and outputs in relation to the disturbances entering the process. We make the derivation following an approach of [25].
12.2 Obtaining LQG-benchmark from Feedback Closed-loop Data
215
Let a single random noise e0 is introduced to the system described by Equations 12.1 and 12.2 at time t = 0 when the system is at the steady state. Then the following propagation of e0 results: ⎞ ⎞ ⎛ ⎛ y0 u0 ⎞ ⎛ ⎟ ⎟ ⎜ ⎜ D 0 0 ··· 0 ⎟ ⎟ ⎜ ⎜ u1 ⎟ ⎜ y1 ⎟ ⎜ CB ⎟ ⎜ D 0 ··· 0 ⎟⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟⎜ CB D · · · 0 ⎟ ⎜ u2 ⎟ ⎜ y2 ⎟ = ⎜ CAB ⎟ ⎜ ⎟ ⎜ ⎟⎜ . ⎟ ⎜ .. ⎟ ⎝ ··· ··· ··· ··· ···⎠⎜ ⎜ . ⎟ ⎜ .. ⎟ ⎠ ⎠ ⎝ CAN −2 B CAN −3 B CAN −4 B · · · D ⎝ yN −1 uN −1 ⎞ ⎛ I ⎟ ⎜ ⎟ ⎜ ⎜ CK ⎟ ⎟ ⎜ ⎟ ⎜ + ⎜ CAK ⎟ e0 ⎟ ⎜ .. ⎟ ⎜ ⎟ ⎜ . ⎠ ⎝ N −2 CA K or in short (following subspace notations of Chapter 3), d s u0|N −1 + HN,1 e0 y0|N −1 = HN s s where HN,1 is the first block column of HN . Using short hand notations, this equation can be further written as
y0|N −1 = Lu u0|N −1 + Le,1 e0
(12.3)
where Le,1 is the first block column of Le . Consider a regulatory finite-horizon LQG control objective function defined as N −1 J= (yiT yi + uTi (λI)ui ) i=0
This can be written as
T T J = y0|N −1 y0|N −1 + u0|N −1 (λI)u0|N −1
Minimizing J, considering the model expressed in Equation 12.3, yields an optimal control law: T −1 T uopt Lu Le,1 e0 0|N −1 = −(Lu Lu + λI)
Substituting it in Equation 12.3 gives the corresponding optimal output expression: opt T −1 T y0|N Lu ]Le,1 e0 −1 = [I − Lu (Lu Lu + λI)
216
12 Performance Assessment with LQG-benchmark from Closed-loop Data
Define the following two matrices: ⎛
⎞ ψ0 ⎜ ψ ⎟ ⎜ 1 ⎟ ⎜ ⎟ = −(LTu Lu + λI)−1 LTu Le,1 ⎝ ... ⎠ ψN −1 and
⎛
⎞ γ0 ⎜ ⎟ ⎜ γ1 ⎟ ⎜ ⎟ = [I − Lu (LTu Lu + λI)−1 LTu ]Le,1 ⎝ ... ⎠ γN −1
When the random noise occurs at every sampling instant, we can apply the principle of superposition to get the optimal sequence of control inputs as = ψ0 e0 uopt 0 opt u1 = ψ0 e1 + ψ1 e0 uopt = ψ0 e2 + ψ1 e1 + ψ2 e0 2 .. . uopt N −1 = ψ0 eN −1 + ψ1 eN −2 + . . . + ψN −1 e0 uopt N = ψ0 eN + ψ1 eN −1 + . . . + ψN −1 e1 .. . N −1 uopt = ψi et−i t i=0
and, similarly, the sequence of the corresponding optimal outputs as ytopt =
N −1
γi et−i
i=0
From the above equations, we can calculate the LQG-benchmark variances (covariances) of the process inputs and outputs as Cov[ut ] =
N −1
ψi Cov[et ] ψiT
(12.4)
γi Cov[et ] γiT
(12.5)
i=0
Cov[yt ] =
N −1 i=0
As can be seen from the above equations, only subspace matrices Lu and Le are required for obtaining the LQG-benchmark variances of the process input and
12.3 Obtaining LQG Benchmark from Feedback and Feedforward Data
217
output. Therefore the closed-loop subspace identification method presented in Chapter 5 can be used for obtaining the LQG control variances of process inputs and outputs. For obtaining the LQG-benchmark tradeoff curve, define ulqg = trace{ Cov[ut ] }
(12.6)
ylqg = trace{ Cov[yt ] }
(12.7)
For different values of λ, the values for ulqg and ylqg are obtained. A plot of ulqg vs. ylqg represents the LQG performance tradeoff curve that can be used for controller performance assessment.
12.3 Obtaining LQG-benchmark with Measured Disturbances If measured disturbance variables are available, obtaining optimal benchmark variances helps us analyze two things: (i) performance assessment of an existing feedforward plus feedback controller and (ii) profit analysis of implementing a feedforward controller on the process. This analysis in terms of process output variance using feedback and feedforward MVC-benchmark has been provided in [160, 155, 161, 162, 163]. In this section we provide the analysis, in terms of both process output variance and process input variance, using the LQG-benchmark. Consider the case where measurements of some of the disturbance variables, mt (h×1), are available where mt is assumed to be white noise and is independent of et ; if this assumption is not true, then pre-whitening is needed. The process state space representation, Equations 3.1-3.2, is modified to include measured disturbances, as shown in Section 5.4, as ⎛ ⎞
ut ⎠ + Ket xt+1 = Axt + B Bm ⎝ mt ⎛ ⎞
ut ⎠ + et yt = Cxt + D Dm ⎝ mt The counterpart of Equation 12.3 is
y0|N −1 = Lu u0|N −1 + Le,1 e0 + Lm,1 m0 m where Lm,1 is the first block column of Lm (or HN ), defined in Equation 5.27. To obtain the LQG-benchmark variances with measured disturbances we need to derive closed-loop expressions for process inputs and outputs in relation to the disturbances entering the process. Following the same procedure as the derivation when there are no measured disturbances, we obtain the optimal input sequence when a single unmeasured noise e0 and a single measured disturbance m0 enter the process at time t = 0: T −1 T Lu (Le,1 e0 + Lm,1 m0 ) uopt 0|N −1 = −(Lu Lu + λI)
218
12 Performance Assessment with LQG-benchmark from Closed-loop Data
The corresponding output sequence is: opt T −1 T y0|N Lu ](Le,1 e0 + Lm,1 m0 ) −1 = [I − Lu (Lu Lu + λI)
Define
and
⎛
⎛
⎞
ω0
⎜ ⎟ ⎜ ⎟ ⎜ ω1 ⎟ ⎜ ⎟ = −(LT Lu + λI)−1 LT Lm,1 u u ⎜ ⎟ ⎜ ... ⎟ ⎝ ⎠ ωN −1 λ0
⎞
⎟ ⎜ ⎟ ⎜ ⎜ λ1 ⎟ ⎟ = [I − Lu (LTu Lu + λI)−1 LTu ]Lm,1 ⎜ ⎟ ⎜ ⎜ ... ⎟ ⎠ ⎝ λN −1
When the random noises occur at every sampling instant, we can apply the principle of superposition to get the optimal sequence of control inputs as = uopt t
N −1
(ωi mt−i + ψi et−i )
i=0
and the sequence of the corresponding optimal outputs as ytopt =
N −1
(λi mt−i + γi et−i )
i=0
From the above equations, we can calculate the LQG-benchmark variances (covariances) of the process input and output as Cov[ut ] =
N −1
ωi Cov[mt ] ωiT +
i=0
Cov[yt ] =
N −1 i=0
N −1
ψi V ar[et ] ψiT
(12.8)
γi V ar[et ] γiT
(12.9)
i=0
λi Cov[mt ] λTi +
N −1 i=0
Hence only the subspace matrices Lu , Lm and Le are required for obtaining the LQG-benchmark variances for the process input and output. Now, ulqg = trace{ Cov[ut ] } ylqg = trace{ Cov[yt ] }
(12.10) (12.11)
By plotting ulqg vs. ylqg for different values of λ, as explained in the previous section, an LQG feedforward-plus-feedback-controller tradeoff curve can be plotted. It can be compared with the feedback-only LQG performance tradeoff curve to analyze the benefits of implementing the feedforward control.
12.4 Controller Performance Analysis
219
12.4 Controller Performance Analysis One of the advantages of the LQG-benchmark is that controller performance can be assessed in terms of both process response (output) variance and the process input variance. The LQG tradeoff curve in Figure 12.1 represents the limits of controller performance, in terms of process input and output variances [125]. That is to say, all linear controllers (from PID, MPC, to any advanced control) can only operate in the region above the curve. Several useful performance indices in the analysis of controller performance can be obtained from the LQG-benchmark curve, as will be discussed in the following three cases, in which Figure 12.1 will be useful.
η= Actual performance (non-optimal)
ylqg Vy
E=
V yo Vy Vuo Vu
Optimal performance limit
V yo
Vuo
Vu
ulqg
Fig. 12.1. LQG control performance tradeoff curve. Vu and Vy represent the variances obtained from process data while Vuo and Vyo represent the LQG-benchmark variances.
12.4.1
Case 1: Feedback Controller Acting on the Process with Unmeasured Disturbances
Consider the case when a feedback-only controller is acting on the process and the actual input and output variances are represented as (Vu )f b and (Vy )f b respectively. The closer (Vu )f b and (Vy )f b are to the tradeoff curve, the closer is the controller performance to the optimal. If the optimal output variance corresponding to (Vu )f b is (Vyo )f b and the optimal input variance corresponding to (Vy )f b is (Vuo )f b , then the LQG performance indices can be defined in terms of process response variance, (η)f b , and process input variance, (E)f b , as
220
12 Performance Assessment with LQG-benchmark from Closed-loop Data
(η)f b =
(Vyo )f b (Vy )f b
(12.12)
(E)f b =
(Vuo )f b (Vu )f b
(12.13)
(η)f b and (E)f b vary between 0 and 1. If (η)f b is equal to 1, for the given input variance, then the controller is giving optimal performance with respect to the output variance. If not, then the controller is non-optimal and there is scope for improvement in terms of process output responses. Similarly if (E)f b is equal to 1, for the given output variance, then the controller is giving optimal performance with respect to the input variance. If not, then the controller is non-optimal and there is scope to reduce input variance. The maximum possible % improvement in the controller performance with respect to process output variance without increasing the input variance, by retuning the controller, can be calculated as (Iη )f b =
(Vy )f b − (Vyo )f b 100% (Vy )f b
(12.14)
Similarly the maximum possible % improvement in controller performance with respect to input variance without increasing the output variance, by retuning the controller, can be calculated as (IE )f b =
12.4.2
(Vu )f b − (Vuo )f b 100% (Vu )f b
(12.15)
Case 2: Feedforward Plus Feedback Controller Acting on the Process
For the case of measured disturbances, and a feedforward-plus-feedback controller acting on the process, let the actual input and output variances be denoted by (Vu )f f &f b and (Vy )f f &f b respectively. Then the LQG curve is determined using {Lu , Le , and Lm }1 and represents the limit of performance in terms of process input and output variances for a feedforward-plus-feedback controller. Let the optimal output variance corresponding to (Vu )f f &f b be (Vyo )f f &f b and the optimal input variance corresponding to (Vy )f f &f b be (Vuo )f f &f b , then the optimal (feedforward plus feedback) FF-FB LQG performance indices can be defined in terms of process output variance, (η)f f &f b , and process input variance, (E)f f &f b , as
1
It should be noted that the subspace matrix corresponding to the measured disturbances, Lm , may not be identified when a feedforward plus feedback controller is acting on the process [164]. A feedback-only controller should be acting on the process for identifying Lm .
12.4 Controller Performance Analysis
221
(η)f f &f b =
(Vyo )f f &f b (Vy )f f &f b
(12.16)
(E)f f &f b =
(Vuo )f f &f b (Vu )f f &f b
(12.17)
(η)f f &f b and (E)f f &f b vary between 0 and 1. If (η)f f &f b is equal to 1, then the controller is giving optimal feedforward-plus-feedback control performance, for the given input variance. If not, then the controller is non-optimal and has potential for improvement by retuning. Similarly if (E)f f &f b is equal to 1, then the controller is giving optimal feedforward-plus-feedback control performance, for the given output variance. The maximum possible % improvement in the controller performance, with respect to process output variance without increasing the input variance, by retuning the controller, is calculated as (Iη )f f &f b =
(Vy )f f &f b − (Vyo )f f &f b 100% (Vy )f f &f b
(12.18)
Similarly we can define % improvement in terms of the input variance (IE )f f &f b = 12.4.3
(Vu )f f &f b − (Vuo )f f &f b 100% (Vu )f f &f b
(12.19)
Case 3: Feedback Controller Acting on the Process with Measured Disturbances
Consider the case where a feedback-only controller is acting on the process and measured disturbance variables are available. We want to know how much improvement in the controller performance is possible by implementing a feedforward control in addition to the existing feedback-only control. By implementing optimal feedforward control on the system the process output variance will decrease. The same may not hold for the input variance. The process input variance may increase or decrease by the implementation of feedforward control. The following analysis helps determine the incentive for the implementation of a feedforward controller on the process. We can obtain (Vu )f b and (Vy )f b from process data. We construct two LQG tradeoff curves. (i) Identify {Lu , Lm , and Le } and construct the FF & FB LQG controller tradeoff curve to obtain (Vyo )f f &f b and (Vuo )f f &f b . (ii) Treat measured and unmeasured disturbances as a lumped set of (m × 1) unmeasured disturbances and identify {Lu and Le } and construct the FB-only LQG controller tradeoff curve to obtain (Vyo )f b and (Vuo )f b . The maximum possible improvement in the optimal controller performance with the implementation of an optimal feedforward controller is obtained in terms of the process output variance and process input variance as (Vyo )f b − (Vyo )f f &f b 100% (Vy )f b
(12.20)
222
12 Performance Assessment with LQG-benchmark from Closed-loop Data
and (Vuo )f b − (Vuo )f f &f b 100% (Vu )f b
(12.21)
respectively. The performance analysis indices presented for three different cases above can be used to analyze the incentives, in terms of decreasing process output variance or process input variance, for retuning the controller.
12.5 Summary of the Subspace Approach to the Calculation of LQG-benchmark Controller performance analysis using the LQG-benchmark involves comparing the current process input and output variances with the variances if an LQG controller were implemented on the process. The subspace method allows the calculation of the LQG-benchmark variances directly from the deterministic and stochastic process subspace matrices, thus not requiring an explicit model, and in principe consists of the following steps: 1. Estimation for the deterministic and stochastic process subspace matrices from process data. The subspace matrices can be identified by either • using open-loop data [65, 38] as shown in Chapter 3, or • using closed-loop data with setpoint excitations [164] as shown in Chapter 5. 2. Estimation of the process stochastic noise and obtaining the variance (covariance), Cov[et ]. Also estimate Cov[mt ] if any measured disturbance is available. If the measured disturbance is not white noise, then pre-whitening is necessary, and the pre-whitened noise should be used in estimating subspace matrices. 3. For different values of λ, calculate the LQG-benchmark variances, ulqg and ylqg . Plot ulqg vs. ylqg to obtain the LQG performance tradeoff curve. 4. Corresponding to the current process input and output variances, Vu and Vy respectively, obtain the optimal variance values Vuo and Vyo , for both feedback-only and feedforward-plus-feedback control cases. Calculate the controller performance analysis indices: (η)f b
(E)f b
(Iη )f b
(IE )f b
(η)f f &f b
(E)f f &f b
(Iη )f f &f b
(IE )f f &f b
12.6 Simulations
223
12.6 Simulations Consider the following state space system (modified from the example in [38]) ⎞ ⎞ ⎞ ⎞ ⎛ ⎛ ⎛ ⎛ 0.6 0.6 0 1.616 0.5 −1.147 ⎟ ⎟ ⎟ ⎟ ⎜ ⎜ ⎜ ⎜ ⎟ ⎟ ⎟ ⎟ ⎜ ⎜ ⎜ ⎜ xt+1 = ⎜ −0.6 0.6 0 ⎟ xt + ⎜ −0.348 ⎟ ut + ⎜ −0.5 ⎟ mt + ⎜ −1.520 ⎟ et ⎠ ⎠ ⎠ ⎠ ⎝ ⎝ ⎝ ⎝ 0 0 0.7 2.632 0.4 −3.199
yt = −0.437 −0.505 0.094 xt − 0.776ut − 0.5mt + et
Process time delay of 3-samples is introduced for the above system. A PID controller, [0.1 + 0.05 s + 0.05s], is tuned. We assume that the controller model is not known in the following analysis. Closed-loop input/output data is obtained by exciting the system using a de signed RBS
signal (with idinput function in MATLAB ), with bandpass limits 0 0.06 and magnitude 1, in the setpoint. The measured disturbance and the unmeasured disturbance are random white noise with standard deviations 0.2 and 0.1 respectively. Note that although the measured and unmeasured disturbances can have the same standard deviation, different seeds have to be used for generating them in MATLAB . Using closed-loop subspace identification of Chapter 5, with N = 30 (rows) and j = 3000 (columns) in the data Hankel matrices, the subspace matrices Lu (30 × 30) and Le (30 × 30) are identified. Due to the presence of noise, the upper non-diagonal elements in Lu and Le will not be exactly zero but negligibly small numbers (they approach zero as N −→ ∞). A feedback-only LQG controller is considered as a benchmark for controller performance assessment. For a range of values of λ (1−30), the LQGbenchmark variances of the process input and output are obtained for both the feedback-only control case and the feedforward-plus-feedback control case and plotted in Figure 12.2. The controller performance analysis parameters are obtained from Figure 12.2 and shown in Table 12.1. From Feedback-Only LQG Tradeoff Curve From the FB-only LQG-benchmark variances we see that although the controller performance ((η)f b = 0.9217) is close to optimal with respect to the process output variance, the performance index (E)f b with respect to the input variability is only 0.2560. Hence there is a maximum possible scope of 74.40% to reduce the input variance without increasing the output variance. From Feedforward and Feedback LQG Tradeoff Curve From the FF-FB LQG-benchmark variances we see that the controller performance is still close to the optimal with respect to the process output variance
224
12 Performance Assessment with LQG-benchmark from Closed-loop Data
Optimal LQG−benchmark variance plot; FB−only (−), FF&FB (−−)
Actual variances from the process 0.056
Variance of Yt
0.054
0.052
0.05
0.048
0.046 0.2
0.4
0.6
0.8
1 1.2 Variance of Ut
1.4
1.6
1.8
2 −3
x 10
Fig. 12.2. Optimal LQG control performance tradeoff curve
((η)f f &f b = 0.8634), whereas the performance index with respect to the input variance is (E)f f &f b = 0.1440. Hence there is a maximum possible scope of 85.60% to reduce the input variance without increasing the output variance. For the gained profit of implementing a feedforward controller on the process relative to the optimal feedback-only controller, we see that there is an incentive of only 5.83% possible reduction in the process output variance, and there is 11.20% possible scope decreasing the process input variance. Thus, it can be concluded that there is not much incentive for the implementation of optimal feedforward control on top of optimal feedback control in this case.
12.7 Application on a Pilot-scale Process The subspace method for controller performance analysis using the LQGbenchmark is tested on a pilot-scale process. The process considered is shown in Figure 5.8. The input (u) is the position of the input water flow valve and the process output variable to be controlled (y) is the level of water in the tank. The tank outlet flow valve is kept at a constant position. The head of the water in the inlet pipe can be considered as (an unmeasured) disturbance. The tank level is controlled by a PID controller, 5 + 0.05 s + 0.05s. The controller sampling interval is 5 seconds. RBS sequences in the setpoint of the level are designed in MATLAB . Closed-loop data of the process input and output is collected
12.7 Application on a Pilot-scale Process
225
Table 12.1. LQG-benchmark performance assessment parameters for the simulations parameter (Vy )f b (Vu )f b (Vyo )f b (Vuo )f b (η)f b (Iη )f b (E)f b (IE )f b
value ..... 0.0549 ..... 1.25 × 10−3 ..... 0.0506 ..... 0.32 × 10−3 ..... 0.9217 ..... 7.83% ..... 0.2560 ..... 74.40%
parameter
value
(Vyo )f f &f b (Vuo )f f &f b (η)f f &f b (Iη )f f &f b (E)f f &f b (IE )f f &f b (Vyo )f b −(Vyo )f f &f b (Vy )f b
..... 0.0474 ..... 0.18 × 10−3 ..... 0.8634 ..... 13.66% ..... 0.1440 ...... 85.60% 100% .....
(Vuo )f b −(Vuo )f f &f b 100% (Vu )f b
.....
5.83% 11.20%
Estimate of process IR−coefficients 0.15
0.1
0.05
0 20
40
60
80
100 lags
120
140
160
180
200
140
160
180
200
Estimate of noise IR−coefficients 1.5
1
0.5
0 20
40
60
80
100 Lags
120
Fig. 12.3. Impulse response models for the process and noise identified using closedloop data
at a sampling rate of 5 seconds. The subspace matrices Lu and Le of dimension 200 are identified using the closed-loop subspace identifition method of Chapter 5. The impulse response models for the process and noise are plotted in Figure 12.3. The LQG-benchmark tradeoff curve is plotted for a range of values
226
12 Performance Assessment with LQG-benchmark from Closed-loop Data
Table 12.2. LQG-benchmark performance assessment parameters for the pilot scale process parameter (Vy )f b (Vu )f b (Vyo )f b (Vuo )f b (η)f b (Iη )f b (E)f b (Iη )f b
−4
2.5
..... ..... ..... ..... ..... ..... ..... .....
value 1.885 × 10−4 1.52 × 10−3 0.71 × 10−4 1.66 × 10−4 0.378 62.30% 0.11 89.00%
Optimal LQG controller benchmark curve
x 10
2
Variance of yt
1.5
1
0.5
0
0
0.2
0.4
0.6
0.8
1 1.2 Variance of u
1.4
1.6
1.8
t
2 −3
x 10
Fig. 12.4. Optimal LQG-benchmark curve for the CSTH
of λ, as shown in Figure 12.4. The actual process input and output variances are compared with the optimal variances for the controller performance analysis. From Table 12.2 we can see that the controller performance is non-optimal with respect to both input and output variances. There is a maximum possible scope of 62.30% to reduce the process output variance without increasing the input variance and of 89.00% to reduce the process input variance without increasing the output variance.
12.8 Summary
227
12.8 Summary A data-driven subspace approach has been discussed in this chapter for obtaining the LQG-benchmark from closed-loop data. It has been shown that instead of using explicit process models, only the subspace matrices corresponding to the deterministic and stochastic inputs, i.e., Lu and Le , are required to obtain the LQG-benchmark. The closed-loop subspace identification method is used to obtain Lu and Le which are subsequently used to obtain the LQG-benchmark. The LQG-benchmarking method is extended to the case of feedforward-plusfeedback control. Profit analysis for the implementation of feedforward control under the LQG-benchmarking framework has also been derived and explained in this chapter. The results are illustrated through a simulation example and a pilot-scale experiment.
Index
2-norm measure of impulse response coefficients, 197 AIC criterion, 51 Algorithm for estimating minimum variance benchmark directly from input-output data, 188 ARIMAX model, 117 ARMAX mode, 16 ARX model, 15 ARX-prediction based closed-loop subspace identification, 60 Backshift operator, 11 Bayes rule, 172 Bayesian network for diagnosis, 171 Benchmark problem for closed-loop subspace identification, 72 Box-Jenkins model, 17 Block-Hankel matrices, 37 Block-Toeplitz matrices, 39 Canonical variables, 49 Closed-loop estimation of dynamic matrices, 79, 81, 84 Closed-loop estimation of noise model, 85 Closed-loop identification of EIV systems, 71 Closed-loop Markov parameter matrices, 202 Closed-loop potential, 166, 198 Closed-loop SIMPCA, CSIMPCA, 69 Closed-loop SOPIM, CSOPIM, 69 Closed-loop subspace identification algorithm, 85
Closed-loop subspace matrices in relation to open-loop subspace matrices, 182 Consistent estimation, 25 Constrained DMC, 115 Constraints, 107, 128 Control horizon, 107, 108 Control-loop performance assessment: conventional approach, 145 Controller performance index, 188 Controller subspace, 69 Conventional MIMO control performance assessment algorithm, 151 Conventional SISO feedback control performance assessment algorithm, 149 Covariance of the prediction error, 197 CVA, 48, 49 Data-driven subspace algorithms to calculate multi-step optimal prediction errors, 202 Decision making in performance diagnosis, 173 Delay-free transfer function, 146 Designed vs. achieved benchmark, 161 Determination of model order, 51 Diophantine equations, 118, 146 Direct closed-loop identification method, 27 Discrete transfer function, 12 Disturbance model, 105 DMC prediction equation, 109 Dynamic Bayesian network, 175 Dynamic matrix, 110, 115
238
Index
Dynamic matrix control, 108 Economic objective function, 168 Economic optimization, 116 Economic performance assessment and tuning, 167 Economic performance index, 170 EIV, 63 EIV state space model, 52 EIV subspace identification, 51 Estimation of multi-step optimal prediction errors, 202 Estimation of MVC Benchmark from Input/Output Data, 183 Estimation of subspace matrices, 180 Exact discretization, 10 Feedback control invariance property, 148, 151 Feedback control invariant, 145 Feedforward control, 127 Finite step response model, 109 Forced response, 106 Free response, 106, 118 Frobenius norm, 42 Fundamentals of MPC, 103 Generalized likelihood ratio test, 164 Generalized predictive control, GPC, 117 Generalized singular value, 50 GPC control law, 118 Graphic model, 171 Guidelines for closed-loop estimation of dynamic matrix, 86 Handling Disturbances in DMC, 112 Historical benchmark, 161 Historical covariance benchmark, 161 Impulse response curvature for performance monitoring, 165 Impulse response model, 104 Indirect closed-loop identification, 28 Innovation estimation approach, 61 Innovation form of state space model, 179 Instrument variable, 52, 67, 68 Instrument-variable methods, 51 Integral action, 125 Integrated white noise, 125
Interactor-matrix free methods for control performance assessment, 196 Joint input-output closed-loop identification, 29 Kalman filter states, 69 Least squares, 70, 181 Linear matrix inequality (LMI) for MPC economic performance analysis (LMIPA), 167 Local approach for model validation, 162 LQG benchmark, 159 LQG benchmark from closed-loop data: subspace approach, 214 LQG benchmark tradeoff curve, 217, 219 LQG benchmark variances of inputs, 216 LQG benchmark variances of outputs, 216 LQG benchmark with measured disturbances, 217 LQG benchmark: data-driven subspace approach, 213 LQG performance indices, 219, 220 LQG-benchmark based controller performance analysis, 219 Markov chain approach for performance monitoring, 166 Matrix inversion lemma, 84 Maximum likelihood, 50 Maximum likelihood ratio test, 164 Measured disturbances, 87 MIMO DMC problem formulation, 115 MIMO dynamic matrix, 115 MIMO feedback control performance assessment: conventional approach, 150 Minimum variance benchmark in subspace, 186 Minimum variance control, 145 Minimum variance control benchmark, 158 Minimum variance control law, 147 Minimum variance term, 151 MISO model for DMC, 114 MISO PEM model, 18 Model free approach for performance monitoring, 165
Index Model predictive control, 101 Model structure selection, 15 Model-based simulation for control performance monitoring, 160 MOESP, 46, 48 Monte-Carlo simulations, 72 MPC performance assessment: prediction error approach, 195 MPC performance monitoring, 157 MPC performance monitoring through model validation, 162 MPC performance monitoring: modelbased approach, 158 MPC relevant model validation, 165 MPC solutions, 108 MPC tradeoff curve, 159 Multi-step optimal prediction errors: subspace algorithm, 201 Multivariate dynamic matrix control, 113 Multivariate performance assessment, 146 MVC benchmark from subspace matrices, 181 N4SID, 46, 48, 63 Noise model estimation from closed-loop data, 81 Noise model tuning, 130 Normalized multivariate impulse response (NMIR) curve, 166 Normalized residual, 164 Objective function, 107 Optimal ith step prediction, 197 Optimal prediction, 21 Optimal prediction for general linear models, 23 Order of interactor matrix, 151 Orthogonal complement, 47 Orthogonal-projection based identification, 63 Out of control index (OCI), 167 Output error model, 17 Output variance under minimum variance control expressed in subspace, 183 Penalizing control action, 111 Persistent excitation, 13 Petrochemical distillation column simulation example, 206
239
Prediction error approach to control performance assessment, 196 Prediction error method, 24 Prediction error method: algorithm, 25 Prediction error model, 15 Prediction horizon, 103 Prediction model for DMC, 109 Prediction-error approach for performance monitoring, 166 Predictions for MIMO DMC, 115 Primary residual, 163 Probabilistic inferencing for diagnosis of MPC performance, 171 Process model subspace, 69 QR decomposition, 48, 184 QR decomposition for projections, 66 Quadratic objective function, 117, 122 Rank determination, 65 Receding horizon, 103 Recurrence relation, 10 Reference closed-loop potential, 198 Reference trajectory, 107 Relative closed-loop potential index, 198 Riccati equation, 23 SISO feedback control performance assessment: conventional approach, 146 Solution of open-loop subspace identification by projection approach, 46 State space model of closed-loop system, 182 State space models, 105 Static Bayesian network, 174 Statistical approach, 49 Step response model, 104 Subspace approach for MIMO feedback control performance assessment, 177 Subspace expression of feedback control invariance property, 186 Subspace identification method via PCA, SIMPCA, 65 Subspace orthogonal projection identification method via the state estimation for model extraction, SOPIM-S, 70
240
Index
Subspace orthogonal projection identification method, SOPIM, 65 Subspace predictive controller, SPC, 124 SVD, 48, 65 Theoretical economic index, 170 Time-delays, 12 Total least squares, 63 Tradeoff curve, 159 Transfer function model, 104
Transition tendency index (TTI), 167 Unconstrained DMC, 111 Unified approach to subspace algorithms, 48 Unitary interactor matrix, 151 Univariate performance assessment, 146 White noise, 32
Lecture Notes in Control and Information Sciences Edited by M. Thoma, M. Morari Further volumes of this series can be found on our homepage: springer.com Vol. 374: Huang B.; Kadali R. Dynamic Modeling, Predictive Control and Performance Monitoring 240 p. 2008 [978-1-84800-232-6] Vol. 373: Wang Q.-G.; Ye Z.; Cai W.-J.; Hang C.-C. PID Control for Multivariable Processes 264 p. 2008 [978-3-540-78481-4] Vol. 372: Zhou J.; Wen C. Adaptive Backstepping Control of Uncertain Systems 241 p. 2008 [978-3-540-77806-6] Vol. 371: Blondel V.D.; Boyd S.P.; Kimura H. (Eds.) Recent Advances in Learning and Control 279 p. 2008 [978-1-84800-154-1] Vol. 370: Lee S.; Suh I.H.; Kim M.S. (Eds.) Recent Progress in Robotics: Viable Robotic Service to Human 410 p. 2008 [978-3-540-76728-2]
Vol. 362: Tarn T.-J.; Chen S.-B.; Zhou C. (Eds.) Robotic Welding, Intelligence and Automation 562 p. 2007 [978-3-540-73373-7] Vol. 361: Méndez-Acosta H.O.; Femat R.; González-Álvarez V. (Eds.): Selected Topics in Dynamics and Control of Chemical and Biological Processes 320 p. 2007 [978-3-540-73187-0] Vol. 360: Kozlowski K. (Ed.) Robot Motion and Control 2007 452 p. 2007 [978-1-84628-973-6] Vol. 359: Christophersen F.J. Optimal Control of Constrained Piecewise Affine Systems 190 p. 2007 [978-3-540-72700-2] Vol. 358: Findeisen R.; Allgöwer F.; Biegler L.T. (Eds.): Assessment and Future Directions of Nonlinear Model Predictive Control 642 p. 2007 [978-3-540-72698-2]
Vol. 369: Hirsch M.J.; Pardalos P.M.; Murphey R.; Grundel D. Advances in Cooperative Control and Optimization 423 p. 2007 [978-3-540-74354-5] Vol. 368: Chee F.; Fernando T. Closed-Loop Control of Blood Glucose 157 p. 2007 [978-3-540-74030-8] Vol. 367: Turner M.C.; Bates D.G. (Eds.) Mathematical Methods for Robust and Nonlinear Control 444 p. 2007 [978-1-84800-024-7] Vol. 366: Bullo F.; Fujimoto K. (Eds.) Lagrangian and Hamiltonian Methods for Nonlinear Control 2006 398 p. 2007 [978-3-540-73889-3]
Vol. 357: Queinnec I.; Tarbouriech S.; Garcia G.; Niculescu S.-I. (Eds.): Biology and Control Theory: Current Challenges 589 p. 2007 [978-3-540-71987-8]
Vol. 365: Bates D.; Hagström M. (Eds.) Nonlinear Analysis and Synthesis Techniques for Aircraft Control 360 p. 2007 [978-3-540-73718-6]
Vol. 353: Bonivento C.; Isidori A.; Marconi L.; Rossi C. (Eds.) Advances in Control Theory and Applications 305 p. 2007 [978-3-540-70700-4] Vol. 352: Chiasson, J.; Loiseau, J.J. (Eds.) Applications of Time Delay Systems 358 p. 2007 [978-3-540-49555-0]
Vol. 364: Chiuso A.; Ferrante A.; Pinzoni S. (Eds.) Modeling, Estimation and Control 356 p. 2007 [978-3-540-73569-4] Vol. 363: Besançon G. (Ed.) Nonlinear Observers and Applications 224 p. 2007 [978-3-540-73502-1]
Vol. 356: Karatkevich A.: Dynamic Analysis of Petri Net-Based Discrete Systems 166 p. 2007 [978-3-540-71464-4] Vol. 355: Zhang H.; Xie L.: Control and Estimation of Systems with Input/Output Delays 213 p. 2007 [978-3-540-71118-6] Vol. 354: Witczak M.: Modelling and Estimation Strategies for Fault Diagnosis of Non-Linear Systems 215 p. 2007 [978-3-540-71114-8]
Vol. 351: Lin, C.; Wang, Q.-G.; Lee, T.H., He, Y. LMI Approach to Analysis and Control of Takagi-Sugeno Fuzzy Systems with Time Delay 204 p. 2007 [978-3-540-49552-9]
Vol. 350: Bandyopadhyay, B.; Manjunath, T.C.; Umapathy, M. Modeling, Control and Implementation of Smart Structures 250 p. 2007 [978-3-540-48393-9] Vol. 349: Rogers, E.T.A.; Galkowski, K.; Owens, D.H. Control Systems Theory and Applications for Linear Repetitive Processes 482 p. 2007 [978-3-540-42663-9] Vol. 347: Assawinchaichote, W.; Nguang, K.S.; Shi P. Fuzzy Control and Filter Design for Uncertain Fuzzy Systems 188 p. 2006 [978-3-540-37011-6] Vol. 346: Tarbouriech, S.; Garcia, G.; Glattfelder, A.H. (Eds.) Advanced Strategies in Control Systems with Input and Output Constraints 480 p. 2006 [978-3-540-37009-3] Vol. 345: Huang, D.-S.; Li, K.; Irwin, G.W. (Eds.) Intelligent Computing in Signal Processing and Pattern Recognition 1179 p. 2006 [978-3-540-37257-8] Vol. 344: Huang, D.-S.; Li, K.; Irwin, G.W. (Eds.) Intelligent Control and Automation 1121 p. 2006 [978-3-540-37255-4]
Vol. 333: Banavar, R.N.; Sankaranarayanan, V. Switched Finite Time Control of a Class of Underactuated Systems 99 p. 2006 [978-3-540-32799-8] Vol. 332: Xu, S.; Lam, J. Robust Control and Filtering of Singular Systems 234 p. 2006 [978-3-540-32797-4] Vol. 331: Antsaklis, P.J.; Tabuada, P. (Eds.) Networked Embedded Sensing and Control 367 p. 2006 [978-3-540-32794-3] Vol. 330: Koumoutsakos, P.; Mezic, I. (Eds.) Control of Fluid Flow 200 p. 2006 [978-3-540-25140-8] Vol. 329: Francis, B.A.; Smith, M.C.; Willems, J.C. (Eds.) Control of Uncertain Systems: Modelling, Approximation, and Design 429 p. 2006 [978-3-540-31754-8] Vol. 328: Loría, A.; Lamnabhi-Lagarrigue, F.; Panteley, E. (Eds.) Advanced Topics in Control Systems Theory 305 p. 2006 [978-1-84628-313-0] Vol. 327: Fournier, J.-D.; Grimm, J.; Leblond, J.; Partington, J.R. (Eds.) Harmonic Analysis and Rational Approximation 301 p. 2006 [978-3-540-30922-2]
Vol. 341: Commault, C.; Marchand, N. (Eds.) Positive Systems 448 p. 2006 [978-3-540-34771-2]
Vol. 326: Wang, H.-S.; Yung, C.-F.; Chang, F.-R. H∞ Control for Nonlinear Descriptor Systems 164 p. 2006 [978-1-84628-289-8]
Vol. 340: Diehl, M.; Mombaur, K. (Eds.) Fast Motions in Biomechanics and Robotics 500 p. 2006 [978-3-540-36118-3]
Vol. 325: Amato, F. Robust Control of Linear Systems Subject to Uncertain Time-Varying Parameters 180 p. 2006 [978-3-540-23950-5]
Vol. 339: Alamir, M. Stabilization of Nonlinear Systems Using Receding-horizon Control Schemes 325 p. 2006 [978-1-84628-470-0] Vol. 338: Tokarzewski, J. Finite Zeros in Discrete Time Control Systems 325 p. 2006 [978-3-540-33464-4] Vol. 337: Blom, H.; Lygeros, J. (Eds.) Stochastic Hybrid Systems 395 p. 2006 [978-3-540-33466-8] Vol. 336: Pettersen, K.Y.; Gravdahl, J.T.; Nijmeijer, H. (Eds.) Group Coordination and Cooperative Control 310 p. 2006 [978-3-540-33468-2] Vol. 335: Kozłowski, K. (Ed.) Robot Motion and Control 424 p. 2006 [978-1-84628-404-5] Vol. 334: Edwards, C.; Fossas Colet, E.; Fridman, L. (Eds.) Advances in Variable Structure and Sliding Mode Control 504 p. 2006 [978-3-540-32800-1]
Vol. 324: Christofides, P.; El-Farra, N. Control of Nonlinear and Hybrid Process Systems 446 p. 2005 [978-3-540-28456-7] Vol. 323: Bandyopadhyay, B.; Janardhanan, S. Discrete-time Sliding Mode Control 147 p. 2005 [978-3-540-28140-5] Vol. 322: Meurer, T.; Graichen, K.; Gilles, E.D. (Eds.) Control and Observer Design for Nonlinear Finite and Infinite Dimensional Systems 422 p. 2005 [978-3-540-27938-9] Vol. 321: Dayawansa, W.P.; Lindquist, A.; Zhou, Y. (Eds.) New Directions and Applications in Control Theory 400 p. 2005 [978-3-540-23953-6] Vol. 320: Steffen, T. Control Reconfiguration of Dynamical Systems 290 p. 2005 [978-3-540-25730-1]