Computational Optimal Control
Computational Optimal Control Tools and Practice Subchan Subchan Cranfield University a...
88 downloads
1084 Views
2MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
Computational Optimal Control
Computational Optimal Control Tools and Practice Subchan Subchan Cranfield University at Shrivenham, UK
˙ Rafał Zbikowski Cranfield University at Shrivenham, UK
A John Wiley and Sons, Ltd, Publication
This edition first published 2009 c 2009 John Wiley & Sons Ltd Registered office John Wiley & Sons Ltd, The Atrium, Southern Gate, Chichester, West Sussex, PO19 8SQ, United Kingdom. For details of our global editorial offices, for customer services and for information about how to apply for permission to reuse the copyright material in this book please see our website at www.wiley.com. The right of the author to be identified as the author of this work has been asserted in accordance with the Copyright, Designs and Patents Act 1988. All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, except as permitted by the UK Copyright, Designs and Patents Act 1988, without the prior permission of the publisher. Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be available in electronic books. Designations used by companies to distinguish their products are often claimed as trademarks. All brand names and product names used in this book are trade names, service marks, trademarks or registered trademarks of their respective owners. The publisher is not associated with any product or vendor mentioned in this book. This publication is designed to provide accurate and authoritative information in regard to the subject matter covered. It is sold on the understanding that the publisher is not engaged in rendering professional services. If professional advice or other expert assistance is required, the services of a competent professional should be sought. Library of Congress Cataloging-in-Publication Data Subchan, S. Computational optimal control : tools and practice / S. Subchan, R. Zbikowski. p. cm. Includes bibliographical references and index. ISBN 978-0-470-71440-9 (cloth) 1. Control theory. 2. Adaptive control systems. 3. Mathematical optimization. I. Zbikowski, R. (Rafal) II. Title. TJ213.S789 2013 629.8’312–dc22 2009019927 A catalogue record for this book is available from the British Library. ISBN 978-0-470-71440-9 (Hbk) Printed and bound in Great Britain by Antony Rowe Ltd, Chippenham, Wiltshire.
Contents Preface
ix
Acknowledgements
xv
Nomenclature
xvii
1
Introduction 1.1 Historical Context of Computational Optimal Control . . . . . . . . . . . . . 1.2 Problem Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3 Outline of the Book . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2
Optimal Control: Outline of the Theory and Computation 2.1 Optimisation: From Finite to Infinite Dimension . . . . . 2.1.1 Finite Dimension: Single Variable . . . . . . . . 2.1.2 Finite Dimension: Two or More Variables . . . . 2.1.3 Infinite Dimension . . . . . . . . . . . . . . . . 2.2 The Optimal Control Problem . . . . . . . . . . . . . . 2.3 Variational Approach to Problem Solution . . . . . . . . 2.3.1 Control Constraints . . . . . . . . . . . . . . . . 2.3.2 Mixed State–Control Inequality Constraints . . . 2.3.3 State Inequality Constraints . . . . . . . . . . . 2.4 Nonlinear Programming Approach to Solution . . . . . . 2.5 Numerical Solution of the Optimal Control Problem . . . 2.5.1 Direct Method Approach . . . . . . . . . . . . . 2.5.2 Indirect Method Approach . . . . . . . . . . . . 2.6 Summary and Discussion . . . . . . . . . . . . . . . . .
3
1 2 4 6
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
9 9 10 15 24 27 29 32 33 33 34 36 36 43 47
Minimum Altitude Formulation 3.1 Minimum Altitude Problem . . . . . . . . . . . . . . . . . . . . . . . 3.2 Qualitative Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.1 First Arc (Level Flight): Minimum Altitude Flight t0 t t1 3.2.2 Second Arc: Climbing . . . . . . . . . . . . . . . . . . . . . 3.2.3 Third Arc: Diving (t3 t tf ) . . . . . . . . . . . . . . . . . 3.3 Mathematical Analysis . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.1 The Problem with the Thrust Constraint Only . . . . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
49 49 50 50 63 64 64 64
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
CONTENTS
vi . . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
66 67 69 70 71 71 74 79 80 85
Minimum Time Formulation 4.1 Minimum Time Problem . . . . . . . . . . . . . . . . . . 4.2 Qualitative Analysis . . . . . . . . . . . . . . . . . . . . . 4.2.1 First Arc (Level Flight): Minimum Altitude Flight 4.2.2 Second Arc (Climbing) . . . . . . . . . . . . . . . 4.2.3 Third Arc (Diving) . . . . . . . . . . . . . . . . . 4.3 Mathematical Analysis . . . . . . . . . . . . . . . . . . . 4.3.1 The Problem with the Thrust Constraint Only . . . 4.3.2 Optimal Control with Path Constraints . . . . . . . 4.3.3 First Arc: Minimum Altitude Flight . . . . . . . . 4.3.4 Second Arc: Climbing . . . . . . . . . . . . . . . 4.3.5 Third Arc: Diving . . . . . . . . . . . . . . . . . 4.4 Indirect Method Solutions . . . . . . . . . . . . . . . . . 4.5 Summary and Discussion . . . . . . . . . . . . . . . . . . 4.5.1 Comments on Switching Structure . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
95 96 96 96 100 100 101 101 103 104 105 106 107 113 113
Software Implementation 5.1 DIRCOL implementation . . . . . . . . . . . . . . . . . . . . . 5.1.1 User.f . . . . . . . . . . . . . . . . . . . . . . . . . 5.1.2 Input File DATLIM . . . . . . . . . . . . . . . . . . . . 5.1.3 Input File DATDIM . . . . . . . . . . . . . . . . . . . . 5.1.4 Grid Refinement and Maximum Dimensions in DIRCOL 5.2 NUDOCCCS Implementation . . . . . . . . . . . . . . . . . . 5.2.1 Main Program . . . . . . . . . . . . . . . . . . . . . . 5.2.2 Subroutine MINFKT . . . . . . . . . . . . . . . . . . . 5.2.3 Subroutine INTEGRAL . . . . . . . . . . . . . . . . . . 5.2.4 Subroutine DGLSYS . . . . . . . . . . . . . . . . . . . 5.2.5 Subroutine ANFANGSW . . . . . . . . . . . . . . . . . . 5.2.6 Subroutine RANDBED . . . . . . . . . . . . . . . . . . 5.2.7 Subroutine NEBENBED . . . . . . . . . . . . . . . . . . 5.2.8 Subroutine CONBOXES . . . . . . . . . . . . . . . . . . 5.3 GESOP (PROMIS/SOCS) Implementation . . . . . . . . . . . . 5.3.1 Dynamic Equations, Subroutine fmrhs.f . . . . . . . 5.3.2 Boundary Conditions, Subroutine fmbcon.f . . . . . 5.3.3 Constraints, Subroutine fmpcon.f . . . . . . . . . . .
. . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . .
119 123 123 128 129 132 132 133 136 136 136 137 138 138 139 139 141 143 144
3.4
3.5
4
5
3.3.2 Optimal Control with Path Constraints 3.3.3 First Arc: Minimum Altitude Flight . 3.3.4 Second Arc: Climbing . . . . . . . . 3.3.5 Third Arc: Diving . . . . . . . . . . Indirect Method Solution . . . . . . . . . . . 3.4.1 Co-state Approximation . . . . . . . 3.4.2 Switching and Jump Conditions . . . 3.4.3 Numerical Solution . . . . . . . . . . Summary and Discussion . . . . . . . . . . . 3.5.1 Comments on Switching Structure . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
CONTENTS . . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
145 146 146 148 154
Conclusions and Recommendations 6.1 Three-stage Manual Hybrid Approach . . . . . . . . . 6.2 Generating an Initial Guess: Homotopy . . . . . . . . 6.3 Pure State Constraint and Multi-objective Formulation 6.4 Final Remarks . . . . . . . . . . . . . . . . . . . . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
159 159 160 161 162
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
165 165 167 168 169
5.4
5.5 6
vii
5.3.4 Objective Function, Subroutine fmpcst.f BNDSCO Implementation . . . . . . . . . . . . . 5.4.1 Possible Sources of Error . . . . . . . . . . 5.4.2 BNDSCO Code . . . . . . . . . . . . . . . User Experience . . . . . . . . . . . . . . . . . . .
Appendix BNDSCO Benchmark Example A.1 Analytic Solution . . . . . . . . . . . . . . . A.1.1 Unconstrained or Free Arc (l 1/4) . A.1.2 Touch Point Case (1/6 l 1/4) . . A.1.3 Constrained Arc Case (0 l 1/6) .
. . . .
. . . .
. . . .
. . . . .
. . . .
. . . .
Bibliography
173
Index
179
Preface Computational Optimal Control Reliable and efficient ways of finding the best possible (optimal) solutions is a pervasive problem in science and engineering. In many such problems, especially in engineering, we can manipulate the optimised object/process through limited (constrained) influences (control) at our disposal which we can vary over time (dynamically). The dynamic control elicits dynamic responses from the optimised object/ process, often in ways which are difficult to guess or intuit. A judicious choice of optimal control, obeying the constraints, must therefore be based on a systematic procedure. The well-documented theory of optimal control is exactly the mathematical tool necessary for that. Given a mathematical description of the optimised object/process, an optimisation criterion (performance index) and constraints, the theory of optimal control gives mathematical equations whose solution is the optimal control needed. In most realistic (and thus practical) engineering problems, application of the theory leads to complex equations. The complexity of the optimal control equations is not a flaw—it properly reflects the mathematical details of the optimised problems; it is the details that make the problem realistic and hence practical. But the complexity of the optimal control equations means that, in industrial practice, the theory must be accompanied by calculations with digital computers, especially for advanced problems in aerospace and aeronautics. This blend of mathematical theory and numerical techniques is the essence of computational optimal control. Both the theory and the numerical algorithms involved are rather non-trivial in nature, and their interaction adds another layer of complexity. Book Focus and Prerequisites This book focuses on informed use of computational optimal control rather than development of either theory or numerics. The aim of the book is to provide a hitherto unavailable computational optimal control self-study textbook for practising engineers, especially engineers working on challenging, real-world applications in aerospace and aeronautical industries. Graduate and post graduate students who want to specialise in advanced applications of optimal control should also find it of interest. The prerequisite knowledge is a general background in numerical analysis and ordinary differential equations plus familiarity with the FORTRAN computer language, usually acquired during graduate engineering studies. Some knowledge of optimal control would be helpful, but is not essential—the relevant theory can be picked up while studying this text. Case Study The main thrust of the book is to explain how to use computational optimal control tools in engineering practice, employing an advanced aeronautical case study to provide a realistic setting for both theory and computation. The case study is focused on missile
x
PREFACE
guidance in the form of trajectory shaping of a generic cruise missile attacking a fixed target which must be struck from above. The problem is reinterpreted using optimal control theory resulting in two formulations: (1) minimum time-integrated altitude and (2) minimum flight time. The resulting trajectory has a characteristic shape and hence the problem is also known as optimisation of the bunt manoeuvre. This eminently realistic and practical problem is quite hard, because realistic missile flight dynamics and practical control and flight path constraints are assumed. Due to its challenging nature, the bunt manoeuvre problem is an excellent illustration of advanced engineering practice, without comforting simplifications found in many textbooks. More importantly, the problem strongly exercises both the theoretical and numerical aspects of computational optimal control, so it robustly tests the true value of theoretical and software tools available to the practitioner. A detailed account of the actual performance of these tools is given in this book. The bunt manoeuvre problem in its minimum time-integrated altitude and minimum flight time formulations is the only problem treated in this book. This allows a detailed, and often tutorial, presentation with insights into the structure and nature of the optimal solutions and also into the advantages and limitations of the available tools. Rather than moving from one simple example to another, and learning little of real-world computational optimal control, we prefer the reader to stay focused on one in-depth project. Introducing other challenging examples would, in our opinion, distract the reader from the main aim of this book: informed use of the tools of computational optimal control in advanced engineering practice. Approach Each of the formulations of the bunt manoeuvre problem is solved using a three-stage approach. In stage 1, the problem is discretised, effectively transforming it into a nonlinear programming problem, and hence suitable for solution with the public-domain FORTRAN packages DIRCOL and NUDOCCCS or the commercial FORTRAN packages PROMIS or SOCS. The results of this direct approach are used to discern the structure of the optimal solution, i.e. type of active constraints, time of their activation, switching and jump points. The qualitative analysis of the solution structure, employing the results of stage 1 and optimal control theory, constitutes stage 2. Finally, in stage 3, the insights of stage 2 are made precise by rigorous mathematical formulation of the relevant two-point boundary value problems (TPBVPs), using appropriate theorems of optimal control theory. The TPBVPs obtained from this indirect approach are then solved using the public-domain FORTRAN package BNDSCO and the results compared with the appropriate solutions of stage 1. Additionally, a comparison is made with the results obtained by the commercial package GESOP (a software environment with PROMIS and SOCS solvers) whose solution approach can be considered as half-way between the approaches of stages 1 and 3. For each formulation (minimum time-integrated altitude and minimum time) the influence of boundary conditions on the structure of the optimal solution and the performance index is investigated. Software implementation employing the public-domain packages DIRCOL, NUDOCCCS and BNDSCO, and also the commercial packages SOCS and PROMIS under the GESOP environment, which produced the results, is described and documented. Book Features As explained earlier in this preface, this book focuses on informed use of computational optimal control for solving the terminal bunt manoeuvre, rather than development of either the underlying theory or the relevant numerics. In this context, the main features of this book are as follows:
PREFACE
xi
• formulating trajectory shaping missile guidance as an optimal control problem for the case of terminal bunt manoeuvre; • devising two formulations of the problem: – minimum time-integrated altitude – minimum flight time; • proposing a three-stage hybrid approach to solve each of the problem formulations: – stage 1: solution structure exploration using a direct method – stage 2: qualitative analysis of the solution obtained in stage 1, using optimal control – stage 3: mathematical formulation of the TPBVP based on the qualitative analysis of stage 2; • solving each of the problem formulations using the three-stage hybrid approach: – stage 1: by using DIRCOL/NUDOCCCS solvers – stage 2: by using the results of stage 1, understanding the underlying flight dynamics and employing optimal control theory – stage 3: by using optimal control theory and the BNDSCO solver; • analysing the influence of boundary conditions on the structure of the optimal control solution of each problem formulation and the resulting values of the performance index; • interpreting the results from the operational and computational perspectives, pointing out the trade-offs between the two; • using effectively DIRCOL, NUDOCCCS, PROMIS and SOCS (under the GESOP environment) and BNDSCO and documenting their use. Book Organisation We have striven to write the book so that each chapter is as much as possible independent of the others. Alas, we have not been able to attain the ideal of selfcontained chapters, but we hope that a certain degree of independence has been achieved. We begin with the introductory Chapter 1 which is deliberately brief: it gives a very concise historical context of the subject of computational optimal control, then defines the case study investigated in the rest of the book and concludes with a detailed summary of the book, chapter by chapter. Section 1.2 of Chapter 1 is an essential reference, as it describes the mathematical details of the case study problem. A reader who is pressed for time may want to glance at that section and move on. Chapter 2 is a friendly (we hope) presentation of all the theory we use in the book. We have made an attempt to produce a reasonably readable narrative, as opposed to a dry recitation of formulae and theorems. It is not strictly necessary to read the whole of Chapter 2 in order to follow later analyses, especially if the reader is familiar with the basics of the theory. However, it might be useful at least to glance through the material—our hard-won experience shows that many “obvious” facts are far from clear, even for those who have already encountered problems of computational optimal control.
PREFACE
xii
Chapters 3 and 4 are the core of the book in that they consider each variant of the case study analysed in this book. Each of these chapters is independent of the other, but both should be studied carefully if the reader is to learn anything real from this book. An impatient user may simply start reading the book from Chapters 3 or 4 and consult earlier chapters if necessary. However, these earlier chapters are not in this book as a result of a contractual obligation—we wrote them because we wished someone had written them when we embarked on the analysis of the terminal bunt problem. Perhaps reading those chapters will save the reader some of the frustration we experienced in our encounter with computational optimal control. Chapter 5 is the part of our book which we strongly draw to the reader’s attention, because it describes in practical detail—including code1 listings—real (and tested) software implementations of the analyses presented in Chapter 4. We have striven not only to explain how to use various software packages, but also to share our (often hard-won) user experience. A separate tutorial on the most challenging of the software packages, BNDSCO, can be found in the Appendix. Finally, we offer in Chapter 6 pragmatic conclusions based on our user experience with the tools and practice of computational optimal control. Moreover, we suggest a few ways of going beyond the approaches described in the book. If the reader is looking for “the bottom line”, this is the place where we give it. The book arose from a real-world, three-year project given to the authors by the UK Ministry of Defence, which was quite challenging and led, among others, to the first author’s PhD thesis. Also, much of the material appearing in Chapter 3 appeared first in Subchan and ˙ ˙ Zbikowski (2007a) while the bulk of Chapter 4 comes from Subchan and Zbikowski (2007b). Bibliographic Comments There are three categories of optimal control books currently available: 1. Introductory textbooks, focused mostly on theory: (a) Entry-level texts i. L. M. Hocking, Optimal Control: An Introduction to the Theory with Applications, Oxford University Press, 1991. ii. E. R. Pinch, Optimal Control and the Calculus of Variations, Oxford University Press, 1995. iii. A. E. Bryson, Dynamic Optimization, Addison-Wesley, 1999. iv. D. S. Naidu, Optimal Control Systems, CRC Press, 2002. v. D. G. Hull, Optimal Control Theory for Applications, Springer, 2003. (b) More advanced texts, but dealing with linear problems only i. F. L. Lewis and V. L. Syrmos, Optimal Control, John Wiley & Sons Ltd, 1995. ii. A. E. Bryson, Applied Linear Optimal Control: Examples and Algorithms, Cambridge University Press, 2002. iii. B. D. O. Anderson and J. B. Moore, Optimal Control: Linear Quadratic Methods, Dover, 2007. 1 The code listings presented in this book are available electronically from http://www.wiley.com/go/zbikowski.
PREFACE
xiii
2. Advanced theoretical monographs (a) L. S. Pontryagin, V. G. Boltyanskii, R. V. Gamkrelidze and E. F. Mishchenko, The Mathematical Theory of Optimal Processes, John Wiley & Sons, Inc., 1962. (b) A. E. Bryson and Y. C. Ho, Applied Optimal Control. Optimization, Estimation, and Control Revised Printing, Hemisphere, 1975. (c) G. Leitmann, The Calculus of Variations and Optimal Control, Plenum Press, 1981. (d) J. Macki and A. Strauss, Introduction to Optimal Control Theory, Springer, 1982. (e) V. M. Alekseev, V. M. Tikhomirov and S. V. Fomin, Optimal Control, Plenum Press, 1987. (f) T. L. Vincent and W. J. Grantham, Nonlinear and Optimal Control Systems, John Wiley & Sons Ltd, 1997. (g) R. Vinter, Optimal Control, Birkhäuser, 2000. (h) M. Athans and P. L. Falb, Optimal Control: An Introduction to Theory and Its Application, Dover, 2006. 3. More practically orientated monographs, including computational methods (a) R. Bulirsch (ed.), Optimal Control: Calculus of Variations, Optimal Control Theory and Numerical Methods, Birkhäuser, 1993. (b) R. Bulirsch and D. Kraft (eds), Computational Optimal Control, Birkhäuser, 1994. (c) J. T. Betts, Practical Methods for Optimal Control Using Nonlinear Programming, SIAM, 2001. Among the above books, the Bryson and Ho monograph, see 2b, must be singled out as highly respected and widely used by practitioners—we benefited from it enormously. It comprehensively presents the theory in a well-organised, clear, readable and systematic way without distracting the reader by lengthy mathematical derivations. It is a ready theoretical reference for any serious user of optimal control. However, this book offers more up-to-date and expansive coverage of numerical methods than is included in Chapter 7 of Bryson and Ho. Titles 3b and 3a as edited compilations serve as research references, and are not intended as textbooks. The book by Betts, see 3c above, is the only self-contained book on computational optimal control and was written by a practitioner from Boeing. However, it is dedicated to one approach to computational optimal control (direct method) and is focused on one commercial FORTRAN package, namely SOCS. Finally, it mainly gives an in-depth analysis of the numerical aspects of the direct method with one chapter briefly describing six examples of varying difficulty. By contrast, our book can be considered a self-study guide to real engineering practice of computational optimal control. Shrivenham, England 2009
Acknowledgements The investigation which was the subject of the case study considered in this book was initiated by the Director of Weapons Systems Sector, WS2, Defence Evaluation & Research Agency (DERA), Farnborough, England, and was carried out under the terms of Contract No. WSS/R4519. The study was sponsored by the UK Ministry of Defence and happened because of an exceptionally kind and dedicated professional, David East (formerly of DERA), who was instrumental in setting up the project; his gentlemanly support is gratefully acknowledged. John Cleminson (formerly of DERA and QinetiQ) kindly suggested the terminal bunt problem, provided the defining data and shared his solution. His eager support and personal modesty are much appreciated. The GESOP (and DIDO) software used here was purchased with the funds of the Data and Information Fusion Defence Technology Centre (DIF DTC), a joint initiative between the UK Ministry of Defence and the UK defence industry. We especially thank Andy Tilbrook from General Dynamics UK for his support in obtaining funds for the GESOP (and DIDO) purchase.
Nomenclature α angle of attack λ co-state vector ψ boundary condition C mixed constraint S pure state constraint u, u(t) control vector u∗ , u∗ (t) optimal control vector x, x(t) state vector x ∗ , x ∗ (t) optimal state vector γ flight path angle γ (0), γ0 flight path angle initial condition γ (tf ), γtf flight path angle terminal condition ρ air density Cd coefficient of axial aerodynamic force Cl coefficient of normal aerodynamic force D axial aerodynamic force g gravitational constant H Hamiltonian h altitude h(0), h0 altitude initial condition h(tf ), htf altitude terminal condition hmin minimum altitude
xviii
NOMENCLATURE
J performance criterion L normal aerodynamic force Lmax normalised maximum normal aerodynamic force Lmin normalised minimum normal aerodynamic force m mass Sref reference area of the missile T thrust t0 initial time tf final/terminal time Tmax maximum thrust Tmin minimum thrust V speed V (0), V0 speed initial condition V (tf ), Vtf speed terminal condition Vmax maximum speed Vmin minimum speed x horizontal position x(0), x0 horizontal position initial condition xtf , x(tf ) horizontal position terminal condition BNDSCO A software package for the numerical solution of optimal control problems using an indirect method CAMTOS Collocation and Multiple Shooting Trajectory Optimization Software DIRCOL Direct Collocation Method GESOP Graphical Environment for Simulation and Optimization GNC Guidance, Navigation and Control GUI Graphical User Interface KKT Karush–Kuhn–Tucker MPBVP Multi-Point Boundary Value Problem NLP Nonlinear Programming
NOMENCLATURE
xix
NUDOCCCS Numerical Discretization Method for Optimal Control Problems with Constraints in Control and States PROMIS Parameterized Trajectory Optimization by Direct Multiple Shooting SOCS Sparse Optimal Control Software SQP Sequential Quadratic Programming TPBVP Two-Point Boundary Value Problem TROPIC Trajectory Optimization by Direct Collocation
1
Introduction The main focus of this short chapter is to define the case study investigated in this book (Section 1.2) and also to summarise the structure of the rest of the book (Section 1.3). These two sections are preceded with a brief account of the historical context of computational optimal control in Section 1.1. Cruise missiles are guided weapons designed for atmospheric flight and whose primary mission is precision strike of fixed targets. This can be achieved only by a judicious approach to guidance, navigation and control (GNC). Navigation is the process of establishing the missile’s location. Based on the location, guidance produces the trajectory that the missile should follow. Finally, control entails the use of actuators, so that the missile follows the desired trajectory. The computational optimal control case study considered here arises from an approach to cruise missile guidance known as trajectory shaping. The essence of the approach is to compute an optimal trajectory together with the associated control demand. In other words, for given launch and strike conditions, find a missile trajectory which: • hits the target in a pre-defined way; • shapes the missile’s flight in an optimal fashion; • defines the control demand for optimal flight. This setting leads naturally to expressing the guidance problem as an optimal control problem which cannot be solved analytically. Hence the solution approach for the trajectory shaping involves computational optimal control. This is a set of techniques which combines the theory of infinite-dimensional optimisation with numerical methods of finite-dimensional optimisation and boundary value problem solvers. Both the optimal control theory and the numerical algorithms involved are rather non-trivial in nature, and their interaction adds another layer of complexity. This book focuses on informed use of computational optimal control rather than development of either theory or numerics. The theoretical and computational tools are used to elucidate the features of the special case of cruise missile trajectory shaping, the terminal bunt manoeuvre, defined in detail in Section 1.2 below. The tools are both powerful and complex. Their power gives insights into optimisation of the manoeuvre—operationally valuable knowledge. Their complexity not only challenges the analyst, but uncovers the limitations of the approach and, crucially, Computational Optimal Control: Tools and Practice c 2009 John Wiley & Sons, Ltd
˙ S. Subchan and R. Zbikowski
2
COMPUTATIONAL OPTIMAL CONTROL: TOOLS AND PRACTICE
elucidates the trade-offs between operationally desirable and computationally tractable. From the practical point of view, the aims of this case study were as follows: • to formulate several operationally useful variants of trajectory shaping of the terminal bunt manoeuvre; • to analyse the formulations from the point of view of the solution structure, e.g. type of control demand, number and duration of active constraints, etc.; • to compute the actual solutions; • to assess the results from the point of view of the analyst, i.e. insights offered and difficulties encountered.
1.1 Historical Context of Computational Optimal Control The history of optimal control problems cannot be separated from the history of the calculus of variations. The history of optimal control reaches back to the famous brachistochrone problem which was proposed by the Swiss mathematician Johann Bernoulli in the seventeenth century, and might be formulated as an optimal control problem, see Bryson (1996), Sussmann and Willems (1997), Pesch and Bulirsch (1994) and Sargent (2000). The problem is to find the quickest descent path between two points with different horizontal and vertical positions. In other words, the problem can be stated as an optimisation problem which is to find a function y = y(x) that minimises the objective function xf J = 1 + (dy/dx)2dx (1.1) x0
subject to boundary conditions. The above calculus of variations problem can be converted into an optimal control problem by introducing u = dy/dx as a control. Then, the problem is to find the control u that minimises the performance criterion xf J = 1 + u2 dx (1.2) x0
subject to dy/dx = u and the boundary conditions. Following their brilliant solution of the brachistochrone problem, Euler and Lagrange found a necessary condition for the extremum of a functional which became later known as the Euler–Lagrange equation. The development of the extremum of a functional became more sophisticated after Legendre, Clebsch and Jacobi found further necessary conditions (the three necessary conditions of Euler/Lagrange, Legendre/Clebsch, and Jacobi were later proved to be sufficient for a weak local minimum), and, finally, Weierstrass and Carathéodory found, after Hilbert’s contribution, sufficient conditions for a strong local minimum. A century later, in 1919, Goddard considered the calculus of variations as an important tool to analyse the performance of a rocket trajectory, see Darling (2002). Subsequently, variational formulations of the flight paths have been developed by Garfinkel (1951), Garfinkel (1963), Breakwell et al. (1963), Breakwell and Dixon (1975) and Lawden (1963) in the formulations of Bolza, Mayer and Lagrange type. The general theory of optimal control was
INTRODUCTION
3
developed by Breakwell (1959), Hestenes (1966) and Pontryagin et al. (1962). The breakthrough, and consequently the birth of a new field in mathematics, optimal control, came with the proof of the minimum1 principle by Pontryagin, Boltyanskii and Gamkrelidze. Incidentally, their new necessary condition was firstly formulated by Hestenes; his proof, however, was still in the context of the calculus of variations and thus not as general as the one done by Pontryagin’s group. This is because some optimal control problems may be transformed into problems of the calculus of variations, but even “simple” ones, e.g. those with controls appearing linearly, cannot be. Some classical books on optimal control are Bellman (1957), Bryson and Ho (1975), Berkovitz (1974) and Gamkrelidze (1978). Most early methods were based on finding an analytic solution that satisfied the minimum principle, or related conditions, rather than attempting a direct minimisation of the performance criterion of optimal control problems. However, the pressing aerospace problems which arose from the 1950s onwards did not have analytical solutions. Thus, while the theoretical necessary and sufficient conditions for optimal control were available, effective computation of solutions was still a challenge, compounded by the presence of constraints in real-life problems. The development of digital computers and reliable numerical methods transformed the situation and ushered in the era of computational optimal control: a combination of optimal control theory and the relevant numerics. Initially, the focus was on approximating the underlying infinite-dimensional problem with a discretised, finite-dimensional version, thus obtaining a nonlinear programming formulation. This approach was given a strong theoretical impetus by the seminal results of Karush, Kuhn and Tucker, see Bazaraa et al. (1993), on optimality conditions for finitedimensional constrained optimisation. Subsequently, several numerical methods were developed, among which the most important is sequential quadratic programming, or SQP. This method was developed further by many researchers, namely Rosen et al. (1970), Gill et al. (1981) and Gill et al. (1993). Following the rapid development of the SQP methods, it became feasible to obtain numerical solutions of the optimal control problem by transforming the original problem to a nonlinear programming problem. This is done by discretising state and/or control variables; the approach is known as the direct method. Bulirsch et al. achieved a major breakthrough, see Stoer and Bulirsch (2002), when they developed their multiple shooting software BOUNDSOL which was applied successfully to solve several two-point boundary value problems, see Keller (1992) and Osborne (1969). This enabled an alternative approach to computational optimal control: the indirect method. The essence of the method is firstly to use optimal control theory to derive the necessary conditions for optimality and then to solve the resulting two-point boundary value problem. Subsequently, BOUNDSOL was developed further by introducing a modified Newton method by Deuflhard et al. (1976) and Deuflhard and Bader (1983), and generalising the multiple shooting method for multi-point boundary value problems by Oberle, see the references cited in Oberle and Grimm (1989), which improved convergence of the underlying multiple shooting method. The resulting BNDSCO software became a package of choice and has been successfully used to solve several optimal control problems via the indirect method. 1 Pontryagin’s minimum principle is also known as Pontryagin’s maximum principle, because Pontryagin’s original work focused on maximising a benefit functional rather than minimising a cost functional. In this book we focus on the latter, so we refer to Pontryagin’s result as the minimum principle.
4
COMPUTATIONAL OPTIMAL CONTROL: TOOLS AND PRACTICE
In this book both approaches of computational optimal control, the direct and indirect methods, are employed not only for comparison of their pros and cons, but also due to the complementary insights into the solution they offer.
1.2 Problem Formulation T L α V γ x D
mg Figure 1.1 Definition of missile axes and angles. Note that L is the normal aerodynamic force and D is the axial aerodynamic force with respect to a body-axis frame, not lift and drag.
Trajectory shaping of a missile is an advanced approach to missile guidance which aims at computing the whole trajectory in an optimal way. The approach underpins the case study considered in this book, focusing on the example of the bunt shaping problem for a cruise missile. The mission is to hit a fixed target while minimising the missile’s exposure to anti-air defences or minimising the flight time. The problem is to find the trajectory of a generic cruise missile from the assigned initial state to a final state either: (1) with the minimum altitude along the trajectory; or (2) in minimum time. The first objective can be formulated by introducing the performance criterion tf J = hdt, (1.3) t0
while the second objective is
J =
tf
dt. t0
(1.4)
INTRODUCTION
5
The performance criterion is subject to the equations of motion, which may be written as L g cos γ T −D sin α + cos α − mV mV V L T − D cos α − sin α − g sin γ V˙ = m m x˙ = V cos γ γ˙ =
h˙ = V sin γ
(1.5a) (1.5b) (1.5c) (1.5d)
where t is the actual time, t0 t tf , with t0 as the initial time and tf as the final time. The state variables are the flight path angle γ , speed V , horizontal position x and altitude h of the missile. The thrust magnitude T and the angle of attack α are the two control variables (see Figure 1.1). The aerodynamic forces D and L are functions of the altitude h, velocity V and angle of attack α. The following relationships have been assumed by Subchan et al. (2003): Axial aerodynamic force This force is written in the form 1 Cd ρV 2 Sref 2
(1.6)
Cd = A1 α 2 + A2 α + A3 .
(1.7)
D(h, V , α) =
Note that D is not the drag force. Normal aerodynamic force This force is written in the form 1 Cl ρV 2 Sref 2 Cl = B1 α + B2
L(h, V , α) =
(1.8) (1.9)
where ρ is air density2 given by ρ = C1 h2 + C2 h + C3
(1.10)
and Sref is the reference area of the missile; m denotes the mass and g the gravitational constant, see also Table 1.1. Note that L is not the lift force. Boundary conditions The initial and final conditions for the four state variables are specified as follows: γ (0) = γ0 ,
γ (tf ) = γtf
(1.11a)
V (0) = V0 ,
V (tf ) = Vtf
(1.11b)
x(0) = x0 ,
x(tf ) = xtf
(1.11c)
h(0) = h0 ,
h(tf ) = htf .
(1.11d)
2 Since the cruise missile is supposed to be surface launched only, the altitude changes are expected to be small and hence a polynomial approximation of the exponential model is used.
6
COMPUTATIONAL OPTIMAL CONTROL: TOOLS AND PRACTICE Table 1.1 Physical modeling parameters. Quantity Value Unit m g Sref A1 A2 A3 B1 B2 C1 C2 C3
1005 9.81 0.3376 −1.9431 −0.1499 0.2359 21.9 0 3.312 · 10−9 −1.142 · 10−4 1.224
kg m/s2 m2
kg/m5 kg/m4 kg/m3
In addition, constraints are defined as follows: • State path constraints Vmin V Vmax hmin h.
(1.12) (1.13)
Note that the altitude constraint (1.13) does not apply near the terminal condition. • Control path constraint
Tmin T Tmax .
(1.14)
• Mixed state and control constraint (see equations (1.8)–(1.10)) Lmin
L Lmax mg
(1.15)
where Lmin and Lmax are normalised, see Table 1.2. Note that (1.12)–(1.13) are pure state constraints, so we may expect the problem to be challenging, see e.g. Maurer and Gillessen (1975), Berkmann and Pesch (1995) and Steindl and Troger (2003).
1.3 Outline of the Book The remaining chapters of the book are organised as follows: Chapter 2 presents an overview of optimal control theory and reviews some numerical methods for solving nonlinear optimal control problems. Section 2.1 is a tutorial guide of the ideas underlying the mathematical theory of optimisation, preparing the ground for the presentation of nonlinear optimal control in the next two sections.
INTRODUCTION
7 Table 1.2 Boundary conditions and constraints. Quantity Value Unit Vmintf Vmin Vmax Tmin Tmax hmin Lmin Lmax
250 200 310 1000 6000 30 −4 4
m/s m/s m/s N N m g g
Section 2.2 focuses on the formulation of the nonlinear optimal control problem. In Section 2.3 some details of the variational approach are given. This section discusses the importance of constraints, in particular control constraint, mixed inequality constraint and pure state inequality constraint. In Section 2.4 a nonlinear optimisation approach based on the Karush, Kuhn and Tucker theorem is considered. In Section 2.5 a numerical solution for the optimal control problem is presented. This section focuses on the direct method in Section 2.5.1 and the indirect method in Section 2.5.2. In Section 2.6 a summary and discussion are given. Chapter 3 describes a detailed analysis of optimal trajectories of a generic cruise missile attacking a fixed target while minimising the missile’s exposure to anti-air defences. This is the first variant of the case study analysed in this book. In Section 3.1 the minimum altitude problem formulation is defined. In Section 3.2 the computational results of the direct method are used for a qualitative analysis of the main features of the optimal trajectories and their dependence on several constraints. In Section 3.3 the mathematical analysis based on the qualitative analysis is presented. This section begins by discussing a constraint on the thrust and is followed by path and mixed constraints. Section 3.4 presents the indirect method approach. The co-state approximation issue is addressed in this section. The section ends with some numerical solutions using the multiple shooting method package BNDSCO. Finally, a summary and discussion are given in Section 3.5. Chapter 4 focuses on the optimal trajectories of a generic cruise missile attacking a fixed target in minimum time. This is the second variant of the case study analysed in this book. In Section 4.1 the problem formulation is defined for the time-optimal control of the terminal bunt manoeuvre.
8
COMPUTATIONAL OPTIMAL CONTROL: TOOLS AND PRACTICE Section 4.2 presents some computational results using the direct method and is followed by a qualitative analysis to reveal the structure of the solution. Section 4.3 contains the mathematical analysis of the time-optimal control problem based on the qualitative analysis. In Section 4.4 numerical solutions using the multiple shooting package BNDSCO are obtained and compared with the results of DIRCOL, PROMIS, SOCS and NUDOCCCS. Finally, Section 4.5 gives a summary and discussion.
Chapter 5 deals with software implementation for the minimum time formulation of the terminal bunt problem using different packages. In Section 5.1 the DIRCOL implementation is given for the case of the minimum time problem. In Section 5.2 the minimum time problem is solved using NUDOCCCS. In Section 5.3 the PROMIS and SOCS implementations (under the GESOP environment) are presented for solving the problem. In Section 5.4 the multiple shooting package BNDSCO implementation of the minimum time problem is shown (see also the Appendix). Finally, Section 5.5 summarises user experience for all software packages. Chapter 6 presents the conclusions of the book and practical recommendations. The Appendix describes a detailed analysis of the BNDSCO implementation of a relatively simple benchmark example, different from the bunt manoeuvre problem. This is done further to elucidate the implementation idiosyncrasies of BNDSCO software.
2
Optimal Control: Outline of the Theory and Computation The main purpose of this chapter is to provide an overview of the aspects of computational optimal control necessary to analyse and solve the terminal bunt manoeuvre. Optimal control theory has been presented in many books, see e.g. Pontryagin et al. (1962), Macki and Strauss (1982), Athans and Falb (1966), Kirk (1970), Leitmann (1962), Leitmann (1981), Bryson and Ho (1975), Lewis and Syrmos (1995), Vinh (1981), Betts (2001) and Naidu (2002), and survey papers, see e.g. Hartl et al. (1995), Pesch (1989a), Pesch (1989b), Pesch (1991) and Pesch (1994). These sources have treated the optimal control problem in depth (including mathematical proofs), but our purpose here is to present only the relevant aspects of computational optimal control in a user-friendly and self-contained way. This chapter is organised as follows. Section 2.1 is a tutorial guide of the ideas underlying the mathematical theory of optimisation, preparing the ground for the presentation of nonlinear optimal control in the next two sections. Section 2.2 concerns formulation of the general nonlinear optimal control problem. As an example, some details of the variational approach derivation are given in Section 2.3 where the important issue of constraints is also discussed: control constraints in Section 2.3.1, mixed inequality constraints in Section 2.3.2 and pure state inequality constraints in Section 2.3.3. A nonlinear optimisation approach based on the Karush, Kuhn and Tucker theorem is considered in Section 2.4. The numerical solution for the optimal control problem is presented in Section 2.5: the direct method approach is considered in Section 2.5.1, followed by the indirect approach in Section 2.5.2. Finally, Section 2.6 presents a summary and discussion.
2.1 Optimisation: From Finite to Infinite Dimension This section provides some background information which may help readers appreciate the theory of optimal control. Since optimal control theory is quite complex, a gradual approach was adopted in presenting the optimisation theory. The progress of the mathematical exposition is from the finite to infinite dimension. The guiding principle is to provide the theory for the user. Therefore, a conversational tone has been adopted with intuitive arguments as the main approach and illustrated with indicative pictures and actual simulations. While no Computational Optimal Control: Tools and Practice c 2009 John Wiley & Sons, Ltd
˙ S. Subchan and R. Zbikowski
10
COMPUTATIONAL OPTIMAL CONTROL: TOOLS AND PRACTICE
proofs are offered (they are readily available in the literature cited), the informal exposition is supported by precise definitions and statements. The structure of this section is as follows. Section 2.1.1 begins the exploration of finitedimensional optimisation by considering functions of only one variable. Thus, the existence and quality of solutions are discussed, followed by an outline of the theory of differentiable optimisation and its limitations. Finally, these one-variable results are summarised in preparation for moving beyond the scalar case. Section 2.1.2 extends the previous discussion to two or more variables, emphasising new phenomena (absent in the scalar case) and the role of constraints. The focus is on n = 2, as geometric illustrations are then possible, while the analytical apparatus readily extends to n > 2. Thus, we first consider differentiable optimisation, which is very useful in the interior of the domain. The next two sections focus on the features of the optimal solution on the boundary of the domain, where the differentiable methods are not applicable. Then we identify possible behaviour of the solution and give its general characterisation. This sets the scene for the description of a practical approach to finding a solution through the powerful method of Lagrange multipliers. Finally, the discussion is completed by drawing together the findings of the previous three sections to treat inequality constraints. The discussion of finite-dimensional optimisation is then closed, emphasising the key points of the optimisation process, relevant to the infinite dimension. Section 2.1.3 is devoted to infinite-dimensional optimisation and appropriate analogues of the fundamental notions introduced for the finite dimension are discussed. This allows our proceeding to an outline of the essentials of optimal control theory in Section 2.2.
2.1.1 Finite Dimension: Single Variable This section begins the outline of basic optimisation theory by focusing on the case of a single real variable. Existence and Quality of Solutions The basic object on which optimisation is performed is a function. Given two sets, a domain X and a co-domain Y , a function f : X → Y assigns to every point from X a unique point from Y , thus mapping X into Y . If the co-domain is a subset of real numbers, Y ⊂ R, then the extremum is defined as follows. Definition 2.1.1 (Extremum of real-valued mapping) Let f : X → Y ⊂ R be given together with the distance (metric) d : X × X → R for any pair of points from X. A point x ∗ ∈ X is a local extremum of f if, and only if, there exists a neighbourhood Nδ (x ∗ ) {x ∈ X| d(x, x ∗ ) < δ},
(2.1)
with δ > 0, for which either
or
∀ x ∈ Nδ (x ∗ )
f (x ∗ ) f (x) (minimum)
∀ x ∈ Nδ (x ∗ )
f (x ∗ ) f (x)
(2.2) (2.3)
(maximum)
holds. If any of the conditions (2.2) or (2.3) is true in the whole of X, then extremum.
x∗
is a global
OPTIMAL CONTROL: OUTLINE OF THE THEORY AND COMPUTATION
11
Deciding on the content of X (and Y ) has profound consequences for the existence and quality problem. √ 2 of the solution of the optimisation √ For example, the function f : x → (x − R = 2 is a (global) minimum for X = R, the set 2) has a parabola as its graph, and xmin of real numbers. If, however, √ X is the set of rationals, X = Q, then f has no minimum (there is no solution), because 2 ∈ Q, while it can be approached with arbitrary accuracy in Q. On the other hand, if X is the set of integers, X = Z, then √ f again has a global minimum, Z = 1. Note that f (x R ) = 0, while f (x Z ) = (1 − 2)2 > 0. Thus, although for X = xmin min min Z the problem has a (global) solution, the optimal value of the performance index f is worse Z ) > f (x R ). (has poorer quality) than for X = R, as f (xmin min √ Continuing this example, let us√consider the co-domain Y . Since 2 is involved in the computation of the values of f , 2 must be a point in Y . Hence the co-domain must be a subset of R in order for the problem to make sense. Suppose that Y = R is chosen with X = R and recall that f maps the whole of X into a subset of Y . This is made precise by defining the image of X under f . Definition 2.1.2 (Image) Let f : X → Y ⊂ R be given. The image of the domain X under f is the set f (X) = {y ∈ Y | ∃ x ∈ X such that f (x) = y}. (2.4) In other words, the image of the domain X under f is the set of those points of the co-domain Y which can be√generated by applying f to all points of X. In the case of X = R, Y = R and f : x → (x − 2)2 , Definition 2.1.2 yields f (X) = [0, +∞) ⊂ (−∞, +∞) = R. Thus, in the example, the image is a proper subset of Y , f (X) ⊂ Y . Similarly, the inverse image can be defined. Definition 2.1.3 (Inverse image) Let f : X → Y ⊂ R be given together with a non-empty subset B of Y , ∅ = B ⊂ Y . The inverse image of B under f is the set f −1 (B) = {x ∈ X| f (x) ∈ B}.
(2.5)
In other words, the inverse image of a subset B of the co-domain Y is the set of those points of the domain X from which B can be generated by f . For X = R, B = [ 12 , 1] and √ √ √ f :√x → (x − 2)2 it follows from Definition 2.1.3 that f −1 ([ 12 , 1]) = [ 2 − 1, 12 2] ∪ √ [ 32 2, 2 + 1], so it is a disconnected set. On the other hand, f −1 ([−1, − 12 ]) = ∅, which says that the co-domain Y = R is too large: there are no points in X = R that can be mapped by f to (−∞, 0). The notions of image and inverse image of f are useful in the context of constrained optimisation. Differentiable Optimisation and Its Limitations Consider a real function, i.e. f : X → Y with X ⊂ R and Y ⊂ R. Excluding exotic cases, we assume that the function is continuous and the domain connected (an interval). Further, we require that the interval is closed, i.e. X = [a, b] ⊂ R with a < b, so that the existence of extrema for a continuous f is guaranteed.1 In particular, the closedness of [a, b] excludes degeneracies like X = (0, +∞) with f : x → 1/x; here f is continuous on X, but has no extrema: no maximum when x → 0 and no minimum for x → +∞. 1 This follows from the extreme value theorem, see Sutherland (1975).
COMPUTATIONAL OPTIMAL CONTROL: TOOLS AND PRACTICE
12
f (x)
a=x *1
x*2
x*3
x*4 =b
x
Figure 2.1 Possible extrema for a continuous f : [a, b] → Y ⊂ R: (1) on the boundary x1∗ = a or x4∗ = b, (2) in the interior x2∗ , x3∗ ∈ (a, b). Note that at x = x3∗ the function f is not differentiable (but still continuous).
For the set-up of f : [a, b] → Y ⊂ R, f continuous, there are two possibilities: (1) the extrema occur on the boundary of X, i.e. x ∗ = a or x ∗ = b, or (2) in the interior Int(X) of X, i.e. x ∗ ∈ (a, b), see Figure 2.1. The extrema may be points at which the derivative exists (x2∗ in Figure 2.1), or does not (x3∗ in Figure 2.1). The simple example in Figure 2.1 shows that the familiar necessary condition of differentiable optimisation, f (x ∗ ) = 0, can only work in the interior of X. It obviously fails at the points of non-differentiability, but is not useful on the boundary either, as is now explained. If f exists at a point x0 ∈ X, it means that the two-sided limit lim
x→x0
f (x) − f (x0 ) f (x) − f (x0 ) f (x) − f (x0 ) = lim = lim − + x − x0 x − x0 x − x0 x→x0 x→x0
(2.6)
must exist, i.e. the limiting value of the ratio (f (x) − f (x0 ))/(x − x0 ) must be the same when approaching x0 from the left and right. In particular, for x0 = a the left limit limx→x − 0 cannot be formed and similarly limx→x + is not well defined for x0 = b. Even if we were to 0 use one-sided limits, then f+ (a) = 0 or f− (b) = 0 would emphatically not be a necessary condition for an extremum, as seen from Figure 2.1. If, however, (2.6) exists at x0 ∈ Int(X), then it yields a unique, linear approximation of f in a neighbourhood of x0 , namely f1 (x) = f (x0 ) + f (x0 )(x − x0 ) for all x ∈ Nδ (x0 ) = {x ∈ X| |x − x0 | < δ} with δ > 0. When f (x0 ) < 0 (Figure 2.2a), or f (x0 ) > 0 (Figure 2.2b), then we can always find some neighbouring points x of x0 such that f (x) > f (x0 ) or f (x) < f (x0 ), thus disqualifying x0 as a candidate for a local extremum (see Definition 2.1.1 on page 10).
OPTIMAL CONTROL: OUTLINE OF THE THEORY AND COMPUTATION a)
c)
b)
f (x)
f (x)
x x0
13
f (x)
x0
x
x0 = x*
x
Figure 2.2 Necessary condition for a differentiable extremum of f : [a, b] → Y ⊂ R in (a, b): (a) for f (x0 ) < 0 there cannot be an extremum, (b) for f (x0 ) > 0 no extremum either and (c) f (x0 ) = 0 must necessarily happen at a differentiable extremum.
Hence, without the condition f (x0 ) = 0 there is no possibility of a differentiable (local) extremum. In order to establish whether this necessary condition indeed yields an optimal point, we look at the second-order approximation of f at x0 , i.e. f2 (x) = f (x0 ) + f (x0 )(x − x0 ) + 12 f
(x0 )(x − x0 )2 for all x ∈ Nδ (x0 ). If f
(x0 ) exists, f2 is well defined. If the necessary condition f (x0 ) = 0 holds, then the second-order approximation reduces to f2 (x) = f (x0 ) + 12 f
(x0 )(x − x0 )2 with the following conclusions. For f
(x0 ) > 0 we can “fit” locally a unique minimising parabola (Figure 2.3a), and for
f (x0 ) < 0 a unique maximising one. However, for f
(x0 ) = 0 the second-order approximation becomes f2 (x) = f (x0 ) and yields no useful information about possible extrema at x0 . A point at which f (x0 ) = f
(x0 ) = 0 is thus called singular and requires special treatment. This involves higher order approximations to establish whether we can locally “fit” the curve f (n) (x − x0 )n with n > 2 odd or even. In the former case, there is no extremum (it is an inflection point, see Figure 2.3c); for n even it is a minimum if f (n) (x0 ) > 0 and a maximum when f (n) (x0 ) < 0, see Figure 2.3d. Not only is this procedure more demanding computationally, but also it requires higher order smoothness, which may not hold. Thus the treatment of singular points is another limitation of differentiable optimisation. Summary for One Variable The choice of the domain (and co-domain) of the optimised function has profound effects on the existence and quality of the optimal solution. Both sets must contain elements with the attributes used in the definition of the function. Otherwise, the domain (and co-domain) will not be rich enough to accommodate a solution or will admit a poor one. Differential calculus offers simple and powerful criteria for finding extrema in the interior of the domain. However, they require smoothness and non-singularity of the function and are useless on the boundary of the domain. All of the above findings are valid for any dimension, provided the essence of the underlying definitions is preserved in generalisation. However, one should remember that in the
COMPUTATIONAL OPTIMAL CONTROL: TOOLS AND PRACTICE
14
b)
a) f (x)
f (x)
x
x0
x0
x
d)
c) f (x)
f (x)
x x0
x0
x
Figure 2.3 Sufficient condition for a differentiable extremum of f : [a, b] → Y ⊂ R in (a, b): (a) for f
(x0 ) > 0 there is a minimum; (b) for f
(x0 ) < 0 a maximum occurs; (c) for f
(x0 ) = 0 there may be no extremum (an inflection) point; (d) when f
(x0 ) = 0 an extremum may still occur, but requires a higher order approximation to detect.
OPTIMAL CONTROL: OUTLINE OF THE THEORY AND COMPUTATION
15
scalar case the sets are made of intervals whose geometry is very limited. Thus, one may expect new phenomena for higher dimensions, having no counterparts for n = 1.
2.1.2 Finite Dimension: Two or More Variables In a regular single variable case the domain X is a closed interval [a, b] and therefore has the very simple interior (a, b) and the trivial boundary {a, b}. This is no longer true for X ⊂ Rn , n > 1. The generic possibilities will be illustrated for n = 2 to facilitate easy geometric visualisation and will be described analytically for an arbitrary n. As for the single variable, we look at the interior of the domain and its boundary. Extrema in the Interior of the Domain Consider a differentiable function of two variables f : R2 → R possessing at least one extremum. Since X = R2 , this is an unconstrained problem (there is no boundary), so the domain equals its interior. By assumption, the graph of f is a smooth surface in R3 and, by analogy with the scalar case, we seek its local linear approximation with a plane in R3 . This is given by f1 (x) = f (x0 ) + [∇f (x0 )]T (x − x0 ), valid for all x ∈ Nδ (x0 ) = {x ∈ X| x − x0 < δ} with δ > 0. Here superscript T denotes the transpose and z ( ni=1 zi2 )1/2 is the Euclidean norm, or the length of vector z. Hence, for n = 2, the neighbourhood Nδ (x0 ) is the open disc centred at x0 , radius δ. Using the same reasoning as for the one-variable case, we conclude that ∇f (x0 ) = 0 is a necessary condition for a differentiable extremum. In order to determine whether x0 is indeed an extremum and, if so, of what kind, we must consider the second-order approximation 1 f2 (x) = f (x0 ) + [∇f (x0 )]T (x − x0 ) + (x − x0 )H (x0 )(x − x0 )T 2 for all x ∈ Nδ (x0 ). Here H (x0 ) is the Hessian matrix2 of f evaluated at x0 . When the necessary condition ∇f (x0 ) = 0 holds, then the second-order approximation reduces to 1 f2 (x) = f (x0 ) + (x − x0 )H (x0 )(x − x0 )T . 2 If H (x0 ) is positive (H (x0 ) > 0) or negative (H (x0 ) < 0) definite, then (x − x0 )H (x0 )(x − x0 )T represents a unique paraboloid which we can “fit” locally when ∇f (x0 ) = 0 and thus an extremum is found. When the Hessian is indefinite, H (x0 ) ≶ 0, then a new phenomenon arises: a saddle point. Thus, the locally approximating quadratic surface (x − x0 )H (x0 )(x − x0 )T is minimised along some directions through x0 , and at the same time is maximised along others. A typical example is f : (x1 , x2 ) → x1 x2 , see Figure 2.4. There is no counterpart of saddle point in the scalar case, because the one-dimensional geometry does not allow such a possibility. If H (x0 ) is singular, det H (x0 ) = 0, then—as before—the second-order approximation yields no new information and x0 is a singular point. In order to resolve this singularity, higher order approximations must be considered. This involves unwieldy tensorial expressions and requires more smoothness. The latter cannot be taken for granted in higher dimensions, where differential geometry of surfaces may be surprisingly complex. 2 The Hessian matrix is the square matrix of all second-order partial derivatives of f , i.e. H ∂ 2 f/∂x ∂x . ij i j
16
COMPUTATIONAL OPTIMAL CONTROL: TOOLS AND PRACTICE
Figure 2.4 Saddle point: the function f : (x1 , x2 ) → x1 x2 is minimised along the direction x1 − x2 = 0, while at the same time maximised along x1 + x2 = 0.
Finally, it should be emphasised that a saddle point, H (x0 ) ≶ 0, is not a singular point, det H (x0 ) = 0. It is a separate phenomenon, characteristic for n > 1 and having no analogue in the scalar case. The above discussion started with the assumption that X = R2 , i.e. no constraints present. However, if we look at local extrema only, all that is needed is the existence of an open neighbourhood of x0 . By definition, for n = 2, this is an open disc centred at x0 , radius δ, which, by an appropriate choice of δ > 0, is entirely contained in the domain X: Nδ (x0 ) = { x ∈ X | x − x0 < δ }. all points are in X
open disc
This cannot be the case if x0 is a boundary point, see xˆ 0 in Figure 2.5. However, it will always be possible (for a sufficiently small δ) when x0 is an interior point of X. Hence, the conditions ∇f (x0 ) = 0 and H (x0 ) > 0 (or H (x0 ) < 0) are relevant for any domain with a non-empty interior. By the same token, these conditions are irrelevant for detecting extrema on the boundary of X; the boundary has no interior. Extrema on the Boundary: Equality Constraints The presence of constraints in the formulation of the optimisation problem alters the nature of the solution radically. Consider two functions p : R2 → R and s : R2 → R given by p : (x1 , x2 ) → x12 + x22 and s : (x1 , x2 ) → x1 x2 , i.e. a circular paraboloid p and a saddle
OPTIMAL CONTROL: OUTLINE OF THE THEORY AND COMPUTATION
17
x2
x^0
x0 x1
Figure 2.5 Interior and boundary points: x0 is an interior point, as its open disc is wholly inside the domain; xˆ 0 is a boundary point, as every open disc around it must contain both interior and exterior points.
s, together with the equality constraint g(x) = x12 + x22 − 1 = 0. In other words, the only admissible points are those on the unit circle. For p the solution is infinitely many extrema, as this function is simply constant on g(x) = 0. This is very different from a single minimum√(x1∗ , x2∗ )√= (0, 0) in the uncon√ x3∗ = (−1/ 2, strained case. For s there are two global minima at x2∗ = (1/ 2, −1/ 2) and √ √ √ ∗ ) = s(x ∗ ) = − 1 , and two global maxima at x ∗ = (1/ 2, 1/ 2) and 1/ 2) for which s(x 3 1 2 √ √2 x4∗ = (−1/ 2, −1/ 2) with s(x1∗ ) = s(x4∗ ) = 12 . This follows from the fact that the constraint cuts from the surface of the saddle a closed curve which is a sinusoid “trapped” on a unit circle, see Figure 2.6. More formally, the function s is restricted to the image of the constraint g(x) = 0 under s (see Definition 2.1.2 on page 11). This, again, is very different from the unconstrained case when s had no extrema at all. The main reason for this sudden change is that the formulation optimise f (x),
x ∈ R2
subject to g(x) = 0
(2.7) (2.8)
implicitly replaces the function f of two variables with a new function of fewer variables. For the first example optimise p(x) = x12 + x22 subject to g(x) = x12 + x22 − 1 = 0,
(2.9) (2.10)
18
COMPUTATIONAL OPTIMAL CONTROL: TOOLS AND PRACTICE
Figure 2.6 Extrema of the saddle s : (x1 , x2 ) → x1 x2 with the equality constraint g(x) = x12 + x22 − 1 = 0. The admissible part of the saddle surface is the image of the constraint (unit circle) under s (left), resulting in√a sinusoid around a√circle√ in R3 (right). There are √ “trapped” ∗ ∗ two global minima at x2 = (1/ 2, −1/ 2) and x3 = (−1/ 2, 1/ 2) for which s(x2∗ ) = √ √ √ √ s(x3∗ ) = − 12 , and two global maxima at x1∗ = (1/ 2, 1/ 2) and x4∗ = (−1/ 2, −1/ 2) with s(x1∗ ) = s(x4∗ ) = 12 .
OPTIMAL CONTROL: OUTLINE OF THE THEORY AND COMPUTATION
19
we simply compute x12 + x22 = 1 from (2.10) and substitute it into (2.9), thus obtaining = 1 p(x). ¯ (2.11) p(x) g(x)=0
The new function p¯ depends on zero variables now (as it is constant) and has no constraints.
x2
t2 t1
p1
p2
x1
t3
p3
Figure 2.7 Implicit function theorem for the constraint g(x) = x12 + x22 − 1 = 0. The arc in the vicinity of p1 can be used to express x2 as a function of x1 and vice versa; the tangent of t1 is non-zero. The arc in the neighbourhood of p2 allows only for expressing x1 in terms of x2 : the tangent of t2 is infinite (inverse tangent is zero). Finally, the situation around p3 is the mirror image of p2 . For the second example, optimise s(x) = x1 x2 subject to g(x) =
x12
+
(2.12) x22
− 1 = 0,
(2.13)
we want to express x2 in terms of x1 using (2.13) and substitute the result into (2.12). This way we should produce a new function of one variable, namely x1 , which has no constraints. However, this elimination of the equality constraint is not as simple as before, because x2 cannot be expressed globally as a function of x1 . This is illustrated in Figure 2.7: x2 = (1 − x12 )1/2 (upper semi-circle), or x2 = −(1 − x12 )1/2 (lower semi-circle), but not both. Similarly, x1 = (1 − x22 )1/2 (right semi-circle), or x1 = −(1 − x22 )1/2 (left semi-circle), but not both. However, if g is a more complicated function of x and/or n > 3, then explicit elimination and a graphical illustration are unlikely. A more systematic method is offered by the implicit function theorem, see Krantz and Parks (2002), which is a sufficient condition for the existence of the function expressing x2 in terms of x1 (or vice versa), relying on the smoothness of g. This theorem generalises to any n and also is valid in the infinite dimension.
20
COMPUTATIONAL OPTIMAL CONTROL: TOOLS AND PRACTICE
Before explaining the theorem we need to address the issue of well-posedness of the constrained problem. It is useful to think about the constraint as a level set of the function g. The graph of g : (x1 , x2 ) → x12 + x22 − 1 is the paraboloid x12 + x22 shifted downwards by 1 along the x3 -axis. Let this surface in R3 be intersected by planes parallel to the x1 x2 -plane and positioned at different levels of the x3 -axis, see Figure 2.8. For each plane, the resulting intersection with the graph of g is a level set. The actual curve of the constraint is the zero level set. For n = 2 a collection of orthogonal projections of level sets is the contour plot, see Figure 2.8. In other words, each level set is the inverse image g −1 (B) of a single-element set B = {l}, where l is the level (value on the x3 -axis), see Definition 2.1.3 on page 11. Hence, the constraint is the inverse image of {0}, i.e. g −1 ({0}) = {(x1 , x2 ) ∈ R2 | x12 + x22 − 1 = 0}.
Figure 2.8 The equality constraint g(x) = x12 + x22 − 1 = 0 is the zero level set of the function g. The graph of the function is a paraboloid shifted downwards by 1 along the x3 -axis. When intersected with planes parallel to the x1 x2 -plane at different levels of the x3 -axis (left), this surface will produce level sets. A collection of level sets in R2 is a contour plot, i.e. orthogonal projections of the level sets on the x1 x2 -plane (right). Recasting the problem in this language has the advantage of clarifying the issue of existence of solutions to (2.7)–(2.8), independent of the dimension. The fundamental requirement is that g −1 ({0}) = ∅, i.e. the constraint must allow at least one point. Returning to the threedimensional picture of (2.13) as a downwards-shifted paraboloid, it is useful to imagine a plane parallel to the x1 x2 -plane moving from +∞ towards −∞ along the x3 -axis, see −1 Figure 2.8. For levels l > −1 the √ resulting level set, or inverse image g ({l}), will be a circle centred at (0, 0), radius |l|. When l = −1, the level set becomes a single point, g −1 ({−1}) = (0, 0), and for l < −1 the level sets are empty.
OPTIMAL CONTROL: OUTLINE OF THE THEORY AND COMPUTATION
21
Suppose now that the constraint is well-posed (the zero level set is non-empty) and that g is a differentiable function, so that ∇g exists everywhere. Our example of g : (x1 , x2 ) → x12 + x22 − 1 is sketched in Figure 2.7, where three tangent lines to the zero level curve (the unit circle) are drawn. Each line determines a local linear approximation of the curve. These approximations can help determine whether it is possible locally to express x2 as a function of x1 , or vice versa. Thus, t1 shows that in a neighbourhood of the tangency point p1 both x2 = φ(x1 ) and x1 = ψ(x2 ) are possible. On the other hand, t2 indicates that x1 = ψ(x2 ) is available around p2 , but not x2 = φ(x1 ), and conversely for t3 and p3 . While the unit circle is the zero level set of the surface corresponding to the graph of g, the lines t1 , t2 , t3 are the zero level sets of the tangent planes to this surface at p1 , p2 , p3 , respectively. It follows that unless the tangent plane is parallel to the x1 x2 -plane, there will be useful tangent lines, as shown in Figure 2.7. Otherwise, the zero level set of the tangent plane is the plane itself and no new information is obtained. Thus, so long as at least one of the partial derivatives gx1 , gx2 in ∇g is non-zero, the variables x1 , x2 can be expressed in terms of one another in at least one way. This is formalised as follows. Theorem 2.1.4 (Implicit function theorem) Let g : R2 → R have continuous partial derivatives gx1 and gx2 at a point x 0 = (x10 , x20 ). If at this point g(x10 , x20 ) = 0
(2.14)
gx2 (x10 , x20 ) = 0,
(2.15)
and then there exists a function φ : R → R such that x2 = φ(x1 )
(2.16)
g(x1 , φ(x1 )) = 0
(2.17)
and in some neighbourhood Nδ (x 0 ) of this point. Moreover, φ is continuously differentiable and its derivative is given by φ (x) = −
gx1 (x) gx2 (x)
(2.18)
for all x in Nδ (x 0 ). A symmetric result holds if the roles of x2 and x1 are exchanged and φ replaced with ψ . This theorem is an existence result: it provides a sufficient condition for resolving (2.8), but does not say how to do it effectively. In general, solving an implicit equation is often difficult, so a brute force extraction of x2 = φ(x1 ), or x1 = ψ(x2 ), from (2.8) and substitution into (2.7) is not always practical. However, Theorem 2.1.4 does give in (2.18) an effective formula for the derivative of φ, or ψ. With this formula it is possible to exploit in practice the basic intuition of eliminating the equality constraint and reducing the dimension of the problem. This is the method of Lagrange multipliers described below.
22
COMPUTATIONAL OPTIMAL CONTROL: TOOLS AND PRACTICE
Extrema on the Boundary: Lagrange Multipliers Suppose the constraint g(x) = 0 satisfies the conditions of the implicit function theorem, i.e. ∇g = 0, or the partial derivatives gx1 and gx2 are never zero at the same time. Recall that for n = 2 graphs of both f and g are surfaces in R3 . In the optimisation process the values of f are allowed to change and to each of such value l there corresponds a level curve. A collection of all these curves, plotted on the x1 x2 plane, is the level set portrait, see Figure 2.9b, on the right. It is the result of the orthogonal projection (on the zero level plane) of the surface which is the graph of f , see Figure 2.9b, on the left. On the other hand, the only manifestation of g is its zero level curve, which can be superimposed on the level set portrait of f , see Figure 2.9b, on the right. The minimisation process may be visualised as follows, see Figure 2.9a. Move across the level set portrait of f in the direction of decreasing values l of f . While doing so, make sure that the level curves of f intersect the constraint curve g(x) = 0, otherwise there will be no feasible points. The lowest level value li for which this is still possible is a minimum.3 The thus obtained solution x ∗ can be characterised analytically. At this point the corresponding level curve of f and the constraint curve have a common tangent: gx1 fx2 gx2 fx1 = or = . (2.19) fx2 x=x ∗ gx2 x=x ∗ fx1 x=x ∗ gx1 x=x ∗ One of the expressions in (2.19) must exist, because g satisfies the assumptions of Theorem 2.1.4, i.e. either gx1 = 0 or gx2 = 0. The reason for the common tangent property is easily noticed from Figure 2.9a, but can be derived formally, using Theorem 2.1.4. Denoting the common ratio by λ, equation (2.19) can be rewritten as fx1 + λgx1 = 0
(2.20)
fx2 + λgx2 = 0
(2.21)
g(x) = 0,
(2.22)
which, together with is a system of three equations in three unknowns and thus—in principle—soluble, yielding x ∗ . By inspection, if F (x, λ) f (x) + λg(x) is defined, then (2.20)–(2.22) are a result of ∇F = 0, or a necessary condition for an unconstrained extremum of F . Solving (2.20)– (2.22) means indirect finding of x2 = φ(x1 ) and substituting it into f . It is indirect, as we work with the derivative of φ, rather than φ itself. Although an extra variable, λ, is introduced, the onerous problem of actually finding φ is avoided altogether. Moreover, λ carries with it interesting information: it is the sensitivity of the optimal solution to the change in constraint. Indeed, from (2.20) we have ∂f fx1 ∂f ∂x1 λ=− = − ∂g =− . gx1 x=x ∗ ∂g x=x ∗ x=x ∗
(2.23)
∂x1
This should be understood as follows. Suppose that we found an optimal solution x ∗ under the constraint g(x) = 0, for which the optimal value is f (x ∗ ). Then we perturb the constraint 3 For maximisation move in the direction of increasing values l of f .
OPTIMAL CONTROL: OUTLINE OF THE THEORY AND COMPUTATION Common tangent line x1
23
x*
x2
Decreasing values of f
g( x ) = 0 Constraint curve
f = l 5 < l4 f = l 4 < l3 f = l 3 < l2 f = l 2 < l1 f = l1
a) General illustration
b) Concrete example Figure 2.9 Geometric interpretation of Lagrange multipliers. (a) The optimal solution is x ∗ , the point of common tangency; elsewhere f (x1 ) > f (x ∗ ) < f (x2 ). (b) Example for the function f (x) = x12 − x1 x2 + x22 − 1 and the equality constraint g(x) = x1 + 2x22 − 0.3 = 0. On the right the ellipses are the level curves of f , the black curve is the constraint g(x) = 0, the vectors are the field of −∇f (indicating the decrease of f ), the grey line is the common tangent of f and g, and the black dot is the optimal solution x ∗ . The 3-D geometry is on the left: the surface is the graph of f , the grid of dashed lines is the zero level plane Z in which the constraint curve lies, the black dot is the optimal solution x ∗ = (0.218, 0.203) ∈ Z, the white dot is the optimal value f (x ∗ ) = −0.956; note that f (x ∗ ) lies below Z.
24
COMPUTATIONAL OPTIMAL CONTROL: TOOLS AND PRACTICE
infinitesimally to g(x) ¯ = 0 and find the corresponding optimal solution x¯ ∗ , so that the new ∗ optimal value is f (x¯ ). The sensitivity ∂f λ=− ∂g x=x ∗ is a quantitative measure of the infinitesimal change of the optimal value from f (x ∗ ) to f (x¯ ∗ ) and its sign indicates the direction of change (increase/decrease). Inequality Constraints: Active Set The earlier developments allow us, in principle, to deal with inequality constraints g(x) 0.
(2.24)
The key observation is that (2.24) can be split into either g(x) < 0 or g(x) = 0. The first situation is the same as for the extrema in the interior of the domain, while the second is identical to that for the extrema on the boundary. However, in order to apply the appropriate method we must know, during the search for an extremum, whether we are on the constraint boundary, g(x) = 0; if we are not, we must be in the interior, g(x) < 0. If we are on the boundary, then we say that the constraint is active. A systematic way of establishing the active set of constraints is given by the Karush–Kuhn–Tucker conditions, lucidly explained in Bryson and Ho (1975, pp. 24–29), and therefore will not be elaborated here. In essence, the Karush–Kuhn–Tucker approach formalises a reasoning not unlike the one applied to (2.19). The result is simple conditions linking the gradient of f with the gradient of g. Summary for Two or More Variables The developments for n > 1 followed the pattern of the scalar case. However, since the underlying geometry is more involved than for intervals, new possibilities arose: saddle points, equality and inequality constraints. Constraints have a very strong influence on the nature of the optimal solution. As in the scalar case, the interior and boundary points must be treated separately, giving rise to the powerful method of Lagrange multipliers and the Karush–Kuhn– Tucker conditions. These findings carry over surprisingly well to the infinite dimension. However, infinitedimensional optimisation involves choosing a function rather than a vector of numbers, so there will arise some possibilities unknown for n < ∞. This is to be expected, as something (conceptually) similar happened in the transition from n = 1 to n > 1.
2.1.3 Infinite Dimension The problems outlined in Sections 2.1.1 and 2.1.2 for the finite dimension have their (more complex) counterparts in infinite-dimensional optimisation. However, the conceptual framework developed for n < ∞ is still applicable as a solution strategy, with appropriate generalisations. These are outlined below.
OPTIMAL CONTROL: OUTLINE OF THE THEORY AND COMPUTATION
25
Domains in Infinite Dimension In the finite dimension the domain of the optimised function was a subset of Rn , i.e. the solution was sought among n-tuples of real numbers (x1 , x2 , . . . , xn ). In order to compare a candidate solution x ∗ with neighbouring vectors x, the vector length was defined, x ( ni=1 xi2 )1/2 , so that the distance x − x ∗ could be used. Once this quantitative characterisation of proximity was set up, the notions of continuity, derivative and extremum could be defined, see Definition 2.1.1 on page 10. This finite-dimensional set-up is formalised below. Definition 2.1.5 Let a non-empty set X ⊂ Rn be given.
i. A function f : X → R is continuous at x0 ∈ X if, and only if, ∀ε > 0 ∃δ > 0
x − x0 < δ ⇒ |f (x) − f (x0 )| < ε.
ii. A function f : X → R is differentiable at x0 ∈ X if, and only if, lim
h→0
|f (x0 + h) − f (x0 ) − A(x0 )h| = 0, h
where h x − x0 with x ∈ X. The map A : X → R is linear in h and depends on x0 only. iii. A point x ∗ ∈ X is a local extremum of the function f : X → R if, and only if, ∃δ > 0
∀x ∈ X
x − x ∗ < δ ⇒ f (x) f (x ∗ ) ∗
⇒ f (x) f (x )
(minimum) (maximum).
The form of Definition 2.1.5(ii) emphasises the fact that the derivative is a unique, linear approximation A(x − x0 ) of f in the immediate vicinity of x0 , x − x0 → 0. Since the defining limit exists, it is unique and thus so is A. Definition 2.1.5(iii) simply says that the neighbouring points of x ∗ cannot improve the value of f , see also Definition 2.1.1 on page 10. A more important feature of Definitions 2.1.5(i)–(iii) is that they will work unaltered for any set X for whose elements we can define (1) vector addition and scalar multiplication, and (2) vector length. Vectors in Rn are n-tuples, e.g. x = (x1 , x2 , . . . , xn ) and y = (y1 , y2 , . . . , yn ); their addition is performed component-wise: x + y (x1 + y1 , x2 + y2 , . . . , xn + yn ); and similarly for scalar multiplication: αx (αx1 , αx2 , . . . , αxn ), α ∈ R. If we let n → ∞, then we obtain a natural extension of an n-vector into an infinite sequence x = (x1 , x2 , . . . , xn , . . .), and the operations of vector addition and scalar multiplication generalise immediately. The question of extending the definition of vector length x from ( ni=1 xi2 )1/2 to ∞ 2 ∞ 2 1/2 is less straightforward. This is because the infinite series i=1 xi may not ( i=1 xi ) converge, while the finite sum ni=1 xi2 is always a number. A useful infinite-dimensional vector space only allows those infinite sequences which are quadratically convergent. It then
26
COMPUTATIONAL OPTIMAL CONTROL: TOOLS AND PRACTICE
holds that x =
∞
xi2
1/2
<∞
and y =
∞
i=1
⇒ x + y =
yi2
1/2
<∞
i=1 ∞
1/2 (xi + yi )2 < ∞, i=1
so that the finite length property is preserved under vector addition, despite the infinite dimension. This set-up of vector spaces of infinite sequences generalises easily to vector spaces of functions. For example, if x, y : [a, b] → R are real functions on a time interval, then their addition is performed time-wise: (x + y)(t) = x(t) + y(t) and the length of x is x =
b ( a |x(t)|2 dt)1/2 . Again the defining integral must exist, for every vector must have finite length. However, functions can have properties like (piecewise) continuity or differentiability, not available for sequences, but having important effects. For example, the vector space C([a, b]) = {f : [a, b] → R| f continuous} has more elements than the vector space D([a, b]) = {f : [a, b] → R| f differentiable}, because if f is differentiable, then it is continuous, but not always conversely, e.g. f : t → |t| is in C([a, b]), but not in D([a, b]); thus, D([a, b]) ⊂ C([a, b]). This is not dissimilar (at least conceptually) to the differences between various number sets like R (reals), Q (rationals) or Z (integers). However, the elements of function spaces are much richer objects: each requires infinitely many numbers in order to be defined. The infinite processes involved allow defining for the functions (piecewise) continuity, differentiability, boundedness, periodicity—attributes for which there are no counterparts in number √ sets. The common thread, though, is the conclusion from the example f : x → (x − 2)2 of Section 2.1.1 considered for Z, Q and R: a change of domain profoundly influences the character of the solution to the optimisation problem. For example, in optimal control a bang–bang solution often occurs in time-optimal problems which requires the control functions to be piecewise continuous, otherwise a solution simply does not exist. Differentiable Optimisation in Infinite Dimension The basic object of interest in infinite-dimensional optimisation is a functional, or a mapping from any set X to reals, J : X → R; in our case X will be taken as a subset of a function space. Since a functional is real valued, it can serve as a tangible performance index. From the mathematical point of view Definitions 2.1.5(i)–(iii) apply as soon as · is defined. In particular, if a functional J has a unique linear approximation for a function x0 ∈ X, as in Definition 2.1.5(ii), then we can talk about the derivative of the functional. This then allows defining the first variation, an analogue of the finite-dimensional differential df = f (x0 )dx. The first variation is denoted by δJ and the optimisation theory of
OPTIMAL CONTROL: OUTLINE OF THE THEORY AND COMPUTATION
27
functionals is known as the calculus of variations, see Elsgolc (1962). Also, the second variation, denoted by δ 2 J , can be defined as the analogue of the unique, quadratic (second-order) approximation of J at x0 . Since the structure of the vector space is independent of dimension, we obtain the necessary and sufficient conditions for a local extremum δJ = 0 and δ 2 J > 0 (δ 2 J < 0), analogous to the finite-dimensional case. Equality constraints are again dealt with by a generalisation of the Lagrange multipliers, which are now functions as well, i.e. if x : [a, b] → R, then also λ : [a, b] → R. The situation becomes somewhat more involved when the calculus of variations is generalised to optimal control. This is discussed next in Section 2.2.
2.2 The Optimal Control Problem The problem is to find an admissible control u = u(t) which minimises the performance index tf min J = φ[x(tf ), tf ] + L[x(t), u(t), t]dt (2.25) u∈U
t0
with respect to the state vector functions X = {x : [0, tf ] → Rn | xi , i = 1, . . . , n, piecewise continuously differentiable},
(2.26)
and the control vector functions U = {u : [0, tf ] → U ⊂ Rm | ui , i = 1, . . . , m, piecewise continuous},
(2.27)
subject to the following constraints: x˙ = f (x(t), u(t)) x(0) = x0 ∈ Rn ψ(x(tf ), tf ) = 0 ∈ R
p
C(x(t), u(t)) 0 ∈ Rq S(x(t)) 0 ∈ Rs
f : Rn+m → Rn
(2.28)
x0 known
(2.29)
ψ : R × R+ → R , p n, tf unknown
(2.30)
C : Rn+m → Rq
(2.31)
S : Rn → Rs .
(2.32)
n
p
The performance index describes a quantitative measure of the performance of the system over time. In aerospace problems, a typical performance index gives an appropriate measure of the quantities such as minimum fuel/energy, optimal time, etc. Here φ : Rn+1 → R1 and L : Rn+m → R1 are assumed to be sufficiently often continuously differentiable in all arguments. The type of performance index (2.25) is said to be in the Lagrange form when φ ≡ 0 and in the Mayer form when L ≡ 0, see Oberle and Grimm (1989). Furthermore, it is in the linear Mayer form when L ≡ 0 and φ is linear. Minimising J with respect to the control function u = u(t) must be accomplished in a way consistent with the dynamics of the system whose performance is optimised. In other words, equation (2.28), expressing the system’s dynamics, is the first fundamental equality constraint. The optimal control u∗ = u∗ (t), when substituted into (2.28), will produce the optimal state x ∗ = x ∗ (t), while minimising J .
28
COMPUTATIONAL OPTIMAL CONTROL: TOOLS AND PRACTICE
The optimal state x ∗ is further constrained by the boundary conditions (2.29) and (2.30): in our case the launch and strike conditions. These are point constraints, i.e. they act only at selected points of the trajectory—t0 and tf —as opposed to the path constraints, valid for (t0 , tf ) and discussed below. It is a remarkable feature of optimal control problems that changes in the terminal conditions (2.30) may have a profound impact on the structure of the solution throughout the whole interval (t0 , tf ). The optimal control u∗ and optimal state x ∗ are subject to path constraints (2.31) and (2.32). Unlike boundary conditions (2.29)–(2.30), these conditions must be satisfied along the trajectory, i.e. on (t0 , tf ), which is a more demanding requirement than for the point constraints. In further contrast to (2.29)–(2.30), the path constraints are inequality constraints, making their analysis more involved. This is briefly discussed now, separately for (2.31) and (2.32). In the case of mixed state–control constraints (2.31), either (1) C = 0 or (2) C < 0, and establishing the subintervals of (t0 , tf ) when (1) occurs is of fundamental importance. If the constraint is active, case (1), then (2.28) and (2.31) become a system of differential algebraic equations. Indeed, equation (2.31) then implicitly defines the state x as a function of control u, effectively lowering the dimension of the original system of controlled ordinary differential equations (2.28). It is important to note that, when (2.31) is active, the algebraic relationship between x and u is clear (at least in principle), provided that the assumptions of the implicit function theorem hold. In the case of pure state constraints (2.32), again, either (1) S = 0 or (2) S < 0, and the occurrence of (1) is the key issue. However, the situation is now more challenging compared with the previous one, C = 0, because it is not explicit how S = 0 constrains the choice of u and thus how to modify the search for optimal control. Various approaches are possible, and are discussed in Sections 2.3.3 and 3.4.2, but it should be noted that the presence of a pure state constraint (2.32) is always a challenge in the context of optimal control. By contrast, the mixed (state and control) constraint (2.31) is easier to deal with, due to the explicit presence of control u. Finally, it should be mentioned that (2.31) or (2.32) may be active on a subinterval of (t0 , tf ) or just at a point. In the former case, the constrained (active) subarc will be characterised by the entry time t1 and the exit time t2 with t0 t1 < t2 tf . In the latter case, the subarc collapses to a single (touch) point, t1 = t2 . The functions appearing in (2.25)–(2.32) are assumed to be sufficiently continuously differentiable with respect to their arguments. Note that the definition of U allows discontinuities in controls and thus implies corners (cusps) in the states, so that X comprises piecewise smooth functions. This is a practical necessity, as many real-world applications of optimal control involve bang–bang types of inputs. Problem (2.25)–(2.32) is infinite dimensional: its solution is not a finite vector of numbers, but a function. For a real-life application it is impossible to guess the optimal function, so recourse to approximate methods is necessary. They attempt to find a finite-dimensional representation of the solution which is accurate at the nodes of the representation, has acceptable error between the nodes and converges to the true function as the number of nodes tends to infinity, if second-order sufficient conditions hold. There are two main approaches to the solution of the problem. The direct approach replaces the continuous time interval with a grid of discrete points, thus approximating it with a finite-dimensional problem, albeit of high dimension (hundreds of discretised variables).
OPTIMAL CONTROL: OUTLINE OF THE THEORY AND COMPUTATION
29
The indirect approach preserves the infinite-dimensional character of the task and uses the theory of optimal control to solve it. This is the focus of Section 2.5.2, while the direct approach is treated in Section 2.5.1.
2.3 Variational Approach to Problem Solution The indirect approach to solution of the optimal control problem is based on a generalisation of the calculus of variations. Necessary conditions for an extremum are derived by considering the first variation of the performance index J with constraints adjoined in the manner of Lagrange. Since the setting is infinite dimensional, the familiar Lagrange multipliers are now functions of time, λ = λ(t), and are called co-states in analogy to the system state x = x(t). While in the finite-dimensional case the multipliers are computed from algebraic equations, the co-states obey a differential equation. The necessary conditions thus entail both the original differential equations of the underlying dynamical system and the associated adjoint differential equations of the co-states. The end result is a two-point boundary value problem (TPBVP) which is made up of the state and co-state equations together with the initial and terminal conditions. The approach is called indirect, because the optimal control is found by solving the auxiliary TPBVP, rather than by a direct focus on the original problem. Here, we consider the general nonlinear optimal control problem given by equations (2.25)–(2.30). The performance index is given in the Bolza form, so it contains a final cost function in addition to the general cost function. Introducing the Lagrange multipliers λ and ν and adjoining the dynamic equations and the boundary conditions to the performance index, we obtain the following augmented performance index: Ja = φ[x(tf ), tf ] + ν T ψ[x(tf ), tf ] tf L[x(t), u(t), t] + λT f [x(t), u(t), t] − x˙ dt. +
(2.33)
t0
The first-order necessary conditions can be derived by applying the variational approaches as follows: δJa =
∂φ ∂φ ∂ψ ∂ψ δxf + δxf + ν T δtf + δν T ψ + ν T δtf ∂x(tf ) ∂tf ∂x(tf ) ∂tf + (L + λT (f − x)) ˙ |t =tf δtf + (L + λT (f − x)) ˙ |t =t0 δt0 tf ∂L ∂L δx + δu + δλT (f − x) + ˙ ∂x ∂u t0 ∂f ∂f δx + λT δu − λT δ x˙ dt. + λT ∂x ∂u
(2.34)
Integrating by parts the last term of (2.34)
tf t0
−λ δ x˙ = −λ (tf )δx(tf ) + λ (t0 )δx(t0 ) + T
T
T
tf
t0
λ˙ T δxdt.
(2.35)
30
COMPUTATIONAL OPTIMAL CONTROL: TOOLS AND PRACTICE
Since the final time tf is free, the variation between the final state, δxf , and the state at the final time, δx(tf ), are different and can be defined as follows (see Figure 2.10): ˙ f )δtf . δxf = δx(tf ) + x(t
(2.36)
By substituting equations (2.35) and (2.36) into (2.34) we obtain ∂φ T ∂ψ T +ν − λ (tf ) δxf + δν T ψ δJa = ∂x(tf ) ∂x(tf ) ∂φ T ∂ψ T T + +ν + (L + λ (f − x) ˙ + λ x) ˙ |t =tf δtf ∂tf ∂tf + (L + λT (f − x) ˙ + λT x) ˙ |t =t0 + λT δx(t0 ) tf
∂L ∂f ∂f ∂L + λT + λ˙ T δx + +λ δu + ∂x ∂x ∂u ∂u t0 + δλT (f − x) ˙ dt.
(2.37)
The extremum of the functional J is obtained when the first variation δJa vanishes. Thus the necessary conditions can be established by setting the coefficients of the independent variations δx, δu, δλ and δν of (2.37) equal to zero. The initial state x(t0 ) and initial time t0 are given in this case; consequently δx(t0 ) and δt0 are both zero. The boundary at the terminal conditions is given by the first and third components of (2.37). In summary, the necessary conditions for J to have an extremum value are • state equation
x˙ = f [x(t), u(t), t]
(2.38)
∂L ∂f + λT ∂x ∂x
(2.39)
∂L ∂f +λ ∂u ∂u
(2.40)
• co-state equation −λ˙ T = • stationarity condition 0=
• boundary condition ∂φ ∂ψ ∂ψ ∂φ + νT − λT δxf + + νT + H δtf = 0 ∂x ∂x ∂tf ∂tf t
(2.41)
f
where H = L + λT f . If the final time and the final state are both free, the boundary conditions (2.41) can be rewritten as ∂φ ∂ψ + νT λ(tf )T = ∂x(tf ) ∂x(tf ) ∂φ ∂ψ 0= + νT + H (tf ). ∂tf ∂tf
OPTIMAL CONTROL: OUTLINE OF THE THEORY AND COMPUTATION
x(t)
31
δxf δx(tf ) x(t ˙ f )δxf
xf
x(t) + δx(t) x(t)
x0
t0
tf
tf + δtf
Figure 2.10 The difference between δxf and δx(tf ).
t
32
COMPUTATIONAL OPTIMAL CONTROL: TOOLS AND PRACTICE
Similarly, the derivation of the necessary conditions can be done by defining the Hamiltonian and the auxiliary function as follows: H (x, u, λ) L(x, u) + λT f (x, u)
(2.42)
(x, t, ν) φ(x, u) + ν ψ(x, u).
(2.43)
T
Here λ : [0, tf ] → Rn and ν ∈ Rp denote Lagrange multipliers or adjoint variables. The following necessary conditions (see Bryson and Ho (1975), Pesch (1994)) are obtained: • differential equations of Euler–Lagrange x˙ T = Hλ =
• minimum principle
∂H ∂λ
(2.44)
∂H λ˙ T = −Hx = − ∂x
(2.45)
u = arg min H (x, λ)
(2.46)
u∈U
• transversality conditions λT (tf ) = x |t =tf ( t + H )|t =tf = 0.
(2.47) (2.48)
If u appears nonlinearly in H , the control function can eliminated as a function of x and λ. This can be obtained in most practical problems explicitly: u = u(x, λ).
(2.49)
Otherwise, the solution can be computed iteratively from the implicit equation Hu = 0, provided that the assumptions of the implicit function theorem hold, too. Note that Hu = 0 and Huu > 0 (positive definite) are sufficient conditions for the necessary condition of the minimum principle (2.46) to hold, if U is an open set. The latter condition Huu > 0, respectively Huu 0 (positive semi-definite), is also known as the necessary condition of Legendre–Clebsch, respectively the strengthened necessary condition of Legendre–Clebsch, in the calculus of variations.
2.3.1 Control Constraints In this case the constraint contains the control u(t) only: C(u(t)) 0. Therefore the constraint can be adjoined directly to the Hamiltonian by a Lagrange multiplier. If u appears linearly in H , firstly we assume that m = 1 and U = [umin, umax ]. Equation (2.42) can be written in the form H (x, u, λ) = H1 (x, λ) + uH2 (x, λ).
(2.50)
OPTIMAL CONTROL: OUTLINE OF THE THEORY AND COMPUTATION
33
Hu = 0 does not determine the optimal control solution. If the second term H2 (x, λ) does not vanish identically on subinterval [tentry , texit ] of [0, tf ] with tentry < texit , the minimum principle yields umax if H2 < 0 u= umin if H2 > 0. H2 is called the switching function associated with the control variable u. However, if H2 vanishes on a subinterval of [0, tf ], the control variable u has a singular subarc. In this case the optimal control variable can be computed by successive differentiation of the switching function H2 with respect to time t until the control variable appears explicitly, see Bryson and Ho (1975, p. 110). The case for vector u is treated similarly, but may be more involved.
2.3.2 Mixed State–Control Inequality Constraints In this section, the constraint includes the state and control variables: C(x(t), u(t)) 0. The mixed inequality constraint can be adjoined directly to the Hamiltonian as in the previous section. For simplicity, we assume that m = q = 1 and using the augmented Hamiltonian H (x, u, λ) = L(x, u) + λT f (x, u) + µC(x, u).
(2.51)
Necessary conditions for minimising the Hamiltonian can then be derived. The Lagrangian parameter µ is 0 if C < 0 µ= µ 0 if C = 0. The Euler–Lagrange equations become Lx − λT fx T λ = −Hx = Lx − λT fx − µCx
if C < 0 if C = 0.
The control u(t) along the constrained arc can be derived from the mixed constraints: C(x, u) = 0 for all t with t1 t t2 and t1 < t2 ,
(2.52)
and the control variable can be represented by a function u = u(x)
(2.53)
if equation (2.52) can be uniquely solved for u. If Cu = 0, the multiplier µ is given by (2.46): Hu Lu + λT fu + µCu .
(2.54)
2.3.3 State Inequality Constraints We now summarise some results of optimal control theory for problems with a state variable inequality constraint (2.32) based on Bryson’s formulation. Consider now the following equation: S(x(t)) 0, S : Rn → Rs .
34
COMPUTATIONAL OPTIMAL CONTROL: TOOLS AND PRACTICE
For simplicity, we assume that m = s = 1 and that the constraint is active on a subinterval S(x(t)) ≡ 0
for all t ∈ [t1 , t2 ] ⊂ [0, tf ].
(2.55)
We take successive total time derivatives of (2.32) and substitute f (x(t), u(t)), until we obtain explicit dependence on u. We obtain on [t1 , t2 ] S(x) ≡ 0, S (1) (x) ≡ 0, . . . , S (r−1) (x) ≡ 0
(2.56)
(x, u) = 0.
(2.57)
with
S
(r)
If r is the smallest non-negative number such that (2.57) holds, r is called the order of the state constraint. Here S (r) (x, u) plays the role of C(x, u) in (2.51) so that the Hamiltonian is H (x, u, λ, µ) = L + λT f + µS (r) . Again for µ we have
µ=
0 µ0
(2.58)
if S (r) < 0 if S (r) ≡ 0.
The control u on the constrained arcs can be derived from (2.57) and µ from (2.46). The right-hand sides of the differential equations for the adjoint variables (2.45) are to be modified along [t1 , t2 ]. In order to guarantee that not only (2.57) but also (2.56) are satisfied, we have to require that the so-called entry conditions are fulfilled: N T (x(t1 ), t1 ) := (S(x(t1 )), S (1) (x(t1 )), . . . , S (r−1) (x(t1 ))) = 0.
(2.59)
Therefore λ generally is discontinuous at t1 and continuous at t2 . Sometimes boundary points occur instead of boundary arcs. If, for example, the order is r = 2, the following conditions hold: S(x(tb )) = 0 S (1) (x(tb )) = 0. (2.60) The first condition is regarded as an interior point condition and yields a possible discontinuity of λ; the second condition determines tb . Singular arcs are treated in a similar manner, leading to multi-point boundary value problems with jump conditions and switching functions.
2.4 Nonlinear Programming Approach to Solution As seen from Section 2.3, the indirect approach entails a considerable amount of rather nontrivial theoretical conditions. The relevant conditions have to be applied judiciously in order to formulate a TPBVP appropriate to the optimal control problem in question. Then the resulting TPBVP has to be solved and the optimal control u∗ calculated from the TPBVP solution including the optimal state x ∗ and co-state λ∗ . An alternative, aimed at avoiding the above complications, is to discretise the original problem (2.25)–(2.32) and interpret the result as a finite-dimensional optimisation problem. This approximation will result in a nonlinear programming (NLP), problem with equality and inequality constraints, possibly of high dimension due to the fineness of the discretisation grid.
OPTIMAL CONTROL: OUTLINE OF THE THEORY AND COMPUTATION
35
Hence we begin by recalling the basics of the NLP problem and associated necessary conditions for optimality. Suppose we have the optimisation problem as follows: min f (x)
(2.61a)
subject to gi (x) 0,
i = 1, . . . , m
(2.61b)
where the objective and the constraint functions (2.61) are assumed to be continuously differentiable. The problem is to find such a solution x ∗ that minimises the objective function, out of all possible solutions x that satisfy the constraints. Introducing the Lagrange function L(x, µ), m L[x, µ] = f (x) + µi gi (x), i = 1, . . . , m (2.62) i=1
where µi , i = 1, . . . , m, are known as the Lagrange multipliers. The theorem of Karush, Kuhn and Tucker gives first-order necessary conditions for a point x to be a local minimum. Theorem (Karush–Kuhn–Tucker (KKT) necessary conditions) Given the optimisation problem (2.61), where f (x), gi (x), i = 1, . . . , m, are differentiable. Let x ∗ be a point satisfying all constraints, and let ∇gi (x ∗ ), i = 1, . . . , m, be linearly independent at x ∗ . If x ∗ is a local optimum of (2.61) then there exist scalars µ∗i , i = 1, . . . , m, such that ∇f (x ∗ ) +
m
µ∗i ∇gi (x ∗ ) = 0
(2.63a)
i=1
µ∗i gi (x ∗ ) = 0,
i = 1, . . . , m
(2.63b)
µ∗i 0,
i = 1, . . . , m.
(2.63c)
Equation (2.63a) is equivalent to the equation ∇x L(x ∗ , µ∗ ) = 0; the scalars µi are called Lagrange multipliers. For a proof of the above theorem, see e.g. Bazaraa et al. (1993). Let us focus on the Mayer type of the performance index of (2.25): J = φ[x(tf ), tf ]
(2.64)
x˙ = f [x(t), u(t), t].
(2.65)
subject to dynamic equations Consider the nonlinear programming problem as follows: Y T = [u(t1 ), . . . , u(tN ), x(t1 ), . . . , x(tN )]
(2.66)
where t0 = t1 < t2 < · · · < tN = tf , defining hd = tf /N or tk+1 = tk + hd . Equation (2.65) can be approximated by the Euler method: x˙ = f [x(t), u(t), t] ≈
xk+1 − x(tk ) hd
(2.67)
for sufficiently small hd . The optimal control problem (2.64)–(2.65) can be transformed as follows. The objective function (2.64) can be rewritten in a discrete approach: J = φ[x(tN )].
(2.68)
36
COMPUTATIONAL OPTIMAL CONTROL: TOOLS AND PRACTICE
The dynamic equation (2.67) can be rewritten as follows: xk+1 − x(tk ) − hd f [x(tk ), u(tk ), tk ] = 0
(2.69)
and becomes an equality constraint. Thus the Lagrangian for the discrete optimal control above is L[x, µ] = φ[x(tN )] +
m−1
µi [xk+1 − x(tk ) − hd f [x(tk ), u(tk ), tk ]].
(2.70)
i=1
The KKT necessary conditions for the discrete approach are ∂L = xk+1 − x(tk ) − hd f [x(tk ), u(tk ), tk ] = 0 ∂µk
(2.71)
∂L ∂f = (µk − µk−1 ) + hd µTk =0 ∂xk ∂xk
(2.72)
∂L ∂f = hd µTk =0 ∂uk ∂uk
(2.73)
∂φ ∂L = µtN −1 + = 0. ∂xtN ∂xtN
(2.74)
Equations (2.71)–(2.74) can be used as estimators of equations (2.38)–(2.42) by letting N → ∞ and hd → 0. Thus the KKT necessary conditions can be used as estimators of the optimal control necessary conditions.
2.5 Numerical Solution of the Optimal Control Problem The numerical solution of the optimal control problem can be categorised into two main approaches. The first approach corresponds to the direct method which is based on discretisation of state and/or control variables over time using (2.25)–(2.32), so that an NLP solver can be used. The second approach corresponds to the indirect method where (2.38)–(2.41) are used. The first step of this method is to formulate an appropriate TPBVP and the second step is to solve the TPBVP numerically.
2.5.1 Direct Method Approach Direct methods are based on the transformation of the original optimal control problem (2.25)–(2.32) into an NLP problem by discretising the state and/or control history and then solving the resulting NLP problem. A variety of direct methods has been developed and applied. Gradient algorithms were proposed by Kelley (1962) and by Bryson and Denham (1962). Pytlak solved a state-constrained optimal control problem using a gradient algorithm and applied it for some problems, see Pytlak (1999) and Pytlak (1998). Hargraves and Paris (1987) reintroduced the direct transcription approach, by discretising the dynamic equations using a collocation method. A cubic polynomial is used to approximate the state variables and linear interpolation for the control variables. The collocation scheme was originally used by Dickmanns and Well (1974) to solve TPBVPs. Seywald et al. introduced an approach
OPTIMAL CONTROL: OUTLINE OF THE THEORY AND COMPUTATION
37
based on the representation of the dynamical system in terms of differential inclusions. This method employs the concepts of hodograph space and attainable sets, see Kumar and Seywald (1995), Seywald et al. (1994), Seywald (1994) and Seywald and Kumar (1997). Direct transcriptions have been presented in detail by many researchers, e.g. Betts (2001), Betts and Huffman (1991), Betts and Huffman (1992), Betts (1994b), Betts (1994a), Betts and Huffman (1993), Betts (1998), Betts and Huffman (2003), Enright and Conway (1991b), Enright and Conway (1991a), Herman and Conway (1996), Tang and Conway (1995), Fahroo and Ross (2002), Ross and Fahroo (2004b), Ross and Fahroo (2004a), Elnagar et al. (1995), Elnagar and Razzaghi (1997), Elnagar (1997) and Elnagar and Kazemi (1998). Based on the discretisation of the state and/or control, direct methods can be categorised into three different approaches. The first approach is based on state and control variable parameterisation. Both the control and the state are discretised and then the resulting discretisation is solved using an NLP solver. The direct collocation approach, based on the full discretisation of the state and control, is considered below and compared with partial discretisation in which just the control is discretised while the state is obtained recursively. Then the Legendre pseudospectral method is considered where the state and control variables are approximated using the Lagrange interpolation polynomial. The second approach is control parameterisation, so that the state and performance index can be solved by numerical integration. This approach is known as control parameterisation and its idea is to approximate the control variables and compute the state variables by integrating the state equations. The control variables can be approximated by choosing an appropriate function with finitely many unknown parameters. This method was described by Rosenbrock and Storey (1966) and Hicks and Ray (1971) by representing the control as
u(t) =
n
ai φi (t)
(2.75)
i=0
where ai denote unknown parameters and φi (t) are some polynomial functions. Hicks and Ray (1971) reported some difficulties with these methods. Brusch (1974) introduced piecewise polynomials for the control approach in equation (2.75) and this modification can handle constraints efficiently. These methods can be found in many papers and books, e.g. Teo et al. (1999) have studied control parameterisation by introducing variable switching time into equivalent standard optimal control problems involving piecewise constant or piecewise linear control functions with pre-fixed switching times. Control parameterisation with the direct shooting method has been studied and applied for mechanical multi-body systems by Gerdts (2003) where the control is parameterised by the B-spline function. Further references include Goh and Teo (1988), Teo et al. (1991), Teo and Wong (1992), Kraft (1994) and Kraft (1985). The third approach is based on state parameterisation only, see Sirisena and Chou (1981). Jaddu and Shimemura (1997) solved unconstrained nonlinear optimal control problems by transforming them into a sequence of quadratic programming problem and state parameterisation. They extended it for constrained nonlinear optimal control problems using Chebyshev polynomials for the state parameterisation, see Jaddu and Shimemura (1999a) and Jaddu and Shimemura (1999b).
38
COMPUTATIONAL OPTIMAL CONTROL: TOOLS AND PRACTICE
Direct Collocation Approach The basic approach for solving optimal control problems by direct collocation is to transform the optimal control problem into a sequence of nonlinear constrained optimisation problems by discretising the state and/or control variables. Two approaches will be considered. The first approach is based on the discretisation of both the state and control variables. The following derivation is based mainly on that of von Stryk and Bulirsch (1992). The duration time of the optimal trajectory is divided into subinterval as follows: t0 = t1 < t2 < t3 < · · · < tk = tf .
(2.76)
The state and control variables at each node are xj = x(tj ) and uj = u(tj ), such that the state and control variables at the nodes are defined as NLP variables: Y = [u(t1 ), . . . , u(tk ), x(t1 ), . . . , x(tk )].
(2.77)
The controls are chosen as piecewise linear interpolating functions between u(tj ) and u(tj +1 ) for tj t tj +1 as follows: uapp (t) = u(tj ) +
t − tj [u(tj +1 ) − u(tj )]. tj +1 − tj
(2.78)
The value of the control variables at the centre is given by u(tj ) + u(tj +1 ) . (2.79) 2 The piecewise linear interpolation is used to prepare for the possibility of discontinuous solutions in control. The state variable x(t) is approximated by a continuously differentiable and piecewise Hermite–Simpson cubic polynomial between x(tj ) and x(tj +1 ) on the interval tj t tj +1 of length qj : 3 j t − tj xapp (t) = cr (2.80) pj r=0 u(tc,j ) =
j
c0 = x(tj ) j
c1 = qj fj j
c2 = −3x(tj ) − 2qj fj + 3x(tj +1 ) − qj fj +1 j
c3 = 2x(tj ) + qj fj − 2x(tj +1 ) + qj fj +1 where fj = f (x(tj ), u(tj ), tj ),
qj = tj +1 − tj
tj t tj +1 , j = 1, . . . , k − 1. The value of the state variables at the centre point of the cubic approximation is xc,j =
x(tj ) + x(tj +1 ) q(f (tj ) + f (tj +1 )) + 2 8
(2.81)
OPTIMAL CONTROL: OUTLINE OF THE THEORY AND COMPUTATION and the derivative is 3(x(tj ) + x(tj +1 )) q(f (tj ) + f (tj +1 )) dxc,j =− − . dt 2q 4
39
(2.82)
In addition, the chosen interpolating polynomial for the state and control variables must satisfy the midpoint conditions for the differential equations as follows: f [xapp(tc,j ), uapp (tc,j ), tc,j ] − x˙ app (tc,j ) = 0.
(2.83)
Equations (2.25)–(2.32) in Section 2.2 can now be defined as a discretised problem as follows: min f (Y ), (2.84) subject to f (xapp (t), uapp (t), t) − x˙ app = 0
(2.85)
xapp (t1 ) − x1 = 0
(2.86)
ψ(xapp (tk ), tk ) = 0
(2.87)
C(xapp (t), uapp (t), t) 0
(2.88)
S(xapp (t), t) 0
(2.89)
where xapp , uapp are the approximations of the state and control, constituting Y in (2.84). This above discretisation approach has been implemented in the DIRCOL package which employed the sequential quadratic programming (SQP) method SNOPT by Gill et al. (2002) and Gill et al. (1993). In contrast with the DIRCOL approach, Büskens and Maurer (2000) proposed to discretise the control only and use an NLP solver with respect to the discretised control only. The corresponding discretised state variables can be determined recursively using a numerical integration scheme (e.g. Euler, Heun, Runge–Kutta, etc.). This approach has been implemented in the NUDOCCCS package by Büskens (1996). NUDOCCCS has more flexibility in choosing the numerical method approach for both the control and state variables. For simple problems, low-order (e.g. Euler) numerical integration of the state is sufficient, but for complex problems, especially when a pure state inequality constraint occurs, the numerical integration approach can be more advanced (e.g. Runge–Kutta). One of the main advantages of DIRCOL and NUDOCCCS is that both packages provide an approximation for the co-state variables λ which can then be used as an initial guess in the indirect multiple shooting approach, see Section 2.5.2. Each of the packages uses a different approach to obtain the co-state variables. In DIRCOL the co-state variables are derived as follows. Consider equations (2.84)–(2.89) and define the Lagrangian equation as follows: Lf = f (Y ) +
k
λ(f (xapp (t), uapp (t), t) − x˙ app )
i=1
+ κ(xapp(t1 ) − x1 ) + νψ(xapp (tk ), tk ) +ς
q i=1
C(xapp (t), uapp (t), t) +
s i=1
S(xapp (t), t).
(2.90)
40
COMPUTATIONAL OPTIMAL CONTROL: TOOLS AND PRACTICE
By using the necessary conditions and the Lagrange multiplier of the discretised equality and inequality constraints from equation (2.90), the co-state variables can be approximated (see the KKT necessary condition on equations (2.71)–(2.74)). Büskens and Maurer employed a different way by using a recursive approach to compute the state variable. In this case, a discretised version of equation (2.90) can be solved recursively after optimal u and x are obtained, see Büskens and Maurer (2000, p. 92). Another way of obtaining the co-state approximation is by exploiting the Lagrangian equation Lp = f (Y ) + κ(xapp(t1 ) − x1 ) + νψ(xapp (tk ), tk ) +ς
q
C(xapp (t), uapp (t), t) +
i=1
s
S(xapp (t), t)
(2.91)
i=1
and then determining the derivative of equation (2.91) with respect to the state p
λ = Lx
(2.92)
and using it to approximate the co-state variables. This approach has been implemented in NUDOCCCS and produces a more reliable and accurate approximation, see Büskens and Maurer (2000, pp. 92–93) and Chudej et al. (2001). Pseudospectral Method for Optimal Control Among the direct transcription methods for the optimal control problem there also is the Legendre pseudospectral method, see Benson (2005), Elnagar et al. (1995), Elnagar and Kazemi (1998), Fahroo and Ross (2001), Fahroo and Ross (2002), Ross and Fahroo (2004b) and Ross and Fahroo (2004a). This method is based on the spectral collocation in which the trajectory for the state and control variables is approximated by the Nth-degree Lagrange interpolating polynomial. The values of the variables at the interpolating nodes are the unknown coefficients, which in this technique are the Legendre–Gauss–Lobatto points. Consider the Legendre–Gauss–Lobatto (LGL) points, ti , i = 0, . . . , N, distributed on the interval τ ∈ [−1, 1]. These points can be given by t0 = −1, tN = 1 and for 1 i N − 1, ti are the ˙ N , which is the derivative of the Legendre polynomial, N . The transformation zeros of between the LGL domain τ ∈ [−1, 1] and the physical domain t ∈ [t0 , tf ] can be defined by the following linear relations: τf − τ0 τ f + τ0 τ= t+ . (2.93) 2 2 The approximations for the state and control variables at the LGL points are given by the Nth-degree Lagrange interpolating polynomial as follows: x(t) ≈ X(t) =
N
x(ti )Li (t)
(2.94)
u(ti )Li (t)
(2.95)
i=0
u(t) ≈ U (t) =
N i=0
where Li (t) are the Lagrange interpolating polynomials of order N and defined by ˙ N (t) 1 (t 2 − 1) 1 if l = k = Li (t) = N(N + 1)N (ti ) t − ti 0 if l = k.
(2.96)
OPTIMAL CONTROL: OUTLINE OF THE THEORY AND COMPUTATION
41
The state approximation (2.94) for the dynamic equations must satisfy the condition of the exact derivative of (2.94) at the LGL points. The derivative of (2.94) is given by ˙ k) = x(t ˙ k ) ≈ X(t
N
x(ti )L˙ i (tk ) =
i=0
N
Dki x(ti )
(2.97)
i=0
where Dki = L˙ i (tk ) are the entries of the (N + 1) × (N rivative matrix and defined by Elnagar (1997) as N (tk )/N (tl ) · 1/tk − tl −N(N + 1)/4 Dki = N(N + 1)/4 0
+ 1) pseudospectral Legendre deif l = k if l = k = 0 if l = k = N otherwise.
(2.98)
The objective function (2.25) is discretised using the Gauss–Lobatto quadrature rule min J = φ[X(tN ), tN ] +
N
L[X(tk ), U (tk ), tk ].wk
(2.99)
k=0
where wk are the LGL weights. The boundary conditions can be defined by the approximating of the state variables at X1 and XN : ψ(X1 , XN ) = 0.
(2.100)
The optimal control problem can now be solved as an NLP problem by using an established NLP solver. The pseudospectral method has been implemented in the commercially available software DIDO, see Ross and Fahroo (2002), and is (in principle) capable of producing estimates of co-states. However, recent work by Benson (2005) shows that this does not work for the pure state constraint case, even for the simple benchmark problem of Bryson and Ho (1975), see the Appendix. Direct Multiple Shooting The basic idea of the direct multiple shooting method is to transform the original optimal control problem (2.25)–(2.32) into an NLP problem by coupling the control parameterisation with a multiple shooting discretisation of the state variables (see Keller (1992), Stoer and Bulirsch (2002), Ascher et al. (1995)). The control is approximated by piecewise functions and the state variables are approximated at the shooting nodes ti (see Figure 2.11). The initial value x(ti ) for the state variables at nodes ti must be guessed. Then in each interval the state equations must be integrated individually from ti to ti+1 . In addition, the continuity conditions (matching conditions) must be satisfied, which require that on each differential node the values x(ti+1 ) should equal the final value of the preceding trajectory. The above method has been implemented in PROMIS. Consider now the following boundary value problem: x˙ = f [x(t), u(t)],
r[x(t0 ), x(tf )] = 0.
(2.101)
COMPUTATIONAL OPTIMAL CONTROL: TOOLS AND PRACTICE
42
Ü
½
½ ¼
t0
¾
t1
t2
½
Figure 2.11 Multiple shooting. The basic idea of multiple shooting is to find simultaneously the values si = x(ti ),
i = 1, . . . , n,
(2.102)
for the solution of the boundary value problems (2.101) at the discretised nodes t0 < t1 < t2 < · · · < tn = tf .
(2.103)
We assume that the discretisation nodes for the control parameterisation are the same as for the state parameterisation. Suppose x[t; si , vi ] is the solution of the initial value problem (IVP) x˙ = f [t, x, u(t, vi )], x(ti ) = si , t ∈ [ti , ti+1 ]. (2.104) The problem now is to find the vectors si , i = 0, 1, . . . , n, and vi , i = 0, 1, . . . , n − 1, such that the function x(t) is pieced together, continuously, by the following IVP solutions: x(t) := x[t; si , vi ] for t ∈ [ti , ti+1 [, x(tn ) := sn .
i = 0, 1, . . . , n − 1,
(2.105) (2.106)
In addition, the boundary condition r[x(t0 ), x(tf )] = 0 must be satisfied by x(t). Hence, the boundary value problem (2.101) is solved on the whole interval. Consider now the following equation X(s): x[t1 ; s0 , v0 ] − s1 x[t2 ; s1 , v1 ] − s2 .. X(s) = (2.107) =0 . x[tn ; sn−1 , vn−1 ] − sn r[x(s0 ), x(sn )]
OPTIMAL CONTROL: OUTLINE OF THE THEORY AND COMPUTATION where the unknown variables
s0 s1 .. .
43
s= sn−1 sn
(2.108)
must be found. The optimal control problem now can be rewritten as an NLP problem: min J (s, v) =
n−1
Ji (si , vi )
(2.109)
i = 0, 1, . . . , n − 1
(2.110)
i=0
subject to x[ti+1 ; si , vi ] − si+1 = 0, r[x(s0 ), x(sn )] = 0.
(2.111)
The path constraints are transformed into vector inequality constraints at the multiple shooting nodes. The NLP problem result can then be solved by an established NLP solver. In contrast to the indirect multiple shooting method (described in Section 2.5.2 below), direct multiple shooting (DMS) does not require an analytic derivation of optimality conditions and co-state equations, so that the need for an initial guess for co-state variables is avoided. In other words, for DMS, multiple shooting is applied directly to equations (2.25)–(2.32), not equations (2.38)–(2.41). The main difference with direct collocation is that DMS has extra matching conditions which will improve convergence in NLP and improve numerical stability by preventing the growth of errors introduced by poor initial estimation, see Bock and Plitt (1984).
2.5.2 Indirect Method Approach In the indirect method the original problem (2.25)–(2.32) is not solved directly. Instead, the required solution is obtained by numerically solving equations (2.38)–(2.41) which constitute optimality conditions for (2.25)–(2.32). Since (2.38)–(2.41) define a TPBVP, they must be approached using the multiple shooting method. Multiple Shooting The mathematical TPBVP is best understood by the simple example of firing a shell. Given the initial barrel orientation and the initial shell speed, one can compute the trajectory and, in particular, the impact point. This is known as the initial value problem (IVP), as all we need to know is the starting point. The main observation is that for a given initial condition, the terminal condition (impact point) is uniquely determined, because it follows from integration of the known differential equation of motion. However, if both the initial and terminal conditions are specified, then this is a TPBVP: the trajectory must be a solution of the defining differential equation, but must pass through
44
COMPUTATIONAL OPTIMAL CONTROL: TOOLS AND PRACTICE
prescribed points at both ends (boundaries). In the shell example, this means that we must find such a combination of barrel orientation and projectile speed that the shell indeed lands at the prescribed impact point. This underpins the idea of the numerical method of shooting. It solves the TPBVP by repeated use of readily available procedures (e.g. Runge–Kutta) for solving the IVP. A guess of the initial point is made and the corresponding terminal point is computed. If it is not the prescribed one, the guess is optimally modified and serves as the starting point for the next use of an IVP solver. This process is repeated until convergence is obtained. The procedure
a
c
final point b
initial point
Figure 2.12 Shooting method procedure.
is illustrated in Figure 2.12 and can be explained as follows. The initial point has two parameters: position (always at the origin) and speed (variable). Trajectory a clearly overshoots the prescribed terminal point, so the speed was modified to get b which now undershoots. Finally, c shows that systematic improvement can be attained. The actual details for a second-order equation x¨ = f (t, x, x) ˙ are given in Figure 2.13. The initial position x0 = x(t0 ) is fixed and so is the terminal one xf = x(tf ). Thus the initial speed x(t ˙ 0 ) has to be iteratively modified until the end of the trajectory is within the desired accuracy ε. The first guess s (1) of the initial speed x(t ˙ 0 ) is made to start the procedure and the corresponding initial value problem (IVP 1) is solved (block 1). The error X between the thus obtained terminal value x(tf ; s (1) ) and the desired one x(tf ) is formed (block 2) and checked against the desired accuracy ε (block 3). If the accuracy requirement is met, the desired trajectory has been found; if not, then the guess of the initial speed must be improved. The improvement is based on the idea that, ideally, the error X should be zero. In other words, we should try to find a value of the guess s (i) of the initial speed x(t ˙ 0 ) which yields X(s (i) ) = x(tf ; s (i) ) − xf = 0. This is done by the well-known Newton procedure in block 7.
OPTIMAL CONTROL: OUTLINE OF THE THEORY AND COMPUTATION
45
Start 4 x¨ = f (t, x, x) ˙ x(t0 ) = x0 x(t ˙ 0 ) = s (i) + s (i) IVP 2
s (1)
1
5
x¨ = f (t, x, x) ˙ x(t0 ) = x0 x(t ˙ 0 ) = s (i) IVP 1
X (s
(i)
(i)
+ s ) = x(t f ; s
X (s (i) ) =
(i)
(i)
+ s ) − x f
X (s (i) + s (i) ) − X (s (i) ) s (i)
6
2 (i)
(i)
X (s ) = x(t f ; s ) − x f
3 | X (s (i) ) |< ε ?
s (i+1) = s (i) −
X (s (i) ) X (s (i) )
7
No
Yes Stop
Figure 2.13 Shooting method flowchart for a second-order equation with boundary conditions x0 = x(t0 ) and xf = x(tf ).
46
COMPUTATIONAL OPTIMAL CONTROL: TOOLS AND PRACTICE
The preceding blocks 4–6 perform the auxiliary computations: block 6 is the approximation X of the derivative of X and needs the results of blocks 4 and 5. As a consequence, another IVP must be solved (block 4), so that the computation becomes more expensive. The main drawback of the shooting method is the sensitivity to the initial guess, because of the use of the Newton iteration (block 7). To overcome this problem, the trajectory must be split up into subintervals and one must apply the same shooting method for each subinterval which results in the method of multiple shooting. The theoretical background for the multiple shooting is the same as DMS in Section 2.5.1. However, the indirect multiple shooting solves the problem using a Newton iteration. In a highly constrained optimal control problem, the jump and switching conditions on the co-state or control variables might occur. In order to handle those conditions, some new nodes must be inserted into the subinterval. Consider the following boundary value problem, see Oberle and Grimm (1989, p. 30): x(t) ˙ = fk (t, x(t)), x(ξk+ )
=
ri (x(t0 ), x(tf )) = 0, ri (ξki , x(ξk−i ))
ξk t ξk+1 ,
hk (ξk , x(ξk− )),
= 0,
0ks
for k = 1, . . . , s
(2.112a) (2.112b)
for 1 i n1
(2.112c)
for i = n1 + 1, . . . , n + s.
(2.112d)
In the optimal control framework equation (2.112a) represents state and co-state equations which are piecewise smooth functions and ξk , with k = 1, . . . , s, are switching points. For each k, equation (2.112b) is a jump condition at the switching point ξk . The n1 boundary conditions at the initial, t0 , and final, tf , time are described by equation (2.112c) and the s conditions at the switching points are given by equation (2.112d). Suppose Sj are the initial guesses for the x(ti ) and j are the initial guesses for the switching points ξj . Let us define x(t) (2.113) y(t) = ξ and Y (t) =
S .
The problem now is to find the solution of the IVP: f (t, x(t)) y(t) ˙ = , tj t tj +1 , y(tj ) = Yj , 0
(2.114)
j = 1, . . . , n − 1
(2.115)
where y(t) consists of the switching point ξ and must be computed simultaneously in the numerical processes. A modified Newton method is used to determine y(t). A professional version of the modified Newton algorithm, tailored to optimal control applications, has been implemented in FORTRAN and is available as the package BNDSCO, see Oberle and Grimm (1989). However, it should be emphasised that even the best TPBVP solver cannot overcome the fundamental problem of a narrow convergence interval inherent in the TPBVP.
OPTIMAL CONTROL: OUTLINE OF THE THEORY AND COMPUTATION
47
2.6 Summary and Discussion This chapter presented an overview of the optimal control problem and its numerical solution. Real-life nonlinear optimal control problems cannot be solved analytically, in general, and must be solved numerically. Numerical solution of continuous optimal control problems can be categorised into two different approaches: (1) the direct and (2) the indirect method. Direct methods are based on the transformation of the original optimal control problem into a nonlinear programming (NLP) problem by discretising the state or/and control history and then solving the resulting problem using an NLP solver. The indirect method solves the optimal control problem by deriving the necessary conditions based on Pontryagin’s minimum principle. In the indirect method the user must derive the appropriate equations for co-state variables, transversality and optimality conditions before the problem can be solved using a boundary value problem solver. Furthermore, the problem is more involved when path constraints are present. The sequence of the constrained/unconstrained arcs must be guessed and the corresponding switching and jump conditions must be derived, which is a non-trivial task. Secondly, the narrow convergence of the multiple shooting method must be considered. Finally, the co-state variables must be guessed, which is a non-intuitive task because the variables do not have physical meaning. In contrast, the direct method is easy to implement because all it requires is a fairly straightforward discretisation of the original problem. But the accuracy of the direct method is smaller than that of the indirect method.
3
Minimum Altitude Formulation In Section 1.2 the problem formulation for the terminal bunt manoeuvre was given for the minimum time and minimum altitude problem. This chapter presents the analysis and computation for the minimum altitude version of the terminal bunt manoeuvre. The cruise missile must hit the fixed target from above while minimising the missile’s exposure to anti-air defences. This means that the flight altitude should be as low as possible, but the impact must be achieved by a vertical dive. This leads to the generic trajectory shape where the missile initially flies straight and level at the minimum altitude. When it approaches the target, it must climb (nose up) to gain enough height for the final dive (nose down). This up-and-down terminal manoeuvre is known as the bunt, and establishing its optimal parameters is an example of trajectory shaping. The quantity to be minimised is the integrated altitude, but this minimisation must take into account, among others, missile dynamics (manoeuvrability constraints of the platform), the final dive specifications and limits on controls (thrust and angle of attack). This chapter is organised as follows. In Section 3.1 the problem formulation for the minimum altitude of the terminal bunt manoeuvre is given. The computational results are based on the DIRCOL packages which are then used for a qualitative analysis of the main features of the optimal trajectories and their dependence on several constraints, as discussed in Section 3.2; a comparison with PROMIS and SOCS solutions is also given. The mathematical analysis based on the qualitative analysis is presented in Section 3.3. This section begins by discussing the constraint on the thrust, followed by path and mixed constraints. Section 3.4 focuses on the indirect method approach. The co-state approximation issue is considered in that section. This section is concluded with some numerical solutions using the multiple shooting method package BNDSCO. Finally, Section 3.5 presents a summary and discussion.
3.1 Minimum Altitude Problem The objective function as given in Section 1.2 is to determine the trajectory of the generic cruise missile from an initial state to a final state with minimum altitude along the trajectory. The objective can be formulated by introducing the performance index: tf J = hdt. (3.1) t0 Computational Optimal Control: Tools and Practice c 2009 John Wiley & Sons, Ltd
˙ S. Subchan and R. Zbikowski
50
COMPUTATIONAL OPTIMAL CONTROL: TOOLS AND PRACTICE
This objective function is subject to the dynamic equations and some constraints as defined in Section 1.2.
3.2 Qualitative Analysis This section gives a qualitative discussion of the optimal trajectory of a cruise missile performing a bunt manoeuvre. The computational results of the terminal bunt manoeuvre are obtained using the direct collocation method package DIRCOL by von Stryk (1999) and then the resulting nonlinear programming problem solved using the SNOPT solver, which is based on sequential quadratic programming due to Gill et al. (1993). The important feature of DIRCOL is that it provides an approximation for the co-state variables. In this simulation the missile is assumed to be launched horizontally from the minimum altitude constraint h0 = 30 m. The initial and final conditions are as follows: γ0 = 0 ◦ ,
γtf = −90◦
(3.2a)
V0 = 272 m/s,
Vtf = 250, 270, 310 m/s
(3.2b)
x0 = 0 m,
xtf = 10000 m
(3.2c)
h0 = 30 m,
htf = 0 m.
(3.2d)
Figure 3.1 shows a comparison of the DIRCOL and differential inclusion results which are taken from Subchan et al. (2003). It can be seen that the DIRCOL results give a smoother solution on the control, see also the comparison of DIRCOL, PROMIS and SOCS in Figures 3.13–3.17. Based on Figures 3.2–3.17, an attempt is made to identify characteristic arcs of the trajectory, classify them according to the constraints active on them, and suggest physical or mathematical explanations for the observed behaviour. The trajectory is split into three subintervals: level flight, climbing and diving. Each of the trajectory arcs corresponding to the subintervals is now discussed in turn.
3.2.1 First Arc (Level Flight): Minimum Altitude Flight t0 t t1 The thrust and altitude constraints are active immediately at the start of the manoeuvre. In this case the altitude h of the missile remains constant on the minimum value (hmin ) until the missile must start climbing, while the thrust is on the maximum value. The flight time depends to the final speed Vtf (see Figure 3.3). In addition, the flight time is longer for the smaller final speed in this case. Equation (1.5d) equals zero during this flight because the altitude remains constant. This means that the flight path angle γ equals zero because the velocity V is never equal to zero during flight. In addition, γ (t) = 0 for t0 t t1 (see Figure 3.2 for the definition of t1 , t2 and t3 ) causes the derivative of the flight path angle γ˙ to be equal to zero. The dynamic
MINIMUM ALTITUDE FORMULATION
51
1400
0.15 DIRCOL Inclusion
0.1
angle of attack (rad)
1200
altitude (m)
1000 800 600
0
Ŧ0.05
400 200
Ŧ0.1
Ŧ0.15
0 Ŧ200 0
0.05
Ŧ0.2 2000
4000
6000
8000
10000
Ŧ0.25 0
DIRCOL Inclusion
10
20
downrange (m)
(a) Altitude versus downrange
40
50
(b) Angle of attack versus time
5
7000 6000 5000
thrust (N)
normal acceleration (g)
30
time (s)
0
Ŧ5 0
4000 3000 2000 1000
DIRCOL Inclusion
10
20
30
40
time (s)
(c) Normal acceleration versus time
50
0 0
DIRCOL Inclusion
10
20
30
40
50
time (s)
(d) Thrust versus time
Figure 3.1 Comparison of DIRCOL and differential inclusion results for minimum altitude problem for final speed Vtf = 250 m/s.
COMPUTATIONAL OPTIMAL CONTROL: TOOLS AND PRACTICE
52
t1
1400
t
3
angle of attack (rad)
1000
altitude (m)
t2
0.1
1200
800 600 400
0.05 0 Ŧ0.05 Ŧ0.1 Ŧ0.15
200 0 0
t1
0.15
Ŧ0.2
10
20
30
40
50
Ŧ0.25 0
5
10
(a) Altitude versus time
t
1
4
15
20
25
30
35
40
45
time (s)
time (s )
(b) Angle of attack versus time
t2
t2
6000
t3
5000 1
thrust (N)
normal acceleration (g)
3 2
0 Ŧ1 Ŧ2
4000
3000
2000
Ŧ3 Ŧ4 0
10
20
30
40
time (s)
(c) Normal acceleration versus time
50
1000 0
10
20
30
40
time (s)
(d) Thrust versus time
Figure 3.2 Computational results for minimum altitude problem using DIRCOL for Vf = 250 m/s. t1 is the time when the missile starts to climb, t2 is the time when the thrust switches to minimum value and t3 is the time when the missile starts to dive.
MINIMUM ALTITUDE FORMULATION
53
2000 V =310 m/s t
1800
f
Vt =270 m/s f
Vt =250 m/s f
1600
altitude (m)
1400 1200 1000 800 600 400 200 0 0
5
10
15
20
25
30
35
40
45
time (s )
Figure 3.3 Altitude versus time histories for a varying final speed. Solid line is for Vtf = 250 m/s, dashed line is for Vtf = 270 m/s and dashdot line is for Vtf = 310 m/s. 310 300 290
speed (m/s)
280 270 260 250 240 Vt =310 m/s
230
f
Vt =270 m/s f
Vt =250 m/s f
220 0
5
10
15
20
25
30
35
40
45
time (s )
Figure 3.4 Speed versus time histories for a varying final speed. Solid line is for Vtf = 250 m/s, dashed line is for Vtf = 270 m/s and dashdot line is for Vtf = 310 m/s.
COMPUTATIONAL OPTIMAL CONTROL: TOOLS AND PRACTICE
54 1
flight path angle (rad)
0.5
0
Ŧ0.5
Ŧ1
Ŧ1.5
V =310 m/s t
f
Vt =270 m/s f
Vt =250 m/s f
Ŧ2 0
5
10
15
25
20
30
35
40
45
time (s)
Figure 3.5 Flight path angle versus time histories for a varying final speed. Solid line is for Vtf = 250 m/s, dashed line is for Vtf = 270 m/s and dashdot line is for Vtf = 310 m/s. 0.15
Ŧ
+
t2
t1 0.1
angle of attack (rad)
0.05
0 Ŧ
t1 Ŧ0.05
Ŧ0.1
Ŧ0.15 Vt =310 m/s
Ŧ0.2
+
t2
f
Vt =270 m/s f
Vt =250 m/s f
Ŧ0.25 0
5
10
15
20
25
30
35
40
45
time (s )
Figure 3.6 Angle of attack versus time histories for a varying final speed. Solid line is for Vtf = 250 m/s, dashed line is for Vtf = 270 m/s and dashdot line is for Vtf = 310 m/s.
MINIMUM ALTITUDE FORMULATION
55
6000 5500 5000
thrust (N)
4500 4000 3500 3000 2500 2000
Vt =310 m/s f
Vt =270 m/s 1500
f
Vt =250 m/s f
1000 0
5
10
15
20
25
30
35
40
45
time (s )
Figure 3.7 Thrust versus time histories for a varying final speed. Solid line is for Vtf = 250 m/s, dashed line is for Vtf = 270 m/s and dashdot line is for Vtf = 310 m/s. 4
3
normal acceleration (g)
2
1
0
Ŧ1
Ŧ2 Vt =310 m/s
Ŧ3
f
Vt =270 m/s f
Vt =250 m/s f
Ŧ4 0
5
10
15
20
25
30
35
40
45
time (s)
Figure 3.8 Normal acceleration versus time histories for minimum altitude problem using DIRCOL for a varying final speed. Solid line is for Vtf = 250 m/s, dashed line is for Vtf = 270 m/s and dashdot line is for Vtf = 310 m/s.
COMPUTATIONAL OPTIMAL CONTROL: TOOLS AND PRACTICE
56
4
5
x 10
Vt =310 m/s f
Vt =270 m/s f
4
Vt =250 m/s f
OJ
3
2
1
0
Ŧ1 0
5
10
15
20
25
30
35
40
45
time (s )
Figure 3.9 Co-state λγ versus time histories for minimum altitude problem using DIRCOL for a varying final speed. Solid line is for Vtf = 250 m/s, dashed line is for Vtf = 270 m/s and dashdot line is for Vtf = 310 m/s. Compared with the flight path angle plots, Figure 3.5, the point of non-smoothness in λγ corresponds to deactivation of the altitude constraint as the missile starts climbing, see also Figure 3.3.
MINIMUM ALTITUDE FORMULATION
57
100
0
Ŧ100
Ov
Ŧ200
Ŧ300
Ŧ400
Ŧ500 Vt =310 m/s
Ŧ600
f
Vt =270 m/s f
Vt =250 m/s f
Ŧ700 0
5
10
15
20
25
30
35
40
45
time (s )
Figure 3.10 Co-state λV versus time histories for minimum altitude problem using DIRCOL for a varying final speed. Solid line is for Vtf = 250 m/s, dashed line is for Vtf = 270 m/s and dashdot line is for Vtf = 310 m/s. Compared with the speed plots, Figure 3.4, note that no constraint is active for V and therefore there is nothing special in the plots of λV .
COMPUTATIONAL OPTIMAL CONTROL: TOOLS AND PRACTICE
58
1.4
1.2
1 V =310 m/s
0.8
t
f
Vt =270 m/s
O
x
f
Vt =250 m/s
0.6
f
0.4
0.2
0
Ŧ0.2 0
5
10
15
20
25
30
35
40
45
time (s)
Figure 3.11 Co-state λx versus time histories for minimum altitude problem using DIRCOL for a varying final speed. Solid line is for Vtf = 250 m/s, dashed line is for Vtf = 270 m/s and dashdot line is for Vtf = 310 m/s. Co-state λx is constant and therefore there is nothing special in the plots of λx , see equation (3.8).
MINIMUM ALTITUDE FORMULATION
59
5
0
Ŧ5
Oh
Ŧ10
Ŧ15
Ŧ20
Ŧ25 Vt =310 m/s
Ŧ30
f
Vt =270 m/s f
Vt =250 m/s f
Ŧ35 0
5
10
15
20
25
30
35
40
45
time (s)
Figure 3.12 Co-state λh versus time histories for minimum altitude problem using DIRCOL for a varying final speed. Solid line is for Vtf = 250 m/s, dashed line is for Vtf = 270 m/s and dashdot line is for Vtf = 310 m/s. Compared with the altitude plots, Figure 3.3, the jump point in λh corresponds to deactivation of the altitude constraint, which is a pure state constraint.
COMPUTATIONAL OPTIMAL CONTROL: TOOLS AND PRACTICE
60
1400 1200
PROMIS SOCS DIRCOL
altitude (m)
1000 800 600 400 200 0 0
2000
4000 6000 downrange (m)
10000
8000
Figure 3.13 Downrange versus altitude comparison of DIRCOL, PROMIS and SOCS results for minimum altitude problem for final speed Vtf = 250 m/s.
320
speed (m/s)
300
280
260
240
220 0
PROMIS SOCS DIRCOL 10
20 time (s)
30
40
Figure 3.14 Speed versus time histories comparison of DIRCOL, PROMIS and SOCS results for minimum altitude problem for final speed Vtf = 250 m/s.
MINIMUM ALTITUDE FORMULATION
61
1
flight path angle (rad)
0.5 0 Ŧ0.5 Ŧ1 Ŧ1.5 Ŧ2 0
PROMIS SOCS DIRCOL 10
20 time (s)
30
40
Figure 3.15 Flight path angle versus time histories comparison of DIRCOL, PROMIS and SOCS results for minimum altitude problem for final speed Vtf = 250 m/s.
0.15
angle of attack (rad)
0.1 0.05 0 Ŧ0.05 Ŧ0.1 Ŧ0.15 Ŧ0.2 Ŧ0.25 0
PROMIS SOCS DIRCOL 10
20 time (s )
30
40
Figure 3.16 Angle of attack versus time histories comparison of DIRCOL, PROMIS and SOCS results for minimum altitude problem for final speed Vtf = 250 m/s.
COMPUTATIONAL OPTIMAL CONTROL: TOOLS AND PRACTICE
62 6000 5500 5000
thrust (N)
4500 4000 3500 3000 2500 2000 1500
PROMIS SOCS DIRCOL
1000 0
10
20 time (s)
30
40
Figure 3.17 Thrust versus time histories comparison of DIRCOL, PROMIS and SOCS results for minimum altitude problem for final speed Vtf = 250 m/s.
equation (1.5) is therefore reduced as follows: γ˙ =
1 V
L T −D sin α + cos α − g = 0 m m
L T −D cos α − sin α V˙ = m m x˙ = V h˙ = 0.
(3.3a) (3.3b) (3.3c) (3.3d)
We now consider the consequences of the right-hand side of equation (3.3a) being zero. This condition means that the normal acceleration L/m remains almost constant, because the angle of attack α is very small. The first term on the right-hand side of equation (3.3a) is small, because sin α ≈ α ≈ 0, and we are left with L/m ≈ g due to cos α ≈ 1. During this time speed increases, because for small α from (3.3b) T −D V˙ ≈ > 0, m
as T > D,
which, in turn, means that the angle of attack α must slowly decrease in order to maintain L/m approximately equal to g in accordance with equation (1.8).
MINIMUM ALTITUDE FORMULATION
63
Table 3.1 Performance index and final time for the minimum altitude problem for different terminal speed using DIRCOL. Final speed Vtf (m/s) J (m·s) tf (s) No. of grid points 250 270 310
13563.3003 16764.8403 31562.5639
40.3476 40.7562 41.3805
176 169 169
3.2.2 Second Arc: Climbing The analysis presented here mainly deals with the case for the final speed of 250 m/s, because it exhibits switching in the thrust (see Figure 3.2d). The missile must climb eventually in order to achieve the final condition of the flight path angle γtf . This condition occurs in the interval t1 t t3 . Climbing: Full Thrust and Maximum Normal Acceleration (t1 t t2 ) At the beginning of the climbing manoeuvre the thrust is at the maximum value. Since altitude above hmin is penalised, the climb occurs as late as possible, so must be done sharply and last as short as possible. Hence, at the beginning of ascent the angle of attack must increase to facilitate a rapid nose-up motion and the thrust has the maximum value. During this time, the normal acceleration is saturated at the maximum value Lmax due to the jump in the angle of attack α. The speed keeps decreasing while the angle of attack α and altitude h increase. This arc ends at t2 when the thrust switches to the minimum value Tmin . During this time, the normal acceleration jumps to the minimum value Lmin due to the jump in the angle of attack α. Climbing: Minimum Thrust (t2 t t3 ) While rapid climbing is necessary, the missile should also turn over to begin its dive as soon as possible, so that the excess of altitude (above hmin ) is minimised. Thus, the thrust should soon be switched to its minimum value and at the same time the angle of attack should be decreased to negative values, further to promote pitching down. From the computational results (see Figure 3.2d) it follows that the thrust is switched to the minimum value before turnover. This switching occurs at approximately t2 after firing. Immediately after the thrust is switched, the flight path angle γ decreases rapidly while the angle of attack α jumps. This jump in α causes the normal acceleration to jump, saturating at the minimum value Lmin . When the normal acceleration is saturated at the minimum value, the angle of attack decreases further. At the same time the speed decreases and the altitude increases until the missile turns over. The missile turns over when γ = 0 and h˙ = 0. At the same time the thrust switches back to the maximum value to facilitate rapid arrival at the target. For the final speed of 270 m/s and 310 m/s the thrust does not exhibit any switching.
COMPUTATIONAL OPTIMAL CONTROL: TOOLS AND PRACTICE
64
3.2.3 Third Arc: Diving (t3 t tf ) The missile starts diving at approximately t3 seconds. At the end of the manoeuvre the missile should hit the target with a prescribed speed Vtf . The speed during turnover is smaller than the final speed Vtf , so the speed must increase and hence the thrust switches back to the maximum value for the case Vtf = 250 m/s. This means that the thrust will facilitate the missile’s arrival at the target as soon as possible. In this case the normal acceleration is at the minimum value. Obviously, the altitude goes down to reach the target (γ < 0 → h˙ < 0, see equation (1.5d)), while the speed goes up to satisfy the terminal speed condition Vtf . Finally, the missile satisfies the terminal condition of the manoeuvre approximately tf after firing. Table 3.1 shows that the objective function is bigger for greater final speed. The performance index for the case Vtf = 310 m/s is twice the case Vtf = 250 m/s while the final time is about 1 second different. The co-state approximation is given in Figures 3.9–3.12. It can be seen that the co-state approximation for λh has a jump at the exit of the pure state constraint (minimum altitude constraint), see Figure 3.12.
3.3 Mathematical Analysis The qualitative analysis of Section 3.2 is now made precise using optimal control theory.
3.3.1 The Problem with the Thrust Constraint Only Firstly, we investigate the minimum altitude problem when the initial and final conditions (1.11) are active and the control is constrained on thrust T only (1.14). Necessary conditions for optimality can be determined by applying Pontryagin’s minimum principle, see Pontryagin et al. (1962) and Bryson and Ho (1975). For this purpose, we first consider the following Hamiltonian:
H
af
λγ T − D L sin α + cos α − g cos γ =h+ V m m L T −D + λV cos α − sin α − g sin γ + λx V cos γ + λh V sin γ , m m
(3.4)
where the co-state variables λ = (λγ , λV , λx , λh ) have been adjoined to the dynamic system of equation (1.5). The co-state equations are determined by af
∂H . λ˙ = − ∂x
(3.5)
MINIMUM ALTITUDE FORMULATION The components of co-state vector λ satisfying the preceding equations are ! " λγ g sin γ − λV g cos γ − λx V sin γ + λh V cos γ λ˙ γ = − V ! Cd ρSref sin α Cl ρSref cos α g T sin α − + + 2 cos γ λ˙ V = − λγ − 2 V m 2m 2m V " λV ρV Sref Cd cos α + Cl sin α + λx cos γ + λh sin γ − m λ˙ x = 0
! # C V S sin αρ Cl V Sref cos αρh $ d ref h + λ˙ h = − 1 + λγ − 2m 2m " # C V 2 S cos αρ Cl V 2 Sref sin αρh $ d ref h + λV − − , 2m 2m
65
(3.6)
(3.7) (3.8)
(3.9)
where ρh = 2C1 h + C2 . The optimal values of the control variables are generally to be determined from Pontryagin’s minimum principle. A necessary condition for optimal control is the minimum principle min H af , u
(3.10)
i.e. the Hamiltonian must be minimised with respect to the vector of controls u (T , α). Since the control T appears linearly in the Hamiltonian, the condition HTaf = 0 does not determine optimal thrust, but T is bounded, so the following provides the minimum of the Hamiltonian: Tmax if HTaf < 0 T = Tsing if HTaf = 0 Tmin if HTaf > 0 with sin α cos α + λV (switching function). (3.11) HTaf = λγ Vm m • Case when T on the boundary (T = Tmax or T = Tmin ) In this case α can be determined from $ T − D + Lα # λγ Hαaf = cos α − λV sin α m V $ Dα + L # λγ sin α + λV cos α = 0 − (3.12) m V with 1 Lα = ρV 2 Sref B1 2 1 Dα = ρV 2 Sref (2A1 α + A2 ). 2
66
COMPUTATIONAL OPTIMAL CONTROL: TOOLS AND PRACTICE The value of α cannot be derived in closed form from (3.12), and must be obtained numerically. Note that D, Dα and L depend on α, see (3.12). • Case when T = Tsing (singular control) When the switching function HTaf becomes zero in an interval (t1 , t2 ) ⊂ (t0 , tf ), the control corresponding to the magnitude of the thrust T is singular. In these circumstances, there are finite control variations of T which do not affect the value of the Hamiltonian. From Bryson and Ho (1975, p. 246), the singular arcs occur when Huaf = 0 and
af det Huu = 0.
(3.13)
Substituting (3.4) into (3.13) with component u = (T , α) yields HTaf = λγ
sin α cos α + λV =0 Vm m
$ #λ λV γ cos α − sin α Vm m $ #λ λV γ − (Dα + L) sin α + cos α = 0. Vm m
(3.14)
Hαaf = (T − D + Lα )
(3.15)
cos α sin α − λV = 0. (3.16) Vm m Conditions (3.14)–(3.16) cannot be satisfied simultaneously, so we conclude that there are no singular arcs. However, jump discontinuities in the control T may appear if, at a time t, the switching function (3.11) changes sign. af = 0 ⇐⇒ λγ det Huu
The Hamiltonian is not an explicit function of time, so H af is constant along the optimal trajectory.
3.3.2 Optimal Control with Path Constraints In Section 3.3.1 we derived necessary conditions for optimality by considering only the boundary conditions and thrust constraint. In this section the level of complexity is increased by considering some additional constraints as defined in Section 1.2. The first state path constraint (1.12) can be split as Vmin − V 0 and V − Vmax 0. Both of them are of order 1, because V˙ explicitly depends on the controls, see Bryson and Ho (1975, pp. 99–100). Since the speed constraint is not active during the manoeuvre in this case, it will not be taken into account in the Hamiltonian (see Figure 3.4). The second path constraint (1.13) is of order 2 and the mixed state–control constraint (1.15) is split as Lmin − L/(mg) 0 and L/(mg) − Lmax 0, where L depends on the control explicitly. The Hamiltonian can be defined as follows: L L ac af ¨ H = H + µ1 − + Lmin + µ2 − Lmax + µ3 (h). mg mg
MINIMUM ALTITUDE FORMULATION
67
The differential equations for co-state vector λ = (λγ , λV , λx , λh ) can be written as ac
∂H . λ˙ = − ∂x
(3.17)
Since these equations are rather lengthy, they are omitted here. For the Lagrange multipliers µi = 1, . . . , 3, there must hold = 0, if the associated constraint is not active; µi 0, if the associated constraint is active. The necessary conditions are completed by deriving the junction conditions at the switching points t¯i as follows (Bryson and Ho 1975, p. 101): H ac (t¯i− ) = H ac (t¯i+ ) − ϑ T λT (t¯i− ) = λT (t¯i+ ) + ϑ T
∂N ∂ t¯i
∂N ∂x
(3.18) (3.19)
which requires finding the additional multipliers ϑ.
3.3.3 First Arc: Minimum Altitude Flight In this section we consider only the state path constraint hmin h and thrust control constraint (T is at the maximum value). In this case we assume that the missile is launched at the initial altitude h = hmin . Therefore, the constraints are active immediately at the beginning of the manoeuvre. The constraint hmin h has no explicit dependence on the control variables, therefore we must take the time derivative of the constraint until, finally, explicit dependence on the control does occur. Consider the following equations: h − hmin = 0 h˙ = V sin γ = 0 ⇒ γ (t) = 0
(3.20a) and V = 0 for t ∈ [t0 , t1 ]
(3.20b)
h¨ = V˙ sin γ + V γ˙ cos γ = 0 ⇒ γ˙ (t) = 0
for t ∈ [t0 , t1 ].
(3.20c)
The controls appear explicitly after differentiating the constraint hmin h twice, therefore the order of the constraint is 2. Substituting equation (3.20) into the equation of motion (1.5), we obtain the reduced state equations as in (3.3). The angle of attack α can be obtained numerically from equation (3.3a). Then substituting α into equations (3.3b) and (3.3c), these equations can be solved as an initial value problem (IVP). Thus we can find the first arc easily, but we do not know how long it will last. For this purpose we should formulate the appropriate boundary value problem (BVP) which involves finding co-state variables by defining the Hamiltonian as follows: aa af ˙ H = H + µ3 V sin γ + V γ˙ cos γ . (3.21)
COMPUTATIONAL OPTIMAL CONTROL: TOOLS AND PRACTICE
68
The components of co-state vector λ satisfying the preceding equations are ! λγ g sin γ − λV g cos γ − λx V sin γ + λh V cos γ λ˙ γ = − V $" # dV˙ dγ˙ ˙ sin γ + V cos γ + V cos γ − γ˙ V sin γ + µ3 dγ dγ ! ˙λV = − λγ − T sin α − Cd ρSref sin α + Cl ρSref cos α + g cos γ V 2m 2m 2m V2 λV ρV Sref Cd cos α + Cl sin α + λx cos γ + λh sin γ − m $" # dV˙ dγ˙ sin γ + V cos γ + γ˙ cos γ + µ3 dV dV
(3.22)
(3.23)
λ˙ x = 0
(3.24) ! # $ λγ V Sref ρh Cd sin α − Cl cos α λ˙ h = − 1 − 2m $" # dV˙ $ dγ˙ λV V 2 Sref ρh # Cd cos α + Cl sin α + µ3 sin γ + V cos γ . (3.25) − 2m dh dh
Using Pontryagin’s minimum principle for (3.21) yields $ $ #λ #λ λV λV γ γ cos α − sin α − (Dα + L) sin α + cos α Vm m Vm m $ # dV˙ dγ˙ + µ3 sin γ + V cos γ = 0. dα dα
Hαaa = (T − D + Lα )
The thrust is at the maximum values. From (3.20b)–(3.20c) it follows that γ = 0 and γ˙ = 0, so we obtain the following reduced co-state equations: ! $" #T − D L cos α − sin α λ˙ γ = − −λV g + λh V + µ3 m m ! T sin α g Cl ρSref cos α Cd ρSref sin α − + + λ˙ V = − (λγ + µ3 V ) − 2m 2m V 2m V2 " $ λV ρV Sref # − Cd cos α + Cl sin α + λx m λ˙ x = 0
! $ (λ + µ3 V )V Sref ρh # ˙λh = − 1 + γ − Cd sin α + Cl cos α 2m $" λV V 2 Sref ρh # Cd cos α + Cl sin α − 2m
MINIMUM ALTITUDE FORMULATION
69
with the optimality condition $ λV sin α V m m $ # λ sin α λ γ V − (Dα + L) + µ3 V + cos α = 0. V m m # λ
Hαaa = (T − D + Lα )
γ
+ µ3 V
cos α
−
(3.26)
The angle of attack α can be obtained from equation (3.3a). Lagrange multiplier µ3 can be derived explicitly from equation (3.26) and substituted into the state and co-state equations. Since we know the flight path angle and the altitude during this manoeuvre, the number of differential equations reduces to six. The main difficulty in solving this problem is that we do not know an initial guess for the co-state variables.
3.3.4 Second Arc: Climbing In this analysis we focus on the case of final speed 250 m/s only and consider the thrust and normal acceleration constraints. From the qualitative analysis in Section 3.2, we know that the thrust control switches to the minimum value during climbing for final speed 250 m/s, therefore the switching function must change sign from negative to positive. Consider mixed state–control inequality constraints as mentioned in equation (1.15) as follows: L Lmin Lmax mg where L explicitly depends on the control α. The inclusion of the mixed constraints above leads to the augmented Hamiltonian
L L H al = H af + µ1 − + Lmin + µ2 − Lmax . (3.27) mg mg The right-hand side of the differential equations for the co-state equations are to be modified along subarcs of this second arc. Additionally, we have a necessary sign condition for the Lagrange parameter µi , on unconstrained subarcs = 0 af µ1 H mg = α on constrained subarcs Lα = 0
and
on unconstrained subarcs Hαaf mg = − on constrained subarcs. Lα The angle of attack α can be determined as follows: µ2
• Optimality condition when normal acceleration is at the maximum value $ #λ λV γ cos α − sin α Hαal = (T − D + Lα ) Vm m $
L #λ λV γ α − (Dα + L) sin α + cos α + µ2 = 0. Vm m mg
(3.28)
COMPUTATIONAL OPTIMAL CONTROL: TOOLS AND PRACTICE
70
• Optimality condition when normal acceleration is at the minimum value $ #λ λV γ cos α − sin α Hαal = (T − D + Lα ) Vm m $
L #λ λV γ α − (Dα + L) sin α + cos α − µ1 = 0. Vm m mg
(3.29)
When the normal acceleration constraint is active (Lmax ), the angle of attack can be determined from (1.8) as 2mgLmax − B2 ρV 2 Sref α= . (3.30) B1 ρV 2 Sref Equation (3.30) is valid until the normal acceleration and the thrust switch to minimum value (see Figures 3.2c–3.2d). Below we summarise the results for the case when the normal acceleration is saturated at the maximum value (Lmax ): • State equations, as in equations (1.5). • Co-state equations: ! " λ ˙λγ = − γ g sin γ − λV g cos γ − λx V sin γ + λh V cos γ V ! Cd ρSref sin α Cl ρSref cos α g T sin α − + + 2 cos γ λ˙ V = − λγ − 2 V m 2m 2m V # # L $" $ λV ρV Sref V Cd cos α + Cl sin α + λx cos γ + λh sin γ + µ2 − m mg λ˙ x = 0
! $ λγ V Sref ρh # Cd sin α − Cl cos α λ˙ h = − 1 − 2m # L $" $ λV V 2 Sref ρh # h Cd cos α + Cl sin α + µ2 . − 2m mg • Optimality condition as in (3.28) and (3.30). The thrust switches to the minimum value when HTal changes sign from negative to positive.
3.3.5 Third Arc: Diving In this analysis we consider only the normal acceleration constraint. During this time the thrust is at the maximum value and normal acceleration is saturated at the minimum value. The angle of attack can be determined from (1.8) as follows: α=
2mgLmin − B2 ρV 2 Sref . B1 ρV 2 Sref
(3.31)
The Hamiltonian and co-state equations are nearly the same as in the previous section, therefore the derivation is omitted here. The equations can be summarised as follows:
MINIMUM ALTITUDE FORMULATION
71
• State equations, as in equations (1.5). • Co-state equations: " λγ g sin γ − λV g cos γ − λx V sin γ + λh V cos γ V ! ˙λV = − λγ − T sin α − Cd ρSref sin α + Cl ρSref cos α + g cos γ V 2m 2m 2m V2 # L $" $ λV ρV Sref # V Cd cos α + Cl sin α + λx cos γ + λh sin γ − µ1 − m mg λ˙ γ = −
!
λ˙ x = 0
! $ λ VS ρ # ˙λh = − 1 − γ ref h Cd sin α − Cl cos α 2m # L $" $ 2 λV V Sref ρh # h Cd cos α + Cl sin α − µ1 . − 2m mg • Optimality condition, Hαal as in equations (3.29) and (3.31). The schematic representation of the boundary value problem associated with the switching structure can be seen in Figures 3.26–3.30 below.
3.4 Indirect Method Solution BNDSCO is a software package developed by Oberle, see Oberle and Grimm (1989), which implements a multiple shooting algorithm, see Stoer and Bulirsch (2002), Keller (1992) and Osborne (1969), and is a reliable solver of the multi-point boundary value problem (MPBVP) with discontinuities, specially written for solving optimal control problems. However, it has the weakness of all shooting methods in that it has a narrow domain of convergence. Therefore initial guesses for the state and co-state variables are crucial for successful computation, especially the co-state variables which have no physical interpretation. Moreover, the task becomes more difficult when the problem has pure state constraints.
3.4.1 Co-state Approximation von Stryk (1993) shows that the co-state variables can be estimated by the necessary conditions of the discretised problem of the optimal control. He developed the DIRCOL package, von Stryk (1999), based on a direct collocation method and it has been used for solving several real-life problems, see von Stryk and Bulirsch (1992) and von Stryk and Schlemmer (1994). Grimm and Markl (1997) estimated the co-state variables using the direct multiple shooting method. Their co-state approximation is accurate for the unconstrained problem but does not work well for the constrained problem. Fahroo and Ross (2001) proposed a Legendre pseudospectral method to estimate the co-state variables and presented an accurate
72
COMPUTATIONAL OPTIMAL CONTROL: TOOLS AND PRACTICE
estimator for the unconstrained problem. Benson (2005) proposed a Gauss pseudospectral transcription to solve the optimal control problem and used it to approximate co-state variables. Again the co-state approximation does not give a good initial guess for problems with pure state constraints. This section presents an example of a problem with pure state constraints which is the first arc of the terminal bunt manoeuvre. In this example, Bryson’s and Jacobson’s formulation are compared and then the DIRCOL package is used to approximate the co-state variables. Jacobson et al. (1971) presented a direct adjoining of the pure state constraint to the Hamiltonian while Bryson et al. (1963) proposed an indirect adjoining of pure state constraints to the Hamiltonian. In Bryson’s approach the pure state constraint is differentiated until the control u appears explicitly and then the resulting equation is adjoined to the Hamiltonian (see section 3.3.3). Consider now Bryson’s following formulation for the first arc (flying at the minimum altitude):
S = h − hmin = 0
(3.32a)
S (1) = h˙ = V sin γ = 0 ⇒ γ (t) = 0
and V = 0
for t ∈ [t0 , t1 ]
(3.32b)
S (2) = h¨ = V˙ sin γ + V γ˙ cos γ = 0 ⇒ γ˙ (t) = 0
for t ∈ [t0 , t1 ].
(3.32c)
Thus the constraint is of second order, see also Section 3.3.3, as controls appear in γ˙ and V˙ , see equations (1.5a) and (1.5b). The Hamiltonian for Bryson’s formulation can be defined as H B = λB f + µS (2) .
(3.33)
In contrast to Bryson’s formulation, the Hamiltonian for Jacobson’s formulation is given by H J = λJ f + νS.
(3.34)
Note that λB = λJ , in general, because of different definitions of the Hamiltonian. The direct method approach for optimal control mainly uses Jacobson’s formulation in the derivation of the Karush–Kuhn–Tucker (KKT) necessary conditions. Therefore the co-state estimation from the direct method is accurate for the problem having a mixed constraint but it may not work well for the problem having a pure state constraint. Thus, in general, for a pure state constraint situation DIRCOL will compute λJ , while BNDSCO will need λB . The following example gives some insights into the different costate estimation for Bryson’s and Jacobson’s formulation using DIRCOL. In this example DIRCOL is implemented for the first arc only because the minimum altitude constraint is active in this arc. Both co-state estimation methods are then used as initial guesses for BNDSCO but this does not work well. Figures 3.18 and 3.19 show DIRCOL solutions obtained
MINIMUM ALTITUDE FORMULATION
73
using the following data:
γ0 = 0 ◦ ,
γt f = 0 ◦
V0 = 272 m/s,
Vtf = 306.32400 m/s
x0 = 0 m,
xtf = 5813.44774 m
h0 = 30 m,
htf = 30 m.
The following equations show the differences in the co-state equations for Bryson’s and Jacobson’s formulation:
• State and co-state equations for Bryson’s formulation
1 γ˙ = V
!
" L T −D sin α + cos α − g = 0 m m
L T −D cos α − sin α V˙ = m m x˙ = V
(3.35) (3.36) (3.37)
h˙ = 0
(3.38) " ! $ #T − D L B B cos α − sin α (3.39) λ˙ B γ = − −λV g + λh V + µ m m ! T sin α Cl ρSref cos α g Cd ρSref sin α B ˙λB − + + 2 V = − (λγ + µV ) − 2m V 2m 2m V " B # $ λ ρV Sref Cd cos α + Cl sin α + λB (3.40) − V x m λ˙ B x =0
! # $ (λB γ + µV )V Sref ρh −C = − 1 + sin α + C cos α λ˙ B d l h 2m $" λB V 2 Sref ρh # − V Cd cos α + Cl sin α . 2m
(3.41)
(3.42)
COMPUTATIONAL OPTIMAL CONTROL: TOOLS AND PRACTICE
74
• State and co-state equations for Jacobson’s formulation L g cos γ T −D sin α + cos α − mV mV V L T −D cos α − sin α − g sin γ V˙ = m m x˙ = V cos γ γ˙ =
(3.43) (3.44) (3.45)
h˙ = V sin γ (3.46) ! J " λγ g sin γ − λJV g cos γ − λJx V sin γ + λJh V cos γ (3.47) λ˙ Jγ = − V ! Cl ρSref cos α g Cd ρSref sin α T sin α + + 2 cos γ − λ˙ JV = − λJγ − 2 2m 2m V m V λJ ρV Sref Cd cos α + Cl sin α + λJx cos γ + λJh sin γ (3.48) − V m λ˙ Jx = 0
(3.49)
! $ λJ V S ρ # ˙λJh = − 1 − γ ref h Cd sin α − Cl cos α 2m −
" $ λJV V 2 Sref ρh # Cd cos α + Cl sin α + ν . 2m
(3.50)
Figures 3.18 and 3.19 show that the computational results for the co-state variables are very different. It is obvious that they must be different in general, because the constraint which is adjoined to the Hamiltonian differs for both cases. However, the state variables give approximately the same solutions. Note that, when solving the TPBVP corresponding to (3.43)–(3.50), finding ν in (3.50) does not lend itself to a systematic iterative procedure, as pointed out by Maurer and Gillessen (1975, p. 111). On the other hand, µ in (3.35)–(3.42) can be found readily using the conditions γ˙ = 0 and HαB = 0.
3.4.2 Switching and Jump Conditions In the previous section we showed that Bryson’s and Jacobson’s formulations produce very different λ, see Figures 3.18 and 3.19. In this section we compare the differences in switching and jump conditions for Bryson’s and Jacobson’s formulations, further to emphasise the consequences of different definitions of co-state variables λ. Bryson’s Formulation The switching and jump conditions at entry ten and exit tex will be considered. The jump conditions at the entry point can be derived from tangency constraint P (x) = 0 as follows.
MINIMUM ALTITUDE FORMULATION
75
Ŧ4000
1000
Ŧ4500
0
Ŧ5000
Ŧ1000
Ŧ5500
Ov
OJ
2000
Ŧ2000
Ŧ6000
Ŧ3000
Ŧ6500
Ŧ4000
Ŧ7000
Ŧ5000
Ŧ7500
Ŧ6000 0
5
10
15
20
25
Ŧ8000 0
5
time (s)
(a) λγ versus time
10
time (s)
15
20
15
20
(b) λV versus time
10
33.5
5
33
0
Oh
Ox
34
32.5
Ŧ5
32
Ŧ10
31.5 0
5
10
time (s)
(c) λx versus time
15
20
Ŧ15 0
5
10
time (s)
(d) λh versus time
Figure 3.18 DIRCOL computational results for the co-state variables of Bryson’s formulation.
COMPUTATIONAL OPTIMAL CONTROL: TOOLS AND PRACTICE
76
5
0.2
0
Ŧ0.2
0
Ŧ0.4
OJ
Ov
Ŧ5
Ŧ0.6 Ŧ0.8
Ŧ10
Ŧ1 Ŧ15
Ŧ1.2
Ŧ20 0
Ŧ1.6 0
Ŧ1.4 5
10
15
20
5
10
15
20
15
20
time (s)
time (s)
(a) λγ versus time
(b) λV versus time
1
0 Ŧ0.01
0.5
Ŧ0.02
Ox
Oh
0
Ŧ0.03
Ŧ0.5
Ŧ0.04 Ŧ1
Ŧ1.5 0
Ŧ0.05
5
10
time (s)
(c) λx versus time
15
20
Ŧ0.06 0
5
10
time (s)
(d) λh versus time
Figure 3.19 DIRCOL computational results for the co-state variables of Jacobson’s formulation. Note the different magnitudes compared with Figure 3.18.
MINIMUM ALTITUDE FORMULATION
77
Consider the following equations of Bryson’s formulation: S(x) P0 (x) h − hmin P (x) = = ˙ = = 0. P1 (x) V sin γ S(x)
(3.51)
If we assume the jump occurred at the entry point then the jump conditions are given by λ+ = λ− − NxT σ with
(3.52)
0 (∇x P0 )T = Px = ∇x N = V cos γ (∇x P1 )T
0 sin γ
0 1 . 0 0
(3.53)
Thus equation (3.52) can be rewritten as +) −) λγ (ten λγ (ten 0 V cos γ λv (t + ) λv (t − ) en en 0 sin γ σ1 = − σ2 + − 0 0 λx (t ) λx (t ) en en 1 0 +) −) λh (ten λh (ten or
(3.54)
+ − λγ (ten ) = λγ (ten ) − σ2 V cos γ + ) λv (ten + ) λx (ten + λh (ten )
= = =
(3.55a)
− λv (ten ) − σ2 sin γ − λx (ten ) − λh (ten ) − σ1 .
(3.55b) (3.55c) (3.55d)
By substituting γ = 0 into equation (3.55), the above equation can be reduced as follows: + − ) = λγ (ten ) − σ2 V λγ (ten
(3.56a)
+ − ) = λv (ten ) λv (ten
(3.56b)
+ ) λx (ten + λh (ten )
= =
− λx (ten ) − λh (ten )
(3.56c) − σ1 .
(3.56d)
The switching conditions at the entry point are given by h − hmin P (x) = =0 V sin γ and
(3.57)
− Ht− = 0 ⇒ −σ2 V γ˙ − σ1 h˙ + µ3 V˙ sin γ + V γ˙ cos γ = 0 Ht+ en en
(3.58)
with − ˙ −˙ − = λ− Ht− γ γ˙ + λV V + λx x˙ + λh h en
+ ˙ +˙ + ˙ Ht+ = λ+ γ γ˙ + λV V + λx x˙ + λh h + µ3 V sin γ + V γ˙ cos γ . en
(3.59) (3.60)
The switching conditions at the exit point tex are − Ht− = 0 ⇒ V˙ sin γ + V γ˙ cos γ = 0. Ht+ ex ex
(3.61)
78
COMPUTATIONAL OPTIMAL CONTROL: TOOLS AND PRACTICE
Kreindler’s Remarks Equation (3.20) shows that the altitude constraint is of order 2 (q = 2). Based on Kreindler’s remarks (Kreindler 1982, p. 244), we note that the constant multipliers σi are unique except possibly σq−1 . If an arbitrary constant ξ is added to σq−1 then the discontinuity occurs at the exit point by −ξ ∇x P1 . Consider now the following equations: P1 = V sin γ V cos γ sin γ ∇x P1 = 0 . 0
(3.62)
(3.63)
Thus the jump conditions at the entry and exit points can be derived as follows: • Jump conditions at the entry point + − λγ (ten ) = λγ (ten ) − (σ2 + ξ )V cos γ
(3.64a)
+ − ) = λv (ten ) − (σ2 + ξ ) sin γ λv (ten
(3.64b)
+ ) λx (ten + ) λh (ten
= =
− λx (ten ) − λh (ten )−
(3.64c) σ1 .
(3.64d)
• Jump conditions at the exit point + − ) = λγ (tex ) − ξ V cos γ λγ (tex
(3.65a)
+ − ) = λv (tex ) − ξ sin γ λv (tex
(3.65b)
+ ) λx (tex + ) λh (tex
= =
− λx (tex ) − λh (tex ).
(3.65c) (3.65d)
Necessary Conditions of Jacobson et al. In this case the state constraint (3.20a) is adjoined to the Hamiltonian directly with multiplier function ν, ν 0, see Jacobson et al. (1971) and Kreindler (1982). The necessary conditions can be derived by defining the Hamiltonian as follows: H mas (x, u, λ, ν) = λT f (x, u) + νS(x).
(3.66)
The jump conditions at the entry and exit points can be derived as follows: λ+ (ti ) = λ− (ti ) − νi Sx where the scalar νi is non-negative,
νi 0.
The jump conditions at the entry point are + ) − ) λγ (ten λγ (ten 0 λ (t + ) λ (t − ) 0 v en v en = − ν1 + ) − ) 0 λx (ten λx (ten 1 λ (t + ) λ (t − ) h en
h en
(3.67) (3.68)
(3.69)
MINIMUM ALTITUDE FORMULATION
79
or + − λγ (ten ) = λγ (ten )
(3.70a)
+ − ) = λv (ten ) λv (ten
(3.70b)
+ ) λx (ten + λh (ten )
= =
− λx (ten ) − λh (ten ) − ν1 .
The jump conditions at the exit point are + ) − ) λγ (tex λγ (tex 0 λ (t + ) λ (t − ) 0 v ex v ex = − ν2 + ) − ) 0 λx (tex λx (tex 1 λ (t + ) λ (t − ) h ex
(3.70c) (3.70d)
(3.71)
h ex
or + − λγ (tex ) = λγ (tex ) + ) λv (tex + ) λx (tex + λh (tex )
= = =
− λv (tex ) − λx (tex ) − λh (tex ) − ν2 .
(3.72a) (3.72b) (3.72c) (3.72d)
The jump conditions given by Jacobson et al. are consistent with the DIRCOL results (see Figure 3.12), which is not surprising, because the constraints are adjoined directly in the KKT necessary conditions (2.71)–(2.74).
3.4.3 Numerical Solution The computational results for the terminal bunt problem were computed by the multiple shooting package BNDSCO, see Oberle and Grimm (1989). The problem is split up into two phases: the first phase is the level flight and the rest of the manoeuvre is the second phase. The time ti at which the transition from phase 1 to 2 occurs must be determined as a part of the BVP. In the presented solution ti was estimated from the direct method approximation using DIRCOL. The DIRCOL approximation gives a sub-optimal solution obtained using the following data: γ0 = 0 ◦ ,
γtf = −90◦
V0 = 272 m/s,
Vtf = 310 m/s
x0 = 0 m,
xtf = 10000 m
h0 = 30 m,
htf = 0 m
where the intermediate time ti = 10.45 s. The minimum performance index is 40445.48347 m·s and the final time 41.4789 s. In the first phase the minimum altitude hmin constraint is active. The missile starts to climb and then dives to reach the target by a bunt manoeuvre which is considered in the second phase. The minimum normal acceleration constraint is active during diving. Figures 3.20–3.24 show that the DIRCOL solutions for the state variables are close enough to the BNDSCO solutions. However, the co-state approximation does not work well in the first phase.
COMPUTATIONAL OPTIMAL CONTROL: TOOLS AND PRACTICE
80
2000 1800
DIRCOL BNDSCO
1600
altitude (m)
1400 1200 1000 800 600 400 200 0 0
5
10
15
20
25
30
35
40
45
time (s)
Figure 3.20 Altitude versus time histories using BNDSCO. Note the good agreement between the DIRCOL and BNDSCO solution due to absence of points of non-smoothness.
3.5 Summary and Discussion The study of computational results for the minimum altitude of the terminal bunt manoeuvre with varying final speed is important from the operational viewpoint. Since the mission is to strike a fixed target while minimising the missile’s exposure to anti-air defences, one should consider both the type of target and the exposure of the missile during the manoeuvre. If the mission is to strike a bunker, it is important to hit the target with the maximum capability of the missile. If the target’s prosecution may lead to collateral damage, then a more measured impact is advisable, so that the final speed should be lower. It is always important to avoid anti-air defences during the manoeuvre, so optimal exposure must be taken into account. Based on the computational result using DIRCOL, the optimal exposure for a maximum speed of 310 m/s is much greater compared with the final speed of 250 m/s, while the manoeuvre time is not much different (see Table 3.1). Hence, if the mission has a risk of collateral damage, the final speed of 250 m/s or 270 m/s is a better choice than 310 m/s, also because the anti-air defences have least time to intercept the missile (see Figure 3.3). While the final speed 310 m/s trajectory has higher exposure, it has a comparable flight time, but much higher terminal kinetic energy. Even though the direct collocation does not give an accurate solution, the numerical solutions give a starting point to analyse the performance of the missile during the bunt manoeuvre. The optimal trajectory of the manoeuvre can be split up into three main arcs. The first arc is level flight at the minimum altitude. The thrust is on the maximum value and the pure state constraint is active (the minimum altitude constraint). The flight time is longer for the smaller final speed, which means that the missile tries to climb as late as possible to gain enough power to perform the bunt manoeuvre and satisfy the final speed.
MINIMUM ALTITUDE FORMULATION
81
310 305
DIRCOL BNDSCO
300
speed (m/s)
295 290 285 280 275 270 265 260 0
5
10
15
20
25
30
35
40
45
time (s)
Figure 3.21 Speed versus time histories using BNDSCO. Note two points of non-smoothness (at t = 10.4500 s and t = 32.0407 s) in the BNDSCO solution, absent in the DIRCOL approximation.
COMPUTATIONAL OPTIMAL CONTROL: TOOLS AND PRACTICE
82
1
flight path angle (rad)
0.5
0
Ŧ0.5
Ŧ1
Ŧ1.5 DIRCOL BNDSCO
Ŧ2 0
5
10
15
20
25
30
35
40
45
time (s)
Figure 3.22 Flight path angle versus time histories using BNDSCO. Note that the points of non-smoothness in the BNDSCO solution (at t = 10.4500 s and t = 32.0407 s) are quite closely matched by the DIRCOL approximation.
MINIMUM ALTITUDE FORMULATION
83
0.1
angle of attack (rad)
0.05
0
Ŧ0.05
Ŧ0.1
DIRCOL BNDSCO
Ŧ0.15 0
5
10
15
20
25
30
35
40
45
time (s)
Figure 3.23 Angle of attack versus time histories using BNDSCO. Note two points of nonsmoothness (at t = 10.4500 s and t = 32.0407 s) in the BNDSCO solution, absent in the DIRCOL approximation.
COMPUTATIONAL OPTIMAL CONTROL: TOOLS AND PRACTICE
84
4
3
normal acceleration (g)
2
1
0
Ŧ1
Ŧ2
Ŧ3
Ŧ4 0
DIRCOL BNDSCO 5
10
15
20
25
30
35
40
45
time (s)
Figure 3.24 Normal acceleration versus time histories using BNDSCO. Note two points of non-smoothness (at t = 10.4500 s and t = 32.0407 s) in the BNDSCO solution, absent in the DIRCOL approximation.
MINIMUM ALTITUDE FORMULATION
85
This arc is the most difficult one to compute because the pure state constraint is active. The DIRCOL package can solve this arc and gives a good insight into the problem. In the second arc the missile must climb in order to achieve the final condition. The thrust is still on the maximum value for some cases while for the case of 250 m/s the thrust switches to the minimum value. The details of the switching structure of the equations and the constraints can be seen in Figures 3.26–3.30 below. The normal constraint is active immediately for the case of 250 m/s and at the beginning of climbing the normal acceleration is at the maximum value. The normal acceleration switches to the minimum value following the switching of the thrust to the minimum value. It is important to notice that the DIRCOL solutions produce a free arc (no constraint active except the thrust saturated at the maximum value) at the beginning of the climb for the case of 270 m/s and then saturated at the maximum value until the missile approaches its dive. Again for the same case of 270 m/s, DIRCOL produces a free arc during this arc. Just before the missile turns over to dive, the normal acceleration constraint is active again and is saturated at the minimum value. The thrust is the only active constraint for the case of 310 m/s in this arc. The third arc is the diving phase. Since the initial diving speed is lower than the final speed, the missile must gain power to reach the target. Therefore the thrust is at the maximum value. It can be seen in Figure 3.26 below that the thrust switches to the maximum value for the case of 250 m/s. The normal acceleration is saturated at the minimum value for the case of 250 m/s and 270 m/s, while for 310 m/s the minimum normal acceleration is active for just a few seconds after the missile starts to dive.
3.5.1 Comments on Switching Structure An intriguing feature of the DIRCOL solutions in Figures 3.3–3.12 is discontinuous jumps in the angle of attack α, particularly for final speeds 250 m/s and 270 m/s, see Figure 3.6. Both the thrust T and the angle of attack α are controls, but T enters the Hamiltonian linearly (thus allowing jumps via bang–bang control, see equation (3.11)), but α does so nonlinearly. Thus, on free arcs (no constraints active), the optimal value of α must be computed from the condition Hα = 0, see equation (3.12). However, for constrained arcs one should not use Hα = 0, but an appropriate equality, corresponding to the active constraint. For example, if the normal acceleration is saturated at L = Lmax , then optimal α is computed from equation (3.30); similarly, if L = Lmin is active, optimal α is obtained from equation (3.31). Consider now the jumps in the angle of attack α in Figure 3.6, see also Figure 3.2. These jumps might be caused by the multiplicity of solutions of α in the Hamiltonian, see Figure 3.25. While there are three possible solutions of Hα = 0, defining optimal α for a free arc, only the solution closest to zero is physically meaningful, as the other two are approximately −114.65◦ and 131.85◦, clearly infeasible values. Moreover, if the equation Hα = 0 is solved at each time step using the Newton–Raphson method with an initial guess of α close to 0, the very steep slope will prevent the method from finding the outlying solutions, see Figure 3.25. Thus, we may assume that jumps in the value of α, observed in Figure 3.6 for Vtf = 250 m/s or Vtf = 270 m/s, are not due to the multiplicity of solutions of Hα = 0. In other words, the fact that the Hamiltonian is not regular does not affect the numerical solution for free arcs. In order to understand what does cause discontinuities in α in Figure 3.6 let us first investigate the case of final speed 310 m/s. Since the missile is launched at h = hmin , the first arc
COMPUTATIONAL OPTIMAL CONTROL: TOOLS AND PRACTICE
86
6
1.5
x 10
1
Hα
0.5
0
Ŧ0.5
Ŧ1 Ŧ4
Ŧ3
Ŧ2
Ŧ1
0 α (rad)
1
2
3
4
Figure 3.25 Hamiltonian (3.4) is not regular, as the optimality condition Hα = 0, equation (3.12), has multiple solutions. The above plot of Hα versus α, revealing three possible values of optimal α, was obtained for t = 10.4500 s; similar curves were obtained for other times. Note the very steep slope in the vicinity of the middle solution (the one closest to zero): the slope is almost vertical, as the tangent is approximately 500000.
MINIMUM ALTITUDE FORMULATION
87
Case 310 m/s min h
max T
min
L
level flight
climbing
diving
Case 270 m/s h
min max
T L
level flight
max
min
climbing
diving
Case 250 m/s h T
min max
min max
L
level flight
climbing
max min
diving
Figure 3.26 Switching structure of the minimum altitude formulation for the terminal bunt manoeuvre.
88
COMPUTATIONAL OPTIMAL CONTROL: TOOLS AND PRACTICE
(level flight) is a constrained arc and therefore optimal α is computed not from Hα = 0, but from equation (3.3a). After 8.0013 s, the first arc ends, as the missile starts climbing, thus beginning the second arc. Although the altitude increases, see Figure 3.3, the normal acceleration is not saturated during climbing, because the speed is not fast enough, see Figure 3.4, to cause the normal acceleration to saturate, see Figure 3.8 and equations (1.8)–(1.10). Thus, while the thrust is saturated at the maximum value, T = Tmax , no other constraints are active, so that optimal α is computed from Hα = 0, see equation (3.12), where Tmax should be substituted for T . From the point of view of calculating optimal α, this fragment of the trajectory is a free arc (note that T = Tmax throughout). During climbing, the speed V decreases while the angle of attack α increases to facilitate climbing. While rapid climbing is necessary, the missile should also turn over to begin its dive as soon as possible, so that the excess of altitude is minimised. Therefore the angle of attack α starts decreasing, and reaches a negative value at time 25.2162 s. During diving the speed V increases to satisfy the terminal speed condition, which, in turn, causes the activation of minimum normal acceleration constraint Lmin at time 34.9148 s, marking the end of the free arc, which began at 8.0013 s when the constrained first arc (level flight) ended, see Figure 3.3. The remainder of the trajectory is another constrained arc, with L = Lmin , so that α is computed from equation (3.31), see also Figure 3.26. In summary, for the case Vtf = 310 m/s, the thrust T is saturated at Tmax throughout the whole trajectory, and the trajectory starts with (1) a constrained arc (h = hmin ), lasting from 0 to 8.0013 s, followed by (2) a free arc between 8.0013 and 34.9148 s, and finishes with (3) another constrained arc (L = Lmin ). As for optimal α, it is computed from equation (3.3a) for (1), then from equation (3.12) for (2), and from equation (3.31) for (3). This computation of α results in α being a continuous function of time t, but with two points of non-smoothness, coinciding with the (1)→(2) and (2)→(3) transitions, see Figure 3.6. For the case of final speed 270 m/s, the time of the level flight is longer than for the case of final speed 310 m/s. This means that the missile has a higher speed at the end of the first arc, and hence when it starts climbing with a rapidly increasing altitude the maximum normal acceleration constraint Lmax is active, after a short free arc on [t1− , t1+ ] (see Figure 3.8). At the start of climbing at t1− = 19.2637 s, optimal α is computed from equation (3.12) until it hits the maximum normal acceleration at t1+ = 19.5821 s. Then optimal α is computed from equation (3.30). Note that the activation of maximum normal acceleration is caused by the high speed of the missile, see equations (1.8)–(1.10), while the thrust is T = Tmax throughout the whole trajectory. The angle of attack must decrease to facilitate the missile’s turnover at t2− = 26.2686 s. While a rapid decrease in α is needed, it soon causes the activation of minimum normal acceleration constraint Lmin at t2+ = 28.4975 s. It must be noted that the thrust is still at the maximum value. The angle of attack α is then computed from equation (3.31), see also Figure 3.26. In summary, for the case Vtf = 270 m/s, the thrust T is saturated at Tmax throughout the whole trajectory. The trajectory starts with (1) a constrained arc (h = hmin ), lasting from 0 to 19.2637 s, followed by (2) a short free arc between t1− and t1+ , then by (3) another constrained arc (L = Lmax ) from t1+ to t2− , then by (4) another short free arc on [t2− , t2+ ], and, finally, by (5) the last constrained arc (L = Lmin ). This computation of α results in α still being a continuous function of time t, but with steep slopes on the short intervals [t1− , t1+ ] and [t2− , t2+ ], and non-smoothness points at t1− , t1+ , t2− and t2+ . As for the case Vtf = 310 m/s, non-smoothness points are due to the joining of constrained and free arcs: (1)→(2) at t1− , (2)→(3) at t1+ , (3)→(4) at t2− and (4)→(5) at t2+ .
MINIMUM ALTITUDE FORMULATION
89
For the case of final speed 250 m/s, the time of level flight is longer than for the case of final speed 270 m/s. Now t1− and t1+ merge into one point t1 , because the missile has a very high speed at the end of level flight and the altitude increases rapidly, so that the constraint L = Lmax is activated immediately after the end of level flight, i.e. at t1 = 20.6861 s. Thus, optimal α is computed from equation (3.3a) to the left of t1 , and from equation (3.30) to the right, causing a jump in α; note that T = Tmax on both sides of t1 . The constrained arc (L = Lmax ) to the right of t1 continues until speed V decreases enough for L < Lmax to become true, see Figure 3.4 and equations (1.8)–(1.10); note that V dominates in equation (1.8). When L = Lmax happens at time 20.6861 s, two factors contribute to computation of the optimal solution at that point: (1) the need to decrease speed V further in order to meet the terminal condition Vtf = 250 m/s, (2) the occurrence of a free arc as on [t2− , t2+ ] for Vtf = 270 m/s. To achieve (1), optimal α should decrease rapidly towards negative values (to facilitate turnover), while satisfying Hα = 0 according to (2). However, a rapid decrease in α activates the L = Lmin constraint and the decrease is limited via equation (3.31). In view of the arrested decrease in α, the only other way of facilitating the required turnover is by a more rapid decrease in speed V . This decrease in V is indeed achieved by switching the thrust from Tmax to Tmin and holding it at Tmin for a short period of time, see Figure 3.7. Hence, the short free arc between t2− and t2+ , seen for Vtf = 270 m/s, collapses now to a point t2− = t2+ = t2 = 28.0369 s at which optimal α is computed from Hα = 0, or equation (3.12). However, in that equation T changes from T = Tmax to the left of t2 into T = Tmin to the right of t2 , thus effecting a jump in α at t2 . This discontinuity in α at t2 immediately activates the L = Lmin constraint which remains active till tf . Still before tf , the thrust switches back to Tmax , once its short-lasting lowering to Tmin has accomplished the necessary facilitation of the missile’s turnover, see also the switching function in Figure 3.27. In summary, for the case Vtf = 250 m/s, the switching structure of the case Vtf = 270 m/s occurs in a limiting form. In other words, the free arcs [t1− , t1+ ] and [t2− , t2+ ] each collapse to a point: t1− = t1+ = t1 and t2− = t2+ = t2 . In the latter case, a switch from Tmax to Tmin also happens to facilitate the missile’s turnover—the thrust was at Tmax all the time for Vtf = 270 m/s. This computation of α results in α no longer being a continuous function of time t, but the causes of jumps at t1 and t2 are of different origin. In the case of t1 , the collapsed free arc does not show itself in the optimal solution: optimal α is computed from equation (3.3a) to the left of t1 and from equation (3.30) to the right of t1 , while T = Tmax on both sides of t1 (and at t1 ). On the other hand, optimal α at t2 is computed in a more subtle way. To the left of t2 it is obtained from equation (3.30) and to the right of t2 from equation (3.31), but its transition between these two values takes it through equation (3.12), where T jumps from Tmax to Tmin . Thus, the jump in α at t2 is caused by the jump in T , affecting it through the optimality equation for a (collapsed) free arc, Hα = 0.
COMPUTATIONAL OPTIMAL CONTROL: TOOLS AND PRACTICE
90
t2
0.02
t3
0
Ŧ0.02
Ŧ0.04
HT
Ŧ0.06
Ŧ0.08
Ŧ0.1
Ŧ0.12
Ŧ0.14
Ŧ0.16
0
5
10
15
20 25 time (s)
30
35
40
Figure 3.27 Switching function HT versus time, see equation (3.11).
45
level flight
t1
h = T = γ˙ = t1
T
climbing
t2
T
f (x, u, t) ∂ H al ∂x 0 L →α mg Tmin
t3
=
L min =
µ2 =
λ˙ =
x˙ =
− λγ (tex ) − ξ V cos γ − λv (tex ) − ξ sin γ − λx (tex ) − λh (tex ) Ht− ex
f (x, u, t) ∂ H al ∂x 0 L →α mg Tmax
= = = = =
t2
=
L max =
µ1 =
λ˙ =
x˙ =
+ λγ (tex ) + λv (tex ) + λx (tex ) + λh (tex ) Ht+ ex
t3
= = = =
T
t4
tf
f (x, u, t) ∂ H al ∂x 0 L →α mg Tmax
−90o 250 m/s 10000 m 0m
diving
=
L min =
µ2 =
λ˙ =
x˙ =
γ tf Vtf x tf h tf
t
Figure 3.28 Schematic representation of the boundary value problem associated with the switching structure for the minimum altitude problem, case 250 m/s.
t0
f (x, u, t) ∂ H aa ∂x h min Tmax 0→α
x˙ =
λ˙ =
0o 272 m/s 0m 30 m
= = = =
γ0 V0 x0 h0
MINIMUM ALTITUDE FORMULATION 91
level flight
t1
h = T = γ˙ =
t1
= = = = =
t2
t2
T
t4
f (x, u, t) ∂ H af = ∂x = 0→α = Tmax
=
climbing
t3
f (x, u, t) ∂ H al x˙ ∂x 0 λ˙ L →α af Hα mg Tmax T
t3
=
L max =
µ1 =
λ˙ =
x˙ =
− λγ (tex ) − ξ V cos γ − λv (tex ) − ξ sin γ − λx (tex ) − λh (tex ) Ht− ex
f (x, u, t) af ˙λ = ∂ H ∂x Hαaf = 0 → α T = Tmax
x˙ =
+ λγ (tex ) + λv (tex ) + λx (tex ) + λh (tex ) Ht+ ex
t4
t5
t5
f (x, u, t) x˙ al ˙λ = ∂ H λ˙ ∂x µ2 = 0 µ2 L → α L min L min = mg T = Tmax T
x˙ =
γtf Vtf x tf h tf
=
=
=
=
=
= = = =
diving
t6
tf
f (x, u, t) ∂ H al ∂x 0 L →α mg Tmax
−90o 270 m/s 10000 m 0m
t
Figure 3.29 Schematic representation of the boundary value problem associated with the switching structure for the minimum altitude problem, case 270 m/s.
t0
f (x, u, t) ∂ H aa ∂x h min Tmax 0→α
x˙ =
λ˙ =
0o 272 m/s 0m 30 m
= = = =
γ0 V0 x0 h0
92
COMPUTATIONAL OPTIMAL CONTROL: TOOLS AND PRACTICE
level flight
t1
h = T = γ˙ =
t1
= = = = =
climbing
t2
t2
t3
f (x, u, t) ∂ H af λ˙ = ∂x af Hα = 0 → α T = Tmax
x˙ =
− λγ (tex ) − ξ V cos γ − λv (tex ) − ξ sin γ − λx (tex ) − λh (tex ) Ht− ex
f (x, u, t) ∂ H af λ˙ = ∂x af Hα = 0 → α T = Tmax
x˙ =
+ λγ (tex ) + λv (tex ) + λx (tex ) + λh (tex ) Ht+ ex
= = = =
diving
t3
T
=
L min =
µ2 =
˙ = λ
x˙ =
γtf Vtf x tf h tf
t4
tf
f (x, u, t) ∂ H al ∂x 0 L →α mg Tmax
−90o 310 m/s 10000 m 0m
t
Figure 3.30 Schematic representation of the boundary value problem associated with the switching structure for the minimum altitude problem, case 310 m/s.
t0
f (x, u, t) ∂ H aa ∂x h min Tmax 0→α
x˙ =
λ˙ =
0o 272 m/s 0m 30 m
= = = =
γ0 V0 x0 h0
MINIMUM ALTITUDE FORMULATION 93
4
Minimum Time Formulation
This chapter focuses on the optimal trajectories of a generic cruise missile attacking a fixed target in minimum time. The target must be struck from above, subject to missile dynamics and path constraints. The generic shape of the optimal trajectory is: level flight, climbing, dive; this combination of the three flight phases is called the bunt manoeuvre. In Chapter 3 we analysed and solved the terminal bunt manoeuvre of a generic cruise missile for which its exposure to anti-air defences was minimised. This resulted in a nonlinear optimal control problem for which time-integrated flight altitude was minimised. In this chapter we consider the same missile model, but we analyse and solve the terminal bunt manoeuvre for the fastest attack. This leads to a minimum time optimal control problem which is solved in two complementary ways. A direct approach based on a collocation method is used to reveal the structure of the optimal solution which is composed of several arcs, each of which can be identified by the corresponding manoeuvre executed and active constraints. The DIRCOL and NUDOCCCS packages used in the direct approach produce approximate solutions for both states and costates. The indirect approach is employed to derive optimality conditions based on Pontryagin’s minimum principle. The resulting multi-point boundary value problem is then solved via multiple shooting with the BNDSCO package. The DIRCOL and NUDOCCCS results provide an initial guess for BNDSCO. This chapter is organised as follows. In Section 4.1 the problem formulation is defined. Section 4.2 presents some computational results of the minimum time problem using the direct collocation method package DIRCOL, followed by a qualitative analysis for the resulting optimal trajectory. Section 4.3 focuses on the mathematical analysis of the problem based on the qualitative analysis. Numerical results using the BNDSCO package are presented in Section 4.4. BNDSCO, DIRCOL, PROMIS, SOCS and NUDOCCCS results are compared in that section. Finally, a summary and discussion are presented in Section 4.5. Computational Optimal Control: Tools and Practice c 2009 John Wiley & Sons, Ltd
˙ S. Subchan and R. Zbikowski
96
COMPUTATIONAL OPTIMAL CONTROL: TOOLS AND PRACTICE
4.1 Minimum Time Problem In this section we consider the same missile model as defined in Section 1.2 and the only difference with Chapter 3 is the objective function. The problem is to find the optimal trajectory of a generic cruise missile from the assigned initial state to a final state with the minimum time along the trajectory. This problem can be formulated by introducing the performance criterion J =
tf
dt.
(4.1)
t0
4.2 Qualitative Analysis This section gives a qualitative discussion of the optimal trajectory of a cruise missile performing a bunt manoeuvre. The computational results are obtained using a direct collocation method (DIRCOL), based on von Stryk (1999), with the resulting nonlinear programming problem solved using the SNOPT solver, based on sequential quadratic programming due to Gill et al. (1993). Figures 4.1–4.6 show the computational results using the following boundary conditions: γ0 = 0 ◦ ,
γtf = −90◦
V0 = 272 m/s,
Vtf = 250, 270, 310 m/s
x0 = 0 m,
xtf = 10000 m
h0 = 30 m,
htf = 0 m.
Based on Figures 4.1–4.6, an attempt is made to identify characteristic arcs of the trajectory, classify them according to the constraints active on them, and suggest physical or mathematical explanations for the observed behaviour. In this analysis the missile is assumed to be launched horizontally from the minimum altitude constraint (h0 = 30 m). The trajectory is split into three subintervals: level flight, climbing and diving. Each of the trajectory arcs corresponding to the subintervals is now discussed in turn.
4.2.1 First Arc (Level Flight): Minimum Altitude Flight The missile flies at the minimum altitude with the thrust at the maximum value. Thus the thrust and altitude constraints are active immediately at the start of the manoeuvre. In this case the altitude h of the missile remains constant at the minimum altitude (hmin ) until the missile must start climbing. The cruise flight time depends on the final speed Vtf (see Figure 4.1). Equation (1.5d) equals zero during this flight because the altitude remains constant. This means that the flight path angle γ equals zero because the velocity V is never equal to zero during flight. Obviously, γ (t) = 0 causes the derivative of the flight path angle γ˙ to be equal
MINIMUM TIME FORMULATION
97
2000 V =310 m/s t
1800
f
Vt =270 m/s f
Vt =250 m/s f
1600
altitude (m)
1400 1200 1000 800 600 400 200 0 0
5
10
15
20
25
30
35
40
45
time (s)
Figure 4.1 Altitude versus time histories for minimum time problem using DIRCOL for a varying final speed. 310
300
290
speed (m/s)
280
270
260
250 V =310 m/s t
240
f
Vt =270 m/s f
Vt =250 m/s f
230 0
5
10
15
20
25
30
35
40
45
time (s)
Figure 4.2 Speed versus time histories for minimum time problem using DIRCOL for a varying final speed.
COMPUTATIONAL OPTIMAL CONTROL: TOOLS AND PRACTICE
98 0.5
flight path angle (rad)
0
Ŧ0.5
Ŧ1
Ŧ1.5
Vt =310 m/s f
Vt =270 m/s f
Vt =250 m/s f
Ŧ2 0
5
10
15
20
25
30
35
40
45
time (s)
Figure 4.3 Flight path angle versus time histories for minimum time problem using DIRCOL for a varying final speed. 0.15
0.1
angle of attack (rad)
0.05
0
Ŧ0.05
Ŧ0.1 V =310 m/s
Ŧ0.15
t
f
Vt =270 m/s f
Vt =250 m/s f
Ŧ0.2 0
5
10
15
20
25
30
35
40
45
time (s)
Figure 4.4 Angle of attack versus time histories for minimum time problem using DIRCOL for a varying final speed.
MINIMUM TIME FORMULATION
99
6000 5500 5000
thrust (N)
4500 4000 3500 3000 2500 2000
V =310 m/s t
f
1500
Vt =270 m/s f
Vt =250 m/s 1000 0
f
5
10
15
20
25
30
35
40
45
time (s)
Figure 4.5 Thrust versus time histories for minimum time problem using DIRCOL for a varying final speed. 4
3
normal acceleration (g)
2
1
0
Ŧ1
Ŧ2 Vt =310 m/s
Ŧ3
f
Vt =270 m/s f
Vt =250 m/s f
Ŧ4 0
5
10
15
20
25
30
35
40
45
time (s)
Figure 4.6 Normal acceleration versus time histories for minimum time problem using DIRCOL for a varying final speed.
100
COMPUTATIONAL OPTIMAL CONTROL: TOOLS AND PRACTICE
to zero. The dynamics equation (1.5) is therefore reduced as follows: 1 T −D L γ˙ = sin α + cos α − g = 0 V m m
(4.2a)
T −D L V˙ = cos α − sin α m m
(4.2b)
x˙ = V
(4.2c)
h˙ = 0.
(4.2d)
We now consider the consequences of the right-hand side of equation (4.2a) being zero. This condition means that the normal acceleration L/m remains almost constant, because the angle of attack α is very small. The first term on the right-hand side of equation (4.2a) is small, because sin α ≈ α ≈ 0, and we are left with L/m ≈ g due to cos α ≈ 1. In this arc, the speed increases, because for small α from (4.2b) T −D V˙ ≈ > 0, as T > D, m which, in turn, means that the angle of attack α slowly decreases in accordance with the lift and in order to maintain L/m approximately equal to g.
4.2.2 Second Arc (Climbing) The missile climbs eventually in order to achieve the final condition of the flight path γtf . Figure 4.1 shows that the missile climbs immediately at the beginning of launch for the case of final speed Vtf = 310 m/s. The thrust constraint is the only active constraint at the beginning of climbing. Although the missile needs full power to reach the target as soon as possible, the missile must satisfy the final speed at the boundary conditions. Therefore, the thrust switches to the minimum value for the case of final speed 250 m/s and 270 m/s when the missile nearly turns over. At the end of climbing the angle of attack is rapidly decreasing, while for the case of final speed 250 m/s the angle of attack is increasing and then decreasing rapidly. This increase makes the maximum normal acceleration constraint active for the case of final speed 250 m/s. The minimum normal acceleration is active at the end of climbing because of the rapidly decreasing angle of attack.
4.2.3 Third Arc (Diving) The missile dives with the minimum thrust at the beginning of diving for the cases of final speed 250 m/s and 270 m/s. Furthermore, the missile must hit the target at a certain value of speed at the end of the manoeuvre. In addition, the speed during the turnover is lower than the final speed. Therefore, the speed must increase and hence the thrust switches back to the maximum value. This means that the thrust will facilitate the missile’s arrival at the target as soon as possible. In this case the normal acceleration is still saturated at the minimum value. Obviously, the altitude goes down to reach the target (γ < 0 ⇒ h˙ < 0, see equation (1.5d)), while the speed goes up to satisfy the terminal speed condition Vtf . Finally, the missile satisfies the terminal condition of the manoeuvre approximately tf after firing.
MINIMUM TIME FORMULATION
101
Table 4.1 Performance index for the minimum time problem using DIRCOL. Final speed Vtf (m/s) J (s) 250 270 310
39.6471 39.9068 40.9078
4.3 Mathematical Analysis This section describes the mathematical analysis of the minimum time terminal bunt problem by considering qualitative analysis results from Section 4.2. The basic premise of the analysis is to exploit the clearly identifiable arcs of the trajectory and obtain the full solution by piecing them together. The theoretical basis of this approach is Bellman’s optimality principle (Macki and Strauss 1982, p. 118): Any piece of an optimal trajectory is optimal, a result which follows easily from a proof by contradiction. While the analysis of the trajectory is thus considerably simplified, establishing the length (duration) of each arc still requires the formulation and solution of consistent boundary value problems (BVPs). In Section 4.3.1 the problem with only the thrust constraint is considered. Section 4.3.2 explains the derivations relevant to optimal control problems with path constraints. Section 4.3.3 presents the first arc of the bunt manoeuvre, which is constrained by the altitude. The climbing manoeuvre is described in Section 4.3.4. The last part of the trajectory, the diving manoeuvre, is discussed in Section 4.3.5.
4.3.1 The Problem with the Thrust Constraint Only Firstly, we investigate the problem when the initial and final conditions are active and the control is constrained on thrust T only. Necessary conditions for optimality can be determined by applying Pontryagin’s minimum principle, see Pontryagin et al. (1962) and Bryson and Ho (1975). For this purpose, we first consider the Hamiltonian for the unconstrained case: H
mtf
λγ T − D L sin α + cos α − g cos γ =1+ V m m L T −D cos α − sin α − g sin γ + λV m m + λx V cos γ + λh V sin γ ,
(4.3)
where the co-states λ = (λγ , λV , λx , λh ) have been adjoined to the dynamic system of equation (1.5). The co-states are determined by mtf
∂H λ˙ = − . ∂x
(4.4)
COMPUTATIONAL OPTIMAL CONTROL: TOOLS AND PRACTICE
102
The components of co-state vector λ satisfying the preceding equations are ! " λγ g sin γ − λV g cos γ − λx V sin γ + λh V cos γ λ˙ γ = − V ! Cl ρSref cos α g Cd ρSref sin α T sin α ˙λV = − λγ − + + 2 cos γ − 2m 2m V 2m V " λV ρV Sref Cd cos α + Cl sin α + λx cos γ + λh sin γ − m λ˙ x = 0
λ˙ h = −λγ − + λV
Cl V Sref cos αρh Cd V Sref sin αρh + 2m 2m
Cl V 2 Sref sin αρh Cd V 2 Sref cos αρh + 2m 2m
(4.5)
(4.6) (4.7)
(4.8)
where ρh = 2C1 h + C2 . The optimal values of the control variables are generally to be determined from Pontryagin’s minimum principle. A necessary condition for optimal control is the minimum principle min H mtf , u
(4.9)
i.e. the Hamiltonian must be minimised with respect to the vector of controls u (T , α). Since the control T appears linearly in the Hamiltonian, the condition HTmtf = 0 does not determine optimal thrust, but T is bounded, so the following provides the minimum of the Hamiltonian: mtf Tmax if HT < 0 T = Tsing if HTmtf = 0 Tmin if HTmtf > 0 with
sin α cos α + λV (switching function). Vm m • Case when T on the boundary (T = Tmax or T = Tmin ) In this case α can be determined from T − D + Lα λγ mtf Hα = cos α − λV sin α m V Dα + L λγ sin α + λV cos α = 0 − m V HTmtf = λγ
with 1 ρV 2 Sref B1 2 1 Dα = ρV 2 Sref (2A1 α + A2 ). 2 Lα =
(4.10)
(4.11)
MINIMUM TIME FORMULATION
103
The value of α cannot be derived in closed form from (4.11), and must be obtained numerically. • Case when T = Tsing (singular control) When the switching function HTmtf becomes zero in an interval (t1 , t2 ) ⊂ (t0 , tf ), the control corresponding to the magnitude of the thrust T is singular. In these circumstances, there are finite control variations of T which do not affect the value of the Hamiltonian. From Bryson and Ho (1975, p. 246), the singular arcs occur when Humtf = 0 and
mtf det Huu = 0.
(4.12)
Substituting (4.3) into (4.12) with component u = (T , α) yields sin α cos α + λV =0 Vm m λγ λV cos α − sin α = (T − D + Lα ) Vm m λγ λV sin α + cos α = 0 − (Dα + L) Vm m
HTmtf = λγ Hαmtf
mtf det Huu = 0 ⇐⇒ λγ
cos α sin α − λV = 0. Vm m
(4.13)
(4.14) (4.15)
Conditions (4.13)–(4.15) cannot be satisfied simultaneously, so we conclude that there are no singular arcs. However, jump discontinuities in the control T may appear if, at a time t, the switching function (4.10) changes sign. The Hamiltonian is not an explicit function of time, so H mtf is constant along the optimal trajectory and must be equal to zero because of the minimum time formulation.
4.3.2 Optimal Control with Path Constraints In Section 4.3.1 we derived necessary conditions for optimality by considering only the initial and terminal conditions. In this section the level of complexity is increased by considering some additional constraints as defined in Section 1.2. The first state path constraint on the speed (1.12) can be split as Vmin − V 0 and V − Vmax 0. Both of them are of order 1, because V˙ explicitly depends on the controls, see Bryson and Ho (1975, pp. 99–100). Since the speed constraint is not active during the manoeuvre in this case, it will not be taken into account in the Hamiltonian (see Figure 4.2). The second path constraint on the altitude (1.13) is of order 2 and the mixed state–control constraint (1.15) is split as Lmin − L/(mg) 0 and L/(mg) − Lmax 0, where L/(mg) depends on the control explicitly. The Hamiltonian can be defined as follows: L L mtc mtf ¨ H + Lmin + µ2 − Lmax + µ3 (h). = H + µ1 − mg mg
104
COMPUTATIONAL OPTIMAL CONTROL: TOOLS AND PRACTICE
The differential equations for co-state vector λ = (λγ , λV , λx , λh ) can be written as mtc
∂H . λ˙ = − ∂x
(4.16)
Since these equations are rather lengthy, they are omitted here. For the Lagrange multipliers µi = 1, . . . , 3, there must hold = 0, if the associated constraint is not active; µi 0, if the associated constraint is active. From Section 4.2 we know that the state path constraint on the speed is not active during the entire manoeuvre. Therefore, in the following section we consider only altitude and normal acceleration constraints.
4.3.3 First Arc: Minimum Altitude Flight In this analysis we consider only the state path constraint hmin h and thrust control constraint (T is at the maximum value). In this case we assume that the missile starts at the initial altitude h = hmin and T = Tmax . Therefore, the constraints are active immediately at the start of the manoeuvre. The constraint hmin h has no explicit dependence on the control variables, therefore we must take the time derivative on the constraint until, finally, explicit dependence on the control does occur. Consider the following equations: h − hmin = 0 h˙ = V sin γ = 0 ⇒ γ (t) = 0
(4.17a) for t ∈ [t0 , t1 ]
h¨ = V˙ sin γ + V γ˙ cos γ = 0 ⇒ γ˙ (t) = 0 for t ∈ [t0 , t1 ].
(4.17b) (4.17c)
The controls appear explicitly after differentiating the constraint hmin h twice, so the order of the constraint is 2. Substituting equation (4.17) into the equation of motion (1.5), we obtain the reduced equations, as in equations (4.2). The angle of attack α can be obtained numerically from equation (4.2a). Then substituting α into equations (4.2b) and (4.2c), these equations can be solved as an initial value problem (IVP). Thus we can find the first arc easily, but we do not know how long it will last. For this purpose we should formulate the appropriate boundary value problem (BVP) which involves finding co-state variables by defining the Hamiltonian as follows: H mta = H mtf + µ3 V˙ sin γ + V γ˙ cos γ . (4.18) The appropriate co-state equations must be derived. The necessary condition for optimality is given by λγ λV cos α + µ3 V − sin α Hαmta = (T − D + Lα ) V m m λγ λV sin α + µ3 V + cos α = 0. (4.19) − (Dα + L) V m m
MINIMUM TIME FORMULATION
105
The angle of attack α can be obtained from equation (4.2a) while the Lagrange multiplier µ3 can be derived explicitly from equation (4.19) and substituted into the state and co-state equations. Since we know the flight path angle and the altitude during this manoeuvre, the number of differential equations reduces to six.
4.3.4 Second Arc: Climbing In this analysis we consider thrust and normal acceleration constraints. From the qualitative analysis, we know that the thrust control switches to the minimum value during climbing for the final speed cases of 250 m/s and 270 m/s, therefore the switching function must change sign from negative to positive. In this section we do not derive optimality conditions for the “free” arc cases, because we can refer to Section 4.3.1 for them. Consider mixed state–control inequality constraints, see equation (1.15), as follows: Lmin
L Lmax mg
where L explicitly depends on the control α. The inclusion of the mixed constraints above leads to the augmented Hamiltonian L L mtl mtf H = H + µ1 − + Lmin + µ2 − Lmax . (4.20) mg mg The right-hand side of the differential equations for the co-state equations is to be modified along subarcs of this second arc. Additionally, we have a necessary sign condition for the Lagrange parameter µi , on unconstrained subarcs = 0 µ1 Hαmtf mg = on constrained subarcs Lα = 0 µ2 −Hαmtf mg = Lα
and
on unconstrained subarcs on constrained subarcs
where • optimality condition when normal acceleration is at the maximum value λγ λV Hαmtl = (T − D + Lα ) cos α − sin α Vm m λγ λV Lα sin α + cos α + µ2 = 0. − (Dα + L) Vm m mg • optimality condition when normal acceleration is at the minimum value λγ λV cos α − sin α Hαmtl = (T − D + Lα ) Vm m λγ λV Lα sin α + cos α − µ1 = 0. − (Dα + L) Vm m mg
(4.21)
(4.22)
COMPUTATIONAL OPTIMAL CONTROL: TOOLS AND PRACTICE
106
When the maximum normal acceleration Lmax constraint is active (case 250 m/s), the angle of attack can be determined from the lift (see equation (1.9)) as α=
2mgLmax − B2 ρV 2 Sref . B1 ρV 2 Sref
(4.23)
Equation (4.23) is valid until equation (4.10) changes sign to positive. Below we summarise the results for the case when the normal acceleration is saturated at the maximum value (Lmax ): • State equations, as in equations (1.5). • Co-state equations: ! " λ ˙λγ = − γ g sin γ − λV g cos γ − λx V sin γ + λh V cos γ V ! Cl ρSref cos α g Cd ρSref sin α T sin α ˙λV = − λγ − + + 2 cos γ − 2m 2m V 2m V −
" λV ρV Sref LV Cd cos α + Cl sin α + λx cos γ + λh sin γ + µ2 m mg
λ˙ x = 0
! λ VS ρ ˙λh = − − γ ref h [Cd sin α − Cl cos α] 2m −
" λV V 2 Sref ρh Lh [Cd cos α + Cl sin α] + µ2 . 2m mg
• Optimality condition, as in equations (4.21) and (4.23). The thrust switches to the minimum value when HTmtf changes sign from negative to positive.
4.3.5 Third Arc: Diving In this analysis we consider only the normal acceleration constraint. At the start of diving the thrust is at the minimum value and then switches back to the maximum value for the final speeds of 250 m/s and 270 m/s. In addition, normal acceleration is saturated at the minimum value for all cases. The angle of attack can be determined as follows: α=
2mgLmin − B2 ρV 2 Sref . B1 ρV 2 Sref
(4.24)
The Hamiltonian and co-state equations are nearly the same as in the previous section, therefore the derivation is omitted here. The equations can be summarised as follows:
MINIMUM TIME FORMULATION
107
• State equations, see equations (1.5). • Co-state equations: " λγ g sin γ − λV g cos γ − λx V sin γ + λh V cos γ V ! Cl ρSref cos α g Cd ρSref sin α T sin α ˙λV = − λγ − + + 2 cos γ − 2m 2m V 2m V " λV ρVSref LV [Cd cos α + Cl sin α] + λx cos γ + λh sin γ − µ1 − m mg λ˙ γ = −
!
λ˙ x = 0
! λγ VSref ρh [Cd sin α − Cl cos α] λ˙ h = − − 2m −
" λV V 2 Sref ρh Lh [Cd cos α + Cl sin α] − µ1 . 2m mg
• Optimality condition, Hαmtl as in equation (4.22) and (4.24).
4.4 Indirect Method Solutions The multi-point boundary value problem is solved by means of the multiple shooting code BNDSCO, see Oberle and Grimm (1989). In contrast to the minimum altitude problem, the optimal solutions of the minimum time problem by the indirect method are given and compared with the DIRCOL, PROMIS, SOCS and NUDOCCCS results. This problem can be solved due to the inactive pure state constraint on the altitude while the mixed state–control constraint is still active. The direct method results based on the DIRCOL, PROMIS, SOCS and NUDOCCCS packages give a good approximation for the state and co-state variables although the problem involves an active mixed state–control inequality constraint. The multiple shooting code BNDSCO is used. Figures 4.7–4.15 show the computational results for BNDSCO, DIRCOL, PROMIS, SOCS and NUDOCCCS results for the following boundary conditions. The initial conditions are γ0 = 0 ◦ , V0 = 272 m/s, x0 = 0 m, h0 = 30 m.
COMPUTATIONAL OPTIMAL CONTROL: TOOLS AND PRACTICE
108 0.4 0.2
flight path angle (rad)
0 Ŧ0.2 Ŧ0.4 Ŧ0.6 Ŧ0.8 Ŧ1 Ŧ1.2 Ŧ1.4
PROMIS SOCS BNDSCO DIRCOL NUDOCCCS
Ŧ1.6 0
5
10
15
20 25 time (s)
30
35
40
45
Figure 4.7 Comparison of PROMIS, SOCS, BNDSCO, DIRCOL and NUDOCCCS results of the flight path angle versus time, constrained minimum time problem.
Table 4.2 Performance index for the minimum time problem for the case of final speed Vtf = 310 m/s. Software J (s) Grid points Tolerance Hamiltonian DIRCOL NUDOCCCS BNDSCO PROMIS SOCS
40.9078 40.9074 40.9476 40.9042 40.9025
87 105 9 141 300
The final conditions are γtf = −90◦, Vtf = 310 m/s, xtf = 10000 m, htf = 0 m.
10−9 10−8 10−6 10−6 10−6
10−4 10−6 10−5
MINIMUM TIME FORMULATION
109
310 305 300
PROMIS SOCS BNDSCO DIRCOL NUDOCCCS
speed (m/s)
295 290 285 280 275 270 265 0
5
10
15
20 25 time (s)
30
35
40
45
Figure 4.8 Comparison of PROMIS, SOCS, BNDSCO, DIRCOL and NUDOCCCS results of the velocity versus time, constrained minimum time problem.
2000 1800 1600
PROMIS SOCS BNDSCO DIRCOL NUDOCCCS
altitude (m)
1400 1200 1000 800 600 400 200 0 0
2000
4000 6000 downrange (m)
8000
10000
Figure 4.9 Comparison of PROMIS, SOCS, BNDSCO, DIRCOL and NUDOCCCS results of the altitude versus downrange, constrained minimum time problem.
COMPUTATIONAL OPTIMAL CONTROL: TOOLS AND PRACTICE
110
2
Normal acceleration (g)
1
0
Ŧ1 Ŧ2 Ŧ3 Ŧ4 Ŧ5 0
PROMIS SOCS BNDSCO DIRCOL NUDOCCCS 5
10
15
20 time (s)
25
30
35
40
Figure 4.10 Comparison of PROMIS, SOCS, BNDSCO, DIRCOL and NUDOCCCS results of the normal acceleration versus time, constrained minimum time problem.
0.1 PROMIS SOCS BNDSCO DIRCOL NUDOCCCS
angle of attack (rad)
0.05
0
Ŧ0.05
Ŧ0.1
Ŧ0.15 0
5
10
15
20 25 time (s)
30
35
40
45
Figure 4.11 Comparison of PROMIS, SOCS, BNDSCO, DIRCOL and NUDOCCCS results of the angle of attack versus time, constrained minimum time problem.
MINIMUM TIME FORMULATION
111
12 BNDSCO DIRCOL NUDOCCCS
10 8
OJ
6 4 2 0 Ŧ2 0
5
10
15
20
25
30
35
40
45
time (s)
Figure 4.12 Comparison of BNDSCO, DIRCOL and NUDOCCCS results of λγ versus time, constrained minimum time problem.
0 BNDSCO DIRCOL NUDOCCCS
Ŧ0.01 Ŧ0.02
Ov
Ŧ0.03 Ŧ0.04 Ŧ0.05 Ŧ0.06 Ŧ0.07 Ŧ0.08 Ŧ0.09 0
5
10
15
20
25
30
35
40
45
time (s)
Figure 4.13 Comparison of BNDSCO, DIRCOL and NUDOCCCS results of λV versus time, constrained minimum time problem.
COMPUTATIONAL OPTIMAL CONTROL: TOOLS AND PRACTICE
112
Ŧ3
Ŧ2.2
x 10
Ŧ2.3 Ŧ2.4
Ox
Ŧ2.5
BNDSCO DIRCOL NUDOCCCS
Ŧ2.6 Ŧ2.7 Ŧ2.8 Ŧ2.9 Ŧ3 0
5
10
15
20
25
30
35
40
45
time (s)
Figure 4.14 Comparison of BNDSCO, DIRCOL and NUDOCCCS results of λx versus time, constrained minimum time problem.
Ŧ3
Ŧ2.4
x 10
BNDSCO DIRCOL NUDOCCCS
Ŧ2.5 Ŧ2.6 Ŧ2.7
Oh
Ŧ2.8 Ŧ2.9 Ŧ3 Ŧ3.1 Ŧ3.2 Ŧ3.3 Ŧ3.4 0
5
10
15
20
25
30
35
40
45
time (s)
Figure 4.15 Comparison of BNDSCO, DIRCOL and NUDOCCCS results of λh versus time, constrained minimum time problem.
MINIMUM TIME FORMULATION
113
4.5 Summary and Discussion The purpose of the analysis in this chapter was to find the fastest trajectory to strike a fixed target which must be hit from above. Firstly, the computational results were obtained by using direct method packages DIRCOL, PROMIS, SOCS and NUDOCCCS for varying final speed. The computational results show that varying the final speed produces no significant differences in the final time (see Table 4.1). Furthermore, if we consider the minimum time only, then the maximum final speed of 310 m/s will inflict the greatest damage on the target. But if we consider both the optimal time and minimum exposure during the manoeuvre, then further analysis must be done. The minimum time solution gives the flight time less than a second faster compared with the minimum altitude solution, see Table 3.1 and Table 4.1, while the exposure during the manoeuvre for the minimum altitude problem is smaller than the minimum time problem. Hence, the trade-off between both objectives (minimum time and minimum altitude) must be taken into account. The generic trajectory for the minimum time problem has nearly the same performance as for the minimum altitude one. Since we only optimise the time, the missile climbs earlier than in the minimum altitude problem for the same final speed. Thus the level flight arc only occurs for the case of final speed 250 m/s. For the cases of final speed 270 m/s and 310 m/s the missile climbs immediately at the beginning of launch. During climbing, the thrust is the maximum value for the case of final speed 310 m/s while for the cases of final speed 250 m/s and 270 m/s the thrust switches to the minimum value. The maximum normal acceleration constraints are active only for the case 250 m/s in the middle of climbing which occurs in a few seconds. The normal acceleration and the thrust then switch to the minimum value. For the case of final speed 270 m/s the thrust switches to the minimum value at the end of climbing followed by the normal acceleration switching to the minimum value. At the start of diving, the minimum normal acceleration is active while the thrust is at the maximum value for the case of final speed 310 m/s. In the middle of diving for the cases of final speed 250 m/s and 270 m/s the thrust switches back to the maximum value to gain enough power to achieve the final speed while the normal acceleration is saturated at the minimum. The structure of the equations and switching time is given in Figures 4.16–4.19. The computational results of the direct method and indirect method are compared for the case of 310 m/s. In this case the minimum normal acceleration constraint is active during diving. DIRCOL, PROMIS, SOCS and NUDOCCCS produce nearly the same trajectory for the state variables. For the co-state variables, DIRCOL and NUDOCCCS produce nearly the same trajectory for the unconstrained arc, but not for the constrained arc. Furthermore, both of them give a good initial guess for the BNDSCO. The minimum time of the indirect method is greater than in the direct method result, see Table 4.2. This is possible because in the direct method the constraints may not be accurately satisfied due to the approximation, see von Stryk and Bulirsch (1992).
4.5.1 Comments on Switching Structure In this section we explain the jumps in the angle of attack α and the switching structure of thrust T . Consider firstly Figures 4.1–4.6, where for the case Vtf = 310 m/s the missile climbs
114
COMPUTATIONAL OPTIMAL CONTROL: TOOLS AND PRACTICE
immediately after launch. Although the altitude increases rapidly, see Figure 4.1, and the angle of attack α is relatively constant, see Figure 4.4, the normal acceleration is not saturated during climbing, see Figure 4.6. Thus, while the thrust is saturated at the maximum value, T = Tmax , no other constraints are active, so that optimal α is obtained from Hα = 0, see equation (4.11), where Tmax should be substituted for T . While rapid climbing is necessary, the missile should also turn over to begin its dive as soon as possible. Therefore the angle of attack α and speed V decrease to facilitate the missile’s turnover. The angle of attack α reaches a negative value at time 28.1241 s, which activates the minimum normal acceleration constraint (L = Lmin ) at time 30.6809 s, marking the end of the free arc. The remainder of the trajectory is a constrained arc, with L = Lmin , so that α is obtained from equation (4.24). In summary, for the case Vtf = 310 m/s, the thrust T is saturated at Tmax throughout the whole trajectory, and the trajectory starts with (1) a free arc, lasting from 0 to 30.6809 s, and finishes with (2) a constrained arc (L = Lmin ) from 30.6809 s till tf . As for optimal α, it is computed from equation (4.11) for (1), and from equation (4.24) for (2). This computation of α results in α being a continuous function of time t, but with one point of non-smoothness, coinciding with the (1) → (2) transition, see Figure 4.4. Let us investigate the case Vtf = 270 m/s. The missile is launched at h = hmin , the first arc (level flight) is a constrained arc and therefore optimal α is obtained not from Hα = 0, but from equation (4.2a). The first arc ends at time 1.6628 s, as the missile starts climbing, thus beginning the second arc. Although the altitude increases, see Figure 4.1, the normal acceleration is not saturated during climbing. While rapid climbing is necessary, the missile should also turn over to begin its dive as soon as possible. Therefore optimal α should decrease rapidly towards negative values (to facilitate turnover). However, rapid decrease in α via equation (4.11) is not sufficient for the required turnover, so it is helped by switching the thrust from Tmax to Tmin . The rapidly decreasing α immediately activates the L = Lmin constraint which remains active till tf . Still before tf , the thrust switches back to Tmax to facilitate the speed V reaching the final condition. In summary, for the case Vtf = 270 m/s, the trajectory starts with (1) a constrained arc (h = hmin ), lasting from 0 to 1.6628 s, followed by (2) a free arc between 1.662784 and 29.0987 s, and finishes with (3) another constrained arc (L = Lmin ). As for optimal α, it is computed from equation (4.2a) for (1), then from equation (4.11) for (2), and from equation (4.24) for (3). This computation of α results in α being a continuous function of time t, but with two points of non-smoothness, coinciding with the (1) → (2) and (2) → (3) transitions, see Figure 4.4. For the case Vtf = 250 m/s, the time of the level flight is longer than for the case of final speed 270 m/s. The first arc (level flight) is a constrained arc (h = hmin ) and therefore optimal α is computed from equation (4.2a). The first arc ends at time 4.9559 s, as the missile starts climbing, thus beginning the second arc. Although the altitude increases, see Figure 4.1, and the speed is relatively large, the normal acceleration is not saturated until 27.7736 s. The free arc ends when the maximum normal acceleration (L = Lmax ) is active. During the free arc α is obtained from equation (4.11). Then α is obtained from equation (4.23) for the time when the maximum normal acceleration (L = Lmax ) is active. Just after that, the thrust switches to the minimum value to facilitate the missile’s turnover. Although the thrust switches, the normal acceleration is still saturated. However, to facilitate the turnover, the angle of attack α decreases rapidly and reaches a negative value at time 28.5996 s. The negative value of α causes the normal acceleration to switch to the minimum value,
MINIMUM TIME FORMULATION
115
Case 310 m/s
max T
min
L
diving
climbing
Case 270 m/s min h
max
min
max
T
min L
level flight
climbing
diving
Case 250 m/s h
min max
T
max
L
level flight
max
min
climbing
min
diving
Figure 4.16 Switching structure of the minimum time formulation for the terminal bunt manoeuvre. L = Lmin , at 28.5996 s. The normal acceleration constraint, L = Lmin , remains active till tf , and therefore α is computed from equation (4.24). Still before tf , the thrust switches back to Tmax to facilitate the speed V reaching the final condition. In summary, for the case Vtf = 250 m/s, the trajectory starts with (1) a constrained arc (h = hmin ), lasting from 0 to 4.9559 s, followed by (2) a free arc between 4.9559 and 27.7736 s, then by (3) another short constrained arc (L = Lmax ) from 27.7736 to 28.5996 s, while in between the thrust switches from Tmax to Tmin , and, finally, by (4) the last constrained arc (L = Lmin ) the thrust again switches from Tmin to Tmax . As a result, optimal α is no longer a continuous function of time, having a discontinuity at the (3) → (4) transition. It also has two other points of non-smoothness: at the (1) → (2) and (2) → (3) transitions.
level flight
t1
t1
f (x, u, t) mtf ˙λ = ∂ H ∂x
= = = = =
∂ H mtf ∂α T
t2
= Tmax
= 0→α
t2
climbing
t3
f (x, u, t) mtl ˙λ = ∂ H ∂x µ1 = 0 L L max = →α mg T = Tmin
x˙ =
− λγ (tex ) − ξ V cos γ − λv (tex ) − ξ sin γ − λx (tex ) − λh (tex ) Ht− ex
f (x, u, t) ∂ H mtf λ˙ = ∂x
x˙ =
+ λγ (tex ) + λv (tex ) + λx (tex ) + λh (tex ) Ht+ ex
t3
t4
f (x, u, t) mtl ˙λ = ∂ H ∂x µ2 = 0 L L min = →α mg T = Tmin
x˙ =
t4
t5
t5
x˙ =
= = = =
−90 o 250 m/s 10000 m 0m
t6
f (x, u, t) mtl ˙λ = ∂ H ∂x µ2 = 0 L →α L min = mg T = Tmax
diving
f (x, u, t) mtl ˙λ = ∂ H ∂x µ2 = 0 L L min = →α mg T = Tmin
x˙ =
γ tf Vtf x tf h tf
tf
t
Figure 4.17 Schematic representation of the boundary value problem associated with the switching structure for the minimum time problem, case 250 m/s.
t0
0o 272 m/s 0m 30 m
x˙ =
= = = =
∂ H mtf = 0→α ∂α T = Tmax γ˙ = 0 → α
γ0 V0 x0 h0
116
COMPUTATIONAL OPTIMAL CONTROL: TOOLS AND PRACTICE
level flight
t1
t1
t2
t2
climbing
= Tmax
L min =
∂ H mtf ∂α T
T
f (x, u, t) ∂ H mtl ∂x 0 L →α mg Tmin
t3
=
λ˙ =
= 0→α
x˙ =
− λγ (tex ) − ξ V cos γ − λv (tex ) − ξ sin γ − λx (tex ) − λh (tex ) Ht− ex
µ2 =
= = = = =
f (x, u, t) ∂ H mtf λ˙ = ∂x
x˙ =
+ λγ (tex ) + λv (tex ) + λx (tex ) + λh (tex ) Ht+ ex
t3
T
=
L min =
µ2 =
λ˙ =
x˙ =
t4
t4
µ2 =
λ˙ =
x˙ =
= = = =
f (x, u, t) ∂ H mtl ∂x 0 L →α mg Tmax
−90o 270 m/s 10000 m 0m
t5
T =
L min =
diving
f (x, u, t) ∂ H mtl ∂x 0 L →α mg Tmin
γtf Vtf x tf h tf
tf
t
Figure 4.18 Schematic representation of the boundary value problem associated with the switching structure for the minimum time problem, case 270 m/s.
t0
f (x, u, t) ∂ H mtf λ˙ = ∂x
x˙ =
= = 272 m/s = 0m = 30 m
∂ H mtf = 0→α ∂α T = Tmax γ˙ = 0 → α
γ0 V0 x0 h0
0o
MINIMUM TIME FORMULATION 117
climbing
t1
= Tmax
t1
L min =
= 0→α
f (x, u, t) ∂ H mtl ∂x 0 L →α mg Tmax
t2
T =
λ˙ = µ2 =
x˙ =
f (x, u, t) ∂ H mtf λ˙ = ∂x
t2
= = = =
t3
f ( x , u, t ) ∂ H mtl ∂x 0 L →α mg Tmax
−90 o 310 m/s 10000 m 0m
diving
T =
L min =
µ2 =
λ˙ =
x˙ =
γ tf Vtf x tf h tf
tf
t
Figure 4.19 Schematic representation of the boundary value problem associated with the switching structure for the minimum time problem, case 310 m/s.
t0
0o 272 m/s 0m 30 m
x˙ =
= = = =
∂ H mtf ∂α T
γ0 V0 x0 h0
118
COMPUTATIONAL OPTIMAL CONTROL: TOOLS AND PRACTICE
5
Software Implementation This chapter presents example software implementations performed in order to solve the terminal bunt manoeuvre as discussed in Chapter 4. The direct method implementations were done using the DIRCOL, NUDOCCCS, PROMIS and SOCS packages, while for the indirect method the BNDSCO package was used. Section 5.1 focuses on the DIRCOL implementation, followed by the NUDOCCCS implementation in Section 5.2. The final direct method implementations are done with PROMIS and SOCS under the GESOP environment, as described in Section 5.3. The indirect method package BNDSCO is discussed in Section 5.4. All packages illustrate the minimum time case for the terminal bunt manoeuvre problem with the final speed 310 m/s. Our user experience with the packages is summarised in Section 5.5. The mathematical model of the terminal bunt manoeuvre from Section 1.2 is shown again for self-contained presentation. The problem is to find the optimal trajectory of a generic cruise missile from an assigned initial state to a final state in minimum time. The objective is formulated by introducing the performance criterion J =
tf
dt.
(5.1)
t0
The performance criterion is subject to the equations of motion, which are L g cos γ T −D sin α + cos α − mV mV V L T − D cos α − sin α − g sin γ V˙ = m m γ˙ =
(5.2a) (5.2b)
x˙ = V cos γ
(5.2c)
h˙ = V sin γ
(5.2d)
where t is the actual time, t0 t tf , with t0 as the initial time and tf as the final time. The state variables are the flight path angle γ , speed V , horizontal position x and altitude h of the missile. The thrust magnitude T and the angle of attack α are the two control variables. The aerodynamic forces D and L are functions of the altitude h, velocity V and angle of attack α. The following relationships have been assumed: Computational Optimal Control: Tools and Practice c 2009 John Wiley & Sons, Ltd
˙ S. Subchan and R. Zbikowski
120
COMPUTATIONAL OPTIMAL CONTROL: TOOLS AND PRACTICE
Axial aerodynamic force This force is written in the form 1 Cd ρV 2 Sref 2
(5.3)
Cd = A1 α 2 + A2 α + A3 .
(5.4)
D(h, V , α) =
Note that D is not the drag force. Normal aerodynamic force This force is written in the form 1 Cl ρV 2 Sref 2 Cl = B1 α + B2 ,
L(h, V , α) =
(5.5) (5.6)
where ρ is air density (see footnote 2 on page 5) given by ρ = C1 h2 + C2 h + C3
(5.7)
and Sref is the reference area of the missile; m denotes the mass and g the gravitational constant, see also Table 5.1. Note that L is not the lift force. Boundary conditions The initial and final conditions for the four state variables are specified as follows: γ0 = 0 ◦ ,
γt f
= −90◦
(5.8a)
V0 = 272 m/s,
Vtf = 310 m/s
(5.8b)
x0 = 0 m,
xtf = 10000 m
(5.8c)
h0 = 30 m,
htf
= 0 m.
(5.8d)
In addition, constraints are defined as follows: • State path constraints 200 V 310 30 h.
(5.9) (5.10)
Note that the altitude constraint (5.10) does not apply near the terminal condition. • Control path constraint
1000 T 6000.
(5.11)
• Mixed state and control constraint (see equations (5.5)–(5.7)) −4
L 4. mg
(5.12)
SOFTWARE IMPLEMENTATION
121
Table 5.1 Physical modelling parameters. Quantity Value Unit m g Sref A1 A2 A3 B1 B2 C1 C2 C3
1005 9.81 0.3376 −1.9431 −0.1499 0.2359 21.9 0 3.312 · 10−9 −1.142 · 10−4 1.224
kg m/s2 m2
kg/m5 kg/m4 kg/m3
Initial estimates for x (t), u (t), tf an initial grid
set tolerances
macro iteration: solve NLP using NPSOL or SNOPT major iteration: solve QP
refinement of the switching structure grid refinement
minor iteration: QP iteration
accuracy check
check switching structure
Stop
Figure 5.1 DIRCOL flowchart, after von Stryk (1999, p. 51).
COMPUTATIONAL OPTIMAL CONTROL: TOOLS AND PRACTICE
122
OUTPUT FILES
INPUT FILES DATDIM (dimensions, tolerances, etc.) DATLIM (lower and upper bounds, boundary conditions) DATGIT (optional) (grid points)
DIRCOL SUBROUTINES DIRCOM (initialising user-supplied COMMON blocks)
DATSKA (optional) (constants for scaling) GDATX (optional) (state variables of a previous run) GDATU (optional) (control variables of a previous run)
USROBJ (objective function) USRDEQ (differential equations) USRNBC (nonlinear boundary conditions) USRNIC (nonlinear inequality constraints) USRNEQ (nonlinear equality constraints) USRSTV (starting values)
DATRES (protocol and résumé of optimisation) DATGIT (grid points used) DATSKA (constants used for scaling) GDATX (computed state variables) GDATU (computed control variables) GDATD (local errors for accuracy checks: defects, optimality, constraints) GDATL (optional) (estimates of adjoint variables) GDATM (optional) (estimates multiplier of constraints) GDATS (optional) (estimates new switching points)
Figure 5.2 DIRCOL structure, after von Stryk (1999, p. 59).
SOFTWARE IMPLEMENTATION
123
5.1 DIRCOL implementation DIRCOL (Direct Collocation Method for the Numerical Solution of Optimal Control Problems) is a collection of subroutines in FORTRAN designed to solve optimal control problems of the systems described by first-order differential equations subject to general equality or inequality constraints on the control and/or state variables, see von Stryk (1999). The direct collocation methods transform the optimal control problem into a sequence of nonlinear constrained optimisation problems by discretising the state and control variables. In DIRCOL, the controls are chosen as piecewise linear interpolating functions and the states are chosen as continuously differentiable and piecewise cubic function. The NLP results of the transformation are solved by the sequential quadratic programming method SNOPT, see Gill et al. (1993). The flowchart of DIRCOL can be seen in Figure 5.1. One of the advantages of DIRCOL is that it also computes the estimation of the adjoint variables (co-states), see von Stryk (1993) and also Section 2.5.1. This section describes in detail how the terminal bunt manoeuvre problem was implemented in DIRCOL by displaying and commenting on essential subroutines in the main file user.f and two input files DATDIM and DATLIM. The section does not attempt to explain how to implement the general optimal control problem in DIRCOL, as this is done in von Stryk’s user’s guide for DIRCOL, see von Stryk (1999) and also the structure of DIRCOL in Figure 5.2. The minimum time case will be given as an illustration of how to implement the terminal bunt problem in DIRCOL.
5.1.1 User.f File user.f defines important components of the simulation, see the subroutine part of Figure 5.2. It contains some subroutines as follows: • Subroutine DIRCOM defines some basic data. • Subroutine USRSTV defines the initial estimates for the state and control variables. • Subroutine USROBJ defines the objective function. • Subroutine USRDEQ defines the dynamic equations. • Subroutine USRNBC defines the nonlinear boundary conditions. • Subroutine USRNIC defines the nonlinear inequality constraints. • Subroutine USRNEC defines the nonlinear equality constraints. Subroutine DIRCOM This subroutine defines the basic data for the problem such as the gravitational constant g (GRA), the mass m (AM), the reference area of the missile Sref (SREF), the polynomial coefficient for D, see equation (5.4), (CA2, CA1, CA0), the polynomial coefficient for L, see equation (5.5), (CN0) and the polynomial coefficient of air density ρ (CRHO2, CRHO1, CRHO0), see Table 5.1, page 121. The user can exploit COMMON blocks such as /USRCOM/, which may be shared by all subroutines which need the variables and constants. The data of the terminal bunt problem can be described as:
COMPUTATIONAL OPTIMAL CONTROL: TOOLS AND PRACTICE
124
IMPLICIT NONE C C--------BEGIN---PROBLEM-----------------------------------------------C C* COMMON /USRCOM/ C C---------END----PROBLEM-----------------------------------------------DOUBLE PRECISION PI,AM,AGRA,SREF,CA2,CA1,CA0,CN0,CRHO2,CRHO1,CRHO0 C COMMON /USRCOM/ PI,AM,AGRA,SREF,CA2,CA1,CA0,CN0,CRHO2,CRHO1,CRHO0 C --PI AM AGRA SREF CA2 CA1 CA0 CN0 CRHO2 CRHO1 CRHO0
= = = = = = = = = = =
3.141592653589793238D0 1005 9.8 0.3376 -1.9431 -0.1499 0.2359 21.9 3.312D-9 -1.142D-4 1.224
C RETURN END
Subroutine USRSTV This subroutine contains the initial estimates of the state and control variable histories, the control parameters, and initial estimates of the events as provided by the user. The initial estimates for the terminal bunt problem can be given as: IMPLICIT NONE INTEGER IPHASE, NX, LU, IFAIL DOUBLE PRECISION + TAU, X(NX), U(LU) C C--------BEGIN---PROBLEM-----------------------------------------------C DOUBLE PRECISION PI,AM,AGRA,SREF,CA2,CA1,CA0,CN0,CRHO2,CRHO1,CRHO0 DOUBLE PRECISION AGAM0, V0, X0, AH0, Z0, XF C COMMON /USRCOM/ PI,AM,AGRA,SREF,CA2,CA1,CA0,CN0,CRHO2,CRHO1,CRHO0 PARAMETER (X0 = 0.D0, AH0 = 30.D0, V0 = 270.0D0) PARAMETER (AGAM0 = 0.0D0, Z0 = 30.0D0, XF = 10000.0D0) C c IF (IPHASE .GT. 0) THEN C C ------ Initial estimates of X(t) and U(t) C X(1) = AGAM0 X(2) = V0 X(3) = X0 + TAU * (XF - X0) X(4) = AH0 U(1) = 0.000D0 U(2) = 6000.D0 C C---------END----PROBLEM-----------------------------------------------C RETURN C --- End of subroutine USRSTV END
SOFTWARE IMPLEMENTATION
125
Subroutine USROBJ This subroutine contains the objective of the optimal control problem in the Mayer form (see Section 2.2). The parameter NR below specifies the required component of the objective in each phase. The objective function of the terminal bunt problem, see equation (5.1), can be defined as: IMPLICIT NONE INTEGER NX, LU, LP, NR, IFAIL DOUBLE PRECISION + ENR, XL(NX), UL(LU), P(LP), FOBJ, XR(NX), UR(LU), TF C C--------BEGIN---PROBLEM-----------------------------------------------C IF (NR .EQ. 1) THEN FOBJ = TF ELSE FOBJ = 0.0D0 END IF C C---------END----PROBLEM-----------------------------------------------C RETURN C --- End of subroutine USROBJ END
Subroutine USRDEQ This subroutine provides the right-hand side of the dynamic equations, see equations 5.2: IMPLICIT NONE INTEGER IPHASE, NX, LU, LP, IFAIL DOUBLE PRECISION + X(NX), U(LU), P(LP), T, F(NX) DOUBLE PRECISION RHO,RHOV,A,AN,COSGAM,SINGAM,COSALP,SINALP,TMA,ANM DOUBLE PRECISION PI,AM,AGRA,SREF,CA2,CA1,CA0,CN0,CRHO2,CRHO1,CRHO0 C COMMON /USRCOM/ PI,AM,AGRA,SREF,CA2,CA1,CA0,CN0,CRHO2,CRHO1,CRHO0 C C--------BEGIN---PROBLEM-----------------------------------------------C INTEGER I INTRINSIC COS, SIN C --- the air density RHO = CRHO2*X(4)**2+CRHO1*X(4)+CRHO0 C --- 0.5 RHO V^2 SREF RHOV = 0.5 * RHO * X(2)**2 * SREF C C --- the drag A = (CA2 * U(1)**2 + CA1 * U(1) + CA0) * RHOV C C --- the lift AN = CN0 * RHOV * U(1) C C C --- the differential equations C -------------------------C COSGAM = COS(X(1)) SINGAM = SIN(X(1)) COSALP = COS(U(1)) SINALP = SIN(U(1))
COMPUTATIONAL OPTIMAL CONTROL: TOOLS AND PRACTICE
126 TMA ANM
= (U(2)-A)/AM = AN/AM
C F(1) F(2) F(3) F(4)
= = = =
(TMA*SINALP + ANM*COSALP - AGRA*COSGAM)/X(2) TMA*COSALP - ANM*SINALP - AGRA*SINGAM X(2) * COSGAM X(2) * SINGAM
C C---------END----PROBLEM-----------------------------------------------C RETURN C --- End of subroutine USRDEQ END
Subroutine USRNBC This subroutine describes the nonlinear boundary conditions of the problem. The parameter IKIND below specifies the type of boundary/switching conditions (explicit or implicit) that has to be computed. The parameter XR is the final conditions of the state variables that are defined by explicit boundary conditions for the terminal bunt problem. The boundary conditions are defined in equations (5.8). The subroutine is given as follows: IMPLICIT NONE C INTEGER IKIND, NRNLN, NX, LU, LP, IFAIL C**** REAL DOUBLE PRECISION + XL(NX), XR(NX), UL(LU), UR(LU), P(LP), EL, ER, RB(NRNLN) C C--------BEGIN---PROBLEM---------------------------------C C This problem doesn’t have any C (nonlinear) implicit boundary/switching conditions C C There are some explicit boundary conditions of the second kind. IF (IKIND .EQ. -1) THEN XR(1) = -1.57D0 XR(2) = 310.0D0 XR(3) = 10000.0D0 XR(4) = 0.0D0 END IF C C---------END----PROBLEM---------------------------------C RETURN C --- End of subroutine USRNBC END
Subroutine USRNIC This subroutine provides the nonlinear inequality constraints of the problem. The following inequality constraints should be defined for the terminal bunt problem:
SOFTWARE IMPLEMENTATION
127 200 V 310 30 h 1000 T 6000 −4
L 4. mg
(5.13) (5.14) (5.15) (5.16)
• Altitude constraint (5.14). In the terminal bunt problem, the final condition of the altitude is htf = 0. Therefore the altitude constraint is always violated by the final condition. It is necessary to give a condition for the altitude constraint to overcome this problem. • Normal acceleration (5.16). The normal acceleration constraints can be rewritten as L 2 0 16 − . (5.17) mg The normal acceleration can now be implemented as one constraint instead of two. • Air speed (5.13). The air speed constraints can be split into two inequality constraints: 0 V − 200 and 0 310 − V .
(5.18)
The constraints on the thrust are not defined here, as they will be defined as lower and upper bounds in file DATLIM, see Section 5.1.2. The following subroutine defines the nonlinear inequality constraints of the terminal bunt problem: IMPLICIT NONE C INTEGER NGNLN, NX, LU, LP, IPHASE, NEEDG(NGNLN), IFAIL DOUBLE PRECISION C**** REAL + T, X(NX), U(LU), P(LP), G(NGNLN) DOUBLE PRECISION Rhok, ACCN DOUBLE PRECISION RHO,RHOV,A,AN,COSGAM,SINGAM,COSALP,SINALP,TMA,ANM DOUBLE PRECISION PI,AM,AGRA,SREF,CA2,CA1,CA0,CN1,CRHO2,CRHO1,CRHO0 C COMMON /USRCOM/ PI,AM,AGRA,SREF,CA2,CA1,CA0,CN1,CRHO2,CRHO1,CRHO0 C C C--------BEGIN---PROBLEM---------------------------------C Rhok = CRHO2*X(4)*X(4)+CRHO1*X(4)+CRHO0 ACCN = (21.9/2 * U(1)*Rhok*X(2)*X(2)*SREF)/(1005*9.8) C --Normal Acceleration Constraint G(1) = 16 - ACCN*ACCN C --Altitude Constraint IF (X(3).LE.7500.0D0) THEN G(2) = X(4)-30.0D0 ENDIF C C---------END----PROBLEM---------------------------------C RETURN C --- End of subroutine USRNIC END
128
COMPUTATIONAL OPTIMAL CONTROL: TOOLS AND PRACTICE
Subroutine USRNEC This subroutine describes the nonlinear equality constraints of the problem. The terminal bunt problem does not have any equality constraints.
5.1.2 Input File DATLIM Input file DATLIM is always needed to run DIRCOL. This file prescribes values at the initial time, final time, lower and upper bounds of the state and control variables’ data limits. In this file the lower and upper bounds of the final time are prescribed. The lower and upper bounds for the fixed final time can be defined by the same values of the lower and upper bounds. ************************************************************************ * file DATLIM * * (prescribed values at initial time, final time and switching points, * * lower and upper bounds for all variables X, U, P, E) * ************************************************************************ * * the NX values of X(1) through X(NX) at E(1)=T0, E(M)=TF are 1 , 1, 0.00D0 1 , 1, 272.0D0 1 , 1, 0.0D0 1 , 1, 30.D0 1 , 0, 0.0D0 * the LU values of U(1) through U(LU) at E(1)=T0, E(M)=TF are 0 , 0 0 , 0 * * 1. switching point E(2): * ----------------------* the NX values of X(1) through X(NX) at the switching point are * ... * the LU values of U(1) through U(LU) at the switching point are * * 2. switching point E(3): * ----------------------* the NX values of X(1) through X(NX) at the switching point are * ... * the LU values of U(1) through U(LU) at the switching point are * ... * * the lower and upper bounds of the events E(2),...,E(M)=TF are * ------------------------------------------------------------* MIN , MAX 30.0D0 , 50.0D0 * * 1. phase: * --------* the NX lower and upper bounds of the state variables X are * X(I)MIN , X(I)MAX -1.57D0 , 1.57D0 +200.0D0 , +310.0D0 0.0D0 , +10000.0D0 0.00D0 , +1900.00D0 0.00D0 , +100000.00D0 * * the LU lower and upper bounds of the control variables U are * U(K)MIN , U(K)MAX -0.3D0 , +0.3D0 +1000.0D0 , +6000.0D0 * * 2. phase: * --------* the NX lower and upper bounds of the state variables X are
SOFTWARE IMPLEMENTATION
129
* X(I)MIN , X(I)MAX * * the LU lower and upper bounds of the control variables U are * U(K)MIN , U(K)MAX * * * the LP lower and upper bounds of the control parameters P are * ------------------------------------------------------------* P(K)MIN , P(K)MAX * * *23456789012345678901234567890123456789012345678901234567890123456789012
5.1.3 Input File DATDIM The input file DATDIM supplies some important data to run DIRCOL. This file prescribes the following block information: • name of the problem • type of simulation to be performed by DIRCOL and the maximum number of iterations • the major optimality tolerance of SNOPT • the nonlinearity feasibility tolerance • major print level • iScale (the type of scaling) and iDiff (the type of finite difference approximations of non-zero derivatives) • the dimension of the state, control and control parameter • number of phases • number of nonlinear implicit boundary constraints • number of nonlinear inequality and equality constraints • number of grid points • grid point parameter • starting values • estimates of the adjoint variables and the switching structure • name of the state variables • name of the control variables • definition of the constraints for the angle • name of inequality constraints.
130
COMPUTATIONAL OPTIMAL CONTROL: TOOLS AND PRACTICE
************************************************************************ * file DATDIM * * (Dimensions of the parameterized optimal control problem) * ************************************************************************ * * NAME of the OPTIMAL CONTROL PROBLEM * ----------------------------------*2345678901234567890123456789* (<-- max. length of name) Terminal Bunt Manoeuvre * * iAction: * ------* - OPTIMIZATION using NPSOL .................................... (0) * - a check of all dimensions of feasibility .................... (1) * - a check of subroutines & computation of starting trajectory . (2) * or computation of a FEASIBLE TRAJECTORY by * - objective min-max1 / use NPOPT .............................. (3) * - objective min-max1 / use NPSOL .............................. (4) * - objective min-max2 / use NPSOL .............................. (5) * or actions involving SNOPT: * - OPTIMIZATION using NPOPT ............................. (6) * - OPTIMIZATION using SNOPT (dense Jacobian)............. (7) * - OPTIMIZATION using SNOPT (sparse Jacobian)............ (8) * - FEASIBLE TRAJECTORY using SNOPT (sparse Jacobian)............ (9) * * iAction, MajItL = ?,? * * 0, -5 * 1 * 2, -1 * 4, -1 * 5, -1 * 6, -1 * 7, -1 * 8, -11 * * Optional SQP-Parameters: * ----------------------* Optimality Tolerance EPSOPT = ? 1.0E-9 * * Nonlinear Feasibility Tolerance EPSNFT = ? 1.0E-9 * * Major Print Level (0, 5 or 10) = ? 5 * * which SCALINGS and DIFFERENCE APPROXIMATIONS are to be used: * ----------------------------------------------------------* iScale: * -----* - automatic scaling (but for X, U, E in each phase the same) (0) * - read scalings from file ’DATSKA’ (1) * - use no scaling (2) * - automatic scaling (X, U, E in each phase different) (3) * - automatic scaling (X, U in each phase the same, but E different)(4) * * iDiff: * ----* - forward difference approximations of DIRCOL (default) (0) * - internal difference approximation of NPSOL or SNOPT (-1) * * iScale, iDiff = ?,? 0, -1 * * NUMBER of STATE VARIABLES ( NX ), * ------ of CONTROL VARIABLES ( LU ), * of CONTROL PARAMETERS ( LP ),
SOFTWARE IMPLEMENTATION * NX, LU, LP = ? 5, 2, 0 * * NUMBER of PHASES M1 = ? * -----------------------1 * * NUMBERS of NONLINEAR IMPLICIT BOUNDARY CONSTRAINTS * -------------------------------------------------* NRNLN(1) * ... * NRNLN(M1) 0 * * NUMBERS of NONLINEAR INEQUALITY and EQUALITY CONSTRAINTS * -------------------------------------------------------* in phases 1 through M1: * NGNLN(1) , NHNLN(1) * ... * NGNLN(M1), NHNLN(M1) * 2,0 * * NUMBER of GRID POINTS in phases 1 through M1 ( NG(k) >= 3 ): * -----------------------------------------------------------* NG(1) * ... * NG(M1) 7 * * GRID POINTs parameters: * iStartGrid | iOptGrid (during optimization): * ---------| -------* (starting positions): | - fixed grid points (0) * - equidistant (0) | - movable (collocation error) (1) * - as in file DATGIT (1) | - movable (variation) (2) * - as Chebyshev points (2) | - movable (no add. eq. cons.) (3) * * iStartGrid, iOptGrid = ?,? 0, 0 * * STARTING VALUES of X(t), U(t), P, and E: * -------------------------------------------* - as specified in subroutine USRSTV (0) * - as in files GDATX, GDATU (unchanged number of phases) (1) * - X, U, P as in files GDATX, GDATU and * E as specified in USRSTV (changed number of phases) (2) 1 * * ESTIMATES of the ADJOINT VARIABLES and of * ----------------------------------------* the SWITCHING STRUCTURES of state and control constraints * - are NOT required (0) * - are required (1) 1 * * NAMES of the NX state variables: * --------------------------------* X(1)_Name * ... * X(NX)_Name *2345678901234* (<-- max. length of name) g(t) s(t) x(t) h(t) z(t) * * NAMES of the LU control variables:
131
132
COMPUTATIONAL OPTIMAL CONTROL: TOOLS AND PRACTICE
* ----------------------------------* U(1)_Name * ... * U(LU)_Name *2345678901234* (<-- max. length of name) alpha(t) Thrust(t) * * the I-th STATE VARIABLE (I = 1,.., NX) is an UNCONSTRAINED ANGLE * and varies only in [ -PI, PI [ : 1 (if yes) or 0 (if not) * 1 0 0 0 0 * * the K-th control variable (K = 1,.., LU) is an UNCONSTRAINED ANGLE * and varies only in [ -PI, PI [ : 1 (if yes) or 0 (if not) * 1 0 * * NAMES of the NGNLN(1) nonlinear INEQUALITY CONSTRAINTS of the 1-st phase: * ----* 1-st name * ... * NGNLN(1)-th name *2345678901234* (<-- max. length of name) * 16 - (N/Mg)^2 h - hmin * * * NAMES of the NGNLN(2) nonlinear INEQUALITY CONSTRAINTS of the 2-nd phase: * 1-st name * ... * NGNLN(1)-th name *2345678901234* (<-- max. length of name) * * *23456789012345678901234567890123456789012345678901234567890123456789012
5.1.4 Grid Refinement and Maximum Dimensions in DIRCOL Grid refinement can be done either by editing input file DATGIT or by increasing the number of grid points in the input file DATDIM. However, the dimensions of variables defined in DATDIM (including the number of grid points) must not exceed the maximum values declared in dircol.h. These maximum values can be changed by editing dircol.h.
5.2 NUDOCCCS Implementation NUDOCCCS (Numerical Discretization Method for Optimal Control Problems with Constraints in Control and States) is a collection of FORTRAN codes developed by Büskens (1996) to solve optimal control problems by discretising the state and control variables (see Section 2.5.1). This section presents the subroutines which must be prepared before solving
SOFTWARE IMPLEMENTATION
133
the problem using NUDOCCCS. The problem of the minimum time terminal bunt manoeuvre is given as an example.
5.2.1 Main Program In the main program the user must define mainly: • number of ordinary differential equations • number of controls • number of maximum grid points • number of unknown initial values • number of multiple shooting nodes • number of constraints (state and mixed constraints) • number of inequality constraints related to state and mixed constraints • number of point equality constraints’, e.g. terminal conditions • maximum precision • order of the constraints • initial guess for the control variables etc. C----------------------------------------------------------------C Bunt Manoeuvre Problem - Minimum Time Problem c-----------------------------------------------------------------PROGRAM MAIN IMPLICIT DOUBLE PRECISION (A-H,O-Z)
A B C D E F G H I J K L M N O P Q R S T
PARAMETER( NDGL NSTEUER NDISKRET NUNBE NSTUETZ NNEBEN NUGLNB NRAND NARTADJ IPRINT DEL1 DEL2 EPS EPS2 EPS3 NZUSATZ N M ME MAX1M DIMENSION
= = = = = = = = = = = = = = = = = = = =
5, ! #ODE (>0) 2, ! #CONTROLS (>0) 201, ! MAX # OF GRIDPOINTS (>1) 1, ! #UNKNOWN INITIAL VALUES (>0) 1, ! #MULTIPLE SHOOTING NOTES (>0) 1, ! #STATE OR MIXED CONSTRAINTS 1, ! #THEREOF INEQUALITY CONSTRAINTS 4, ! #POINTEQUALITIES (E.G. TERMINAL CONDITIONS) 1, ! TYPE OF APPROXIMATING ADJOINTS 5, ! PRINT LEVEL 1.0D-6, ! FINITE DIFFERENCE FOR OPTIMIZER 1.0D-4, ! FINITE DIFFERENCE FOR GRIDFIT 1.0D-12, ! HIGHEST REACHABLE PRECISION OF SOLUTION 1.0D-120, ! SMOOTHING OPERATOR FOR TRUNCATION ERRORS 1.0d-6, ! PRECISION OF GRIDREFINEMENT NUNBE*NSTUETZ, (NDISKRET+2)*NSTEUER+NZUSATZ, NDISKRET*NNEBEN+NRAND+NZUSATZ-NUNBE, M-NDISKRET*NUGLNB, M)
COMPUTATIONAL OPTIMAL CONTROL: TOOLS AND PRACTICE
134 A B C D E F G H I J K
X(NDGL,NDISKRET),U(NSTEUER,NDISKRET+2),DFDU(N), G(MAX1M),T(NDISKRET),UNBE(NUNBE,NSTUETZ), UHELP(N),DCDU(MAX1M,N),BL(N+M),BU(N+M), WORK(3*N*N+2*N*M+21*N+22*M),IWORK(4*N+3*M), MSDGL(NUNBE),MSSTUETZ(NSTUETZ),IUSER(22+NUNBE+NSTUETZ), USER(10+7*NDGL+NNEBEN+NSTEUER+NDISKRET*(NDGL+NSTEUER+5)), ADJ(NDGL,NDISKRET),ADJH(NDGL),DISERR(NDISKRET), U2(NSTEUER,NDISKRET+2),X2(NDGL,NDISKRET+2),T2(2*NDISKRET), DSDXH(NNEBEN*NDGL+2*NDGL),DFDXH(NDGL*NDGL+2*NDGL), PDSDX(NDGL),PD2SD2X(NDGL,NDGL),PDFDX(NDGL,NDGL), CONORDER(NNEBEN+1),CONH(4*NNEBEN+1)
COMMON/RK/rkeps,tol
C ORDER OF CONSTRAINTS --( OPTIONAL, MUST NOT BE SET)-------c conorder(1) = 2
C NUMBER OF DISCRETE POINTS -------------------------------write(*,*) ’NDISKRET at the beginning (e.g. 21):’ read(*,*) ndis1 !#OF GRIDPOINTS AT THE BEGINNING (>1)
C FIT DYNAMICAL N1 M1 ME1 MAX1M1
DIMENSIONS --------------------------------= (NDIS1+2)*NSTEUER+NZUSATZ = NDIS1*NNEBEN+NRAND+NZUSATZ-NUNBE = M1-NDIS1*NUGLNB = MAX(1,M1)
C AEQUIDISTANTE DISKRETISIERUNG---------------------------C DISCRETIZATION OF TIME ---------------------------------C HERE: EQUIDISTANT --------------------------------------DO 104 I=1,NDIS1 T(I) = 1.0d0/(NDIS1-1)*(I-1) 104 CONTINUE
C RKEPS AND TOL ARE ONLY USED FOR NART=8,9,18,19,28,29 ---C INITIAL STEPSIZE OF RKF --------------------------------C IF RKF=0 THEN RADAU5 SOLVER IS USED (COMPARE SUBROUTINE MAS) rkeps = 0.0d0 C TOLERANCE OF RKF- OR DAE-SOLVER ------------------------tol = 1.0d-8 C INITIAL PRECISION OF SOLUTION --------------------------C TENDS TO EPS IF GRIDFIT IS USED SEVERAL TIMES ----------epsgit = 1.0d-5
C STARTSCHAETZUNG DER STEUERUNGEN ------------------------C INITIAL GUESS FOR CONTROL VARIABLES --------------------do 202 i=1,ndis1 u(1,i) = 0.0112d0 u(2,i) = 6000.0d0 202 CONTINUE C ONLY USED FOR CUBIC INTERPOLATION OF CONTROL -----------u(1,ndiskret+1) = 0.011d0 u(2,ndiskret+2) = 6000.0d0
C STARTSCHAETZUNG DER FREIEN ANFANGSWERTE UND MEHRZIELKNOTEN C HERE: NOT USED ------------------------------------------DO 300 I=1,NUNBE
SOFTWARE IMPLEMENTATION
300
C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C
DO 300 J=1,NSTUETZ UNBE(1,J) = 25.0d0 CONTINUE
--------TYPE OF INTERPOLATION, INTEGRATION, OPTIMIZATION----------------NART=0..9 ARE FASTEST AND WORK WELL FOR MOST PROBLEMS----NART= 0: EULER INTEGRATION, NO CONTROL INTERPOL. NART= 1: HEUN INTEGRATION, NO CONTROL INTERPOL. NART= 2: IMPR. POLY. EULER INTEGRATION, CONST. CONTROL INTERPOL. NART= 3: IMPR. POLY. EULER INTEGRATION, LIN. CONTROL INTERPOL. NART= 4: RUKU 4 ENGLAND INTEGRATION, LIN. CONTROL INTERPOL. NART= 5: RUKU 5 ENGLAND INTEGRATION, LIN. CONTROL INTERPOL. NART= 6: RUKU 4 ENGLAND INTEGRATION, CUBIC CONTROL INTERPOL. NART= 7: RUKU 5 ENGLAND INTEGRATION, CUBIC CONTROL INTERPOL. NART= 8: RKEPS>0: RKFEHLBERG 7/8 INTEGRATION, CONST. CONTROL INTERPOL. NART= 8: RKEPS=0: IMPLIC. RADAU5 INTEGRATION, CONST. CONTROL INTERPOL. NART= 9: RKEPS>0: RKFEHLBERG 7/8 INTEGRATION, LIN. CONTROL INTERPOL. NART= 9: RKEPS=0: IMPLIC. RADAU5 INTEGRATION, LIN. CONTROL INTERPOL. --------NART=10...19 ARE SLOWER BUT SAFER-----------------------NART=10: AS NART=0 BUT SLOWER AND WITH CENTRAL DIFFERENCES IF NECESSARY NART=11: AS NART=1 BUT SLOWER AND WITH CENTRAL DIFFERENCES IF NECESSARY NART=12: AS NART=2 BUT SLOWER AND WITH CENTRAL DIFFERENCES IF NECESSARY NART=13: AS NART=3 BUT SLOWER AND WITH CENTRAL DIFFERENCES IF NECESSARY NART=14: AS NART=4 BUT SLOWER AND WITH CENTRAL DIFFERENCES IF NECESSARY NART=15: AS NART=5 BUT SLOWER AND WITH CENTRAL DIFFERENCES IF NECESSARY NART=16: AS NART=6 BUT SLOWER AND WITH CENTRAL DIFFERENCES IF NECESSARY NART=17: AS NART=7 BUT SLOWER AND WITH CENTRAL DIFFERENCES IF NECESSARY NART=18: AS NART=8 BUT SLOWER AND WITH CENTRAL DIFFERENCES IF NECESSARY NART=19: AS NART=9 BUT SLOWER AND WITH CENTRAL DIFFERENCES IF NECESSARY --------NART=20...29 ARE MUCH SLOWER BUT MUCH MORE SAFE---------NART=20: AS NART=10 WITH ORIGINAL OPTIMIZERSETTINGS, PLUS DIFF. CHECKS NART=21: AS NART=11 WITH ORIGINAL OPTIMIZERSETTINGS, PLUS DIFF. CHECKS NART=22: AS NART=12 WITH ORIGINAL OPTIMIZERSETTINGS, PLUS DIFF. CHECKS NART=23: AS NART=13 WITH ORIGINAL OPTIMIZERSETTINGS, PLUS DIFF. CHECKS NART=24: AS NART=14 WITH ORIGINAL OPTIMIZERSETTINGS, PLUS DIFF. CHECKS NART=25: AS NART=15 WITH ORIGINAL OPTIMIZERSETTINGS, PLUS DIFF. CHECKS NART=26: AS NART=16 WITH ORIGINAL OPTIMIZERSETTINGS, PLUS DIFF. CHECKS NART=27: AS NART=17 WITH ORIGINAL OPTIMIZERSETTINGS, PLUS DIFF. CHECKS NART=28: AS NART=18 WITH ORIGINAL OPTIMIZERSETTINGS, PLUS DIFF. CHECKS NART=29: AS NART=19 WITH ORIGINAL OPTIMIZERSETTINGS, PLUS DIFF. CHECKS write(*,*) ’NART (e.g. 4):’ read(*,*) nart ! IN GENERAL BEST RESULTS WITH NART=4
401
write(*,*) ’NDISKRET for next calculations’
C Set before each call to NUDOCCCS -------------------------iter = 0 ! NO. OF ITERATIONS ifail = -1 ! ERROR MESSAGE
C START OPTIMIZATION ---------------------------------------CALL NUDOCCCS(NDGL,NSTEUER,NDIS1,NUNBE,NNEBEN,NUGLNB,NRAND, 1 NZUSATZ,nart,N1,M1,ME1,MAX1M1,ITER,IFAIL,IPRINT,DEL1,EPSGIT, 2 X,U,DFDU,FF,G,DCDU,BL,BU,T,UNBE,UHELP, 3 NSTUETZ,MSDGL,MSSTUETZ,IWORK,WORK,IUSER,USER) C POSTOPTIMAL CALCULATION OF ADJOINTS -----------------------CALL ADJUNG(NDGL,NSTEUER,NDIS1,NUNBE,NNEBEN,NUGLNB,NRAND, 1 NZUSATZ,NART,NARTADJ,N1,M1,ME1,MAX1M1,ITER,IFAIL,IPRINT,DEL1, 2 EPSGIT,X,U,DFDU,FF,G,DCDU,BL,BU,T,UNBE,UHELP,NSTUETZ,MSDGL, 3 MSSTUETZ,IWORK,WORK,IUSER,USER,ADJ,DSDXH,DFDXH,ADJH)
C SAFE RESULTS --------------------------------------------CALL AUSGABE(FF,x,adj,UHELP,T,G,NDGL,NSTEUER,NDIS1,NNEBEN, 1 NRAND,N1,M1)
135
COMPUTATIONAL OPTIMAL CONTROL: TOOLS AND PRACTICE
136
write(*,*) ’State, adjoints and control saved.’
C ADAPTIVE AUTOMATIC GRIDREFINEMENT ---------------------------1243 CALL GITTERFIT(NDGL,NSTEUER,NDISKRET,NUNBE,NNEBEN,NUGLNB,NRAND, 1 NZUSATZ,NART,N1,M1,ME1,MAX1M1,ITER,IFAIL,IPRINT,DEL1, 2 DEL2,EPSGIT,EPS,EPS3,X,U,DFDU,FF,G,DCDU,BL,BU,T,UNBE,UHELP, 3 NSTUETZ,MSDGL,MSSTUETZ,IWORK,WORK,IUSER,USER,CONORDER, 4 NDIS1,N1,M1,ME1,MAX1M1,DISERR,X2,U2,T2,pdsdx,pd2sd2x,pdfdx, 5 conh,FINISH) WRITE(*,*) ’Take new grid and optimize new :<ENTER>’ WRITE(*,*) ’ for termination ...’ READ(*,*) GOTO 401 STOP END
5.2.2 Subroutine MINFKT This subroutine contains the objective function in the Mayer form (see Section 2.2). SUBROUTINE MINFKT(X,U,T,MIN,NDGL,NSTEUER,NDISKRET) IMPLICIT DOUBLE PRECISION (A-H,O-Z) DOUBLE PRECISION MIN,LAGINT DIMENSION U(NSTEUER,NDISKRET),X(NDGL,NDISKRET),T(NDISKRET)
C OBJECTIVE: BOLZA FORMULATION ---------------------------C HERE: Mayer-Formulation --------------------------------min = x(5,ndiskret)
RETURN END
5.2.3 Subroutine INTEGRAL This subroutine contains the objective function in the Lagrange form (see Section 2.2). SUBROUTINE INTEGRAL(INT,X,U,T,NDGL,NSTEUER) IMPLICIT DOUBLE PRECISION (A-H,O-Z) DOUBLE PRECISION INT DIMENSION U(NSTEUER),X(NDGL) C CAN BE USED FOR LAGRANGE-PART IN OBJECTIVE-------------C BUT BETTER USE MAYER-FORMULATION ----------------------c INT=u(1)*u(1)+u(2)*u(2)+u(3)*u(3) RETURN END
5.2.4 Subroutine DGLSYS This subroutine provides the right-hand side of the dynamic equations, see equations (5.2). The physical parameters can be seen in Table 5.1 on page 121. SUBROUTINE DGLSYS(X,U,T,DX,NDGL,NSTEUER) IMPLICIT DOUBLE PRECISION (A-H,O-Z)
SOFTWARE IMPLEMENTATION
137
DOUBLE PRECISION PI,AM,AGRA,SREF,CA2,CA1,CA0,CN1,CRHO2,CRHO1,CRHO0 DOUBLE PRECISION RHO,RHOV,A,AN,COSGAM,SINGAM,COSALP,SINALP,TMA,ANM DIMENSION X(NDGL),U(NSTEUER),DX(NDGL) c C --- Konstanten aus der Referenz PI = 3.141592653589793238D0 AM = 1005.0d0 AGRA = 9.8D0 SREF = 0.3376D0 CA2 = -1.9431D0 CA1 = -0.1499D0 CA0 = 0.2359 CN1 = 21.9D0 CRHO2 = 3.312D-9 CRHO1 = -1.142D-4 CRHO0 = 1.224D0 c C RIGHT HAND SIDE OF DGL SYSTEM --------------------------RHO = CRHO2*x(4)**2+CRHO1*x(4)+CRHO0 c C --- 0.5 RHO V^2 SREF RHOV = 0.5 * RHO * x(2)**2 * SREF C C --- the drag A = (CA2 * u(1)**2 + CA1 * u(1) + CA0) * RHOV C C --- the lift AN = CN1 * RHOV * u(1) C C --- the differential equations C COSGAM = dcos(x(1)) SINGAM = dsin(x(1)) COSALP = dcos(u(1)) SINALP = dsin(u(1)) c TMA = (u(2)-A)/AM ANM = AN/AM C dx(1) = x(5)*(TMA*SINALP + ANM*COSALP - AGRA*COSGAM)/x(2) dx(2) = x(5)*(TMA*COSALP - ANM*SINALP - AGRA*SINGAM) dx(3) = x(5)*x(2)*COSGAM dx(4) = x(5)*x(2)*SINGAM dx(5) = 1 RETURN END
5.2.5 Subroutine ANFANGSW This subroutine defines the unknown and known initial conditions as follows: γ0 = 0 ◦ , V0 = 272 m/s, x0 = 0 m, h0 = 30 m, z0 = unknown. SUBROUTINE ANFANGSW(AWX,UNKNOWN,NDGL,NUNBE) IMPLICIT DOUBLE PRECISION (A-H,O-Z)
COMPUTATIONAL OPTIMAL CONTROL: TOOLS AND PRACTICE
138
DIMENSION AWX(NDGL),UNKNOWN(NUNBE) C INITIAL CONDITIONS ---------------------------------------------AWX(1) = 0.0d0 AWX(2) = 272.0d0 AWX(3) = 0.0d0 AWX(4) = 30.0d0 AWX(5) = unknown(1) RETURN END
5.2.6 Subroutine RANDBED This subroutine provides the point equality constraints, i.e. boundary values. γtf = −90 deg, Vtf = 310 m/s, xtf = 10000 m, htf = 0 m. SUBROUTINE RANDBED(X,U,T,R,NDGL,NSTEUER,NDISKRET,NRAND) IMPLICIT DOUBLE PRECISION (A-H,O-Z) DIMENSION X(NDGL,NDISKRET),U(NSTEUER,NDISKRET),R(NRAND) DIMENSION T(NDISKRET) C POINTWISE EQUALITY CONSTRAINTS --------------------------------C HERE: TERMINAL CONDITIONS -------------------------------------r(1) r(2) r(3) r(4)
= = = =
X(1,ndiskret) X(2,ndiskret) X(3,ndiskret) X(4,ndiskret)
+ -
1.57d0 310.0d0 10000.0d0 0.0d0
RETURN END
5.2.7 Subroutine NEBENBED This subroutine describes the state and mixed constraints. The following equations should be defined: L −4 4. (5.19) mg The normal acceleration constraints can be rewritten as 2 L . 0 16 − mg The equation now can be implemented as one constraint instead of two.
(5.20)
SOFTWARE IMPLEMENTATION
139
SUBROUTINE NEBENBED(X,U,T,CON,NDGL,NSTEUER,NNEBEN) IMPLICIT DOUBLE PRECISION (A-H,O-Z) DIMENSION X(NDGL),CON(NNEBEN),U(NSTEUER) C CONSTRAINTS ---------------------------------------C NO BOX CONSTRAINTS OF CONTROLS IN THIS ROUTINE-----PI = 3.141592653589793238D0 AM = 1005.0d0 AGRA = 9.8D0 SREF = 0.3376D0 CA2 = -1.9431D0 CA1 = -0.1499D0 CA0 = 0.2359 CN1 = 21.9D0 CRHO2 = 3.312D-9 CRHO1 = -1.142D-4 CRHO0 = 1.224D0 Rhok = CRHO2*X(4)*X(4)+CRHO1*X(4)+CRHO0 ACCN = (21.9/2 * U(1)*Rhok*X(2)*X(2)*SREF)/(1005*9.8) CON(1) = 16.0D0 - ACCN*ACCN RETURN END
5.2.8 Subroutine CONBOXES This subroutine prescribes the lower and upper bounds for the control and constraint defined in the subroutine NEBENBED. SUBROUTINE CONBOXES(NSTEUER,NDISKRET,NNEBEN,BL,BU,BLCON,BUCON,T) IMPLICIT DOUBLE PRECISION (A-H,O-Z) DIMENSION BL(NSTEUER),BU(NSTEUER),BLCON(NNEBEN),BUCON(NNEBEN) C LOWER AND UPPER BOUNDS OF THE CONTROL FUNCTIONS ----------------C FOR THIS EXAMPLE NO CONTROL CONSTRAINTS ------------------------BL(1) BU(1) BL(2) BU(2)
= -0.3d0 = 0.3d0 = 1000.0d0 = 6000.0d0
C LOWER AND UPPER BOUNDS OF THE CONSTRAINTS IN SUBROUTINE NEBENBEDBLCON(1) = 0.0d0 RETURN END
5.3 GESOP (PROMIS/SOCS) Implementation GESOP is a computing environment for calculating optimal trajectories for complex multiphase optimal control problems. It consists of four well-developed optimisation solvers, PROMIS, TROPIC, CAMTOS and SOCS, which handle large and highly discretised problems, see GESOP – Software User Manual (2004). The general description of the flowchart is given in Figure 5.3. An advantage of GESOP is the possibility of switching between the
140
COMPUTATIONAL OPTIMAL CONTROL: TOOLS AND PRACTICE
Modelling Initialising the model
Initial Guess Modification
grids, states, controls, parameters, constraints on/off in Par. & Grid Inspector (Graphic I/O)
Initial Guess settings
Simulation/Evaluation
Modification
grids, states, controls
of states, controls, constraints with Graphic I/O
OK?
with Graphic I/O no
yes Optimisation with TROPIC, PROMIS, SOCS
Modification of parameters, constraints on/off
with Graphic I/O
Monitoring result with Review
Iterations (GISMO)
OK?
yes Simulation and Viewing
OK?
no
no
yes Export simulation data to MATLAB Figure 5.3 GESOP flowchart, after (GESOP – Software User Manual 2004, p. 69).
SOFTWARE IMPLEMENTATION
141
MODEL cost function
right hand side
phase connect
parameter constraints
description interface
model initialization
boundary constraints
path constraints
output function descriptions
control law interface
PROGRAMS
Export to MATLAB FEATURES
Simulation data
TOPS Generator initial guess
Simulation TOPS PROMIS
Quick View Result Summary
Graphic I/O TROPIC
GISMO
Figure 5.4 GESOP software architecture, after GESOP – Software User Manual (2004, p. 34).
optimisation methods provided in that environment. To do so, the user needs to select an optimiser (optimisation solver) in the Tool Inspector and to provide an appropriate initial grid. A useful characteristic of GESOP is a graphical user interface (GUI) with drop-down menus for most functions. GESOP has three components: • The definition of the problem, supplied by the user in C, FORTRAN, Ada or in MATLAB script language, see the medium grey (MODEL) part of Figure 5.4. • The optimiser packages, e.g. TROPIC, PROMIS, CAMTOS and SOCS, provided by GESOP, see the dark grey (PROGRAMS) part of Figure 5.4. • The input–output features which are GUI-based user-friendly tools, e.g. editing the discretised problem, see the light grey (FEATURES) part of Figure 5.4.
5.3.1 Dynamic Equations, Subroutine fmrhs.f This subroutine specifies the right-hand side of the dynamic equation interface for GESOP, see equations (5.2). The dynamic equations can be defined as a phased dynamic system and
COMPUTATIONAL OPTIMAL CONTROL: TOOLS AND PRACTICE
142
must be given in the following form: dxi = fi (t, xi , ui , rpar) dt
(5.21)
where fi is the right-hand side of phase i, xi is the corresponding state vector, ui is the corresponding control vector and rpar the design parameters in phase i. The possibility of defining phases is particularly useful for multi-stage rockets and other problems with distinct flight regimes. SUBROUTINE +
fmrhs ( phase, dimx, dimu, dimip, dimrp, fazinf, t, x, u, ipar, rpar, dx )
c c INPUT: c phase - (Integer) Number of current phase (stage). c dimx - (Integer) Dimension of state vector in this stage c dimu - (Integer) Dimension of control vector in this stage c dimip - (Integer) Dimension of user integer model parameters c in this stage c dimrp - (Integer) Dimension of user real model parameters c in this stage c fazinf- (Double Precision) vector of phase information (see above). c t - (Double Precision) independent variable (time) c x - (Double Precision) state vector at time t c u - (Double Precision) control vector at time t c ipar - (Integer) model parameter vector c rpar - (Double precision) model parameter vector c c OUTPUT: c dx - (Double Precision) rhs value at time t C implicit logical(a-h,j-z) c c .. Formal parameters .. c integer phase, dimx, dimu, dimip, dimrp, ipar(*) double precision fazinf(2), t, x(*), u(*), dx(*) double precision rpar(*) C
.. RHS evaluation counter. An optional parameter. integer nfunc common /cfunc/ nfunc double precision m,g,Sref,a1,a2,a3,b1,c1,c2,c3,rho,CL,CD,d,l,tmd,ldm
C C C C
nfunc = nfunc+1
.. RHS evaluation counter. An optional parameter. IF (dimu.LT.1) THEN call fmstop(’From RHS: incorrect dimensions’) ENDIF m = 1005.0d0 g = 9.810d0 Sref = 0.3376d0 a1 = -1.9431d0 a2 = -0.1499d0 a3 = 0.2359d0 b1 = 21.9d0 c1 = 3.312d-9 c2 = -1.142d-4 c3 = 1.224d0 rho = (c1*x(4)+c2)*x(4)+c3 CL = b1*u(2)
SOFTWARE IMPLEMENTATION CD d l tmd ldm
= = = = =
143
(a1*u(2)+a2)*u(2)+a3 0.5*CD*rho*x(2)*x(2)*Sref 0.5*CL*rho*x(2)*x(2)*Sref (u(1) - d)/m l/m
dx(1) = dx(2) = dx(3) = dx(4) = RETURN END
(tmd*sin(u(2))+ldm*cos(u(2))-g*cos(x(1)))/x(2) tmd*cos(u(2))-ldm*sin(u(2))-g*sin(x(1)) x(2)*cos(x(1)) x(2)*sin(x(1))
5.3.2 Boundary Conditions, Subroutine fmbcon.f This subroutine provides the boundary conditions for the problem. The initial and final conditions must be defined in this subroutine, see equations (5.8). SUBROUTINE fmbcon ( phase, dimx, dimu, dimip, dimrp, + fazinf, t, x, u, ipar, rpar, + dimbc, evalc, bcon ) c INPUT: c c phase - (Integer) Number of current phase (stage). c dimx - (Integer) Dimension of state vector in this stage c dimu - (Integer) Dimension of control vector in this stage c dimip - (Integer) Dimension of user integer model parameters c in this stage c dimrp - (Integer) Dimension of user real model parameters c in this stage c dimbc - (Integer) Dimension of total constraint vector for c this stage c fazinf- (Double Precision) vector of phase information (see above). c t - (Double Precision) independent variable (time) c x - (Double Precision) state vector at time t c u - (Double Precision) control vector at time t c ipar - (Integer) model parameter vector c rpar - (Double precision) model parameter vector c evalc - (Integer) vector of dimension dimbc containing the c evaluation tags for the constraints c c OUTPUT: c c bcon - (Double Precision) Vector of length dimbc for the c constraint values C implicit logical(a-h,j-z) c c .. Formal parameters .. c integer phase, dimx, dimu, dimbc, dimip, dimrp integer ipar(*), evalc(*) double precision t, x(dimx), u(*), rpar(*) double precision fazinf(2), bcon(*)
* * *
IF ( dimx .LT. 2 ) THEN call fmstop(’dimension error in BCONST: dimx’) ELSE IF( phase .eq. 0 ) THEN
c c c c
Initial boundary constraints ---------------------------IF ( dimrp .ne. 2 ) THEN
COMPUTATIONAL OPTIMAL CONTROL: TOOLS AND PRACTICE
144
call fmstop(’dimension error in BCONST: dimrp’) ENDIF c bcon(1)= bcon(2)= bcon(3)= bcon(4)=
x(1) - 0.0d0 (x(2) - 272.d0)/100.0d0 x(3) - 0.0d0 x(4) - 30.0d0
ELSE IF( phase .eq. 3 ) THEN c c c c
Final boundary constraints -------------------------IF ( dimrp .ne. 2 ) THEN call fmstop(’dimension error in BCONST: dimrp’) ENDIF bcon(1)= bcon(2)= bcon(3)= bcon(4)=
*
x(1) + 1.57 (x(2) - 310.d0)/100.0d0 (x(3) - 10000.0d0)/10000.d0 (x(4) - 0.0d0)/100.d0
END IF END IF return end
5.3.3 Constraints, Subroutine fmpcon.f This subroutine defines the state and control constraints along the trajectory. Both equality and inequality constraints must be defined in this subroutine, see page 120 for the constraints. SUBROUTINE fmpcon ( phase, dimx, dimu, dimip, dimrp, + fazinf, t, x, u, udot, ipar, rpar, + dimpc, evalc, pcon ) cDEC$ ATTRIBUTES DLLEXPORT :: fmpcon c INPUT: c c phase - (Integer) Number of current phase (stage). c dimx - (Integer) Dimension of state vector in this stage c dimu - (Integer) Dimension of control vector in this stage c dimip - (Integer) Dimension of user integer model parameters c in this stage c dimrp - (Integer) Dimension of user real model parameters c in this stage c dimpc - (Integer) Dimension of total constraint vector for c this stage c fazinf- (Double Precision) vector of phase information (see above). c t - (Double Precision) independent variable (time) c x - (Double Precision) state vector at time t c u - (Double Precision) control vector at time t c ipar - (Integer) model parameter vector c rpar - (Double precision) model parameter vector c evalc - (Integer) vector of dimension dimpc containing the c evaluation tags for the constraints c c OUTPUT: c c pcon - (Double Precision) Vector of length dimpc for the c constraint values c implicit logical(a-h,j-z) c c .. Formal parameters .. c integer phase, dimx, dimu, dimpc, dimip, dimrp, ipar(*), evalc(*)
SOFTWARE IMPLEMENTATION
145
double precision t, x(dimx), u(*), udot(*), rpar(*) double precision pcon(*), fazinf(2) double precision m,g,Sref,a1,a2,a3,b1,c1,c2,c3,rho,CL,l,ldm m = 1005.0d0 g = 9.810d0 Sref = 0.3376d0 a1 = -1.9431d0 a2 = -0.1499d0 a3 = 0.2359d0 b1 = 21.9d0 c1 = 3.312d-9 c2 = -1.142d-4 c3 = 1.224d0 rho = (c1*x(4)+c2)*x(4)+c3 CL = b1*u(2) l = 0.5*CL*rho*x(2)*x(2)*Sref ldm = l/m c c c
1. Altitude constraint IF ( evalc(1) .eq. 1 ) THEN pcon(1) = (x(4) - rpar(1))/100.0d0 ENDIF
c c c
2. Normal acceleration IF ( evalc(2) .eq. 1 ) THEN pcon(2) = (4.0d0 - ldm/g)*(ldm/g + 4.0d0) ENDIF
c c RETURN END
5.3.4 Objective Function, Subroutine fmpcst.f This subroutine provides the cost function in the Lagrange form. The objective function of the terminal bunt problem, see equations (5.1), can be defined as: SUBROUTINE fmpcst ( phase, dimx, dimu, dimip, dimrp, fazinf, t, x, u, udot, ipar, rpar, costd ) cDEC$ ATTRIBUTES DLLEXPORT :: fmpcst c INPUT: c c phase - (Integer) Number of current phase (stage). c dimx - (Integer) Dimension of state vector in this stage c dimu - (Integer) Dimension of control vector in this stage c dimip - (Integer) Dimension of user integer model parameters c in this stage c dimrp - (Integer) Dimension of user real model parameters c in this stage c fazinf- (Double Precision) vector of phase information (see above). c t - (Double Precision) independent variable (time) c x - (Double Precision) state vector at time t c u - (Double Precision) control vector at time t c udot - (Double Precision) derivative of control vector at time t c ipar - (Integer) model parameter vector c rpar - (Double precision) model parameter vector c Description") c c OUTPUT: c costd - (Double Precision) Lagrange term of the cost functional. c The value to return is the value of the integrand in +
COMPUTATIONAL OPTIMAL CONTROL: TOOLS AND PRACTICE
146 c C
this stage (phase) implicit logical(a-h,j-z)
c c c
.. Formal parameters .. integer phase, dimx, dimu, dimip, dimrp, ipar(*) double precision fazinf(2),t,x(*),u(*),udot(*),rpar(*),costd
c c c
cost is a quadratic function of the control costd = 1 return end
5.4 BNDSCO Implementation In this section, several aspects of BNDSCO are discussed in the context of the indirect method implementation of the minimum time problem. Additionally, the classical benchmark problem with a pure state constraint due to Bryson and Ho (1975) is presented in the Appendix to highlight other important details of the BNDSCO implementation, not covered here. The discussion begins with a troubleshooting list in Section 5.4.1 for the BNDSCO implementation developed in the course of the present work, based on the BNDSCO manual and hard-won experience with the terminal bunt problem. With this aid in hand, the actual code for the minimum time formulation is given in Section 5.4.2.
5.4.1 Possible Sources of Error There are: • Analytical: wrong formulae and/or wrong data. • Numerical: wrong accuracy, singular data (division by 0). • Coding: wrong FORTRAN implementation of formulae/data. Analytical Errors: A Discussion Since the ultimate numerical tool for solving the underlying TPBVP is the BNDSCO package, it is prudent to formulate the optimal control problem, and apply Pontryagin’s minimum principle, in the way summarised in the BNDSCO manual, see Oberle and Grimm (1989). However, one should remember that the DIRCOL and/or NUDOCCCS packages are also needed, in general, to generate an initial guess of the solution (especially co-states). Thus, the formulae must be consistent for both packages. With the above in mind, there follows a list of possible analytical errors produced by a close study of the BNDSCO manual and experience with the terminal bunt problem; the relevant pages of the manual are given. 1. Bolza → Mayer; p. 10. 2. Boundary conditions for co-states λ, appropriate to the problem (unconstrained, freetime, state constraint only, etc.); pp. 12–16, 21–22, 24–25.
SOFTWARE IMPLEMENTATION
147
BNDSCO
NEWTON
BROYDN
JACOBI
TRAJEC
SUBST
HOUSE
EXIT
METHOD
R
F
Figure 5.5 The structure of BNDSCO, after Oberle and Grimm (1989, p. 38).
3. Jump points for co-state λ; pp. 22, 24–25. 4. Additional checks: consistency of Hamiltonian etc.; p. 50.
Numerical Errors: A Discussion BNDSCO is a reliable solver of the MPBVP with discontinuities, specially written for optimal control problems. However, it has the weakness of all shooting methods in that it has a narrow domain of convergence. In other words, if the data are inaccurate or finite precision arithmetic errors accumulate, it will fail. Hence, the following list of possible sources of error reflects these concerns. 1. BNDSCO requires that time instants (solution nodes) are different from switching points; p. 31.
COMPUTATIONAL OPTIMAL CONTROL: TOOLS AND PRACTICE
148
2. Solution nodes need not be equidistant: concentrate them where rapid changes are expected; p. 47. 3. Accuracy of integration: play with the parameters • Highly accurate solution of IVP required; p. 42. • Tolerance in integration TOL; p. 42. • Maximum number of iterations ITMAX; p. 40. 4. Accuracy of Newton’s method (inside BNDSCO): play with the EPMACH parameter; p. 49. 5. Scaling of boundary conditions (stored in variable W of subroutine R): all components should be of the same order; pp. 52–53. See also Figure 5.5 for details of the BNDSCO structure. Coding Errors: A Discussion The first check in this category is obvious, i.e. whether the properly derived formulae and data (constants and initial guesses) were coded correctly into FORTRAN. Assuming that this indeed has been done, the list below addresses more subtle possibilities. 1. The whole of BNDSCO is written in double precision, so all variables, constants and data must be in that format, using directive DBLE, when necessary. 2. The all-important initial guesses are not passed through a COMMON, but via the array PAR. The entries of PAR and their subsequent use should be checked carefully; pp. 40, 49. 3. Integer parameters J and L of subroutine F must be consistent with each other; p. 48.
5.4.2 BNDSCO Code Main Program C ******************************************************************** C Minimum time of terminal bunt manoeuvre. The normal acceleration C constraint is active during diving-arc C C ******************************************************************** C IMPLICIT DOUBLE PRECISION (A-H,O-Z) PARAMETER (MMAX=150,MSMAX=140,MMS=170,NMS=170,NP=96, * NDW=NMS*(120+MMAX*(120+6*NMS)),NDIW=115*NMS) DIMENSION X(MMS),XS(MSMAX),Y(NMS,MMS),WORK(NDW) DIMENSION TI(NP),GI(NP),VI(NP),PI(NP),AI(NP), * TC(NP),CG(NP),CV(NP),CP(NP),CA(NP) INTEGER JS(MMS,MSMAX),IWORK(NDIW) EXTERNAL F, R, DIFSYB COMMON /OUT/ DALP
SOFTWARE IMPLEMENTATION COMMON /ITEIL/ ITEIL COMMON /PARA1/ B1,B2,B3,E1,E2,E3,D1,AM,T,GR,SREF,TM, + HGM,HV,HXP,HH,ALP c C c
Initial guesses OPEN(3,FILE=’test1.dat’,STATUS=’old’) OPEN(5,FILE=’gnuad310.m’,STATUS=’old’)
C
Result file’s name OPEN(6,FILE=’nmin.txt’,STATUS=’UNKNOWN’) OPEN(7,FILE=’nminp.dat’,STATUS=’UNKNOWN’)
C C
Reading the initial guesses DO I=1,NP READ(3,*)TI(I),GI(I),VI(I),PI(I),AI(I),CG(I),CV(I),CP(I),CA(I) enddo C---------------------------------------------------------------C Parameters, see Table 1.1 on page 6. C---------------------------------------------------------------B1 = 3.312D-9 B2 = -1.142D-4 B3 = 1.224D0 E1 = -1.9431D0 E2 = -0.1499D0 E3 = 0.2359D0 D1 = 21.9D0 AM = 1005.D0 SREF = 0.3376D0 T = 6000.0D0 GR = 9.8D0 TM = 1.0D0/(2.0D0*AM) C N = 9 M = NP TOL = 1.D-06 KS = 1 MS = 1 XS(1) = 0.786818979D0 NFILE = 6 WRITE(6,1000) C C --- Initial trajectory C DO 100 K=1,M X(K) = FLOAT(K-1)/FLOAT(M-1) Y(1,K) = GI(K) Y(2,K) = VI(K) Y(3,K) = PI(K) Y(4,K) = AI(K) Y(5,K) = CG(K) Y(6,K) = CV(K) Y(7,K) = CP(K) Y(8,K) = CA(K) Y(9,K) = 40.9D0 100 CONTINUE C KP = 0 ITMAX = 20 CALL BNDSCO(F,R,DIFSYB,X,XS,Y,WORK,IWORK,JS,N,M,MS, * KS,TOL,ITMAX,KP,MMAX,MSMAX,MMS,NMS,NDW,NDIW,NFILE) IF(KP.LT.0) GOTO 900 C KP = 0 IFEIN = 5 NPLOT = 7 IPLOT = 1
149
COMPUTATIONAL OPTIMAL CONTROL: TOOLS AND PRACTICE
150
ICASE = 1 CALL AWP(DIFSYB,F,X,XS,Y,N,M,MS, * JS,MMS,NMS,IFEIN,IPLOT,KP,NFILE,NPLOT,ICASE) C 900 CONTINUE 1000 FORMAT(//’ TERMINAL BUNT PROBLEM:’) STOP END
Subroutine F The equations of motion for each subarc are defined in this subroutine. C******************************************************************** SUBROUTINE F(X,Y,DY,J,L,JS,MMS) C******************************************************************** IMPLICIT DOUBLE PRECISION (A-H,O-Z) DIMENSION Y(*),DY(*),AROOT(2) INTEGER JS(MMS,*) COMMON /OUT/ DALP COMMON /ITEIL/ ITEIL COMMON /PARA1/ B1,B2,B3,E1,E2,E3,D1,AM,T,GR,SREF,TM, + HGM,HV,HXP,HH,ALP C --------------------------------GM = Y(1) V = Y(2) XP = Y(3) H = Y(4) GL = Y(5) VL = Y(6) XL = Y(7) HL = Y(8) TF = Y(9) *----------------------------------------------------C Diving-arc *----------------------------------------------------IF (JS(J,1).GT.0) THEN C---------------------------------V2 = V*V RHO = B1*H*H+B2*H+B3 DRHO = 2*B1*H+B2 ALNN = -8*AM*GR/(21.9*RHO*V2*SREF) dalp = alnn C ********************************************* C determine mu_1 C ********************************************* A = ((E1*ALNN+E2)*ALNN+E3)*0.5*RHO*V2*SREF AA = (E1*2*ALNN+E2)*0.5*RHO*V2*SREF AN = D1*ALNN*0.5*RHO*V2*SREF ANA = D1*0.5*RHO*V2*SREF CGU = DCOS(ALNN) SGU = DSIN(ALNN) COMP1 = GL*CGU/(V*AM)-VL*SGU/AM COMP2 = GL*SGU/(V*AM)+VL*CGU/AM FAU = (T-A+ANA)*COMP1 - (AA+AN)*COMP2 CONC = -(21.9*RHO*V2*SREF)/(2*AM*GR) CMU1 = -FAU/CONC C ********************************************* C end determine mu_1 C ********************************************* CG = DCOS(GM) SG = DSIN(GM) *----------------------------------------------------* DYNAMIC EQUATIONS *----------------------------------------------------DY(1) = TF*(((T-A)*SGU+AN*CGU)/AM-GR*CG)/V
SOFTWARE IMPLEMENTATION DY(2) = TF*(((T-A)*CGU-AN*SGU)/AM-GR*SG) DY(3) = TF*V*CG DY(4) = TF*V*SG DY(5) = -TF*(GL*GR*SG/V-VL*GR*CG-XL*V*SG+HL*V*CG) *----------------------------------------------------VLC1 = GL*(-T*SA/(V2*AM)-(ACOEF*SA-D1*ALP*CA)*RHO*SREF*TM + +GR*CG/V2) VLC2 = -VL*(ACOEF*CA+D1*ALP*SA)*RHO*V*SREF/AM VLC3 = -CMU1*(21.9*ALNN*RHO*V*SREF)/(AM*GR) *----------------------------------------------------DY(6) = -TF*(VLC1+VLC2+XL*CG+HL*SG+VLC3) DY(7) = 0.D0 *----------------------------------------------------DRHO = 2*H*B1+B2 HLC1 = GL*(-ACOEF*SA+D1*ALP*CA)*V*SREF*DRHO*TM HLC2 = -VL*(ACOEF*CA+D1*ALP*SA)*V2*SREF*DRHO*TM HLC3 = -CMU1*(21.9*ALNN*0.5*DRHO*V2*SREF)/(AM*GR) DY(8) = -TF*(HLC1+HLC2+HLC3) DY(9) = 0.D0 *----------------------------------------------------*----------------------------------------------------Else C-----------------------------------------------------C UNCONSTRAINED PART C-----------------------------------------------------RHO = B1*H*H+B2*H+B3 DRHO = 2*B1*H+B2 GALP = 0.02 V2 = V*V c c----- Newton Raphson ----------------c DO 12 I=1,100 A = ((E1*GALP+E2)*GALP+E3)*0.5*RHO*V2*SREF AA = (E1*2*GALP+E2)*0.5*RHO*V2*SREF AN = D1*GALP*0.5*RHO*V2*SREF ANA = D1*0.5*RHO*V2*SREF CGU = DCOS(GALP) SGU = DSIN(GALP) COMP1 = GL*CGU/(V*AM)-VL*SGU/AM COMP2 = GL*SGU/(V*AM)+VL*CGU/AM FA = (T-A+ANA)*COMP1 - (AA+AN)*COMP2 ********************************************** AAPANX = (E1*2+D1)*0.5*RHO*V2*SREF ******************************************************* *** Derivative FA wrt alpha ******* ******************************************************* FAP = -AA *COMP1 - (AA+AN)*COMP1-(T-A+ANA)*COMP2-AAPANX*COMP2 XRN = GALP-FA/FAP ceck = abs(XRN-GALP) toln = 1.0E-9 if (ceck.le.toln) goto 10 GALP = XRN 12 enddo 10
ALP = GALP DALP = ALP CA = DCOS(ALP) SA = DSIN(ALP) ACOEF = (E1*ALP+E2)*ALP+E3 ALP2 = ALP*ALP C ----------------------------------AEQ = ACOEF*0.5*RHO*V2*SREF ANO = D1*ALP*0.5*RHO*V2*SREF CG = DCOS(GM) SG = DSIN(GM) *----------------------------------------------------* DYNAMIC EQUATIONS
151
152
COMPUTATIONAL OPTIMAL CONTROL: TOOLS AND PRACTICE
*----------------------------------------------------DY(1) = TF*(((T-AEQ)*SA+ANO*CA)/AM-GR*CG)/V DY(2) = TF*(((T-AEQ)*CA-ANO*SA)/AM-GR*SG) DY(3) = TF*V*CG DY(4) = TF*V*SG DY(5) = -TF*(GL*GR*SG/V-VL*GR*CG-XL*V*SG+HL*V*CG) *----------------------------------------------------VLC1 = GL*(-T*SA/(V2*AM)-(ACOEF*SA-D1*ALP*CA)*RHO*SREF*TM + +GR*CG/V2) VLC2 = -VL*(ACOEF*CA+D1*ALP*SA)*RHO*V*SREF/AM *----------------------------------------------------DY(6) = -TF*(VLC1+VLC2+XL*CG+HL*SG) DY(7) = 0.D0 *----------------------------------------------------DRHO = 2*H*B1+B2 HLC1 = GL*(-ACOEF*SA+D1*ALP*CA)*V*SREF*DRHO*TM HLC2 = -VL*(ACOEF*CA+D1*ALP*SA)*V2*SREF*DRHO*TM DY(8) = -TF*(HLC1+HLC2) DY(9) = 0.D0 endif *----------------------------------------------------RETURN END
Subroutine R Subroutine R defines the boundary, jump and switching conditions. C******************************************************************** SUBROUTINE R(YA,YB,ZZ,W,NYA,NSK,J,L,LS,JS,MMS) C******************************************************************** IMPLICIT DOUBLE PRECISION (A-H,O-Z) DIMENSION YA(*),YB(*),ZZ(*),W(*) INTEGER NYA(*),NSK(*) INTEGER JS(MMS,*) COMMON /ITEIL/ ITEIL COMMON /OUT/ DALP COMMON /PARA1/ B1,B2,B3,E1,E2,E3,D1,AM,T,GR,SREF,TM, + HGM,HV,HXP,HH,ALP C GTF = -1.57D0 V0 = 272.0D0 VTF = 310.0D0 H0 = 30.0D0 XF = 10000.0D0 HF = 0.0D0 C C---- Boundary conditions C W(1) = YA(1) W(2) = YA(2)/V0 - 1.D0 W(3) = YA(3) W(4) = YA(4)/H0 - 1.D0 W(5) = YB(1)/GTF - 1.D0 W(6) = YB(2)/VTF - 1.D0 W(7) = YB(3)/XF - 1.D0 W(8) = YB(4) C NYA(1) = 1 NYA(2) = 2 NYA(3) = 3 NYA(4) = 4 C***************************************** C Hamiltonian at final time C***************************************** YB2 = YB(2)*YB(2) ROI = B1*YB(4)*YB(4)+B2*YB(4)+B3
SOFTWARE IMPLEMENTATION
C C C
C C C C
DROI = 2*B1*YB(4)+B2 ALPI = -8*AM*GR/(21.9*ROI*YB2*SREF) ********************************************* determine mu_1 ********************************************* A = ((E1*ALPI+E2)*ALPI+E3)*0.5*ROI*YB2*SREF AA = (E1*2*ALPI+E2)*0.5*ROI*YB2*SREF AN = D1*ALPI*0.5*ROI*YB2*SREF ANA = D1*0.5*ROI*YB2*SREF CGU = DCOS(ALPI) SGU = DSIN(ALPI) COMP1 = YB(5)*CGU/(YB(2)*AM)-YB(6)*SGU/AM COMP2 = YB(5)*SGU/(YB(2)*AM)+YB(6)*CGU/AM FAU = (T-A+ANA)*COMP1 - (AA+AN)*COMP2 CONC = -(21.9*ROI*YB2*SREF)/(2*AM*GR) CMU1 = -FAU/CONC ********************************************* end determine mu_1 ********************************************* CAI SAI ACOEFI
= DCOS(ALPI) = DSIN(ALPI) = (E1*ALPI+E2)*ALPI+E3
C ATFRI ANTFI SGTFI CGTFI HGMI HVRI HXPRI HHRI COMU CONA
= = = = = = = = = =
ACOEFI*0.5*ROI*YB2*SREF D1*ALPI*0.5*ROI*YB2*SREF DSIN(YB(1)) DCOS(YB(1)) (((T-ATFRI)*SAI+ANTFI*CAI)/AM-GR*CGTFI)/YB(2) ((T-ATFRI)*CAI-ANTFI*SAI)/AM-GR*SGTFI YB(2)*CGTFI YB(2)*SGTFI -CMU1*((21.9*ALPI*0.5*DROI*YB2*SREF)/(AM*GR)+4.D0) YB(5)*HGMI+YB(6)*HVRI+YB(7)*HXPRI+YB(8)*HHRI+COMU+1
C C************************************************ C Determine alpha using Newton at Initial C************************************************ YA2 = YA(2)*YA(2) RHO = B1*YA(4)*YA(4)+B2*YA(4)+B3 DRHO = 2*B1*YA(4)+B2 GALP = 0.02 DO 12 I=1,100 A = (-0.328*GALP*GALP-0.0253*GALP+0.03982)*RHO*YA2 AA = (-0.328*2*GALP-0.0253)*RHO*YA2 AN = 3.69672*GALP*RHO*YA2 ANA = 3.69672*RHO*YA2 CGU = DCOS(GALP) SGU = DSIN(GALP) COMP1 = YA(5)*CGU/(YA(2)*AM)-YA(6)*SGU/AM COMP2 = YA(5)*SGU/(YA(2)*AM)+YA(6)*CGU/AM FA = (T-A+ANA)*COMP1 - (AA+AN)*COMP2 *************************************************************************** AX = (-0.656*GALP-0.0253)*RHO*YA2 AAPANX = 3.04072*RHO*YA2 *************************************************************************** *** {derivative FA wrt alpha} ******* *************************************************************************** FAP = -AX *COMP1 - (AA+AN)*COMP1-(T-A+ANA)*COMP2-AAPANX*COMP2 XRN = GALP-FA/FAP ceck = abs(XRN-GALP) toln = 1.0E-9 if (ceck.le.toln) goto 10 GALP = XRN 12 enddo 10 ALPB = GALP C**************************************** c Hamiltonian at initial time
153
COMPUTATIONAL OPTIMAL CONTROL: TOOLS AND PRACTICE
154
C**************************************** CAR = DCOS(ALPB) SAR = DSIN(ALPB) ACOEFR = (E1*ALPB+E2)*ALPB+E3 ATFR ANTFR SGTFR CGTFR HGMR HVR HXPR HHR CONB
= = = = = = = = =
ACOEFR*0.5*RHO*YA(2)*YA(2)*SREF D1*ALP*0.5*RHO*YA(2)*YA(2)*SREF DSIN(YA(1)) DCOS(YA(1)) (((T-ATFR)*SAR+ANTFR*CAR)/AM-GR*CGTFR)/YA(2) ((T-ATFR)*CAR-ANTFR*SAR)/AM-GR*SGTFR YA(2)*CGTFR YA(2)*SGTFR YA(5)*HGMR+YA(6)*HVR+YA(7)*HXPR+YA(8)*HHR+1
W(9) = CONA-CONB C************************************************************** ZZ2 = ZZ(2)*ZZ(2) ZRHO = B1*ZZ(4)*ZZ(4)+B2*ZZ(4)+B3 ALPI = -8*AM*GR/(21.9*ZRHO*ZZ2*SREF) W(10) = -(21.9*ALPI*0.5*ZRHO*ZZ2*SREF)/(AM*GR)-4.0D0 NSK(10) = 1 C RETURN END
5.5 User Experience This section summarises some of our user experience for all computational optimal control software packages which worked (produced convergent solutions) for the case study considered in this book, see Table 5.2 below. There is no overall winner in Table 5.2, because the notion of the ‘winner’ in the context of software for challenging optimal control problems is not obvious: the main scientific aim is to get an accurate solution while the main engineering aim is to do it quickly. For a practical, realistic (and thus demanding) problem, satisfying both of these aims is seldom possible and it is prudent to use more than one package to cross-verify the results, anyway. In principle, the direct method packages—all but BNDSCO in Table 5.2—can be used as stand-alone pieces of software, but the accuracy of their results will usually be inferior to the accuracy of BNDSCO. However, in order for its solutions to converge, BNDSCO requires good-quality initial guesses for states and also co-states which can be provided by running DIRCOL and/or NUDOCCCS first, so BNDSCO is not a stand-alone computational optimal control software package. Moreover, BNDSCO is not easy to learn and its documentation is sometimes cryptic. Despite the lack of user-friendliness, BNSDCO offers an important reward: its solutions are very accurate and trustworthy, and no other package can match its solution quality, especially for problems with discontinuities. DIRCOL and NUDOCCCS are reasonably easy to use, but are command-line FORTRAN packages—they have no GUIs. They, like BNDSCO, are academic (as opposed to commercial) software packages of professional quality. From our experience, it seems that a good, practical way of attacking a challenging optimal control problem is to start with DIRCOL and NUDOCCCS to get a feel for the problem relatively quickly. If there are doubts about the quality of the initial DIRCOL or NUDOCCCS solutions, then it may be necessary to progress to BNDSCO.
SOFTWARE IMPLEMENTATION
155
PROMIS and SOCS are commercial FORTRAN packages, but when used under the GESOP environment they are much more user-friendly than academic software like BNDSCO, DIRCOL and NUDOCCCS. Indeed, GESOP has a menu-driven GUI and also good documentation; training in GESOP use can also be included in the licence fee. The solvers available in GESOP, especially PROMIS and SOCS, are solid implementations of the direct method, but no co-states are produced, so these solvers cannot be used to provide an initial guess for BNDSCO. A downside of the neat GESOP packaging is that the user cannot “peek under the hood”, i.e. to analyse in depth the sources of errors. Such error analysis may be important when dealing with challenging problems for which solutions often converge only after judicious manipulation of parameters. Table 5.3 gives basic details of the computational optimal control software packages used in this book in terms of their licensing, authorship and compatibility with popular operating systems.
BNDSCO
DIRCOL
NUDOCCCS
GESOP/PROMIS
GESOP/SOCS
= very good
= good
= satisfactory
= bad
= very bad
interface
documentation
Software
learning time
solution quality
running time
ease of use
Table 5.2 User experience with computational optimal control software packages
ease of install
156
COMPUTATIONAL OPTIMAL CONTROL: TOOLS AND PRACTICE
Licence
academic
academic
academic
commercial
commercial
Software
BNDSCO
DIRCOL
NUDOCCCS
GESOP/PROMIS
GESOP/SOCS
Prof. Christof Büskens
http://astos.de/
Astos Solutions
http://astos.de/
Astos Solutions
http://www.math.uni-bremen.de
http://www.boeing.com/phantom/socs/
(1) works under Windows (2) SOCS is a Boeing FORTRAN package
works under Windows
(1) needs NAG Library www.nag.co.uk (2) works under Linux
(2) works under Linux
(1) needs SNOPT http://www.sbsi-sol-optimize.com
Prof. Oskar von Stryk
works under both Windows and Linux
Remarks
http://www.sim.tu-darmstadt.de
http://www.math.uni-hamburg.de
Prof. Hans Joachim Oberle
Authorship/availability
Table 5.3 Details of computational optimal control software packages.
SOFTWARE IMPLEMENTATION 157
6
Conclusions and Recommendations In this book we analysed a realistic optimal control case study of a mission-critical application, the terminal bunt manoeuvre optimisation of a generic cruise missile, and tried to share our user experience gained in the process of solving the bunt manoeuvre problem. In this chapter we present our pragmatic conclusions based on our experience with the tools and practice of computational optimal control. We also suggest a few ways of going beyond the approaches described in the book. Overall technical conclusions are presented in Section 6.1 while possible extensions of the approaches explored in this book are given in Sections 6.2 and 6.3. Final comments are given in Section 6.4.
6.1 Three-stage Manual Hybrid Approach In this book a combination of the direct and indirect approaches (and the relevant codes) was used resulting, in effect, in a hybrid approach. The main direct solver, DIRCOL, was used to discern the solution structure, including characteristic subarcs, constraints’ activation and switching times. Whenever possible, DIRCOL results were compared with those of other direct solvers, namely NUDOCCCS, PROMIS and SOCS. DIRCOL and NUDOCCCS codes produce initial guesses for the co-state, an essential feature to enable subsequent use of the BNDSCO code for solving the relevant two-point boundary value problem (TPBVP). The hybridisation was done manually, i.e. DIRCOL, NUDOCCCS, PROMIS and SOCS were run first, their results analysed to help formulate an appropriate TPBVP, and then the results were fed to BNDSCO as an initial guess (with co-states’ guess from DIRCOL or NUDOCCCS). There are two main reasons for opting for the manual hybrid approach. Firstly, the three-stage approach: • direct solution (NLP via DIRCOL/NUDOCCCS/PROMIS/SOCS) • analysis (optimal control theory, TPBVP formulation) Computational Optimal Control: Tools and Practice c 2009 John Wiley & Sons, Ltd
˙ S. Subchan and R. Zbikowski
160
COMPUTATIONAL OPTIMAL CONTROL: TOOLS AND PRACTICE
• indirect solution (TPBVP solution via BNDSCO) offers valuable insights into the problem, its solution structure, the role of constraints and boundary conditions. The focus of the case study considered in this book was trajectory shaping, i.e. not just computing a bunt manoeuvre, but exploring a family of terminal bunt problems. Thus, insights into the influence of constraints and boundary conditions on the solution structure (e.g. the number of switching points, the number of active constraints, the duration of their activation) is of significant operational and engineering value. Secondly, trajectory shaping of the bunt manoeuvre naturally leads to a pure state constraint formulation, a difficult type of optimal control problem. The arising difficulties can be handled—if not always fully resolved—due to the gradual progression of the three-stage, manual hybridisation approach.
6.2 Generating an Initial Guess: Homotopy The main difficulty in the indirect approach is due to finding a converging initial guess for the state and co-state variables. In some problems, inserting new shooting points might help, but for complex problems this strategy may not work. However, continuation, or homotopy, methods can be employed to overcome these difficulties, see Allgower and Georg (1990). The homotopy method solves the original problem by solving the “family” problem. This means that the original problem is embedded into a family of problems characterised by a parameter. The selected parameter is the crucial aspect of this method. If the selected parameter is physically appropriate and mathematically convenient, then the family of problems will include the original problem. The parameter is then used to find the solution of the original problem by gradually correcting the parameter until the required solution is achieved. Progress is achieved by using the previous solution as an initial guess for the next problem, starting with a parameter value for which the corresponding problem is tractable. For a highly constrained optimal control problem, Bulirsch (Stoer and Bulirsch 2002, chapter 7, p. 563) has warned that the parameter must be an intrinsic parameter to the problem, as opposed to an artificial parameter. If a simple or arbitrary parameter that has nothing to do with the problem is used, one may not succeed in solving the problem. A natural parameter which is related to the problem may give a good starting solution. In addition, one may need to use more than one natural parameter to find the solution for the original problem, see Steindl and Troger (2003). Ehtamo et al. (2000) proposed a final time as a natural parameter to solve the minimum time problem. They converted the minimum time optimal control problem into a sequence of terminal cost minimisations by fixing the terminal time. The starting point is to use a small value for the fixed final time and then by using an appropriate search procedure, gradually, the final time for the original problem is found. They applied these continuation methods for the minimum time flight of an aircraft. Bulirsch et al. (1991b) and Pesch (1991) proposed a combination of the Chebyshev and Bolza functionals1 as the objective function to find the solution of the abort landing in windshear problems. They solved the unconstrained problem and then gradually the constraints were activated until the original problem was solved. 1 Both functionals are originally derived from the minimax optimal control problem, therefore they are related naturally, see Bulirsch et al. (1991a, p. 4).
CONCLUSIONS AND RECOMMENDATIONS
161
Three formulations2 might be employed to attack the problem considered in this book. Firstly, consider the minimum altitude problem of Chapter 3. We can begin by solving the unconstrained problems which means that we neglect most constraints, especially the difficult state constraint h hmin , and retain only a constraint on the thrust T . Thus, the missile immediately dives and “hits” the ground (or “goes underground”), followed by climbing and, finally, finished by diving to reach the target. The next step is finding a touch point (h = htouch ) on the minimum altitude (h˙ = 0, h¨ > 0) the unconstrained trajectory from the previous step. This way, the difficult state constraint h hmin is put back into the problem formulation, albeit with lower hmin than required, i.e. with h = htouch . This introduction of a touch point results in a computation similar to the unconstrained case, but a new switching point must be introduced. The following steps consist of gradually “lifting” the minimum altitude constraint from htouch , activating a boundary arc starting from a short boundary arc close to the touch point (h = htouch + h), and, finally, by increasing h the problem can be solved for a fully constrained formulation. Another approach would be to retain all the constraints, but to shorten the distance between the launch point and the target. Then the down range is squeezed so that the missile climbs immediately, which avoids activation of the minimum altitude constraint. The next step is to increase the down range gradually, activating the constraint at a point, a short arc, etc., until the original problem is solved. Effectively, this simply is homotopy on x(tf ), one of the terminal conditions. Finally, a new objective function can be introduced which is a combination of both cases, minimum altitude and minimum time. Consider the following new objective function ε tf (1 − ε) J = tf + hdt (6.1) 0 where has a big value and 0 ε 1. This approach starts by solving the minimum time problem where the minimum altitude constraint is not active (ε = 0). Then the original problem is solved by gradually increasing ε.
6.3 Pure State Constraint and Multi-objective Formulation As explained in Chapter 1, a guidance strategy based on trajectory shaping lends itself to optimal control interpretation. In the context of a cruise missile required to perform the terminal bunt manoeuvre, the resulting optimal control leads naturally to pure state constraints. The presence of such constraints is a source of significant difficulties for the indirect method approach to solution, as explained in Section 2.2 and illustrated in Section 3.4. Indeed, even the direct method solver NUDOCCCS had convergence problems in that case, so it was not possible to use its approximate solution to compare it with that of DIRCOL and use it to initialise BNDSCO. Theoretically, there are a few ways of dealing with the pure state constraint, see e.g. Section 3.4.2, but none of them offers a magic bullet and, besides, the theoretical results must be numerically practical, which is not always the case, see Maurer and Gillessen (1975, p. 111). 2 We are grateful to Prof. Pesch for his advice on the homotopy approach to the terminal bunt problem.
162
COMPUTATIONAL OPTIMAL CONTROL: TOOLS AND PRACTICE
Thus, pure state constraints are best avoided, but can they be—they arise naturally for the terminal bunt problem? This leads to the issue of problem formulation. From the user’s (operational) point of view, it is natural to impose inequality constraints on state variables—it is intuitively clear and practically desirable to limit the missile’s altitude and speed. It is also transparent to have a simple performance index, e.g. penalise altitude or flight time only, for it is obvious what it means, and its value is meaningful for the user. Moreover, if a parametric study is conducted, say, by varying the terminal speed, the resulting changes in the performance index are easy to comprehend. From the analyst’s (computational) point of view, it is more desirable to have as few constraints as possible, particularly if they are path constraints and especially if a pure state constraint is involved. On the other hand, the performance index can be complex, because this can be handled easily. How, then, can one practically reconcile the user’s preference for many constraints and a simple performance index with the analyst’s desire for the converse? The key observation is that, for a given problem, the user would benefit more from establishing bounds on the state variables rather than imposing them a priori. Indeed for all the cases in this study the missile speed constraints are never activated, but must be included when the relevant TPBVP is formulated. Therefore, it would perhaps be more desirable to make finding bounds on the states a part of the optimisation process at the outset. How can this be done? Including states in the performance index in order to minimise them together with, say, flight time is not satisfactory, for the resulting combination of disparate variables produces a scalar measure of performance lacking transparency (what would be the units and meaning of such a performance index?). However, a vector performance measure preserves transparency by having separate entries for each variable of interest, thus allowing natural units and avoiding mixing of incommensurable variables. The resulting multi-objective formulation (Levary (1986), Petrosyan (1994)) can be handled computationally and has at its core partial ordering of the vector performance index. The solution is expressed through a Pareto set which contains trade-off points only. For example, if there are two objectives to be minimised simultaneously,
say, the flight time tf and the timeintegrated square deviation from minimum altitude δ = (h − hmin )2 dt, then the Pareto set will be composed of pairs (tf∗ , δ ∗ ) such that any further minimisation of tf∗ would increase (worsen) δ ∗ and conversely. Analysing the pairs (tf∗ , δ ∗ ) would give the user insights into inherent trade-offs between the duration of the attack and the exposure to anti-air defences. Computationally, replacing the state constraint h hmin with the second objective will simplify problem solution considerably. If δ ∗ solutions result in h < hmin subarcs, the integrand (h − hmin )2 can be modified to include a suitable barrier function for penalising the h < hmin solution more than the h hmin ones.
6.4 Final Remarks Optimal control is a challenging subject and attacking a realistic optimal control problem is not something to be undertaken lightly. To have a reasonable chance of successfully solving such a problem—usually involving nonlinear dynamics and numerous constraints—a working knowledge of optimal control theory is needed together with real appreciation of numerical implementation issues and, finally, some software experience, preferably with
CONCLUSIONS AND RECOMMENDATIONS
163
FORTRAN. Even if a suitably qualified person can be found, such a person will need ample time in order to produce results with confidence. We believe that for a problem of the kind considered in this book, an experienced professional may need several months of dedicated work, and a newcomer at least two years, if not more. It should be emphasised that this is not unfriendly software which puts such demands on time—it is the underlying complexity of the subject and the challenge of its practical application in a reliable way. Informed use of computational optimal control is emphatically not a plug-and-play exercise—considerable background knowledge and thoughtfully analysed experience must be acquired in order to have justified confidence in the results. However, there is a large reward for studying computational optimal control, because realistic (and hence practically important) problems cannot otherwise be tackled in an optimal way. In real-life aircraft or spacecraft applications the best possible solutions are often essential either on cost grounds (e.g. fuel efficiency for aircraft fleet) or mission criticality (typically in space or military applications). Moreover, the insights gained in deriving optimal solutions have independent value, because they may lead to design improvements for the platforms considered and their mission planning. Due to high payoffs in volume, or simply mission importance, aerospace applications provide good justification for investing the time and effort required for informed use of computational optimal control. The alternative of relying on sub-optimal approximations may mean losing competitive advantage or losing the mission itself. No pain, no gain.
Appendix
BNDSCO Benchmark Example This BNDSCO benchmark example is taken from Bryson and Ho (1975, pp. 120–123) and is worked out in detail below as a complement to the implementation issues covered in Section 5.4. Consider the following state variable inequality constraint problem. Let x and v define the position and velocity of a particle in rectilinear motion. Find the control history of acceleration a(t) on 0 t 1 such that the objective function 1 1 2 J = a (t)dt (A.1) 0 2 is minimised subject to the following constraints: • dynamic equation x˙ = v
(A.2)
v˙ = a
(A.3)
• boundary conditions x(0) = x(1) = 0
(A.4)
v(0) = −v(1) = 1
(A.5)
• state inequality constraint x(t) l
A.1
for 0 t 1.
(A.6)
Analytic Solution
By introducing an additional state variable z, the problem can be transformed into a Mayer problem as follows: min J = z(1) (A.7) subject to the constraints Computational Optimal Control: Tools and Practice c 2009 John Wiley & Sons, Ltd
˙ S. Subchan and R. Zbikowski
166
COMPUTATIONAL OPTIMAL CONTROL: TOOLS AND PRACTICE
• dynamic equation x˙ = v
(A.8a)
v˙ = a
(A.8b)
z˙ =
1 2 a 2
(A.8c)
• boundary conditions x(0) = x(1) = 0
(A.9)
v(0) = −v(1) = 1
(A.10)
z(0) = 0
(A.11)
and the state inequality constraint (A.6) remains the same. Following the BNDSCO manual notation, see Oberle and Grimm (1989), let y (x, v, z), so that the Hamiltonian is 1 H = λx v + λv a + λz a 2 . 2 The co-state equations are given by 0 λ˙ = −HyT = −λx . 0
(A.12)
(A.13)
The stationary condition is given by −λv ∂H = λv + λz a = 0 → a = . ∂a λz
(A.14)
From the transversality condition we obtain λz (T ) =
∂
= 1. ∂z(T )
(A.15)
Substituting equation (A.15) into equation (A.14) gives a = −λv . Consider now the case when the state inequality constraint S = x − l is active. The first and second total time derivatives of S yield S =x−l
(A.16)
S˙ = x˙ = v
(A.17)
S¨ = v˙ = a.
(A.18)
Thus, the state inequality constraint is a second-order constraint. The solution can be a touch point or a constrained arc depending on the value of l. In this example, the constrained arc problem will be discussed. Based on the second-order state constraints (A.16), the Hamiltonian is given by 1 H svic = λx v + λv a + λz a 2 + µa. 2 The stationary condition yields:
(A.19)
BNDSCO BENCHMARK EXAMPLE • unconstrained
167
+ Hasvic = 0 µ=0
• constrained
+ Hasvic = 0 S¨ = 0
⇒
a=−
⇒
a = 0,
λv , λz
µ=0
(A.20)
µ = −λv .
(A.21)
At the beginning of the constrained arc, we obtain x−l = 0. N(x(t1 )) = v
(A.22)
Based on equation (20) on page 21 of Oberle and Grimm (1989), the discontinuity at the jump conditions at the entry point t1 is given by λ(t1+ )T = λ(t1− )T − σ T Ny = λ(t1− )T − [σ1 , σ2 , 0].
(A.23)
The Hamiltonian at both switching points is continuous H (ti+ ) = H (ti− ),
i = 1, 2.
(A.24)
A.1.1 Unconstrained or Free Arc (l 1/4) In this case the constraint x l is not active along the optimal trajectory. Based on equations (A.8) and (A.13) the differential equations can be given by x˙ = v
→
DY(1) = Y(2)
(A.25)
v˙ = a = −λv
→
DY(2) = −Y(4)
(A.26)
λ˙ x = 0
→
DY(3) = 0.0D0
(A.27)
λ˙ v = −λx
→
DY(4) = −Y(3).
(A.28)
The above equations (A.25)–(A.28) correspond to equation (2.112a) and must be implemented in subroutine F. The initial and final conditions are given by x(0) = 0
→
W(1) = YA(1) − X0
x(1) = 0
→
W(3) = YB(1) − XF
v(0) = 1
→
W(2) = YA(2) − V0
v(1) = −1
→
W(4) = YB(2) − VF.
→
NYA(1) = 1
(A.29) (A.30)
→
NYA(2) = 2
(A.31) (A.32)
Equations (A.29)–(A.32) correspond to equation (2.112c) and must be implemented in subroutine R and the corresponding initial conditions have to be marked by defining NYA (see page 51 of Oberle and Grimm (1989) for more details).
COMPUTATIONAL OPTIMAL CONTROL: TOOLS AND PRACTICE
168
The following is the notation based on equation (2.112). The state and co-state variables are (N = 4) x → y1 v → y2 y= (A.33) λx → y3 λv → y4 . The boundary conditions can be defined as x(0) − 0 → r1 v(0) − 1 → r2 r(y(0), y(1)) = x(1) − 0 → r3 v(1) + 0 → r4
→ W(1) → W(2) → W(3) → W(4)
(A.34)
where r1 , r2 , r3 and r4 are initial and final conditions.
A.1.2 Touch Point Case (1/6 l 1/4) In this case the optimal trajectory for x touches the constraint at only one point. The differential equations are x˙ = v
→
DY(1) = Y(2)
(A.35)
v˙ = a = −λv
→
DY(2) = −Y(4)
(A.36)
λ˙ x = 0
→
DY(3) = 0.0D0
(A.37)
λ˙ v = −λx
→
DY(4) = −Y(3)
(A.38)
l˙0 = 0
→
DY(5) = 0.0D0.
(A.39)
The boundary conditions are x(0) = 0
→
W(1) = YA(1) − X0
x(1) = 0
→
W(3) = YB(1) − XF
v(0) = 1
→
W(2) = YA(2) − V0
v(1) = −1
→
W(4) = YB(2) − VF.
→
NYA(1) = 1
(A.40) (A.41)
→
NYA(2) = 2
(A.42) (A.43)
The switching and jump conditions at touch point tb are x(tb ) = l
→
W(5) = ZZ(1) − l
→
NSK(5) = 1
(A.44)
v(tb ) = 0
→
W(6) = ZZ(2)
→
NSK(6) = 1
(A.45)
→
ZZ(3) = ZZ(3) − ZZ(5).
λx (tb+ ) = λx (tb− ) − l0
(A.46)
Equations (A.35)–(A.39) correspond to equation (2.112a) and must be placed in subroutine F while equations (A.40)–(A.43) and (A.44)–(A.46) are in subroutine R. Equations (A.40)–(A.43) correspond to equation (2.112c) while equations (A.44)–(A.45) correspond to
BNDSCO BENCHMARK EXAMPLE
169
equation (2.112d). The jump condition in equation (A.46) corresponds to equation (2.112b). In this case the user must prescribe NYA and NSK accordingly. The following is the notation based on equation (2.112). The state and co-state variables are (N = 5):
x → y1 v → y2 y= λx → y3 λv → y4 l0 → y5
(A.47) → ξ1 .
The boundary conditions can be defined as x(0) − 0 → r1 v(0) − 1 → r2 x(1) − 0 → r3 r(y(0), y(1)) = v(1) + 0 → r4 x(tb ) − l → r5 v(tb ) − 0 → r6
→ W(1) → W(2) → W(3) → W(4) → W(5) → W(6)
(A.48)
where r1 , r2 , r3 and r4 are initial and final conditions while r5 and r6 are for switching conditions at touch point tb .
A.1.3 Constrained Arc Case (0 l 1/6) The constraint x l is active in the optimal trajectory. The constrained arc is active on the state variable x for a finite time t1 t t2 . The differential equations can be written as x˙ = v
→
DY(1) = Y(2)
(A.49)
v˙ = a = −λv
→
DY(2) = −Y(4)
(A.50)
λ˙ x = 0
→
DY(3) = 0.0D0
(A.51)
λ˙ v = −λx
→
DY(4) = −Y(3)
(A.52)
λ˙ z = 0
→
DY(5) = 0.0D0
(A.53)
→
DY(6) = 0.0D0.
(A.54)
z˙ =
1 2 a 2
The above equations (A.49)–(A.53) correspond to equation (2.112a) and must be put into subroutine F. The initial and final conditions based on equations (2.112c) can be given by x(0) = 0
→
W(1) = YA(1) − X0
x(1) = 0
→
W(3) = YB(1) − XF
v(0) = 1
→
W(2) = YA(2) − V0
v(1) = −1
→
W(4) = YB(2) − VF.
→
NYA(1) = 1
(A.55) (A.56)
→
NYA(2) = 2
(A.57) (A.58)
COMPUTATIONAL OPTIMAL CONTROL: TOOLS AND PRACTICE
170
The switching and jump conditions at t1 are x(t1 ) = l
→
W(5) = ZZ(1) − l
→
NSK(5) = 1
(A.59)
v(t1 ) = 0
→
W(6) = ZZ(2)
→
NSK(6) = 1
(A.60)
λv (t1− ) = 0
→
W(7) = ZZ(4)
→
NSK(7) = 1
(A.61)
λx (t1+ ) λv (t1+ )
→
ZZ(3) = ZZ(3) − ZZ(5)
(A.62)
→
ZZ(4) = ZZ(4) − ZZ(6)
(A.63)
= =
λx (t1− ) − σ1 λv (t1− ) − σ2
and the switching conditions at t2 are λv (t2 ) = 0
→
W(8) = ZZ(4) →
NSK(8) = 2.
(A.64)
Equations (A.59)–(A.61) and (A.64) correspond to equation (2.112d), while equations (A.62)–(A.63) correspond to equation (2.112b). Equations (A.55)–(A.64) must be placed in subroutine R. The initial conditions, switching point and jump conditions must be prescribed in NYA and NSK. The following is the notation based on equation (2.112). The state and co-state variables are (N = 6) x → y1 v → y2 λx → y3 y= (A.65) λv → y4 λz → y5 → ξ1 z → y6 → ξ2 . The boundary conditions can be defined as x(0) − 0 → r1 v(0) − 1 → r2 x(0) − 0 → r3 v(0) + 0 → r4 r(y(0), y(1)) = x(t1 ) − l → r5 v(t1 ) − 0 → r6 λv (t − ) − 0 → r7 1 λv (t2 ) − 0 → r8
→ W(1) → W(2) → W(3) → W(4) → W(5) → W(6) → W(7) → W(8)
(A.66)
where r1 , r2 , r3 and r4 are initial and final conditions while r5 , r6 , r7 and r8 are for switching conditions at t1 and t2 . C This code can be downloaded from C http://www.math.uni-hamburg.de/home/oberle C C****************************************** SUBROUTINE F(X,Y,DY,J,L,JS,MMS) C****************************************** C Cubic Spline with a State Constraint C IMPLICIT DOUBLE PRECISION (A-H,O-Z) DIMENSION Y(*),DY(*)
BNDSCO BENCHMARK EXAMPLE INTEGER JS(MMS,*) COMMON /ITEIL/ ITEIL COMMON /PARA/ X0, XF, V0, VF, XMAX, U, HAM C************************************************** C see unconstrained case C************************************************** C U = -Y(4) C DY(1) = Y(2) DY(2) = U DY(3) = 0.D0 DY(4) = -Y(3) C HAM = 0.5D0*U*U + Y(3)*Y(2) + Y(4)*U C IF(ITEIL.EQ.1) RETURN IF(ITEIL.EQ.2) THEN C************************************************** C see touch point case C************************************************** C DY(5) = 0.5D0*U*U RETURN ENDIF C DY(5) = 0.D0 IF(ITEIL.EQ.3) RETURN IF(ITEIL.EQ.4) THEN C************************************************** C see constrained case C************************************************** DY(6) = 0.5D0*U*U RETURN ENDIF C IF(JS(J,1)*JS(J,2).LT.0) THEN DY(1) = 0.D0 DY(2) = 0.D0 DY(4) = 0.D0 U = 0.D0 ENDIF C IF(ITEIL.EQ.6) DY(6) = 0.5D0*U*U C ------------------------------------------------------------------RETURN END
SUBROUTINE R(YA,YB,ZZ,W,NYA,NSK,J,L,LS,JS,MMS) C C C
Cubic Spline with a State Constraint IMPLICIT DOUBLE PRECISION (A-H,O-Z) DIMENSION YA(*),YB(*),ZZ(*),W(*) INTEGER NYA(*),NSK(*) INTEGER JS(MMS,*) COMMON /ITEIL/ ITEIL COMMON /PARA/ X0, XF, V0, VF, XMAX, U, HAM
C C************************************************** C see unconstrained case C************************************************** W(1) = YA(1) - X0 W(2) = YA(2) - V0 W(3) = YB(1) - XF W(4) = YB(2) - VF
171
COMPUTATIONAL OPTIMAL CONTROL: TOOLS AND PRACTICE
172 C NYA(1) = NYA(2) =
1 2
C IF(ITEIL.EQ.1) RETURN C IF(ITEIL.EQ.3) THEN C************************************************** C see touch point case C************************************************** C W(5) = ZZ(1) - XMAX W(6) = ZZ(2) NSK(5) = 1 NSK(6) = 1 IF(LS.EQ.1) ZZ(3) = ZZ(3) - ZZ(5) RETURN ENDIF C C************************************************** C see constrained case C************************************************** IF(ITEIL.EQ.5) THEN W(5) = ZZ(1) - XMAX W(6) = ZZ(2) W(7) = ZZ(4) NSK(7) = 1 NSK(5) = 1 NSK(6) = 1 IF(LS.EQ.1) ZZ(3) = 0.D0 IF(LS.EQ.2) ZZ(3) = -ZZ(5) ENDIF C C ------------------------------------------------------------------RETURN END
Bibliography Allgower EL and Georg K 1990 Numerical Continuation Methods: An Introduction. Springer, New York. Ascher UM, Mattheij RMM and Russel RD 1995 Numerical Solution of Boundary Value Problem for Ordinary Differential Equations. SIAM, Philadelphia. Athans M and Falb PL 1966 Optimal Control: An Introduction to Theory and Its Application. McGrawHill, New York. Bazaraa MS, Sherali HD and Shetty CM 1993 Nonlinear Programming: Theory and Algorithms 2 edn. Wiley-Interscience, New York. Bellman R 1957 Dynamics Programming. Princeton University Press, Princeton, NJ. Benson D February, 2005 A Gauss Pseudospectral Transcription for Optimal Control. PhD thesis, Massachusetts Institute of Technology. Berkmann P and Pesch HJ 1995 Abort landing in windshear: Optimal control problem with third-order state constraint and varied switching structure. Journal of Optimization Theory and Applications 85(1), 21–57. Berkovitz LD 1974 Optimal Control Theory. Springer, New York. Betts JT 1994a Issues in the direct transcription of optimal control problems to sparse nonlinear programs. In Computational Optimal Control (ed. Bulirsch R and Kraft D) vol. 115 of International Series of Numerical Mathematics. Birkhäuser, Basel, pp. 3–17. Betts JT 1994b Trajectory optimization using sparse sequential quadratic programming. In Optimal Control: Calculus of Variations, Optimal Control Theory and Numerical Methods (ed. Bulirsch R, Kraft D, Stoer J and Well KH) vol. 111 of International Series of Numerical Mathematics. Birkhäuser, Basel, pp. 115–128. Betts JT 1998 Survey of numerical methods for trajectory optimization. Journal of Guidance, Control, and Dynamics 21(2), 193–207. Betts JT 2001 Practical Methods for Optimal Control Using Nonlinear Programming. SIAM, Philadelphia. Betts JT and Huffman WP 1991 Trajectory optimization on a parallel processor. Journal of Guidance, Control, and Dynamics 14(2), 431–439. Betts JT and Huffman WP 1992 Application of sparse nonlinear programming to trajectory optimization. Journal of Guidance, Control, and Dynamics 15(1), 198–206. Betts JT and Huffman WP 1993 Path-constrained trajectory optimization using sparse sequential quadratic programming. Journal of Guidance, Control, and Dynamics 16(1), 59–68. Betts JT and Huffman WP 2003 Large scale parameter estimation using sparse nonlinear programming methods. SIAM Journal on Optimization 14(1), 223–244. Bock HG and Plitt KJ 1984 A multiple-shooting algorithm for direct solution of optimal control problems. Proceedings of the 9th IFAC World Congress, vol. IX. Budapest. Breakwell JV 1959 The optimization of trajectories. Journal of the Society for Industrial and Applied Mathematics 7(2), 215–247. Computational Optimal Control: Tools and Practice c 2009 John Wiley & Sons, Ltd
˙ S. Subchan and R. Zbikowski
174
BIBLIOGRAPHY
Breakwell JV and Dixon JF 1975 Minimum-fuel rocket trajectories involving intermediate-thrust arcs. Journal of Optimization Theory Applications 17, 465–479. Breakwell JV, Speyer JL and Bryson AE 1963 Optimization and control of nonlinear systems using the second variation. SIAM Journal on Control 1(2), 193–223. Brusch R 1974 A nonlinear programming approach to space shuttle trajectory optimization. Journal of Optimization Theory and Applications 13, 94–118. Bryson AE 1996 Optimal control 1950 to 1985. IEEE Control Systems Magazine 16(3), 26–33. Bryson AE and Denham WF 1962 A steepest ascent method for solving optimum programming problems. ASME Journal of Applied Mechanics 29, 247–257. Bryson AE and Ho YC 1975 Applied Optimal Control Optimization, Estimation, and Control. Revised Printing. Hemisphere, New York. Bryson AE, Denham WF and Dreyfus SE 1963 Optimal programming problems with inequality constraints 1: Necessary conditions for extremal solutions. AIAA Journal 1(11), 2544–2550. Bulirsch R, Montrone F and Pesch HJ 1991a Abort landing in the presence of windshear as a minimax optimal control problem, Part 1: Necessary conditions. Journal of Optimization Theory and Applications 70(1), 1–23. Bulirsch R, Montrone F and Pesch HJ 1991b Abort landing in the presence of windshear as a minimax optimal control problem, Part 2: Multiple shooting and homotopy. Journal of Optimization Theory and Applications 70(2), 223–254. Büskens C 1996 Lösung optimaler Steuerprozesse. Lösung adjungierter Variablen. Automatische Gitterpunktsanpassung, NUDOCCCS. Technical report, Westfälischen Wilhelms–Universität Münster. Büskens C and Maurer H 2000 SQP-methods for solving optimal control problems with control and state constraints: Adjoint variables, sensitivity analysis and real-time control. Journal of Computational and Applied Mathematics 120(1–2), 85–108. Chudej K, Büskens C and Graf T 2001 Solution of a hard flight path optimization problem by different optimization codes. In High Performance Scientific and Engineering Computing (ed. Breuer M, Durst F and Zenger C). Springer, Berlin, pp. 289–296. Darling D 2002 The Complete Book of Spaceflight: from Apollo 1 to Zero Gravity. John Wiley & Sons, Inc., New York. Deuflhard P and Bader G 1983 Multiple shooting techniques revisited. In Numerical Treatment of Inverse Problems in Differential and Integral Equations (ed. Deuflhard P and Hairer E), vol. 2 of Progress in Scientific Computing. Birkhäuser, Boston, pp. 74–94. Deuflhard P, Pesch HJ and Rentrop P 1976 A modified continuation method for the numerical solution of nonlinear two-point boundary value problems by shooting techniques. Numerische Mathematik 26, 327–343. Dickmanns ED and Well KH 1974 Approximate solution of optimal control problems using third order Hermite polynomial functions. Proceedings of the IFIP Technical Conference. Springer, Berlin, pp. 158–166. Ehtamo H, Raivio T and Hämäläinen RP 2000 A continuation method for minimum time problems. Technical report, System Analysis Laboratory, Helsinki University of Technology. Elnagar GN, Kazemi MA and Razzaghi M 1995 The pseudospectral Legendre method for discretizing optimal control problems. IEEE Transactions on Automatic Control 40(10), 1793–1796. Elnagar GN and Razzaghi M 1997 A collocation-type method for linear quadratic optimal control problems. Optimal Control Applications & Methods 18(3), 227–235. Elnagar GN 1997 State-control spectral Chebyshev parameterization for linearly constrained quadratic optimal control problems. Journal of Computational and Applied Mathematics 79(1), 19–40. Elnagar GN and Kazemi MA 1998 Pseudospectral Chebyshev optimal control of constrained nonlinear dynamical systems. Computational Optimization and Applications 11(2), 195–217. Elsgolc LE 1962 Calculus of Variations. Pergamon Press, London.
BIBLIOGRAPHY
175
Enright PJ and Conway BA 1991a Discrete approximations to optimal trajectories using direct transcription and nonlinear programming. Journal of Guidance, Control, and Dynamics 15(4), 994– 1002. Enright PJ and Conway BA 1991b Optimal finite-thrust spacecraft trajectories using collocation and nonlinear programming. Journal of Guidance, Control, and Dynamics 14(5), 981–985. Fahroo F and Ross IM 2001 Costate estimation by a Legendre pseudospectral method. Journal of Guidance, Control, and Dynamics 24(2), 270–277. Fahroo F and Ross IM 2002 Direct trajectory optimization pseudospectral method. Journal of Guidance, Control, and Dynamics 25(1), 160–166. Gamkrelidze RV 1978 Principles of Optimal Control Theory. Plenum Press, New York. Garfinkel B 1951 Minimal problems in airplane performance. Quarterly of Applied Mathematics 9(2). Garfinkel B 1963 A solution of the Goddard problem. SIAM Journal on Control 1(3), 349–368. Gerdts M 2003 Optimal control and real-time optimization of mechanical multi-body systems. Zeitschrift für Angewandte Mathematik und Mechanik 83(10), 705–719. GESOP – Software User Manual 2004. Institute of Flight Mechanics and Control, University of Stuttgart, Germany. Gill PE, Murray W and Saunders MA 1993 Large-scale SQP methods and their application in trajectory optimization. In Computational Optimal Control (ed. Bulirsch R and Kraft D) vol. 115 of International Series of Numerical Mathematics. Birkhäuser, Basel, pp. 29–42. Gill PE, Murray W and Wright MH 1981 Practical Optimization. Academic Press, London. Gill PE, Murray W and Wright MH 2002 SNOPT: An SQP algorithm for large-scale constrained optimization. SIAM Journal on Optimization 12(4), 979–1006. Goh CJ and Teo KL 1988 Control parametrization: A unified approach to optimal control problems with general constraints. Automatica 24(1), 3–18. Grimm W and Markl A 1997 Adjoint estimation from a direct multiple shooting method. Journal of Optimization Theory and Applications 92(2), 263–283. Hargraves CR and Paris SW 1987 Direct trajectory optimization using nonlinear programming and collocation. Journal of Guidance, Control, and Dynamics 10(4), 338–342. Hartl RF, Sethi SP and Vickson RG 1995 A survey of maximum principles for optimal control problems with state constraints. SIAM Review 37(2), 181–218. Herman AL and Conway BA 1996 Direct optimization using collocation based on higher-order GaussLobatto quadrature rules. Journal of Guidance, Control, and Dynamics 19(3), 592–599. Hestenes MR 1966 Calculus of Variations and Optimal Control Theory. John Wiley & Sons Inc., New York. Hicks GH and Ray WH 1971 Approximation methods for optimal control synthesis. Canadian Journal of Chemical Engineering 49, 522–528. Jacobson DH, Lele MM and Speyer JL 1971 New necessary conditions optimality for control problems with state-variable inequality constraints. Journal of Mathematical Analysis and Applications 35, 255–284. Jaddu H and Shimemura E 1997 Solution of nonlinear optimal control problem using Chebyshev polynomial. Proceedings of the 2nd Asian Control Conference, Seoul, Korea, vol. 1, pp. 471–420. Jaddu H and Shimemura E 1999a Computational methods based on the state parameterization for solving constrained optimal control problems. International Journal of Systems Science 30(3), 275–282. Jaddu H and Shimemura E 1999b Computation of optimal control trajectories using Chebyshev polynomials: Parametrization, and quadratic programming. Optimal Control, Applications and Methods 20(1), 21–42. Keller HB 1992 Numerical Methods for Two-Point Boundary Value Problems. Dover, New York.
176
BIBLIOGRAPHY
Kelley HJ 1962 Methods of gradients. In Optimization Techniques with Application to Aerospace Systems (ed. Leitmann G) vol. 5 of Mathematics in Science and Engineering. Academic Press, New York, pp. 206–254. Kirk DE 1970 Optimal Control Theory. Prentice Hall, Englewood Cliffs, NJ. Kraft D 1985 On converting optimal control problems into nonlinear programming problems. In Computational Mathematical Programming (ed. Schittkowski K), vol. 15 of NATO ASI Series. Series F: Computer and System Science. Springer, Berlin, pp. 261–280. Kraft D 1994 Algoritm 733: TOMP-Fortran modules for optimal control calculations. ACM Transactions on Mathematical Software 20(3), 262–281. Krantz SG and Parks HR 2002 The Implicit Function Theorem: History, Theory, and Applications. Birkhäuser, Boston. Kreindler E 1982 Additional necessary conditions for optimal control with state-variable inequality constraints. Journal of Optimization Theory and Applications 38(2), 241–250. Kumar RR and Seywald H 1995 Fuel-optimal station keeping via differential inclusion. Journal of Guidance, Control, and Dynamics 18(5), 1156–1162. Lawden DF 1963 Optimal Trajectories for Space Navigation. Butterworths, London. Leitmann G 1962 Optimization Techniques. Academic Press, London. Leitmann G 1981 The Calculus of Variations and Optimal Control. Plenum Press, New York. Levary RR 1986 Optimal control problems with multiple goal objective. Optimal Control Applications & Methods 7(2), 201–207. Lewis FL and Syrmos VL 1995 Optimal Control 2nd edn. John Wiley & Sons, Inc., New York. Macki J and Strauss A 1982 Introduction to Optimal Control Theory. Springer, New York. Maurer H and Gillessen W 1975 Application of multiple shooting to the numerical solution of optimal control problems with bounded state variables. Computing 15, 105–126. Naidu DS 2002 Optimal Control Systems. CRC Press, Boca Raton, FL. Oberle HJ and Grimm W 1989 BNDSCO - a program for the numerical solution of optimal control problems. Technical report DLR IB 515-89-22, Institute for Flight Systems Dynamics, DLR, Oberpfaffenhofen, Germany. Osborne MR 1969 On shooting methods for boundary value problems. Journal of Mathematical Analysis and Applications 27, 417–433. Pesch HJ 1989a Real-time computation of feedback controls for constrained optimal control problems, Part 1: Neighboring extremals. Optimal Control Applications & Methods 10(2), 129–145. Pesch HJ 1989b Real-time computation of feedback controls for constrained optimal control problems, Part 2: A correction method based on multiple shooting. Optimal Control Applications & Methods 10(2), 147–171. Pesch HJ 1991 Offline and online computation of optimal trajectories in the aerospace field. In Applied Mathematics in Aerospace Science and Engineering (ed. Miele A and Salvetti A) Proceedings of a Meeting on Applied Mathematics in Aerospace Field. Plenum Press, New York, pp. 165–220. Pesch HJ 1994 A practical guide to the solution of real-life optimal control problems. Control and Cybernetics 23(1 and 2), 7–60. Pesch HJ and Bulirsch R 1994 The maximum principle, Bellman’s equation, and Caratheodory’s work. Journal of Optimization Theory and Applications 80(2), 199–225. Petrosyan LA 1994 Strongly dynamically stable optimality principles in multicriteria optimal control problems. Journal of Computer Systems Sciences International 32(2), 146–150. Pontryagin LS, Boltyanskii VG, Gamkrelidzeand RV and Mishchenko EF 1962 The Mathematical Theory of Optimal Processes. John Wiley & Sons, Inc., New York. Pytlak R 1998 Runge-Kutta based procedure for the optimal control of differential-algebraic equations. Journal of Optimization Theory and Applications 97(3), 675–705.
BIBLIOGRAPHY
177
Pytlak R 1999 Numerical Methods for Optimal Control Problems with State Constraints vol. 1707 of Lecture Notes in Mathematics. Springer, Berlin, Berlin. Rosen JB, Mangasarian OL and Ritter K 1970 A new algorithm for unconstrained optimization. In Nonlinear Programming. Academic Press, New York. Rosenbrock H and Storey C 1966 Computational Techniques for Chemical Engineers. Pergamon Press, Oxford. Ross IM and Fahroo F 2002 User’s manual for DIDO 2002: A MATLAB application package for solving optimal control problems. Technical report AA-02-002, Department of Aerospace and Astronautics, Naval Postgraduate School, Monterey, California. Ross IM and Fahroo F 2004a Legendre pseudospectral approximations of optimal control problems In New Trends in Nonlinear Dynamics and Control and their Applications (ed. Wei Kang and Carlos Borges MX) vol. 295 of Lecture Notes in Control and Information Sciences. Springer, Heidelberg, pp. 327–343. Ross IM and Fahroo F 2004b Pseudospectral knotting methods for solving optimal control problems. Journal of Guidance, Control, and Dynamics 27(3), 397–405. Sargent RWH 2000 Optimal control. Journal of Computational and Applied Mathematics 124(1–2), 361–371. Seywald H 1994 Trajectory optimization based on differential inclusion. Journal of Guidance, Control, and Dynamics 17(3), 480–487. Seywald H and Kumar RR 1997 Some recent developments in computational optimal control. IMA Volumes in Mathematics and its Applications 93, 203–234. Seywald H, Cliff EM and Well KH 1994 Range optimal trajectories for an aircraft flying in the vertical plane. Journal of Guidance, Control, and Dynamics 17(2), 389–398. Sirisena HR and Chou FS 1981 State parameterization approach to the solution of optimal control problems. Optimal Control Applications & Methods 2, 289–298. Steindl A and Troger H 2003 Optimal control of deployment of a tethered subsatellite. Nonlinear Dynamics 31(3), 257–274. Stoer J and Bulirsch R 2002 Introduction to Numerical Analysis 3rd edn. Springer, New York. ˙ Subchan S and Zbikowski R 2007a Computational optimal control of the terminal bunt manoeuvre— Part 1: Minimum altitude case. Optimal Control Applications & Methods 28(5), 311–353. ˙ Subchan S and Zbikowski R 2007b Computational optimal control of the terminal bunt manoeuvre— Part 2: Minimum-time case. Optimal Control Applications & Methods 28(5), 355–379. ˙ Subchan S, Zbikowski R and Cleminson JR 2003 Optimal trajectory for the terminal bunt problem: An analysis by the indirect method. Proceedings of AIAA Guidance, Navigation and Control Conference and Exhibition, Austin, Texas, vol. 4, pp. 3055–3065. Sussmann HJ and Willems JC 1997 300 years of optimal control: From the brachystochrone to the maximum principle. IEEE Control Systems Magazine 17(3), 32–44. Sutherland WA 1975 Introduction to Metric and Topological Spaces. Clarendon Press, Oxford. Tang S and Conway BA 1995 Optimization of low-thrust interplanetary trajectories using collocation and nonlinear programming. Journal of Guidance, Control, and Dynamics 18(3), 599–604. Teo KL and Wong KH 1992 Nonlinearly constrained optimal control problems. Journal of the Australian Mathematical Society Series B 33(4), 517–530. Teo KL, Goh C and Wong K 1991 A Unified Computational Approach to Optimal Control Problems. Longman Scientific and Technical, Harbour. Teo KL, Jennings LS, Lee HWJ and Rehbock V 1999 The control parameterization enhancing transform for constrained optimal control problems. Journal of the Australian Mathematical Society Series B 40(3), 314–335. Vinh NX 1981 Optimal Trajectories in Atmospheric Flight. Elsevier Scientific, Amsterdam.
178
BIBLIOGRAPHY
von Stryk O 1993 Numerical solution of optimal control problems by direct collocation. In Optimal Control, Calculus of Variations, Optimal Control Theory and Numerical Methods (ed. Bulirsch R, Miele A, Stoer J and Well K) vol. 111 of International Series of Numerical Mathematics. Birkhäuser, Basel, pp. 129–143. von Stryk O 1999 User’s guide for DIRCOL - a direct collocation method for the numerical solution of optimal control problems. Technische Universität Darmstad. von Stryk O and Bulirsch R 1992 Direct and indirect methods for trajectory optimization. Annals of Operations Research 37(1–4), 357–373. von Stryk O and Schlemmer M 1994 Optimal control of the industrial robot Manutec r3. In Computational Optimal Control (ed. Bulirsch R and Kraft D) vol. 115 of International Series of Numerical Mathematics. Birkhäuser, Basel, pp. 367–382.
Index air density, 5 approximation co-state variable, 39, 40, 50, 71 axial aerodynamic force, 5 BNDSCO, 46, 71, 79, 107, 146 source of errors, 146 analytical, 146 coding, 148 numerical, 147 structure, 147 Bolza form, see performance criterion boundary conditions, see bunt manoeuvre (problem formulation) boundary value problem (BVP), 67 Bryson’s formulation, see constraint (pure state) bunt manoeuvre definition, x minimum altitude problem, see minimum altitude problem formulation minimum time problem, see minimum time problem formulation problem formulation, 4 boundary conditions, 5, 120 boundary conditions and constraints (table), 7 equations of motion, 5 performance criterion, 4, 119 physical modelling parameters (table), 6 calculus of variations, 27 CAMTOS, see computational optimal control (software) Computational Optimal Control: Tools and Practice c 2009 John Wiley & Sons, Ltd
climbing arc minimum altitude problem, see minimum altitude problem formulation minimum time problem, see minimum time problem formulation computational optimal control definition, ix, 3 software, 119, 157 BNDSCO, 46, 71, 79, 107, 146 CAMTOS, 139 DIDO, 41 DIRCOL, 39, 50, 123 GESOP, 139 NUDOCCCS, 39, 132 PROMIS, 49, 139 SNOPT, 39, 50, 96 SOCS, 49, 139 TROPIC, 139 constraint control, 6, 32, 120 mixed, 6, 33, 69, 120 path, 66 pure state, 6, 67, 120, 161 Bryson’s formulation, 33, 72, 74 Jacobson’s formulation, 72 state inequality, 33 state–control, 6 continuation method, 160 continuity of a function, 25 control, 1 control constraint, see constraint DIDO, see computational optimal control (software)
˙ S. Subchan and R. Zbikowski
INDEX
180 differentiable function, 25 differential inclusion, see direct method DIRCOL, 39, 50 flowchart, 121 grid refinement, 132 input, 129 structure, 122 direct collocation, see direct method direct method, 36, 47 differential inclusion, 37, 51 direct collocation, 36, 38, 50 direct multiple shooting, 41, 43 gradient algorithms, 36 Legendre pseudospectral, 40 direct multiple shooting, see direct method diving arc minimum altitude problem, see minimum altitude problem formulation minimum time problem, see minimum time problem formulation dynamics equation, see equations of motion entry condition, 34 entry point, see switching condition, see jump condition equations of motion, see bunt manoeuvre (problem formulation) Euler–Lagrange equation, 32, 33 exit point, see switching condition, see jump condition extremum, 10 maximum, 10, 25 minimum, 10, 25 first variation, see variation functional, 26 GESOP, 139 flowchart, 140 software architecture, 141 guidance, 1 Hamiltonian, 32, 65
Hessian, 15 homotopy method, 160 hybrid approach, 159 image of a function, 11 implicit function theorem, 21, 32 indirect approach, see indirect method indirect method, 29, 34, 36, 43, 47, 71 indirect multiple shooting, 43 indirect multiple shooting, see indirect method initial value problem (IVP), 43, 67 inverse image of a function, 11 Jacobson’s formulation see constraint (pure state), 72 jump condition, 74, 77 entry point, 74, 78 exit point, 79 junction condition, 67 Karush–Kuhn–Tucker (KKT) conditions, 24, 35, 72 Kreindler’s remarks, 78 jump condition entry point, 78 exit point, 78 Lagrange form, see performance criterion Lagrange multiplier, 29 Legendre–Clebsch condition, 32 Legendre–Gauss–Lobatto point, 40 level flight minimum altitude problem, see minimum altitude problem formulation minimum time problem, see minimum time problem formulation level set, 20 mathematical analysis minimum altitude problem, see minimum altitude problem formulation minimum time problem, see minimum time problem formulation
INDEX mathematical model, see bunt manoeuvre (problem formulation) maximum principle, see minimum principle Mayer form, see performance criterion minimum altitude flight minimum altitude problem, see minimum altitude problem formulation minimum time problem, see minimum time problem formulation minimum altitude problem formulation indirect method, 71 mathematical analysis, 64 climbing arc, 69 diving arc, 70 minimum altitude flight, 67 performance criterion, 49 qualitative analysis, 50 climbing arc, 63 diving arc, 64 level flight, 50 minimum principle, 32, 33, 47, 64, 65, 68, 101 minimum time problem formulation, 96 indirect method, 107 mathematical analysis, 101 climbing arc, 105 diving arc, 106 minimum altitude flight, 104 qualitative analysis, 96 climbing arc, 100 diving arc, 100 level flight, 96 mixed constraint, see constraint multi-objective formulation, 161 multi-point boundary value problem (MPBVP), 71 navigation, 1 necessary condition derivation, 32 first order, 29, 35 nonlinear programming (NLP), 34 normal aerodynamic force, 5
181 NUDOCCCS, see computational optimal control (software) main program, 133 objective function, see performance criterion optimality condition, 69, 70, 105, 107 optimisation, 10 finite dimension single variable, 10 two or more variables, 15 infinite dimension, 24 path constraint, see constraint performance criterion, 4, 27 Bolza form, 29 Lagrange form, 27 Mayer form, 27 performance index, see performance criterion Pontryagin’s minimum principle, see minimum principle positive definite, 32 positive semi-definite, 32 PROMIS, see computational optimal control (software) pure state constraint, see constraint qualitative analysis minimum altitude, see minimum altitude problem formulation minimum time problem, see minimum time problem formulation second variation, see variation sequential quadratic programming (SQP), 39 shooting method flowchart, 45 procedure, 44 singular arc, 33, 66, 103 singular control, 66 singular point, 15 SNOPT, see computational optimal control (software) SOCS, see computational optimal control (software)
INDEX
182 spectral collocation, 40 state constraint, see pure state constraint state inequality constraint, see constraint state–control constraint, see mixed constraint sufficient condition, 32 switching condition, 77 entry point, 77 exit point, 77 switching function, 33, 66, 90 switching point, 67 switching structure minimum altitude problem, 71, 85, 91–93 minimum time problem, 113, 115–118
terminal bunt manoeuvre (TBM), see bunt manoeuvre TPBVP, see two-point boundary value problem trajectory shaping, 1, 4 transversality condition, 32 TROPIC, see computational optimal control (software) two-point boundary value problem (TPBVP), 29, 43 user experience, 154 software packages, 156 variation first, 26, 29 second, 27 variational approach, 29