OPTIMIZATION
Springer Optimization and Its Applications VOLUME 32 Managing Editor Panos M. Pardalos (University of Florida) Editor–Combinatorial Optimization Ding-Zhu Du (University of Texas at Dallas) Advisory Board J. Birge (University of Chicago) C.A. Floudas (Princeton University) F. Giannessi (University of Pisa) H.D. Sherali (Virginia Polytechnic and State University) T. Terlaky (McMaster University) Y. Ye (Stanford University)
Aims and Scope Optimization has been expanding in all directions at an astonishing rate during the last few decades. New algorithmic and theoretical techniques have been developed, the diffusion into other disciplines has proceeded at a rapid pace, and our knowledge of all aspects of the field has grown even more profound. At the same time, one of the most striking trends in optimization is the constantly increasing emphasis on the interdisciplinary nature of the field. Optimization has been a basic tool in all areas of applied mathematics, engineering, medicine, economics and other sciences. The series Springer Optimization and Its Applications publishes undergraduate and graduate textbooks, monographs and state-of-the-art expository works that focus on algorithms for solving optimization problems and also study applications involving such problems. Some of the topics covered include nonlinear optimization (convex and nonconvex), network flow problems, stochastic optimization, optimal control, discrete optimization, multiobjective programming, description of software packages, approximation techniques and heuristic approaches.
OPTIMIZATION Structure and Applications Edited By CHARLES PEARCE School of Mathematical Sciences, The University of Adelaide, Adelaide, Australia EMMA HUNT School of Economics & School of Mathematical Sciences, The University of Adelaide, Adelaide, Australia
123
Editors Charles Pearce Department of Applied Mathematics University of Adelaide 70 North Terrace Adelaide SA 5005 Australia
[email protected]
Emma Hunt Department of Applied Mathematics University of Adelaide 70 North Terrace Adelaide SA 5005 Australia
[email protected]
ISSN 1931-6828 ISBN 978-0-387-98095-9 e-ISBN 978-0-387-98096-6 DOI 10.1007/978-0-387-98096-6 Springer Dordrecht Heidelberg London New York Library of Congress Control Number: 2009927130 Mathematics Subject Classification (2000): 49-06, 65Kxx, 65K10, 76D55, 78M50 c Springer Science+Business Media, LLC 2009 All rights reserved. This work may not be translated or copied in whole or in part without the written permission of the publisher (Springer Science+Business Media, LLC, 233 Spring Street, New York, NY 10013, USA), except for brief excerpts in connection with reviews or scholarly analysis. Use in connection with any form of information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed is forbidden. The use in this publication of trade names, trademarks, service marks, and similar terms, even if they are not identified as such, is not to be taken as an expression of opinion as to whether or not they are subject to proprietary rights. Cover illustration: Picture provided by Elias Tyligadas Printed on acid-free paper Springer is part of Springer Science+Business Media (www.springer.com)
This volume is dedicated with great affection to the late Alex Rubinov, who was invited plenary speaker at the mini-conference. He is sorely missed. ♥
Contents
List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xv List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xvii Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xix Part I Optimization: Structure 1
On the nondifferentiability of cone-monotone functions in Banach spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Jonathan Borwein and Rafal Goebel 1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.2 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2
Duality and a Farkas lemma for integer programs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jean B. Lasserre 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1.2 Summary of content . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Duality for the continuous problems P and I . . . . . . . . . . . . . . 2.2.1 Duality for P . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2.2 Duality for integration . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2.3 Comparing P, P∗ and I, I∗ . . . . . . . . . . . . . . . . . . . . . . 2.2.4 The continuous Brion and Vergne formula . . . . . . . . . 2.2.5 The logarithmic barrier function . . . . . . . . . . . . . . . . . 2.2.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3 Duality for the discrete problems Id and Pd . . . . . . . . . . . . . . 2.3.1 The Z-transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3.2 The dual problem I∗d . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3.3 Comparing I∗ and I∗d . . . . . . . . . . . . . . . . . . . . . . . . . . .
15 15 16 17 18 18 19 19 20 21 22 22 23 24 24
vii
viii
Contents
2.3.4 The “discrete” Brion and Vergne formula . . . . . . . . . 2.3.5 The discrete optimization problem Pd . . . . . . . . . . . . 2.3.6 A dual comparison of P and Pd . . . . . . . . . . . . . . . . . 2.4 A discrete Farkas lemma . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4.1 The case when A ∈ Nm×n . . . . . . . . . . . . . . . . . . . . . . 2.4.2 The general case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.6 Proofs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.6.1 Proof of Theorem 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.6.2 Proof of Corollary 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.6.3 Proof of Proposition 3.1 . . . . . . . . . . . . . . . . . . . . . . . . 2.6.4 Proof of Theorem 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
4
Some nonlinear Lagrange and penalty functions for problems with a single constraint . . . . . . . . . . . . . . . . . . . . . . . . . J. S. Giri and A. M. Rubinov 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3 The relationship between extended penalty functions and extended Lagrange functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4 Generalized Lagrange functions . . . . . . . . . . . . . . . . . . . . . . . . . 3.5 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.5.1 The Lagrange function approach . . . . . . . . . . . . . . . . . 3.5.2 Penalty function approach . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Convergence of truncates in l1 optimal feedback control . . Robert Wenczel, Andrew Eberhard and Robin Hill 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Mathematical preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3 System-theoretic preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3.1 Basic system concepts . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3.2 Feedback stabilization of linear systems . . . . . . . . . . . 4.4 Formulation of the optimization problem in l1 . . . . . . . . . . . . 4.5 Convergence tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.6 Verification of the constraint qualification . . . . . . . . . . . . . . . . 4.6.1 Limitations on the truncation scheme . . . . . . . . . . . . . 4.7 Convergence of approximates . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.7.1 Some extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.8 Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
25 26 27 29 30 31 33 34 34 35 36 37 38 41 41 43 44 47 51 51 52 53 55 55 58 60 60 62 64 67 71 76 78 85 88 92
Contents
5
6
7
8
Asymptotical stability of optimal paths in nonconvex problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Musa A. Mamedov 5.1 Introduction and background . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 The main conditions of the turnpike theorem . . . . . . . . . . . . . 5.3 Definition of the set D and some of its properties . . . . . . . . . . 5.4 Transformation of Condition H3 . . . . . . . . . . . . . . . . . . . . . . . . . 5.5 Sets of 1st and 2nd type: Some integral inequalities . . . . . . . . 5.5.1 ............................................. 5.5.2 ............................................. 5.5.3 ............................................. 5.5.4 ............................................. 5.5.5 ............................................. 5.6 Transformation of the functional (5.2) . . . . . . . . . . . . . . . . . . . 5.6.1 ............................................. 5.6.2 ............................................. 5.7 The proof of Theorem 13.6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.7.1 ............................................. 5.7.2 ............................................. References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Pontryagin principle with a PDE: a unified approach . . . . . B. D. Craven 6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2 Pontryagin for an ODE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3 Pontryagin for an elliptic PDE . . . . . . . . . . . . . . . . . . . . . . . . . . 6.4 Pontryagin for a parabolic PDE . . . . . . . . . . . . . . . . . . . . . . . . . 6.5 Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A turnpike property for discrete-time control systems in metric spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Alexander J. Zaslavski 7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2 Stability of the turnpike phenomenon . . . . . . . . . . . . . . . . . . . . 7.3 A turnpike is a solution of the problem (P) . . . . . . . . . . . . . . . 7.4 A turnpike result . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Mond–Weir Duality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B. Mond 8.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.2 Convexity and Wolfe duality . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.3 Fractional programming and some extensions of convexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
ix
95 95 97 100 101 105 105 106 107 108 113 117 117 119 123 123 129 133 135 135 136 138 139 140 141
143 143 146 149 151 155 157 157 158 159
x
Contents
8.4 Mond–Weir dual . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.5 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.6 Second order duality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.7 Symmetric duality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
Computing the fundamental matrix of an M /G/1–type Markov chain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Emma Hunt 9.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.2 Algorithm H: Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.3 Probabilistic construction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.4 Algorithm H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.5 Algorithm H: Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . H, G and convergence rates . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.6 9.7 A special case: The QBD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.8 Algorithms CR and H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
10 A comparison of probabilistic and invariant subspace methods for the block M /G/1 Markov chain . . . . . . . . . . . . . . Emma Hunt 10.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.2 Error measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.3 Numerical experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.3.1 Experiment G1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.3.2 Experiment G2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.3.3 The Daigle and Lucantoni teletraffic problem . . . . . . 10.3.4 Experiment G6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.3.5 Experiment G7 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 Interpolating maps, the modulus map and Hadamard’s inequality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . S. S. Dragomir, Emma Hunt and C. E. M. Pearce 11.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2 A refinement of the basic inequality . . . . . . . . . . . . . . . . . . . . . 11.3 Inequalities for Gf and Hf . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.4 More on the identric mean . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.5 The mapping Lf . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
160 160 162 163 165
167 167 170 172 174 175 178 181 185 187
189 189 190 192 193 194 196 201 202 204
207 207 210 216 217 220 223
Contents
xi
Part II Optimization: Applications 12 Estimating the size of correcting codes using extremal graph problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Sergiy Butenko, Panos Pardalos, Ivan Sergienko, Vladimir Shylo and Petro Stetsyuk 12.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12.2 Finding lower bounds and exact solutions for the largest code sizes using a maximum independent set problem . . . . . . 12.2.1 Finding the largest correcting codes . . . . . . . . . . . . . . 12.3 Lower Bounds for Codes Correcting One Error on the Z-Channel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12.3.1 The partitioning method . . . . . . . . . . . . . . . . . . . . . . . . 12.3.2 The partitioning algorithm . . . . . . . . . . . . . . . . . . . . . . 12.3.3 Improved lower bounds for code sizes . . . . . . . . . . . . . 12.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 New perspectives on optimal transforms of random vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . P. G. Howlett, C. E. M. Pearce and A. P. Torokhti 13.1 Introduction and statement of the problem . . . . . . . . . . . . . . . 13.2 Motivation of the statement of the problem . . . . . . . . . . . . . . . 13.3 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13.4 Main results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13.5 Comparison of the transform T 0 and the GKLT . . . . . . . . . . . 13.6 Solution of the unconstrained minimization problem (13.3) . 13.7 Applications and further modifications and extensions . . . . . 13.8 Simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13.9 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 Optimal capacity assignment in general queueing networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . P. K. Pollett 14.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14.2 The model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14.3 The residual-life approximation . . . . . . . . . . . . . . . . . . . . . . . . . 14.4 Optimal allocation of effort . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14.5 Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14.6 Data networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14.7 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
227
227 229 232 236 237 239 239 241 242 245 245 247 248 249 251 252 253 254 258 258 261 261 262 263 264 267 268 271 271
xii
Contents
15 Analysis of a simple control policy for stormwater management in two connected dams . . . . . . . . . . . . . . . . . . . . . . Julia Piantadosi and Phil Howlett 15.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15.2 A discrete-state model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15.2.1 Problem description . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15.2.2 The transition matrix for a specific control policy . . 15.2.3 Calculating the steady state when 1 < m < k . . . . . . 15.2.4 Calculating the steady state for m = 1 . . . . . . . . . . . . 15.2.5 Calculating the steady state for m = k . . . . . . . . . . . . 15.3 Solution of the matrix eigenvalue problem using Gaussian elimination for 1 < m < k . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15.3.1 Stage 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15.3.2 The general rules for stages 2 to m − 2 . . . . . . . . . . . 15.3.3 Stage m − 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15.3.4 The general rules for stages m to k − 2m . . . . . . . . . . 15.3.5 Stage k − 2m + 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15.3.6 The general rule for stages k − 2m + 2 to k − m − 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15.3.7 The final stage k − m − 1 . . . . . . . . . . . . . . . . . . . . . . . 15.4 The solution process using back substitution for 1 < m < k . 15.5 The solution process for m = 1 . . . . . . . . . . . . . . . . . . . . . . . . . . 15.6 The solution process for m = k . . . . . . . . . . . . . . . . . . . . . . . . . . 15.7 A numerical example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15.8 Justification of inverses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15.8.1 Existence of the matrix W0 . . . . . . . . . . . . . . . . . . . . . . 15.8.2 Existence of the matrix Wp for 1 ≤ p ≤ m − 1 . . . . . 15.8.3 Existence of the matrix Wq for m ≤ q ≤ k − m − 1 . . . . . . . . . . . . . . . . . . . . . . . . . . 15.9 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 Optimal design of linear consecutive–k–out–of–n systems . Malgorzata O’Reilly 16.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16.1.1 Mathematical model . . . . . . . . . . . . . . . . . . . . . . . . . . . 16.1.2 Applications and generalizations of linear consecutive–k–out–of–n systems . . . . . . . . . . . . . . . . . 16.1.3 Studies of consecutive–k–out–of–n systems . . . . . . . . 16.1.4 Summary of the results . . . . . . . . . . . . . . . . . . . . . . . . . 16.2 Propositions for R and M . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16.3 Preliminaries to the main proposition . . . . . . . . . . . . . . . . . . . . 16.4 The main proposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16.5 Theorems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
273 273 274 274 275 276 279 280 280 281 281 283 284 285 286 287 287 290 292 292 295 296 296 298 305 306 307 307 307 308 309 311 312 315 318 321
Contents
xiii
16.6
Procedures to improve designs not satisfying necessary conditions for the optimal design . . . . . . . . . . . . . . . . . . . . . . . . 324 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 325 17 The (k+1)-th component of linear consecutive–k–out–of–n systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Malgorzata O’Reilly 17.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17.2 Summary of the results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17.3 General result for n > 2k, k ≥ 2 . . . . . . . . . . . . . . . . . . . . . . . . . 17.4 Results for n = 2k + 1, k > 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . 17.5 Results for n = 2k + 2, k > 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . 17.6 Procedures to improve designs not satisfying the necessary conditions for the optimal design . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 Optimizing properties of polypropylene and elastomer compounds containing wood flour . . . . . . . . . . . . . . . . . . . . . . . . . Pavel Spiridonov, Jan Budin, Stephen Clarke and Jani Matisons 18.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18.2 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18.2.1 Materials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18.2.2 Sample preparation and tests . . . . . . . . . . . . . . . . . . . . 18.3 Results and discussions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18.3.1 Density of compounds . . . . . . . . . . . . . . . . . . . . . . . . . . 18.3.2 Comparison of compounds obtained in a Brabender mixer and an injection-molding machine . . . . . . . . . . 18.3.3 Compatibilization of the polymer matrix and wood flour . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18.3.4 Optimization of the compositions . . . . . . . . . . . . . . . . 18.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 Constrained spanning, Steiner trees and the triangle inequality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Prabhu Manyem 19.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19.2 Upper bounds for approximation . . . . . . . . . . . . . . . . . . . . . . . . 19.2.1 The most expensive edge is at most a minimum spanning tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19.2.2 MaxST is at most (n − 1)MinST . . . . . . . . . . . . . . . . . 19.3 Lower bound for a CSP approximation . . . . . . . . . . . . . . . . . . . 19.3.1 E-Reductions: Definition . . . . . . . . . . . . . . . . . . . . . . . . 19.3.2 SET COVER . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19.3.3 Reduction from SET COVER . . . . . . . . . . . . . . . . . . . 19.3.4 Feasible Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
327 327 329 330 334 337 340 341 343 344 344 344 345 345 345 346 349 350 352 353 355 355 359 359 359 360 360 361 361 362
xiv
Contents
19.3.5 Proof of E-Reduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 365 19.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 366 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 366 20 Parallel line search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . T. C. Peachey, D. Abramson and A. Lewis 20.1 Line searches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20.2 Nimrod/O . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20.3 Execution time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20.3.1 A model for execution time . . . . . . . . . . . . . . . . . . . . . 20.3.2 Evaluation time a Bernoulli variate . . . . . . . . . . . . . . . 20.3.3 Simulations of evaluation time . . . . . . . . . . . . . . . . . . . 20.3.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20.4 Accelerating convergence by incomplete iterations . . . . . . . . . 20.4.1 Strategies for aborting jobs . . . . . . . . . . . . . . . . . . . . . . 20.4.2 Experimental results . . . . . . . . . . . . . . . . . . . . . . . . . . . 20.4.3 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 Alternative Mathematical Programming Models: A Case for a Coal Blending Decision Process . . . . . . . . . . . . . . . . . . . . . Ruhul A. Sarker 21.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.2 Mathematical programming models . . . . . . . . . . . . . . . . . . . . . . 21.2.1 Single period model (SPM) . . . . . . . . . . . . . . . . . . . . . . 21.2.2 Multiperiod nonlinear model (MNM) . . . . . . . . . . . . . 21.2.3 Upper bound linear model (MLM) . . . . . . . . . . . . . . . 21.2.4 Multiperiod linear model (MLM) . . . . . . . . . . . . . . . . 21.3 Model flexibility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.3.1 Case-1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.3.2 Case-2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.3.3 Case-3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.4 Problem size and computation time . . . . . . . . . . . . . . . . . . . . . . 21.5 Objective function values and fluctuating situation . . . . . . . . 21.6 Selection criteria . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.7 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
369 369 370 373 373 373 375 375 377 377 378 381 381 383 383 385 386 389 390 391 392 392 394 395 395 396 397 398 398
About the Editors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 401
List of Figures
3.1 3.2 3.3
P (f0 , f1 ). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 L(x; 52 ). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 L+ s 1 (x; 1). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
4.1 12.1 12.2 13.1 13.2
A closed-loop control system. . . . . . . . . . . . . . . . . . . . . . . . . . . . . A scheme of the Z-channel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Algorithm for finding independent set partitions . . . . . . . . . . . . Illustration of the performance of our method. . . . . . . . . . . . . . Typical examples of a column reconstruction in the matrix X (image “Lena”) after filtering and compression of the observed noisy image (Figure 13.1b) by transforms H (line with circles) and T 0 (solid line) of the same rank. In both subfigures, the plot of the column (solid line) virtually coincides with the plot of the estimate by the transform T 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Density of (a) polypropylene and (b) SBS in elastomer compounds for different blending methods. . . . . . . . . . . . . . . . . Comparison of tensile strength of the compounds obtained in an injection-molding machine and in a Brabender mixer. . . Influence of wood flour fractions and the modifier on the tensile strength of injection-molded specimens of the (a) PP and (b) SBS compounds. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Relative cost of the (a) PP and (b) SBS compounds depending on the content of wood flour and maleated polymers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Photographs of the PP compounds containing 40% wood flour of different fractions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A Constrained Steiner Tree and some of its special cases. . . . . A CSPI instance reduced from SET COVER (not all edges shown). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
18.1 18.2 18.3
18.4
18.5 19.1 19.2
3
62 236 240 256
257 347 348
349
351 352 357 361
xv
xvi
List of Figures
19.3
Feasible solution for our instance of CSPI (not all edges shown). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20.1 A sample configuration file. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20.2 Architecture of Nimrod/O. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20.3 Performance with Bernoulli job times. . . . . . . . . . . . . . . . . . . . . 20.4 Test function g(x). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20.5 Results of simulations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20.6 Incomplete evaluation points. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20.7 Strategy 1 with exponential distribution of job times. . . . . . . . 20.8 Strategy 1 with rectangular distribution of job times. . . . . . . . 20.9 Results for Strategy 2. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20.10 Results for Strategy 3. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.1 Simple case problem. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
364 371 372 374 375 376 377 379 379 380 380 393
List of Tables
9.1 10.1 10.2 10.3 10.4 10.5
The interlacing property . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Experiment G1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Experiment G2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Experiment G2 continued . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Iterations required with various traffic levels: Experiment G3 Iterations required with various traffic levels: Experiment G3 continued . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.6 Experiment G4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.7 Experiment G4 continued . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.8 Experiment G5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.9 Experiment G6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.10 Experiment G7: a transient process . . . . . . . . . . . . . . . . . . . . . . . 12.1 Lower bounds obtained . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12.2 Exact algorithm: Computational results . . . . . . . . . . . . . . . . . . . 12.3 Exact solutions found . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12.4 Lower bounds obtained in: a [27]; b [6]; c [7]; d [12]; e (this chapter) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12.5 Partitions of asymmetric codes found . . . . . . . . . . . . . . . . . . . . . 12.6 Partitions of constant weight codes obtained in: a (this chapter); b [4]; c [12] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12.7 New lower bounds. Previous lower bounds were found in: a [11]; b [12] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13.1 Ratios ρij of the error associated with the GKLT H to that of the transform T 0 with the same compression ratios . . . . . . . 15.1 Overflow lost from the system for m = 1, 2, 3, 4 . . . . . . . . . . . . . 16.1 Invariant optimal designs of linear consecutive–k–out–of–n systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18.1 Physical characteristics of the wood flour fractions . . . . . . . . . . 19.1 Constrained Steiner Tree and special cases: References . . . . . . 19.2 E-Reduction of a SET COVER to a CSPI : Costs and delays of edges in G . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
184 194 195 196 198 199 200 200 202 202 204 231 235 236 237 240 241 241 255 295 310 345 358 362
xvii
xviii
21.1 21.2
List of Tables
Relative problem size of ULM, MLM and MNM . . . . . . . . . . . . 395 Objective function values of ULM, MLM and MNM . . . . . . . . 397
Preface
This volume comprises a selection of material based on presentations at the Eighth Australian Optimization Day, held in McLaren Vale, South Australia, in September 2001, and some additional invited contributions by distinguished colleagues, here and overseas. Optimization Day is an annual miniconference in Australia which dates from 1994. It has been successful in bringing together Australian researchers in optimization and related areas for the sharing of ideas and the facilitation of collaborative work. These meetings have also attracted some collaborative researchers from overseas. This particular meeting was remarkable in the efforts made by some of the participants to ensure being present. It took place within days of the September 11 tragedy in New York and the financial collapse of a major Australian airline. These events left a number of us without air tickets on the eve of the conference. Some participants arrived in South Australia by car, having driven up to several thousand kilometers to join the meeting. This volume has two parts, one concerning mathematical structure and the other applications. The first part begins with a treatment of nondifferentiability of cone-monotone functions in Banach spaces, showing that whereas several regularity properties of cone-monotone functions in finite-dimensional spaces carry over to a separable Banach space provided the cone has an interior, further generalizations are not readily possible. The following chapter concerns a comparison between linear and integer programming, particularly from a duality perspective. A discrete Farkas lemma is provided and it is shown that the existence of a nonnegative integer solution to a linear equation can be tested via a linear program. Next, there is a study of connections between generalized Lagrangians and generalized penalty functions for problems with a single constraint. This is followed by a detailed theoretical analysis of convergence of truncates in 1 optimal feedback control. The treatment permits consideration of the frequently occurring case of an objective function lacking interiority of domain. The optimal control theme continues with a study of asymptotic stability of optimal paths in nonconvex problems. The purpose of the chapter is to avoid the convexity conditions usually
xix
xx
Preface
assumed in turnpike theory. The succeeding chapter proposes a unified approach to Pontryagin’s principle for optimal control problems with dynamics described by a partial differential equation. This is followed by a study of a turnpike property for discrete-time control systems in metric spaces. A treatment of duality theory for nonlinear programming includes comparisons of alternative approaches and discussion of how Mond–Weir duality and Wolfe duality may be combined. There are two linked chapters centered on the use of probabilistic structure for designing an improved algorithm for the determination of the fundamental matrix of a block-structured M/G/1 Markov chain. The approach via probabilistic structure makes clear in particular the nature of the relationship between the cyclic reduction algorithms and the Latouche–Ramaswami algorithm in the QBD case. Part I concludes with a chapter developing systematic classes of refinements of Hadamard’s inequality, a cornerstone of convex analysis. Although Part II of this volume is concerned with applications, a number of the chapters also possess appreciable theoretical content. Part II opens with the estimation of the sizes of correcting codes via formulation in terms of extremal graph problems. Previously developed algorithms are used to generate new exact solutions and estimates. The second chapter addresses the issue of optimal transforms of random vectors. A new transform is presented which has advantages over the Karhunen–Lo`eve transform. Theory is developed and applied to an image reconstruction problem. The following chapter considers how to assign service capacity in a queueing network to minimize expected delay under a cost constraint. Next there is analysis of a control policy for stormwater management in a pair of connected tandem dams, where a developed mathematical technology is proposed and exhibited. Questions relating to the optimal design of linear consecutive-k-out-of-n systems are treated in two related chapters. There is a study of optimizing properties of plastics containing wood flour; an analysis of the approximation characteristics of constrained spanning and Steiner tree problems in weighted undirected graphs where edge costs and delays satisfy the triangle inequality; heuristics for speeding convergence in line search; and the use of alternative mathematical programming formulations for a real-world coal-blending problem under different scenarios. All the contributions to this volume had the benefit of expert refereeing. We are grateful to the following reviewers for their help: Mirta In´es Aranguren (Universidad Nacional de Mar del Plata, Argentina), Eduardo Casas (Universidad de Cantabria, Spain), Aurelian Cernea (University of Bucharest), Pauline Coolen–Schrijner (University of Durham), Bruce Craven (University of Melbourne), Yu-Hong Dai (Chinese Academy of Sciences, Beijing), Sever Dragomir (Victoria University of Technology, Melbourne), Elfadl Khalifa Elsheikh (Cairo University), Christopher Frey (N. Carolina State University), Frank Kelly (University of Cambridge), Peter Kloeden (Johann Wolfgang Goethe-Universit¨ at, Frankfurt), Denis Lander (RMIT University), Roy Leipnik (UCSB), Musa Mamedov (University of
Preface
xxi
Ballarat), Jie Mi (Florida International University), Marco Muselli (Institute of Electronics, Computer and Telecommunication Engineering, Genova), Malgorzata O’Reilly (University of Tasmania), Stavros Papastavridis (University of Athens), Serpil Pehlivan (S¨ uleyman Demirel University, Turkey), Danny Ralph (Cambridge University), Sekharipuram S. Ravi (University at Albany, New York), Dan Rosenkrantz (University at Albany, New York), Alexander Rubinov (University of Ballarat), Hanif D. Sherali (Virginia Polytechnic Institute), Moshe Sniedovich (University of Melbourne), Nicole Stark (USDA Forest Products Laboratory, Madison), Nasser Hassan Sweilam (Cairo University), Fredi Tr¨ oltzsch (Technische Universit¨ at, Berlin), Erik van Doorn (University of Twente, Netherlands), Frank K. Wang (National Chiao University, Taiwan, R.O.C.), Jianxing Yin (SuDa University, China), Alexander Zaslavski (Technion-Israel Institute of Technology), and Ming Zuo (University of Alberta). We wish to thank John Martindale for his involvement in bringing about a firm arrangement for publication of this volume with Kluwer; Elizabeth Loew for shepherding the book through to publication with Springer following the Kluwer–Springer merger; and Panos Pardalos for his support throughout. A special thank you is due to Jason Whyte, through whose technical skill, effort and positive morale many difficulties were overcome. Charles Pearce & Emma Hunt
Chapter 1
On the nondifferentiability of cone-monotone functions in Banach spaces Jonathan Borwein and Rafal Goebel
Abstract In finite-dimensional spaces, cone-monotone functions – a special case of which are coordinate-wise nondecreasing functions – possess several regularity properties like almost everywhere continuity and differentiability. Such facts carry over to a separable Banach space, provided that the cone has interior. This chapter shows that further generalizations are not readily possible. We display several examples of cone–monotone functions on various Banach spaces, lacking the regularity expected from their finite-dimensional counterparts. Key words: Monotone functions, ordered Banach spaces, generating cones, differentiability
1.1 Introduction Functions for which f (y) ≥ f (x) whenever y − x is an element of a given convex cone K are called cone monotone with respect to K (or, simply, Kmonotone). The simplest examples are provided by nondecreasing functions on the real line. These have several immediate regularity properties, the most intuitive of which may be the at most countable number of discontinuities. Regularity properties of coordinate-wise nondecreasing functions on IRn , that is, functions f for which f (y) ≥ f (x) whenever yi ≥ xi for i = 1, 2, . . . , n, Jonathan Borwein Centre for Experimental and Constructive Mathematics, Simon Fraser University, Burnaby, BC, CANADA V5A 1S6 e-mail:
[email protected] Rafal Goebel Center for Control Engineering and Computation ECE, University of California, Santa Barbara, CA 93106-9650 U. S. A. e-mail:
[email protected] C. Pearce, E. Hunt (eds.), Structure and Applications, Springer Optimization and Its Applications 32, DOI 10.1007/978-0-387-98096-6 1, c Springer Science+Business Media, LLC 2009
3
4
J. Borwein and R. Goebel
were first collected by Chabrillac and Crouzeix [5] and include measurability, almost everywhere continuity, and almost everywhere Fr´echet differentiability. Note that nondecreasing functions, whether on the real line or IRn , are cone monotone with respect to the nonnegative cone, either [0, +∞) or [0, +∞)n . Recently, Borwein, Burke and Lewis [2] showed that functions on a separable Banach space, monotone with respect to a convex cone with nonempty interior, are differentiable except at points of an appropriately understood null set. The main goal of the current chapter is to demonstrate how possible extensions of this result, or other generalizations of finite-dimensional results on regularity of cone-monotone functions, fail in a general Banach space. Motivation for studying coordinate-wise nondecreasing functions in Chabrillac and Crouzeix, and cone-monotone functions by Borwein, Burke and Lewis, comes in part from the connections of such functions with Lipschitz, and more generally, directionally Lipschitz functions. Interest in the latter stems from the work of Burke, Lewis and Overton [4] on approximation of the Clarke subdifferential using gradients, an important idea in practical optimization. It turns out that such approximations, like in the Lipschitz case, are possible in the more general directionally Lipschitz setting. Before summarizing the properties of nondecreasing functions in finite dimensions, we illustrate their connection with Lipschitz functions. Consider a Lipschitz function l : IRn → IR and a K > 0 satisfying |l(x) − l(y)| ≤ K x − y ∞ ,
for all x, y ∈ IRn .
Let z ∈ IRn be given by zi = K, i = 1, 2, . . . , n, and define a function f : IRn → IR by f (x) = l(x) + z, x. Then for x and y such that yi ≥ xi , i = 1, 2, . . . , n, we have f (y) − f (x) = z, y − x + l(y) − l(x) n (yi − xi ) − max (yi − xi ) ≥ 0. ≥K i=1
i=1,2,...,n
In effect, f is coordinate-wise nondecreasing. Thus a Lipschitz function decomposes into a sum of a linear and a nondecreasing function, and inherits the regularity properties of the latter. Borwein, Burke and Lewis show that directionally Lipschitz functions on more general spaces admit a similar (local) decomposition, into a sum of a linear function and a cone-monotone one (with respect to a convex cone with interior). Theorem 1 (Monotone functions in finite dimensions). Suppose that f : IRn → IR satisfies f (x) ≤ f (y) whenever xi ≤ yi , i = 1, 2, . . . , n. Then: (a) f is measurable. (b) If, for some d with di > 0 for i = 1, 2, . . . , n, the function t → f (x0 + td) is lower semicontinuous at t = 0, then f is lower semicontinuous at x0 . Similarly for upper semicontinuity.
1
On the nondifferentiability of cone-monotone functions in Banach spaces
5
(c) f is almost everywhere continuous. (d) If f is Gˆ ateaux differentiable at x0 , then it is Fr´echet differentiable at x0 . (e) Let f be the lower semicontinuous hull of f . Then f is continuous at x0 if and only if f is. Similarly, f is Gˆ ateaux differentiable at x0 if and only ateaux differentiable, their derivatives f is, and if these functions are Gˆ agree. (f ) f is almost everywhere Fr´echet differentiable. For details and proofs consult Chabrillac and Crouzeix [5]. Statements (c) and (f ) generalize the Lebesgue monotone differentiability theorem and in fact can be deduced from the two-dimensional version given by S. Saks [9]; for details consult Borwein, Burke and Lewis [2]. The Banach space version of (c) and (f ) of Theorem 1, proved by Borwein, Burke and Lewis, requires a notion of a null set in a Banach space. We recall that a Banach space does not admit a Haar measure unless it is finite-dimensional, and proceed to make the following definitions – for details on these, and other measure-related notions we use in this chapter, we refer the reader to Benyamini and Lindenstrauss [1]. Let X be a separable Banach space. A probability measure μ on X is called Gaussian if for every x∗ ∈ X ∗ , the measure μx∗ on the real line, defined by μx∗ (A) = μ{y | x∗ , y ∈ A}, has a Gaussian distribution. It is additionally called nondegenerate if for every x∗ = 0 the distribution μx∗ is nondegenerate. A Borel set C ⊂ X is called Gauss null if μ(C) = 0 for every nondegenerate Gaussian measure on X. It is known that the set of points where a given Lipschitz function f : X → IR is not Gˆ ateaux differentiable is Gauss null. This in fact holds for functions with values in a space with the Radon–Nikodym property (Benyamini and Lindenstrauss [1] Theorem 6.42), whereas it fails completely for the stronger notion of Fr´echet differentiability. Theorem 2 (Borwein, Burke and Lewis). Let X be a separable space and let K ⊂ X be a convex cone with non-empty interior. If f : X → IR is K-monotone, then it is continuous and Hadamard (so also Gˆ ateaux) differentiable except at the points of a Gauss null set. In what follows, we show that all the assumptions in the above theorem are necessary, and, more generally, demonstrate how the properties of conemonotone functions described in Theorem 1 fail to extend to a general Banach space. Note that the results of Theorems 1 and 2 hold if the functions are allowed to take on infinite values, as long as appropriate meaning is given to the continuity or differentiability of such functions. Indeed, composing a possibly infinite-valued function with, for example, an inverse tangent does not change its monotonicity properties, while leading to finite values. We do not address this further, and work with finite-valued functions. Moreover, we only work with monotone functions which are in fact nondecreasing (homotone) and note that nonincreasing (antitone) functions can be treated in a symmetric fashion.
6
J. Borwein and R. Goebel
1.2 Examples We begin the section by showing that K-monotonicity of a function f : X → IR, with X finite- or infinite-dimensional, carries little information about the function if the cone K is not generating: K − K = X. (Recall that K − K is always a linear subspace of X.) An interesting example of a nongenerating cone K is provided by nonnegative and nondecreasing functions in X = C[0, 1]: K − K is the non-closed set of all functions of bounded variation, dense in (but not equal to) X. Note that the example below, as well as Example 2, could be considered in a general vector space. Example 1 (a nongenerating cone). Suppose that K − K = X. Let L ⊃ K − K be a hyperplane in X, not necessarily closed, and let l be the (onedimensional) algebraic complement of L in X, that is, L + l = X, L ∩ l = 0. Such L and l can be constructed using one of the versions of the Separation Principle: for any point l0 not in the intrinsic core (relative algebraic interior) of the convex set K − K (as K − K is linear, for any point not in K − K) there exists a linear functional (not necessarily continuous) φ on X such that
φ, l0 > φ, K − K; see Holmes [6], page 21. Now let L = ker φ, l = IR l0 . Define Pl (x) to be the projection of x onto l – the unique point of l such that x ∈ Pl (x) + L. Given any function g : l → IR, let f (x) = g(Pl (x)). The function f : X → IR is K-monotone: if y ≥K x, that is, y − x ∈ K, then Pl (x) = Pl (y), and in effect, f (y) = f (x). Now note that at any point x ∈ X, the function f has the properties of g “in the direction of l.” Consequently, in this direction, f may display any desired irregular behavior. In light of the example above, in what follows we only discuss generating cones. A cone K is certainly generating when int K = ∅, as then the linear subspace K − K has interior. More generally, K is generating when the core (algebraic interior) of K is not empty. These conditions are met by nonnegative cones in C[a, b] or l∞ but not by all generating cones — consider for example nonnegative cones in c0 or lp , 1 ≤ p < ∞. A condition equivalent to K − K = X is that for any x, y ∈ X there exists an upper bound of x and y: an element z ∈ X such that z ≥K x, z ≥K y. Consequently, nonnegative cones are generating in Banach lattices, as in such spaces, the maximum of two elements (with respect to the nonnegative cone) is defined. Banach lattices include Lp , lp , C[a, b] and c0 spaces. When the subspace K − K is dense in X, K is generating whenever K − K is additionally closed; equivalently, when the difference of polar cones to K is closed. Finally, in some cases, measure-theoretic arguments lead to conclusions that int(K − K) is nonempty, under some assumptions on K. We take advantage of such arguments in Example 4.
1
On the nondifferentiability of cone-monotone functions in Banach spaces
7
In what follows, we assume that the spaces in question are infinitedimensional. The next example involves a nonnegative cone defined through a Hamel basis. Example 2 (lack of continuity, general Banach space). Consider an ordered Hamel basis in X (every element of X is a finite linear combination of elements of such a basis, the latter is necessarily uncountable). Let K be a cone of elements, all of which have nonnegative coordinates. This cone is closed, convex, generating and has empty interior. Define a function f : X → IR by setting f (x) = 1 if the last nonzero coordinate of x is positive, and f (x) = 0 otherwise. Then f is K-monotone: if y − x ∈ K, then the last coordinates where x and y differ must satisfy yα > xα . If the α-th coordinate is the last nonzero one for either x or y, then f (y) ≥ f (x). In the opposite case, the last nonzero coordinates for x and y agree, and f (x) = f (y). In any neighborhood of any point in X, there are points y with f (y) = 0 as well as points z with f (z) = 1. Indeed, if xα is the last nonzero coordinate of x, we can perturb x by assigning a small negative or positive value to some xβ , with β > α. In effect, f is neither lower nor upper semicontinuous. However, for any nonzero x ∈ X, the function t → f (x + tx) is continuous at t = 0 (multiplying the last nonzero coordinate by 1 + t does not change its sign). Moreover, the lower semicontinuous hull of f is the constant function l(x) = 0, and the upper semicontinuous hull of f is the constant function u(x) = 1. Both of these are smooth. The preceding example demonstrated the failure of (c) and (e) of Theorem 1 in a general Banach space. We note that the cone K in this example is a variation on one given by Klee [7], of a pointed convex cone dense in X (pointed means that K ∩ −K = 0). Such a set is obtained by considering all elements of X for which the last nonzero coordinate in a given Hamel basis is positive. In the example below, we use a Schauder basis to construct a conemonotone function violating (c) and (e) of Theorem 1 similarly to Example 2, however continuous at every point in a dense set of directions. Example 3 (lack of continuity but continuity on a dense set of directions). basis of X, and let {x∗j }∞ Let {xi }∞ i=1 be a Schauder i=1 be the associated ∞ ∈ X. We assume that the projections, so that x = i=1 x, x∗i xi for any x ∞ basis is unconditional, that is, for any x the sum i=1 i x, x∗i xi converges ∞ for any combinations of i = ±1, and consequently, j=1 x, x∗ij xij converges ∞ for any subsequence {ij }j=1 . The standard bases in c0 and lp with p < +∞ satisfy this condition. Define a cone K ⊂ X by K = co(cone{xi }∞ i=1 ), a closed convex hull of the cone generated by xi ’s – equivalently, K = {x | x, x∗i ≥ 0 for all i = 1, 2, . . .}. As the basis is unconditional, any x can be decomposed into a sum of an element with positive coordinates and an element with negative coordinates, and thus the cone K is generating. Let f : X → IR be given by
8
J. Borwein and R. Goebel
f (x) = lim sup sign x, x∗j + , j→∞
where a+ = max{0, a}. Then f is K-monotone. Indeed, if x ≤K y, that is, y − x ∈ K, then y − x, x∗j ≥ 0 – equivalently y, x∗j ≥ x, x∗j – for all x∗j . This implies that f (x) ≤ f (y). Note that the sets {x | f (x) = 0} and {x | f (x) = 1} are dense in X: we have f (x) = 0for any x ∈ span({xi }∞ i=1 ) whereas f (x) = 1 for any x ∈ ∞ −i span({xi }∞ i=1 ) + i=1 2 xi / xi . As a result, f (x) is nowhere continuous, whereas for any x ∈ X there exists d ∈ X such that f (x + td) is continuous at t = 0. In fact, f (x + txi ) is continuous in t, with f (x + txi ) = f (x) for any t and any xi . In greater generality, f (x + y) = f (x) for any y ∈ span({xi }∞ i=1 ), as for large enough j, we have x + y, x∗j = x, x∗j . Thus there exists a set D dense in X such that, for every x ∈ X and every d ∈ D, the function t → f (x + td) is continuous. As in Example 2 the lower and upper semicontinuous hulls of f are the constant functions l(x) = 0 and u(x) = 1, respectively, and in particular, they are smooth. We now need to introduce another notion of a null set in a separable Banach space, more general than that of a Gauss null set, described in the comments preceding Theorem 2. A Borel set C ⊂ X is called Haar null if there is a Borel probability measure μ on X such that μ(C + x) = 0 for all x ∈ X. Haar null sets include all Gauss null sets. The nonnegative cone in lp , 1 ≤ p < ∞, is Haar null but not Gauss null (in fact it is not σ-directionally null, a notion weaker than Gauss null), whereas the nonnegative cone in c0 is not Haar null. In the example below, we use a fact that follows from Theorem 6.4 in Benyamini and Lindenstrauss [1]: if a set S is not Haar null, then S − S contains a neighborhood of 0. Example 4 (continuity, but only on a dense subset of a separable and nonreflexive Banach space). Let X be a separable, nonreflexive space, and Y ⊂ X a hyperplane not containing 0. Let C ⊂ Y be a closed convex set, with empty interior and not Haar null with respect to Y . In X, consider K = IR+ C = {rC | r ∈ [0, +∞)}, and note that this set is a closed convex cone: any description of C as {x ∈ X | x, aγ ≥ bγ , γ ∈ Γ } for aγ ∈ X ∗ , bγ ∈ IR leads to K = {x ∈ X | x, aγ ≥ bγ , γ ∈ Γ, bγ = 0}. Moreover, K has empty interior and is not Haar null. Indeed, suppose that μ(K) = 0 for some Borel probability measure on X. Then μ (C) = 0 where μ is a Borel probability measure on Y defined by μ (A) = μ(IR+ A), and this contradicts C being non Haar null. Also note that K − K = X, as K − K is a cone, and, since K is not Haar null, K − K is a neighborhood of 0. Define a function f : X → IR by 0 if x ∈ −K, f (x) = 1 if x ∈ −K.
1
On the nondifferentiability of cone-monotone functions in Banach spaces
9
We check that f is K-monotone. The only way this could fail is if for some y ≥K x, f (y) = 0 and f (x) = 1. But f (y) = 0 states that y ∈ −K, y ≥K x implies x = y − k for some k ∈ K, and since −K is a convex cone, x = y + (−k) ∈ −K + (−K) = −K. Thus x ∈ −K and f (x) cannot equal 1. Thus f is K-monotone. Additionally, as K is closed and convex, the function f is, respectively, lower semicontinuous and quasiconvex (level sets {x | f (x) ≤ r} are closed and convex). Moreover, f is generically continuous by Fort’s theorem (see Borwein, Fitzpatrick and Kenderov [3]). However, for every x ∈ −K, there exists d ∈ X such that t → f (x + td) is not continuous at t = 0. Indeed, suppose that this failed for some x0 ∈ −K. Then for every d ∈ X there exists (d) > 0 so that |t| < (d) implies f (x0 + td) = 0, and so x0 + td ∈ −K. Thus x0 is an absorbing point of a closed convex set −K, and, as X is barelled, x0 ∈ int(−K). But the latter set is empty. To sum up, f is continuous on a dense set but it is not continuous (and not differentiable) at any point of the given non Haar null set. The closed convex set C ⊂ Y in the example above was chosen to be not Haar null and has no interior. Such a set exists in any nonreflexive space, and in fact can be chosen to contain a translate of every compact subset of the space – see Benyamini and Lindenstrauss [1]. In c0 , the nonnegative cone is such a set (this requires the cone to be not Haar null), whereas the Haar null nonnegative cone in l1 is not. Still, in l1 , and in fact in any nonreflexive space, a cone satisfying the mentioned conditions can be found. Indeed, suppose the set C of Example 4 contains translates of all compact subsets of Y . We show that the constructed cone K contains a translate of every compact subset of X. Pick any compact D ⊂ X. Let g ∈ X ∗ be such that Y = g −1 (1). Shift D by z1 so that mind∈D+z1 g(d) = 1, and moreover, so that (D + z1 ) ∩ C = ∅. Pick any v ∈ (D + z1 ) ∩ C, and let E ⊂ Y be the projection of D onto Y in the direction v. Then E is a compact subset of Y , and thus for some z2 ∈ ker g, E + z2 ⊂ C. Now note that E + z2 is exactly the projection in the direction v onto Y of the set D + z1 + z2 , which implies that the latter set is a subset of C + IR+ v. Now C + IR+ v ⊂ K, as C ⊂ K and v ∈ C. In effect, K contains D + z1 + z2 . We now address another question on regularity of cone-monotone functions. Weak and strong notions of lower semicontinuity for convex functions agree. One cannot expect the same to hold for monotone functions, as the following example demonstrates. Example 5 (Lipschitz continuity, but no weak continuity). Let X = c0 with the supremum norm. The nonnegative cone K is closed, convex, has empty interior but is not Haar null. Fix any a ∈ X with a > 0 (a has positive coordinates) and define f : X → IR by f (x) =
x+
x+ . + (a − x)+
10
J. Borwein and R. Goebel
The denominator is never 0, as at least one of the summands is always positive, and thus f is continuous. In fact, x+ + (a − x)+ ≥ a , and since both the numerator and denominator are Lipschitz, so is f . Note also that f (X) = [0, 1], with f (x) = 0 if and only if x ≤ 0, and f (x) = 1 if and only if x ≥ a. We check that f is monotone. For any y ≥ x we have y + ≥ x+ , and (a − x)+ ≥ (a − y)+ , since a − x ≥ a − y. Then y +
x+ x+ y + ≥ + ≥ + , + + + (a − y) x + (a − y) x + (a − x)+
where the first inequality stems from the fact that for a fixed β ≥ 0, the function α → α/(α + β) is nondecreasing. Thus f is monotone. Let {en }∞ n=1 be the standard unit vectors in X. Notice that for any fixed α > 0 and large enough n, we have (x − αen )+ = x+ and (a − x + αen )+ = max{ (a − x)+ , α}. In effect, (x − αen )+ (x − αen )+ + (a − x + αen )+ x+ → + x + max{ (a − x)+ , α}
f (x − αen ) =
as n → ∞. Note that the last expression is less than f (x) whenever x+ > 0 and (a − x)+ < α. Similar analysis leads to f (x + αen ) →
max{ x+ , α} , max{ x+ , α} + (a − x)+
with the limit greater than f (x) when (a − x)+ > 0 and x+ < α. For a given α, the vectors αen converge weakly to 0. The constant α can be fixed arbitrarily large, and thus the function f is not weakly lower semicontinuous at any x with x+ > 0 (equivalently x ∈ −K), and not weakly upper semicontinuous at any x with (a − x)+ > 0 (equivalently x ∈ a + K). Consider any x with xn < 0 for all n. It is easy to verify that lim
t→0
f (x + th) − f (x) =0 t h
for all h ∈ c00 , that is, for sequences h with finitely many nonzero entries (in fact, the difference quotient is then 0 for all small enough t). As c00 is dense in X, and f is Lipschitz continuous, f is Gˆateaux differentiable at x, with ateaux derivative 0 ∈ X ∗ the derivative equal to 0 ∈ X ∗ . Similarly, f has Gˆ at every x such that xn > an for all n. Theorem 2 of Burke, Borwein and Lewis states that functions monotone with respect to a cone with interior are Gˆateaux differentiable outside a Gauss null set. In the example below, we show a failure of that conclusion for a cone with empty interior, even in a Hilbert space.
1
On the nondifferentiability of cone-monotone functions in Banach spaces
11
We first recall that the nonnegative cone in c0 is not Haar null, and so is not Gaussian null, whereas those in lp , 1 < p < ∞ are Haar null but not Gauss null. To see that the nonnegative cone in l2 is not Gauss null, observe for example that it contains the interval J = {x ∈ l2 | 0 ≤ xn ≤ 1/8n , n = 1, 2, . . .} and apply the fact that the closed convex hull of any norm-convergent to 0 sequence with dense span is not Gauss null (Theorem 6.24 in Benyamini and Lindenstrauss). The non Gauss null interval J will be used in the example below. Example 6 (Holder continuity, lack of Gˆ ateaux differentiability). We show that the Holder continuous function f (x) = x+ , monotone with respect to the nonnegative cone K, fails to be Gˆ ateaux differentiable at any point of −K, both in c0 with the supremum norm and in the Hilbert space l2 with the standard l2 –norm. We discuss the c0 case first. Pick any x ∈ −K. If xn = 0 for some n, ateaux differentiable (the then considering xn + ten shows that f is not Gˆ directional derivative in the direction of en is infinite). Suppose that xn < 0 for all n. Let h be given by hn = (−xn )1/3 , and consider tk = 2(−xk )2/3 converging to 0 as k → ∞. We have √ f (x + tk h) xk + tk hk f (x + tk h) − f (x) = ≥ tk tk tk 1/2 (−xk ) 1 = = , 2(−xk )2/3 2(−xk )1/6 and the last expression diverges to +∞ as k → ∞. Thus f is not differentiable at x. ateaux differentiable We now turn to l2 , and first show that f fails to be Gˆ on the non Gauss null interval −J = {x ∈ l2 | − 1/8n ≤ xn ≤ 0, n = 1, 2, . . .}. Indeed, for any x ∈ −J, consider h with hn = 1/2n and tk = 1/2k . Then √ −1/8k + 1/4k xk + tk hk f (x + tk h) − f (x) ≥ ≥ = 1 − 1/2k , tk tk 1/2k and, if the function was Gˆ ateaux differentiable at x, the directional derivative in the direction h would be at least 1. But this is impossible, as x provides a global minimum for f .
12
J. Borwein and R. Goebel
To see that f fails to be Gˆ ateaux differentiable at any point x ∈ −K, note that for some sequence ni , we have −1/8i ≤ xni ≤ 0. An argument as above, with hni = 1/2i and 0 otherwise, leads to the desired conclusion. A slight variation on Example 6 leads to a continuous function on c0 , monotone with respect to the nonnegative cone K, but not Gˆ ateaux differentiable at any point of a dense set c00 − K (any point with finite number of positive coefficients). Let {ki }∞ i=1 be dense in K, and define ∞ (x − ki )+ . f (x) = 2i i=1 Monotonicity is easy to check, and f is continuous as (x − ki )+ ≤ x+ . If ateaux at x. Indeed, in such a case we for some i, x ≤K ki , then f is not Gˆ have, for h ≥K 0 and t > 0, (x − ki + th)+ − (x − ki )+ f (x + th) − f (x) ≥ . t 2i t Picking t and h as in Example 6 (for x − Ki ) leads to the desired conclusion. The close relationship of cone-monotone and Lipschitz functions suggests that badly behaved cone-monotone functions will exist in spaces where irregular Lipschitz functions do. For example, p(x) = lim sup |xn | n→∞
is a nowhere Gˆateaux differentiable continuous seminorm in l∞ , see Phelps [8]. Arguments similar to those given by Phelps show that the function f : l∞ → IR, given by − f (x) = lim sup x+ n − lim sup xn , n→∞
n→∞
though monotone with respect to the nonnegative cone, is not Gˆ ateaux differentiable outside c0 , that is, at any x for which at least one of lim sup x+ n − − and lim sup x− n is positive. (Recall α = −α when α < 0 and α = 0 otherwise.) Indeed, suppose that lim sup x+ n = α > 0, and choose a subsequence nk so that lim xnk = α. Define h by hn2i = 1, hn2i+1 = −1, i = 1, 2, . . ., and hn = 0 for n = nk , k = 1, 2, . . .. Notice that for t close enough to 0, lim sup(x + th)+ n = α + |t| (if t > 0 then (x + th)n2i = xn2i + t, i = 1, 2, . . ., and these terms provide the lim sup(x + th)+ n ; if t < 0 one should consider − (x + th)n2i+1 = xn2i+1 − t). On the other hand, lim sup(x + th)− n = lim sup xn , and in effect, the limit of (f (x + th) − f (x))/t as t → 0 does not exist. The case of lim sup x− n = β > 0 can be treated in a symmetric fashion. Borwein, Burke and Lewis [2] show that any directionally Lipschitz function decomposes (locally) into a linear function, and a cone-monotone one (with respect to a cone with interior). Consequently, nondifferentiable
1
On the nondifferentiability of cone-monotone functions in Banach spaces
13
Lipschitz functions lead to local examples of nonregular cone-monotone functions. On spaces where there exist nowhere differentiable globally Lipschitz functions, like l∞ or l1 (Γ ) with Γ uncountable, one can in fact construct nowhere Gˆ ateaux cone-monotone functions; we carry this out explicitly in our final example. We note that the technique of Example 7 can be used to construct cone-monotone functions (with respect to cones with nonempty interiors) from any given Lipschitz function, on spaces like c0 and lp . Also note that spaces which admit a nowhere Fr´echet differentiable convex function (spaces which are not Asplund spaces) also admit a nowhere Fr´echet renorm (and so a nowhere Fr´echet globally Lipschitz function); the situation is not well understood for Gˆ ateaux differentiability. Example 7 (a nowhere Gˆ ateaux differentiable function on l∞ ). As discussed ateaux differentiable on l∞ . We above, p(x) = lim supn→∞ |xn | is nowhere Gˆ use this fact to construct a nowhere Gˆ ateaux differentiable function, monotone with respect to a cone with interior. the
Let e1 be the first of the standard unit vectors in l∞ , and consider function f (x) = p(x) + e1 , x = p(x) + x1 and the cone K = IR+ IB 1/2 (e1 ) (the cone generated by the closed ball of radius 1/2 centered at e1 ). Then K has interior and f is K-monotone. Indeed, as for any x ∈ IB 1/2 (e1 ), x1 ≥ 1/2 while xr ≤ 1/2, r = 2, 3, . . ., we have, for any k ∈ K, k = k1 (for the supremum norm). As p(x) is Lipschitz continuous, with constant 1, we obtain, for any x ∈ X, k ∈ K, p(x + k) − p(x) ≥ − k ≥ − e1 , k, which translates to p(x + k) + e1 , x + k ≥ p(x) + e1 , x, and this means that f is K-monotone. As p is nowhere Gˆ ateaux differentiable, so is f . Acknowledgments The first author’s research was partially supported by NSERC and by the Canada Research Chair Programme. The second author performed this research at the Centre for Experimental and Constructive Mathematics at Simon Fraser University and at the Department of Mathematics at University of British Columbia.
References 1. Y. Benyamini and J. Lindenstrauss,Geometric Nonlinear Functional Analysis, Vol. 1 (American Mathematical Society, Providence, RI, 2000). 2. J. M. Borwein, J. Burke and A. S. Lewis, Differentiability of cone–monotone functions on separable Banach space, Proc. Amer. Math. Soc., 132 (2004), 1067–1076. 3. J. M. Borwein, S. Fitzpatrick and P. Kenderov, Minimal convex uscos and monotone operators on small sets, Canad. J. Math. 43 (1991), 461–477.
14
J. Borwein and R. Goebel
4. J. Burke, A. S. Lewis and M. L. Overton, Approximating subdifferentials by random sampling of gradients, Math. Oper. Res. 27 (2002), 567–584. 5. Y. Chabrillac and J.-P. Crouzeix, Continuity and differentiability properties of monotone real functions of several real variables, Nonlinear Analysis and Optimization (Louvain-la-Neuve, 1983); Math. Programming Stud. 30 (1987), 1–16. 6. R. B. Holmes, Geometric Functional Analysis and its Applications (Springer-Verlag, New York, 1975). 7. V. Klee, Convex sets in linear spaces, Duke Math. J. 8 (1951), 433–466. 8. R. Phelps, Convex Functions, Monotone Operators and Differentiability (SpringerVerlag, New York, 1993). 9. S. Saks, Theory of the Integral, English translation, second edition (Stechert, New York, 1937).
Chapter 2
Duality and a Farkas lemma for integer programs Jean B. Lasserre
Abstract We consider the integer program max{c x | Ax = b, x ∈ Nn }. A formal parallel between linear programming and continuous integration, and discrete summation, shows that a natural duality for integer programs can be derived from the Z-transform and Brion and Vergne’s counting formula. Along the same lines, we also provide a discrete Farkas lemma and show that the existence of a nonnegative integral solution x ∈ Nn to Ax = b can be tested via a linear program. Key words: Integer programming, counting problems, duality
2.1 Introduction In this paper we are interested in a comparison between linear and integer programming, and particularly in a duality perspective. So far, and to the best of our knowledge, the duality results available for integer programs are obtained via the use of subadditive functions as in Wolsey [21], for example, and the smaller class of Chv´ atal and Gomory functions as in Blair and Jeroslow [6], for example (see also Schrijver [19, pp. 346–353]). For more details the interested reader is referred to [1, 6, 19, 21] and the many references therein. However, as subadditive, Chv´ atal and Gomory functions are only defined implicitly from their properties, the resulting dual problems defined in [6] or [21] are conceptual in nature and Gomory functions are used to generate valid inequalities for the primal problem. We claim that another natural duality for integer programs can be derived from the Z-transform (or generating function) associated with the counting Jean B. Lasserre LAAS-CNRS, 7 Avenue du Colonel Roche, 31077 Toulouse C´edex 4, FRANCE e-mail:
[email protected] C. Pearce, E. Hunt (eds.), Structure and Applications, Springer Optimization and Its Applications 32, DOI 10.1007/978-0-387-98096-6 2, c Springer Science+Business Media, LLC 2009
15
16
J.B. Lasserre
version (defined below) of the integer program. Results for counting problems, notably by Barvinok [4], Barvinok and Pommersheim [5], Khovanskii and Pukhlikov [12], and in particular, Brion and Vergne’s counting formula [7], will prove especially useful. For this purpose, we will consider the four related problems P, Pd , I and Id displayed in the diagram below, in which the integer program Pd appears in the upper right corner. Continuous Optimization Discrete Optimization − | − x : f (b, c) := max c x P P : f (b, c) := max c d d Ax = b Ax = b ←→ s.t. s.t. x ∈ Rn+ x ∈ Nn − Integration Summation − | − Id : fd (b, c) := εc x εc x ds I : f (b, c) := x∈Ω(b) Ω(b) ←→ Ax = b Ax = b Ω(b) := Ω(b) := x ∈ Rn+ x ∈ Nn Problem I (in which ds denotes the Lebesgue measure on the affine variety {x ∈ Rn | Ax = b} that contains the convex polyhedron Ω(b)) is the integration version of the linear program P, whereas Problem Id is the counting version of the (discrete) integer program Pd . Why should these four problems help in analyzing Pd ? Because first, P and I, as well as Pd and Id , are simply related, and in the same manner. Next, as we will see, the nice and complete duality results available for P, I and Id extend in a natural way to Pd .
2.1.1 Preliminaries In fact, I and Id are the respective formal analogues in the algebra (+, ×) of P and Pd in the algebra (⊕, ×), where in the latter, the addition a ⊕ b stands for max(a, b); indeed, the “max” in P and Pd can be seen as an idempotent integral (or Maslov integral) in this algebra (see, for example, Litvinov et al. [17]). For a nice parallel between results in probability ((+, ×) algebra) and optimization ((max, +) algebra), the reader is referred to Bacelli et al. [3, Section 9]. Moreover, P and I, as well as Pd and Id , are simply related via εf (b,c) = lim f (b, rc)1/r ; r→∞
εfd (b,c) = lim f d (b, rc)1/r . r→∞
(2.1)
2
Duality and a Farkas lemma for integer programs
17
Equivalently, by continuity of the logarithm, f (b, c) = lim
r→∞
1 ln f (b, rc); r
fd (b, c) = lim
r→∞
1 ln fd (b, rc), r
(2.2)
a relationship that will be useful later. Next, concerning duality, the standard Legendre-Fenchel transform which yields the usual dual LP of P, P∗ → minm {b λ | A λ ≥ c}, λ∈R
(2.3)
has a natural analogue for integration, the Laplace transform, and thus the inverse Laplace transform problem (that we call I∗ ) is the formal analogue of P∗ and provides a nice duality for integration (although not usually presented in these terms). Finally, the Z-transform is the obvious analogue for summation of the Laplace transform for integration. We will see that in the light of recent results in counting problems, it is possible to establish a nice duality for Id in the same vein as the duality for (continuous) integration and by (2.2), it also provides a powerful tool for analyzing the integer program Pd .
2.1.2 Summary of content (a) We first review the duality principles that are available for P, I and Id and underline the parallels and connections between them. In particular, a fundamental difference between the continuous and discrete cases is that in the former, the data appear as coefficients of the dual variables whereas in the latter, the same data appear as exponents of the dual variables. Consequently, the (discrete) Z-transform has many more poles than the Laplace transform. Whereas the Laplace transform has only real poles, the Z-transform has additional complex poles associated with each real pole, which induces some periodic behavior, a well-known phenomenon in number theory where the Z-transform (or generating function) is a standard tool (see, for example, Iosevich [11], Mitrinov´ıc et al. [18]). So, if the procedure of inverting the Laplace transform or the Z-transform (that is, solving the dual problems I∗ and I∗d ) is basically of the same nature, that is, a complex integral, it is significantly more complicated in the discrete case, due to the presence of these additional complex poles. (b) Then we use results from (a) to analyze the discrete optimization problem Pd . Central to the analysis is Brion and Vergne’s inverse formula [7] for counting problems. In particular, we provide a closed-form expression for the optimal value fd (b, c) which highlights the special role played by the so-called reduced costs of the linear program P and the complex poles of the Z-transform associated with each basis of the linear program P. We also show that each basis B of the linear program P provides exactly det(B)
18
J.B. Lasserre
complex dual vectors in Cm , the complex (periodic) analogues for Pd of the unique dual vector in Rm for P, associated with the basis B. As in linear programming (but in a more complicated way), the optimal value fd (b, c) of Pd can be found by inspection of (certain sums of) reduced costs associated with each vertex of Ω(b). (c) We also provide a discrete Farkas lemma for the existence of nonnegative integral solutions x ∈ Nn to Ax = b. Its form also confirms the special role of the Z-transform described earlier. Moreover, it allows us to check the existence of a nonnegative integral solution by solving a related linear program.
2.2 Duality for the continuous problems P and I With A ∈ Rm×n and b ∈ Rm , let Ω(b)Rn be the convex polyhedron Ω(b) := {x ∈ Rn | Ax = b;
x ≥ 0},
(2.4)
and consider the standard linear program (LP) P:
f (b, c) := max{c x | Ax = b;
x ≥ 0}
with c ∈ Rn , and its associated integration version
εc x ds I : f (b, c) :=
(2.5)
(2.6)
Ω(b)
where ds is the Lebesgue measure on the affine variety {x ∈ Rn | Ax = b} that contains the convex polyhedron Ω(b). For a vector c and a matrix A we denote by c and A their respective transposes. We also use both the notation c x and c, x for the usual scalar product of two vectors c and x. We assume that both A ∈ Rm×n and b ∈ Rm have rational entries.
2.2.1 Duality for P It is well known that the standard duality for (2.5) is obtained from the Legendre-Fenchel transform F (., c) : Rm → R of the value function f (b, c) with respect to b, that is, here (as y → f (y, c) is concave) λ → F (λ, c) := infm λ, y − f (y, c), y∈R
(2.7)
which yields the usual dual LP problem P∗ →
inf λ, b − F (λ, c) = minm {b λ | A λ ≥ c}.
λ∈Rm
λ∈R
(2.8)
2
Duality and a Farkas lemma for integer programs
19
2.2.2 Duality for integration Similarly, the analogue for integration of the Fenchel transform is the twosided Laplace transform F (., c) : Cm → C of f (b, c), given by
λ → F (λ, c) := ε−λ,y f (y, c) dy. (2.9) Rm
It turns out that developing (2.9) yields F (λ, c) =
n k=1
1 (A λ − c)k
whenever Re(A λ − c) > 0,
(2.10)
(see for example [7, p. 798] or [13]). Thus F (λ, c) is well-defined provided Re(A λ − c) > 0,
(2.11)
and f (b, c) can be computed by solving the inverse Laplace transform problem, which we call the (integration) dual problem I∗ of (2.12), that is, I∗ → f (b, c) :=
1 (2iπ)m
1 = (2iπ)m
γ+i∞
εb,λ F (λ, c) dλ
γ−i∞ γ+i∞
γ−i∞
εb,λ n
dλ,
(2.12)
(A λ − c)k
k=1
where γ ∈ Rm is fixed and satisfies A γ −c > 0. Incidentally, observe that the domain of definition (2.11) of F (., c) is precisely the interior of the feasible set of the dual problem P∗ in (2.8). We will comment more on this and the link with the logarithmic barrier function for linear programming (see Section 2.2.5 below). We may indeed call I∗ a dual problem of I as it is defined on the space Cm of variables {λk } associated with the nontrivial constraints Ax = b; notice that we also retrieve the standard “ingredients” of the dual optimization problem P∗ , namely b λ and A λ − c.
2.2.3 Comparing P, P∗ and I, I∗ One may compute f (b, c) directly using Cauchy residue techniques. That is, one may compute the integral (2.12) by successive one-dimensional complex integrals with respect to one variable λk at a time (for example starting with λ1 , λ2 , . . .) and by repeated application of Cauchy’s Residue Theorem
20
J.B. Lasserre
[8]. This is possible because the integrand is a rational fraction, and after application of Cauchy’s Residue Theorem at step k with respect to λk , the ouput is still a rational fraction of the remaining variables λk+1 , . . . , λm . For more details the reader is referred to Lasserre and Zeron [13]. It is not difficult to see that the whole procedure is a summation of partial results, each of them ∈ Rm that annihilates m terms of corresponding to a (multi-pole) vector λ n products in the denominator of the integrand. This is formalized in the nice formula of Brion and Vergne [7, Proposition 3.3 p. 820] that we describe below. For the interested reader, there are several other nice closed-form formulae for f (b, c), notably by Barvinok [4], Barvinok and Pommersheim [5], and Khovanskii and Pukhlikov [12].
2.2.4 The continuous Brion and Vergne formula The material in this section is taken from [7]. To explain the closed-form formula of Brion and Vergne we need some notation. Write the matrix A ∈ Rm×n as A = [A1 | . . . |An ] where Aj ∈ Rm denotes the j-th column of A for all j = 1, . . . , n. With Δ := (A1 , . . . , An ) let C(Δ) ⊂ Rm be the closed convex cone generated by Δ. Let Λ ⊆ Zm be a lattice. A subset σ of {1, . . . , n} is called a basis of Δ if the sequence {Aj }j∈σ is a basis of Rm , and the set of bases of Δ is denoted by B(Δ). For σ ∈ B(Δ) let C(σ) be the cone generated by {Aj }j∈σ . With any y ∈ C(Δ) associate the intersection of all cones C(σ) which contain y. This defines a subdivision of C(Δ) into polyhedral cones. The interiors of the maximal cones in this subdivision are called chambers in Brion and Vergne [7]. For every y ∈ γ, the convex polyhedron Ω(y) in (2.4) is simple. Next, for a chamber γ (whose closure is denoted by γ), let B(Δ, γ) be the set of bases σ such that γ is contained in C(σ), and let μ(σ) denote the volume of the convex polytope m { j∈σ tj Aj | 0 ≤ tj ≤ 1} (normalized so that vol(R /Λ) = 1). Observe that for b ∈ γ and σ ∈ B(Δ, γ) we have b = j∈σ xj (σ)Aj for some xj (σ) ≥ 0. Therefore the vector x(σ) ∈ Rn+ , with xj (σ) = 0 whenever j ∈ σ, is a vertex of the polytope Ω(b). In linear programming terminology, the bases σ ∈ B(Δ, γ) correspond to the feasible bases of the linear program P. Denote by V the subspace {x ∈ Rn | Ax = 0}. Finally, given σ ∈ B(Δ), let π σ ∈ Rm be the row vector that solves π σ Aj = cj for all j ∈ σ. A vector c ∈ Rn is said to be regular if cj − π σ Aj = 0 for all σ ∈ B(Δ) and all j ∈ σ. Let c ∈ Rn be regular with −c in the interior of the dual cone (Rn+ ∩ V )∗ (which is the case if A u > c for some u ∈ Rm ). Then, with Λ = Zm , Brion and Vergne’s formula [7, Proposition 3.3, p. 820] states that f (b, c) =
σ∈B(Δ,γ)
μ(σ)
εc,x(σ) σ k ∈σ (−ck + π Ak )
∀ b ∈ γ.
(2.13)
2
Duality and a Farkas lemma for integer programs
21
Notice that in linear programming terminology, ck − π σ Ak is simply the socalled reduced cost of the variable xk , with respect to the basis {Aj }j∈σ . Equivalently, we can rewrite (2.13) as
f (b, c) = x(σ):
vertex of
μ(σ) Ω(b)
εc,x(σ) . σ k ∈σ (−ck + π Ak )
(2.14)
Thus f (b, c) is a weighted summation over the vertices of Ω(b) whereas f (b, c) is a maximization over the vertices (or a summation with ⊕ ≡ max). So, if c is replaced by rc and x(σ ∗ ) denotes the vertex of Ω(b) at which c x is maximized, we obtain ⎡
⎤ r1
⎢ ∗ f (b, rc)1/r = εc,x(σ ) ⎢ ⎣
∗
⎥ εrc,x(σ)−x(σ ) ⎥ ⎦ σ n−m μ(σ) r (−c + π A ) k k x(σ):vertex of Ω(b) k ∈σ
from which it easily follows that lim ln f (b, rc)1/r = c, x(σ ∗ ) = max c, x = f (b, c),
r→∞
x∈Ω(b)
as indicated in (2.2).
2.2.5 The logarithmic barrier function It is also worth noticing that f (b, rc) =
1 (2iπ)m
1 = (2iπ)m
γr −i∞
εb,λ
γr +i∞ n
dλ
(A λ − rc)k
k=1 γ+i∞ m−n rb,λ
γ−i∞
r n
ε
dλ
(A λ − c)k
k=1
with γr = rγ and we can see that (up to the constant (m − n) ln r) the logarithm of the integrand is simply the well-known logarithmic barrier function −1
λ → φμ (λ, b) = μ
b, λ −
n j=1
ln (A λ − c)j ,
22
J.B. Lasserre
with parameter μ := 1/r, of the dual problem P∗ (see for example den Hertog [9]). This should not come as a surprise as a self-concordant barrier function of a cone K ⊂ Rn is given by the logarithm of the Laplace transform φK (x)−x,s ε ds of its dual cone K ∗ (see for example G¨ uler [10], Truong and K∗ Tun¸cel [20]). Thus, when r → ∞, minimizing the exponential logarithmic barrier function on its domain in Rm yields the same result as taking its residues.
2.2.6 Summary The parallel between P, P∗ and I, I∗ is summarized below. Fenchel-duality f (b, c) :=
max
Ax=b; x≥0
c x
F (λ, c) := infm {λ y − f (y, c)} y∈R
Laplace-duality
f (b, c) :=
εc x ds
Ax=b; x≥0
F (λ, c) := =
1 n
ε−λ y f (y, c) dy
Rm
(A λ − c)k
k=1
with : A λ − c ≥ 0
f (b, c) = minm {λ b − F (λ, c)} λ∈R
= minm {b λ | A λ ≥ c} λ∈R
with : Re(A λ − c) > 0 f (b, c) = 1 = (2iπ)m
1 (2iπ)m
Γ
n
ελ b F (λ, c) dλ
Γ
ελ b
dλ
(A λ − c)k
k=1
Simplex algorithm → vertices of Ω(b) → max c x over vertices.
Cauchy’ s Residue → poles of F (λ, c) c x → ε over vertices.
2.3 Duality for the discrete problems Id and Pd In the respective discrete analogues Pd and Id of (2.5) and (2.6) one replaces the positive cone Rn+ by Nn (or Rn+ ∩ Zn ), that is, (2.5) becomes the integer program (2.15) Pd : fd (b, c) := max {c x | Ax = b; x ∈ Nn }
2
Duality and a Farkas lemma for integer programs
23
whereas (2.6) becomes a summation over Nn ∩ Ω(b), that is, Id : f d (b, c) := {εc x | Ax = b; x ∈ Nn }.
(2.16)
We here assume that A ∈ Zm×n and b ∈ Zm , which implies in particular that the lattice Λ := A(Zn ) is a sublattice of Zm (Λ ⊆ Zm ). Note that b in (2.15) and (2.16) is necessarily in Λ. In this section we are concerned with what we call the “dual” problem I∗d of Id , the discrete analogue of the dual I∗ of I, and its link with the discrete optimization problem Pd .
2.3.1 The Z-transform The natural discrete analogue of the Laplace transform is the so-called Ztransform. Therefore with f d (b, c) we associate its (two-sided) Z-transform F d (., c) : Cm → C defined by z → F d (z, c) := z −y f d (y, c), (2.17) y∈Zm ym where the notation z y with y ∈ Zm stands for z1y1 · · · zm . Applying this definition yields z −y f d (y, c) F d (z, c) = y∈Zm
=
⎡ z −y ⎣
y∈Zm
=
=
εc x ⎦
x∈Nn ; Ax=y
⎡
εc x ⎣
x∈Nn
⎤
⎤
−ym ⎦ z1−y1 · · · zm
y=Ax
−(Ax)1 εc x z 1
−(Ax)m · · · zm
x∈Nn
= =
n k=1 n k=1
1 (1 −
εck z1−A1k z2−A2k
−Amk · · · zm )
1 , (1 − εck z −Ak )
(2.18)
which is well-defined provided Amk | (= |z Ak |) > εck |z1A1k · · · zm
∀k = 1, . . . , n.
(2.19)
24
J.B. Lasserre
Observe that the domain of definition (2.19) of F d (., c) is the exponential version of (2.11) for F (., c). Indeed, taking the real part of the logarithm in (2.19) yields (2.11).
2.3.2 The dual problem I∗d Therefore the value f d (b, c) is obtained by solving the inverse Z-transform problem I∗d (that we call the dual of Id )
1 · · · (2.20) F d (z) z b−em dzm · · · dz1 , f d (b, c) = (2iπ)m |z1 |=γ1 |zm |=γm where em is the unit vector of Rm and γ ∈ Rm is a (fixed) vector that Amk > εck for all k = 1, . . . , n. We may indeed call I∗d satisfies γ1A1k γ2A2k · · · γm the dual problem of Id as it is defined on the space Zm of dual variables zk associated with the nontrivial constraints Ax = b of the primal problem Id . We also have the following parallel. Continuous
Laplace-duality εc x ds f (b, c) :=
Discrete Z-duality f d (b, c) := εc x
Ax=b; x∈Rn +
F (λ, c) :=
ε−λ y f (y, c)dy
Ax=b; x∈Nn
F d (z, c) :=
Rm
=
n k=1
1 (A λ − c)k
with Re(A λ − c) > 0.
=
n k=1
z −y f d (y, c)
y∈Zm
1 1 − εck z −Ak
with |z Ak | > εck ,
k = 1, . . . , n.
2.3.3 Comparing I∗ and I∗d Observe that the dual problem I∗d in (2.20) is of the same nature as I∗ in (2.12) because both reduce to computing a complex integral whose integrand is a rational function. In particular, as I∗ , the problem I∗d can be solved by Cauchy residue techniques (see for example [14]). However, there is an important difference between I∗ and I∗d . Whereas the data {Ajk } appears in I∗ as coefficients of the dual variables λk in F (λ, c), it now appears as exponents of the dual variables zk in F d (z, c). And an immediate consequence of this fact is that the rational function F d (., c) has many more poles than F (., c) (by considering one variable at a time), and in
2
Duality and a Farkas lemma for integer programs
25
particular, many of them are complex, whereas F (., c) has only real poles. As a result, the integration of F d (z, c) is more complicated than that of F (λ, c), which is reflected in the discrete (or periodic) Brion and Vergne formula described below. However, we will see that the poles of F d (z, c) are simply related to those of F (λ, c).
2.3.4 The “discrete” Brion and Vergne formula Brion and Vergne [7] consider the generating function H : Cm → C defined by λ → H(λ, c) := f d (y, c)ε−λ,y , y∈Zm
which, after the change of variable zi = ελi for all i = 1, . . . , m, reduces to F d (z, c) in (2.20). They obtain the nice formula (2.21) below. Namely, and with the same notation used in Section 2.2.4, let c ∈ Rn be regular with −c in the interior of (Rn+ ∩V )∗ , and let γ be a chamber. Then for all b ∈ Λ ∩ γ (recall Λ = A(Zn )), f d (b, c) =
σ∈B(Δ,γ)
εc x(σ) Uσ (b, c) μ(σ)
(2.21)
for some coefficients Uσ (b, c) ∈ R, a detailed expression for which can be found in [7, Theorem 3.4, p. 821]. In particular, due to the occurrence of complex the term Uσ (b, c) in (2.21) is the periodic analogue poles in F (z, c), of ( k ∈σ (ck − πσ Ak ))−1 in (2.14). Again, as for f (b, c), (2.21) can be re-written as
f d (b, c) = x(σ):
vertex of
εc x(σ) Uσ (b, c), μ(σ)
(2.22)
Ω(b)
to compare with (2.14). To be more precise, by inspection of Brion and Vergne’s formula in [7, p. 821] in our current context, one may see that Uσ (b, c) =
ε2iπb (g) , Vσ (g, c)
(2.23)
g∈G(σ)
where G(σ) := (⊕j∈σ ZAj )∗ /Λ∗ (where ∗ denotes the dual lattice); it is a 2iπb finite abelian group of order μ(σ) and with (finitely many) characters ε for all b ∈ Λ. In particular, writing Ak = j∈σ ujk Aj for all k ∈ σ, ε2iπAk (g) = ε2iπ
j∈σ
ujk gj
k ∈ σ.
26
J.B. Lasserre
Moreover, Vσ (g, c) =
1 − ε−2iπAk (g)εck −π
σ
Ak
,
(2.24)
k ∈σ
with Ak , π σ as in (2.13) (and π σ rational). Again note the importance of the reduced costs ck − π σ Ak in the expression for F d (z, c).
2.3.5 The discrete optimization problem Pd We are now in a position to see how I∗d provides some nice information about the optimal value fd (b, c) of the discrete optimization problem Pd . Theorem 1. Let A ∈ Zm×n , b ∈ Zm and let c ∈ Zn be regular with −c in the interior of (Rn+ ∩ V )∗ . Let b ∈ γ ∩ A(Zn ) and let q ∈ N be the least common multiple (l.c.m.) of {μ(σ)}σ∈B(Δ,γ) . If Ax = b has no solution x ∈ Nn then fd (b, c) = −∞, else assume that 1 c x(σ) + lim ln Uσ (b, rc) , max r→∞ r x(σ): vertex of Ω(b) is attained at a unique vertex x(σ) of Ω(b). Then 1 c x(σ) + lim ln Uσ (b, rc) max fd (b, c) = r→∞ r x(σ): vertex of Ω(b) 1 c x(σ) + (deg(Pσb ) − deg(Qσb )) = max q x(σ): vertex of Ω(b) (2.25) for some real-valued univariate polynomials Pσb and Qσb . Moreover, the term limr→∞ ln Uσ (b, rc)/r or (deg(Pσb ) − deg(Qσb ))/q in (2.25) is a sum of certain reduced costs ck − π σ Ak (with k ∈ σ). For a proof see Section 2.6.1. Remark 1. Of course, (2.25) is not easy to obtain but it shows that the optimal value fd (b, c) of Pd is strongly related to the various complex poles of F d (z, c). It is also interesting to note the crucial role played by the reduced costs ck − π σ Ak in linear programming. Indeed, from the proof of Theorem 1 the optimal value fd (b, c) is the value of c x at some vertex x(σ) plus a sum of certain reduced costs (see (2.50) and the form of the coefficients αj (σ, c)). Thus, as for the LP problem P, the optimal value fd (b, c) of Pd can be found by inspection of (certain sums of) reduced costs associated with each vertex of Ω(b).
2
Duality and a Farkas lemma for integer programs
27
We next derive an asymptotic result that relates the respective optimal values fd (b, c) and f (b, c) of Pd and P. Corollary 1. Let A ∈ Zm×n , b ∈ Zm and let c ∈ Rn be regular with −c in the interior of (Rn+ ∩ V )∗ . Let b ∈ γ ∩ Λ and let x∗ ∈ Ω(b) be an optimal vertex of P, that is, f (b, c) = c x∗ = c x(σ ∗ ) for σ ∗ ∈ B(Δ, γ), the unique optimal basis of P. Then for t ∈ N sufficiently large, 1 ln Uσ∗ (tb, rc) . (2.26) fd (tb, c) − f (tb, c) = lim r→∞ r In particular, for t ∈ N sufficently large, the function t → f (tb, c) − fd (tb, c) is periodic (constant) with period μ(σ ∗ ). For a proof see Section 2.6.2. Thus, when b ∈ γ ∩ Λ is sufficiently large, say b = tb0 with b0 ∈ Λ and t ∈ N, the “max” in (2.25) is attained at the unique optimal basis σ ∗ of the LP (2.5) (see details in Section 2.6.2). From Remark 1 it also follows that for sufficiently large t ∈ N, the optimal ∗ value fd (tb, c) is equal to f (tb, c) plus a certain sum of reduced costs ck −π σ Ak ∗ ∗ (with k ∈ σ ) with respect to the optimal basis σ .
2.3.6 A dual comparison of P and Pd We now provide an alternative formulation of Brion and Vergne’s discrete formula (2.22), which explicitly relates dual variables of P and Pd . Recall that a feasible basis of the linear program P is a basis σ ∈ B(Δ) for which A−1 σ b ≥ 0. Thus let σ ∈ B(Δ) be a feasible basis of the linear program P and consider the system of m equations in Cm : A
Amj z1 1j · · · zm = ε cj ,
j ∈ σ.
(2.27)
Recall that Aσ is the nonsingular matrix [Aj1 | · · · |Ajm ], with jk ∈ σ for all k = 1, . . . , m. The above system (2.27) has ρ(σ) (= det(Aσ )) solutions ρ(σ) {z(k)}k=1 , written as z(k) = ελ ε2iπθ(k) ,
k = 1, . . . , ρ(σ)
(2.28)
for ρ(σ) vectors {θ(k)} in Cm . m Indeed, writing z = ελ ε2iπθ (that is, the vector {eλj ε2iπθj }m j=1 in C ) and passing to the logarithm in (2.27) yields Aσ λ + 2iπ Aσ θ = cσ ,
(2.29)
where cσ ∈ Rm is the vector {cj }j∈σ . Thus λ ∈ Rm is the unique solution of Aσ λ = cσ and θ satisfies
28
J.B. Lasserre
Aσ θ ∈ Zm .
(2.30)
Equivalently, θ belongs to (⊕j∈σ Aj Z)∗ , the dual lattice of ⊕j∈σ Aj Z. Thus there is a one-to-one correspondence between the ρ(σ) solutions {θ(k)} and the finite group G (σ) = (⊕j∈σ Aj Z)∗ /Zm , where G(σ) is a subgroup of G (σ). Thus, with G(σ) = {g1 , . . . , gs } and s := μ(σ), we can write (Aσ )−1 gk = θgk = θ(k), so that for every character ε2iπy of G(σ), y ∈ Λ, we have
y ∈ Λ, g ∈ G(σ)
ε2iπy (g) = ε2iπy θg ,
(2.31)
and
ε2iπAj (g) = ε2iπAj θg = 1,
j ∈ σ.
(2.32)
So, for every σ ∈ B(Δ), denote by {zg }g∈G(σ) these μ(σ) solutions of (2.28), that is, g ∈ G(σ), (2.33) zg = ελ ε2iπθg ∈ Cm , with λ = (Aσ )−1 cσ , and where ελ ∈ Rm is the vector {ελi }m i=1 . So, in the linear program P we have a dual vector λ ∈ Rm associated with each basis σ. In the integer program P, with each (same) basis σ there are now associated μ(σ) “dual” (complex) vectors λ + 2iπθg , g ∈ G(σ). Hence, with a basis σ in linear programming, the “dual variables” in integer programming are obtained from (a), the corrresponding dual variables λ ∈ Rm in linear programming, and (b), a periodic correction term 2iπθg ∈ Cm , g ∈ G(σ). We next introduce what we call the vertex residue function. Definition 1. Let b ∈ Λ and let c ∈ Rn be regular. Let σ ∈ B(Δ) be a feasible basis of the linear program P and for every r ∈ N, let {zgr }g∈G(σ) be as in (2.33), with rc in lieu of c, that is, zgr = εrλ ε2iπθg ∈ Cm ;
g ∈ G(σ),
with λ = (Aσ )−1 cσ .
The vertex residue function associated with a basis σ of the linear program P is the function Rσ (zg , .) : N → R defined by r → Rσ (zg , r) :=
b zgr 1 , −Ak rck μ(σ) (1 − zgr ε ) g∈G(σ)
(2.34)
j ∈σ
which is well defined because when c is regular, |zgr |Ak = εrck for all k ∈ σ. The name vertex residue is now clear because in the integration (2.20), Rσ (zg , r) is to be interpreted as a generalized Cauchy residue, with respect to the μ(σ) “poles” {zgr } of the generating function F d (z, rc). Recall from Corollary 1 that when b ∈ γ ∩Λ is sufficiently large, say b = tb0 with b0 ∈ Λ and some large t ∈ N, the “max” in (2.25) is attained at the unique optimal basis σ ∗ of the linear program P.
2
Duality and a Farkas lemma for integer programs
29
Proposition 1. Let c be regular with −c ∈ (Rn+ ∩ V )∗ and let b ∈ γ ∩ Λ be sufficiently large so that the max in (2.25) is attained at the unique optimal basis σ ∗ of the linear program P. Let {zg }g∈G(σ∗ ) be as in (2.33) with σ = σ ∗ . Then the optimal value of Pd satisfies ⎤ ⎡ b zgr 1 ⎣ 1 ⎦ fd (b, c) = lim ln −Ak rck r→∞ r μ(σ ∗ ) ) ∗ k ∈σ ∗ (1 − zgr ε g∈G(σ )
1 = lim ln Rσ∗ (zg , r) r→∞ r
(2.35)
and the optimal value of P satisfies ⎡ ⎤ b 1 1 |zgr | ⎦ f (b, c) = lim ln ⎣ r→∞ r μ(σ ∗ ) (1 − |zgr |−Ak εrck ) ∗ k ∈σ ∗ g∈G(σ )
= lim
r→∞
1 ln Rσ∗ (|zg |, r). r
(2.36)
For a proof see Section 2.6.3. Proposition 1 shows that there is indeed a strong relationship between the integer program Pd and its continuous analogue, the linear program P. Both optimal values obey exactly the same formula (2.35), but for the continuous version, the complex vector zg ∈ Cm is replaced by the vector ∗ |zg | = ελ ∈ Rm of its component moduli, where λ∗ ∈ Rm is the optimal solution of the LP dual of P. In summary, when c ∈ Rn is regular and b ∈ γ ∩ Λ is sufficiently large, we have the following correspondence. Linear program P
Integer program Pd
unique optimal basis σ ∗
unique optimal basis σ ∗
1 optimal dual vector λ∗ ∈ Rm
μ(σ ∗ ) dual vectors zg ∈ Cm , g ∈ G(σ ∗ ) ln zg = λ∗ + 2iπθg
f (b, c) = lim
1
r→∞ r
ln Rσ∗ (|zg |, r)
fd (b, c) = lim
1
r→∞ r
ln Rσ∗ (zg , r)
2.4 A discrete Farkas lemma In this section we are interested in a discrete analogue of the continuous Farkas lemma. That is, with A ∈ Zm×n and b ∈ Zm , consider the issue of the existence of a nonnegative integral solution x ∈ Nn to the system of linear equations Ax = b .
30
J.B. Lasserre
The (continuous) Farkas lemma, which states that given A ∈ Rm×n and b ∈ Rm , {x ∈ Rn | Ax = b, x ≥ 0} = ∅ ⇔ [A λ ≥ 0] ⇒ b λ ≥ 0,
(2.37)
has no discrete analogue in an explicit form. For instance, the Gomory functions used in Blair and Jeroslow [6] (see also Schrijver [19, Corollary 23.4b]) are implicitly and iteratively defined, and are not directly defined in terms of the data A, b. On the other hand, for various characterizations of feasibility of the linear diophantine equations Ax = b, where x ∈ Zn , the interested reader is referred to Schrijver [19, Section 4]. Before proceeding to the general case when A ∈ Zm×n , we first consider the case A ∈ Nm×n , where A (and thus b) has only nonnegative entries.
2.4.1 The case when A ∈ Nm×n In this section we assume that A ∈ Nm×n and thus necessarily b ∈ Nm , since otherwise {x ∈ Nn | Ax = b} = ∅. Theorem 2. Let A ∈ Nm×n and b ∈ Nm . Then the following two propositions (i) and (ii) are equivalent: (i) The linear system Ax = b has a solution x ∈ Nn . bm − 1 can be written (ii) The real-valued polynomial z → z b − 1 := z1b1 · · · zm zb − 1 =
n
Qj (z)(z Aj − 1),
(2.38)
j=1
for some real-valued polynomials Qj ∈ R[z1 , . . . , zm ], j = 1, . . . , n, all of which have nonnegative coefficients. In addition, the degree of the Qj in (2.38) is bounded by b∗ :=
m j=1
bj − min k
m
Ajk .
(2.39)
j=1
For a proof see Section 2.6.4. Hence Theorem 2 reduces the issue of existence of a solution x ∈ Nn to a particular ideal membership problem, that is, Ax = b has a solution x ∈ Nn if and only if the polynomial z b − 1 belongs to the binomial ideal I = z Aj − 1j=1,...,n ⊂ R[z1 , . . . , zm ] for some weights Qj with nonnegative coefficients. Interestingly, consider the ideal J ⊂ R[z1 , . . . , zm , y1 , . . . , yn ] generated by the binomials z Aj − yj , j = 1, . . . , n, and let G be a Gr¨obner basis of J. Using the algebraic approach described in Adams and Loustaunau [2, Section 2.8], it is known that Ax = b has a solution x ∈ Nn if and only if the monomial
2
Duality and a Farkas lemma for integer programs
31
z b can be reduced (with respect to G) to some monomial y α , in which case, α ∈ Nn is a feasible solution. Observe that in this case, we do not know α ∈ Nn in advance (we look for it!) to test whether z b − y α ∈ J. One has to apply Buchberger’s algorithm to (i) find a reduced Gr¨ obner basis G of J, and (ii) reduce z b with respect to G and check whether the final result is a monomial y α . Moreover, in the latter approach one uses polynomials in n + m (primal) variables y and (dual) variables z, in contrast with the (only) m dual variables z in Theorem 2. ∗
the dimension Remark 2. (a) With b∗ as in (2.39) denote by s(b∗ ) := m+b b∗ of the vector space of polynomials of degree b∗ in m variables. In view of Theorem 2, and given b ∈ Nm , checking the existence of a solution x ∈ Nn to Ax = b reduces to checking whether or not there exists a nonnegative solution y to a system of linear equations with: • n × s(b∗ ) variables, the nonnegative coefficients of the Qj ; n • s(b∗ + max Ajk ) equations to identify the terms of the same powers on k
j=1
both sides of (2.38). This in turn reduces to solving an LP problem with ns(b∗ ) variables and s(b∗ + maxk j Ajk ) equality constraints. Observe that in view of (2.38), this LP has a matrix of constraints with coefficients made up only of 0’s and ±1’s. (b) From the proof of Theorem 2 in Section 2.6.4, it easily follows that one may even constrain the weights Qj in (2.38) to be polynomials in Z[z1 , . . . , zm ] (instead of R[z1 , . . . , zm ]) with nonnegative coefficients. However, (a) shows that the strength of Theorem 2 is precisely allowing Qj ∈ R[z1 , . . . , zm ] while enabling us to check feasibility by solving a (continuous) linear program. By enforcing Qj ∈ Z[z1 , . . . , zm ] one would end up with an integer linear system whose size was larger than that of the original problem.
2.4.2 The general case In this section we consider the general case where A ∈ Zm×n so that A may have negative entries, and we assume that the convex polyhedron Ω := {x ∈ Rn+ | Ax = b} is compact. The above arguments cannot be repeated because of the occurrence of negative powers. However, let α ∈ Nn and β ∈ N be such that jk := Ajk + αk ≥ 0, A
k = 1, . . . , n;
bj := bj + β ≥ 0,
for all j = 1, . . . , m. Moreover, as Ω is compact, we have that
(2.40)
32
J.B. Lasserre
max
x∈Nn
⎧ n ⎨ ⎩
⎫ ⎬
αj xj | Ax = b
⎭
j=1
≤ maxn
x∈R+
⎧ n ⎨ ⎩
αj xj | Ax = b
j=1
⎫ ⎬ ⎭
=: ρ∗ (α) < ∞. (2.41)
∗
Observe that given α ∈ N , the scalar ρ (α) is easily calculated by solving ∈ Nm×n and b ∈ Nm be an LP problem. Choose N β ≥ ρ∗ (α), and let A as in (2.40). Then the existence of solutions x ∈ Nn to Ax = b is equivalent to the existence of solutions (x, u) ∈ Nn ×N to the system of linear equations ⎧ ⎪ ⎨ n Ax + uem = b Q (2.42) αj xj + u = β. ⎪ ⎩ n
j=1
Indeed, if Ax = b with x ∈ Nn then Ax + em
n
αj xj − em
j=1
or equivalently,
n
αj xj = b + em β − em β,
j=1
⎛ + ⎝β − Ax
n
⎞ αj xj ⎠ em = b,
j=1
n
and thus, as β ≥ ρ∗ (α) ≥ j=1 αj xj (see, for example, (2.41)), we see that n (x, u) with β − j=1 αj xj =: u ∈ N is a solution of (2.42). Conversely, let and b, it (x, u) ∈ Nn × N be a solution of (2.42). Using the definitions of A then follows immediately that Ax + em
n j=1
αj xj + uem = b + βem ;
n
αj xj + u = β,
j=1
so that Ax = b. The system of linear equations (2.42) can be cast in the form ⎤ ⎡ | em A b x (2.43) B = with B := ⎣ − − ⎦ , u β α | 1 and as B only has entries in N, we are back to the case analyzed in Section 2.4.1. Corollary 2. Let A ∈ Zm×n and b ∈ Zm and assume that Ω := {x ∈ Rn+ | Ax = b} is compact. Let α ∈ Nn and β ∈ N be as in (2.40) with β ≥ ρ∗ (α) (see, for example, (2.41)). Then the following two propositions (i) and (ii) are equivalent: (i) The system of linear equations Ax = b has a solution x ∈ Nn ;
2
Duality and a Farkas lemma for integer programs
33
(ii) The real-valued polynomial z → z b (zy)β − 1 ∈ R[z1 , . . . , zm , y] can be written z b (zy)β − 1 = Q0 (z, y)(zy − 1) +
n
Qj (z, y)(z Aj (zy)αj − 1)
(2.44)
j=1
for some real-valued polynomials {Qj }nj=0 in R[z1 , . . . , zm , y], all of which have nonnegative coefficients. The degree of the Qj in (2.44) is bounded by ⎡ ⎤⎤ ⎡ m m bj − min ⎣m + 1, min ⎣(m + 1)αk + Ajk ⎦⎦ . (m + 1)β + k=1,...,n
j=1
j=1
∈ Nm×n , b ∈ Nm , α ∈ Nn and β ∈ N be as in (2.40) with Proof. Let A ∗ β ≥ ρ (α). Then apply Theorem 2 to the equivalent form (2.43) of the system Q in (2.42), where B and ( b, β) only have entries in N, and use the definitions and b. of A Indeed Theorem 2 and Corollary 2 have the flavor of a Farkas lemma as it is stated with the transpose A of A and involving the dual variables zk associated with the constraints Ax = b. In addition, and as expected, it implies the continuous Farkas lemma because if {x ∈ Nn | Ax = b} = ∅, then from (2.44), and with z := ελ and y := (z1 · · · zm )−1 ,
εb λ − 1 =
m
Qj (eλ1 , . . . eλm , e−
i
λi
)(ε(A λ)j − 1).
(2.45)
j=1
Therefore A λ ≥ 0 ⇒ ε(A λ)j − 1 ≥ 0 for all j = 1, . . . , n, and as the Qj have nonnegative coefficients, we have eb λ − 1 ≥ 0, which in turn implies b λ ≥ 0. Equivalently, evaluating the partial derivatives n of both sides of (2.45) with respect to λj , at the point λ = 0, yields bj = k=1 Ajk xk for all j = 1, . . . , n, with xk := Qk (1, . . . , 1) ≥ 0. Thus Ax = b for some x ∈ Rn+ .
2.5 Conclusion We have proposed what we think is a natural duality framework for the integer program Pd . It essentially relies on the Z-transform of the associated counting problem Id , for which the important Brion and Vergne inverse formula appears to be an important tool for analyzing Pd . In particular, it shows that the usual reduced costs in linear programming, combined with the periodicities phenomena associated with the complex poles of F d (z, c), also play an essential role for analyzing Pd . Moreover, for the standard dual
34
J.B. Lasserre
vector λ ∈ Rm associated with each basis B of the linear program P, there are det(B) corresponding dual vectors z ∈ Cm for the discrete problem Pd . Moreover, for b sufficiently large, the optimal value of Pd is a function of these dual vectors associated with the optimal basis of the linear program P. A topic of further research is to establish an explicit dual optimization problem P∗d in these dual variables. We hope that the above results will stimulate further research in this direction.
2.6 Proofs A proof in French of Theorem 1 can be found in Lasserre [15]. The English proof in [16] is reproduced below.
2.6.1 Proof of Theorem 1 Proof. Use (2.1) and (2.22) to obtain ⎡
εfd (b,c) = lim ⎣ r→∞
x(σ):
⎡
vertex of
x(σ):
⎡
rc x(σ)
μ(σ) Ω(b)
x(σ):
⎤1/r Uσ (b, rc)⎦
Ω(b)
ε
vertex of
= lim ⎣ r→∞
μ(σ)
= lim ⎣ r→∞
ε
rc x(σ)
2iπb
g∈G(σ)
ε (g) ⎦ Vσ (g, rc)
⎤1/r Hσ (b, rc)⎦
vertex of
⎤1/r
.
(2.46)
Ω(b)
Next, from the expression of Vσ (b, c) in (2.24), and with rc in lieu of c, we see that Vσ (g, rc) is a function of y := er , which in turn implies that Hσ (b, rc) is also a function of εr , of the form
Hσ (b, rc) = (εr )c x(σ)
g∈G(σ)
j
ε2iπb (g)
, δj (σ, g, A) × (er )αj (σ,c)
(2.47)
for finitely many coefficients {δj (σ, g, A), αj (σ, c)}. Note that the coefficients αj (σ, c) are sums of some reduced costs ck − π σ Ak (with k ∈ σ). In addition, the (complex) coefficients {δj (σ, g, A)} do not depend on b. Let y := εr/q , where q is the l.c.m. of {μ(σ)}σ∈B(Δ,γ) . As q(ck −π σ Ak ) ∈ Z for all k ∈ σ, Pσb (y) (2.48) Hσ (b, rc) = y qc x(σ) × Qσb (y)
2
Duality and a Farkas lemma for integer programs
35
for some polynomials Pσb , Qσb ∈ R[y]. In view of (2.47), the degree of Pσb and Qσb , which depends on b but not on the magnitude of b, is uniformly bounded in b. Therefore, as r → ∞,
Hσ (b, rc) ≈ (εr/q )qc x(σ)+deg(Pσb )−deg(Qσb ) ,
(2.49)
so that the limit in (2.46), which is given by max εc x(σ) lim Uσ (b, rc)1/r (as σ
r→∞
we have assumed unicity of the maximizer σ), is also max x(σ): vertex of
εc x(σ)+(deg(Pσb )−deg(Qσb ))/q . Ω(b)
Therefore fd (b, c) = −∞ if Ax = b has no solution x ∈ Nn , else 1 c x(σ) + (deg(Pσb ) − deg(Qσb )) , (2.50) max fd (b, c) = q x(σ): vertex of Ω(b) from which (2.25) follows easily.
2.6.2 Proof of Corollary 1 Proof. Let t ∈ N and note that f (tb, rc) = trf (b, c) = trc x∗ = trc x(σ ∗ ). As in the proof of Theorem 1, and with tb in lieu of b, we have ⎡ U (tb, rc) + f d (tb, rc) = εtc x ⎣ μ(σ ∗ ) 1 r
%
σ∗
∗
vertex x(σ) =x∗
rc x(σ)
ε εrc x(σ∗ )
&t
⎤ r1 Uσ (tb, rc) ⎦ μ(σ)
and from (2.47)–(2.48), setting δσ := c x∗ − c x(σ) > 0 and y := εr/q , ⎡ f d (tb, rc)1/r = εtc x
∗
∗ ⎣ Uσ (tb, rc) + μ(σ ∗ )
⎤1/r P (y) σtb ⎦ . y −tqδσ Q σtb (y) ∗
vertex x(σ) =x
Observe that c x(σ ∗ )−c x(σ) > 0 whenever σ = σ ∗ because Ω(y) is simple if y ∈ γ, and c is regular. Indeed, as x∗ is an optimal vertex of the LP problem ∗ P, the reduced costs ck − π σ Ak (k ∈ σ ∗ ) with respect to the optimal basis σ ∗ are all nonpositive, and in fact, strictly negative because c is regular (see Section 2.2.4). Therefore the term vertex x(σ) =x∗
y −tqδσ
Pσtb (y) Qσtb (y)
36
J.B. Lasserre
is negligible for t sufficiently large, when compared with Uσ∗ (tb, rc). This is because the degrees of Pσtb and Qσtb depend on tb but not on the magnitude of tb (see (2.47)–(2.48)), and they are uniformly bounded in tb. Hence taking the limit as r → ∞ yields % fd (tb,c)
ε
= lim
r→∞
&1/r ∗ ∗ εrtc x(σ ) Uσ∗ (tb, rc) = εtc x(σ ) lim Uσ∗ (tb, rc)1/r , r→∞ μ(σ ∗ )
from which (2.26) follows easily. Finally, the periodicity comes from the term ε2iπtb (g) in (2.23) for g ∈ G(σ ∗ ). The period is then, of the order G(σ ∗ ).
2.6.3 Proof of Proposition 3.1 ∗
Proof. Let Uσ∗ (b, c) be as in (2.23)–(2.24). It follows immediately that π σ = (λ∗ ) and so ε−π
σ∗
Ak −2iπAk
ε
∗
(g) = ε−Ak λ ε−2iπAk θg = zg−Ak ,
g ∈ G(σ ∗ ).
Next, using c x(σ ∗ ) = b λ∗ ,
∗
∗
εc x(σ ) ε2iπb (g) = εb λ ε2iπb θg = zgb ,
g ∈ G(σ ∗ ).
Therefore 1 1 εc x(σ) Uσ∗ (b, c) = ∗ μ(σ ) μ(σ ∗ )
zgb
g∈G(σ ∗ )
(1 − zg−Ak εck )
= Rσ∗ (zg , 1), and (2.35) follows from (2.25) because, with rc in lieu of c, zg becomes zgr = ∗ εrλ ε2iπθg (only the modulus changes). ∗ Next, as only the modulus of zg is involved in (2.36), we have |zgr | = εrλ for all g ∈ G(σ ∗ ), so that 1 μ(σ ∗ )
g∈G(σ ∗ )
∗
εrb λ |zgr |b = , −A rc r(ck −Ak λ∗ ) ) kε k) k ∈σ ∗ (1 − |zgr | k ∈σ ∗ (1 − ε
and, as r → ∞ ,
∗
∗ εrb λ ≈ εrb λ , r(ck −Ak λ∗ ) ) (1 − ε k ∈σ ∗
because (ck − Ak λ∗ ) < 0 for all k ∈ σ ∗ . Therefore
2
Duality and a Farkas lemma for integer programs
%
∗
1 εrb λ ln r(ck −Ak λ∗ ) r→∞ r ) k ∈σ ∗ (1 − ε
37
&
lim
= b λ∗ = f (b, c),
the desired result.
2.6.4 Proof of Theorem 2 Proof. (ii) ⇒ (i). Assume that z b − 1 can be written as in (2.38) for some coefficients {Qjα }, that is, {Qj(z) } = polynomials {Qj } with nonnegative α1 α αm Q z = Q z · · · zm , for finitely many nonzero (and m m jα jα 1 α∈N α∈N nonnegative) coefficients Qjα . Using the notation of Section 2.3, the function f d (b, 0), which (as c = 0) counts the nonnegative integral solutions x ∈ Nn to the equation Ax = b, is given by
1 z b−em n dz, · · · f d (b, 0) = m −Ak ) (2πi) |z1 |=γ1 |zm |=γm j=1 (1 − z where γ ∈ Rm satisfies A γ > 0 (see (2.18) and (2.20)). Writing z b−em as z −em (z b − 1 + 1) we obtain f d (b, 0) = B1 + B2 , with B1 =
1 (2πi)m
|z1 |=γ1
···
|zm |=γm
z −em dz −Ak ) j=1 (1 − z
n
and 1 B2 := (2πi)m =
=
n j=1 n
|z1 |=γ1
1 (2πi)m
j=1 α∈Nm
···
|zm |=γm
|z1 |=γ1
Qjα (2πi)m
···
z −em (z b − 1) n dz −Ak ) j=1 (1 − z
|zm |=γm
z Aj −em Qj (z) dz −Ak ) k =j (1 − z
|z1 |=γ1
···
|zm |=γm
z Aj +α−em dz. −Ak ) k =j (1 − z
From (2.20) (with b := 0) we recognize in B1 the number of solutions x ∈ Nn to the linear system Ax = 0, so that B1 = 1. Next, again from (2.20) (now with b := Aj + α), each term Cjα
Qjα := (2πi)m
|z1 |=γ1
···
|zm |=γm
z Aj +α−em dz −Ak ) k =j (1 − z
38
J.B. Lasserre
is equal to Qjα × the number of integral solutions x ∈ Nn−1 (j) x = Aj + α, where A (j) is the matrix in Nm×(n−1) of the linear system A obtained from A by deleting its j-th column Aj . As by hypothesis each Qjα is nonnegative, it follows that n
B2 =
Cjα ≥ 0,
j=1 α∈Nm
so that f d (b, 0) = B1 + B2 ≥ 1. In other words, the sytem Ax = b has at least one solution x ∈ Nn . (i) ⇒ (ii). Let x ∈ Nn be a solution of Ax = b, and write z b − 1 = z A1 x1 − 1 + z A1 x1 (z A2 x2 − 1) + · · · + z and
n−1 j=1
Aj xj
( ' z Aj xj − 1 = (z Aj − 1) 1 + z Aj + · · · + z Aj (xj −1) ,
(z An xn − 1)
j = 1, . . . , n,
to obtain (2.38) with z → Qj (z) := z
j−1 k=1
Ak xk
'
( 1 + z Aj + · · · + z Aj (xj −1) ,
j = 1, . . . , n.
We immediately see that each Qj has all its coefficients nonnegative (and even in {0, 1}). Finally, the bound on the degree follows immediately from the proof for (i) ⇒ (ii).
References 1. K. Aardal, R. Weismantel and L. A. Wolsey, Non-standard approaches to integer programming, Discrete Appl. Math. 123 (2002), 5–74. 2. W. W. Adams and P. Loustaunau, An Introduction to Gr¨ obner Bases (American Mathematical Society, Providence, RI, 1994). 3. F. Bacelli, G. Cohen, G. J. Olsder and J.-P. Quadrat, Synchronization and Linearity (John Wiley & Sons, Chichester, 1992). 4. A. I. Barvinok, Computing the volume, counting integral points and exponential sums, Discrete Comp. Geom. 10 (1993), 123–141. 5. A. I. Barvinok and J. E. Pommersheim, An algorithmic theory of lattice points in polyhedra, in New Perspectives in Algebraic Combinatorics, MSRI Publications 38 (1999), 91–147. 6. C. E. Blair and R. G. Jeroslow, The value function of an integer program, Math. Programming 23 (1982), 237–273.
2
Duality and a Farkas lemma for integer programs
39
7. M. Brion and M. Vergne, Residue formulae, vector partition functions and lattice points in rational polytopes, J. Amer. Math. Soc. 10 (1997), 797–833. 8. J. B. Conway, Functions of a Complex Variable I, 2nd ed. (Springer, New York, 1978). 9. D. den Hertog, Interior Point Approach to Linear, Quadratic and Convex Programming (Kluwer Academic Publishers, Dordrecht, 1994). 10. O. G¨ uler, Barrier functions in interior point methods, Math. Oper. Res. 21 (1996), 860–885. 11. A. Iosevich, Curvature, combinatorics, and the Fourier transform, Notices Amer. Math. Soc. 48 (2001), 577–583. 12. A. Khovanskii and A. Pukhlikov, A Riemann-Roch theorem for integrals and sums of quasipolynomials over virtual polytopes, St. Petersburg Math. J. 4 (1993), 789–812. 13. J. B. Lasserre and E. S. Zeron, A Laplace transform algorithm for the volume of a convex polytope, JACM 48 (2001), 1126–1140. 14. J. B. Lasserre and E. S. Zeron, An alternative algorithm for counting integral points in a convex polytope, Math. Oper. Res. 30 (2005), 597–614. 15. J. B. Lasserre, La valeur optimale des programmes entiers, C. R. Acad. Sci. Paris Ser. I Math. 335 (2002), 863–866. 16. J. B. Lasserre, Generating functions and duality for integer programs, Discrete Optim. 1 (2004), 167–187. 17. G. L. Litvinov, V. P. Maslov and G. B. Shpiz, Linear functionals on idempotent spaces: An algebraic approach, Dokl. Akad. Nauk. 58 (1998), 389–391. 18. D. S. Mitrinovi´c, J. S´ andor and B. Crstici, Handbook of Number Theory (Kluwer Academic Publishers, Dordrecht, 1996). 19. A. Schrijver, Theory of Linear and Integer Programming (John Wiley & Sons, Chichester, 1986). 20. V. A. Truong and L. Tun¸cel, Geometry of homogeneous convex cones, duality mapping, and optimal self-concordant barriers, Research report COOR #2002-15 (2002), University of Waterloo, Waterloo, Canada. 21. L. A. Wolsey, Integer programming duality: Price functions and sensitivity analysis, Math. Programming 20 (1981), 173–195.
Chapter 3
Some nonlinear Lagrange and penalty functions for problems with a single constraint J. S. Giri and A. M. Rubinov†
Abstract We study connections between generalized Lagrangians and generalized penalty functions, which are formed by increasing positively homogeneous functions. In particular we show that the least exact penalty parameter is equal to the least Lagrange multiplier. We also prove, under some natural assumptions, that the natural generalization of a Lagrangian cannot improve it. Key words: Generalized Lagrangians, generalized penalty functions, single constraint, IPR convolutions, IPH functions
3.1 Introduction Consider the following constrained optimization problem P (f0 , f1 ): min f0 (x) subject to x ∈ X, f1 (x) ≤ 0,
(3.1)
where X ⊆ IRn and f0 (x), f1 (x) are real-valued, continuous functions. (We shall assume that these functions are directionally differentiable in Section 3.4.) Note that a general mathematical programming problem: min f0 (x) subject to x ∈ X, gi (x) ≤ 0, (i ∈ I),
hj (x) = 0 (j ∈ J),
J. S. Giri School of Information Technology and Mathematical Sciences, University of Ballarat, Victoria, AUSTRALIA e-mail:
[email protected] A. M. Rubinov† School of Information Technology and Mathematical Sciences, University of Ballarat, Victoria, AUSTRALIA e-mail:
[email protected] C. Pearce, E. Hunt (eds.), Structure and Applications, Springer Optimization and Its Applications 32, DOI 10.1007/978-0-387-98096-6 3, c Springer Science+Business Media, LLC 2009
41
42
J.S. Giri and A.M. Rubinov
where I and J are finite sets, can be reformulated as (3.1) with f1 (x) = max(max gi (x), max |hj (x)|). i∈I
j∈J
(3.2)
Note that the function f1 defined by (3.2) is directionally differentiable if the functions gi (i ∈ I) and hj (j ∈ J) possess this property. The traditional approach to problems of this type has been to employ a Lagrange function of the form L(x; λ) = f0 (x) + λf1 (x). The function q(λ) = inf x∈X (f0 (x) + λf1 (x)) is called the dual function and the problem max q(λ) subject to λ > 0 is called the dual problem. The equality sup inf L(§, λ) = inf{{ (§) : § ∈ X , {∞ (§) ≤ } λ>0 x∈X
¯ > 0 such that is called the zero duality gap property. The number λ ¯ = inf{f0 (x) : x ∈ X, f1 (x) ≤ 0} inf L(x, λ)
x∈X
is called the Lagrange multiplier. Let f1+ (x) = max(f1 (x), 0). Then the Lagrange function for the problem P (f0 , f1+ ) is called the penalty function for the initial problem P (f0 , f1 ). The traditional Lagrange function may be considered to be a linear convolution of the objective and constraint functions. That is, L(x; λ) ≡ p(f0 (x), λf1 (x)), where p(u, v) = u + v. It has been shown in [4, 5] that for penalty functions, increasing positively homogeneous (IPH) convolutions provide exact penalization for a large class of objective functions. The question thus arises “are there nonlinear convolution functions for which Lagrange multipliers exist?” The most interesting example of a nonlinear IPH convolution func 1/k . These convolutions also oftion is the function sk (u, v) = uk + v k ten provide a smaller exact penalty parameter than does the traditional linear convolution. (See Section 3.3 for the definition of an exact penalty parameter.) We will show in this chapter that for problems where a Lagrange multiplier exists, an exact penalty parameter also exists, and the smallest exact penalty parameter is equal to the smallest Lagrange multiplier. We also show that whereas a generalized penalty function can often improve the classical situation (for example, provide exact penalization with a smaller parameter than that of the traditional function), this is not true
3
Some nonlinear Lagrange and penalty functions
43
for generalized Lagrange functions. Namely, we prove, under some natural assumptions, that among all functions sk the Lagrange multiplier may exist only for the k = 1 case. So generalized Lagrangians cannot improve the classical situation.
3.2 Preliminaries Let us present some results and definitions which we will make use of later in this chapter. We will refer to the solution of the general problem, P (f0 , f1 ), as M (f0 , f1 ). We will also make use of the sets, X0 = {x ∈ X : f1 (x) ≤ 0} and X1 = {x ∈ X : f1 (x) > 0}. It will be convenient to talk about Increasing Positively Homogeneous (IPH) functions. These are defined as functions which are increasing, that is, if (δ, γ) ≥ (δ , γ ) then p(δ, γ) ≥ p(δ , γ ), and positively homogeneous of the first degree, that is, p(α(δ, γ)) = αp(δ, γ), α > 0 . We shall consider only continuous IPH functions defined on either the halfplane {(u, v) : u ≥ 0} or on the quadrant IR2+ = {(u, v) ∈ IR2 : u ≥ 0, v ≥ 0}. In the latter case we consider only IPH functions p : IR2++ → IR, which possess the following properties: p(1, 0) = 1,
lim p(1, v) = +∞.
v→+∞
We shall denote by P1 the class of all such functions. The simplest example of functions from P1 is the following function sk (0 < k < +∞), defined on IR2+ :
1 sk (u, v) = uk + v k k . (3.3) 2l + 1 with k, l ∈ N then the function sk is well defined and IPH 2m + 1 on the half-plane {(u, v) : u ≥ 0}. (Here N is the set of positive integers.) A perturbation function plays an important part in the study of extended penalty functions and is defined on IR+ = {y ∈ IR : y ≥ 0} by If k =
β(y) = inf{f0 (x) : x ∈ X, f1 (x) ≤ y}. We denote by CX the set of all problems (f0 , f1 ) such that: 1. inf x∈X f0 (x) > 0; 2. there exists a sequence xk ∈ X1 such that f1 (xk ) → 0 and f0 (xk ) → M (f0 , f1 ); 3. there exists a point x ∈ X such that f1 (x) ≤ 0; 4. the perturbation function of the problem (f0 , f1 ) is l.s.c. at the point y = 0.
44
J.S. Giri and A.M. Rubinov
An important result which follows from the study of perturbation functions is as follows. Theorem 1. Let P (f0 , f1 ) ∈ CX . Let k > 0 and let p = pk be defined as 1 ¯ = pk (δ, γ) = (δ k + γ k ) k . There exists a number d¯ > 0 such that qp+ (d) M (f0 , f1 ) if and only if β is calm of degree k at the origin. That is, lim inf
y→+0
β(y) − β(0) > −∞. yk
A proof of this is presented in [4] and [5].
3.3 The relationship between extended penalty functions and extended Lagrange functions Let (f0 , f1 ) ∈ CX and let p be an IPH function defined on the half-plane IR2∗ = {(u, v) : u ≥ 0}. Recall the following definitions. The Lagrange-type function with respect to p is defined by Lp (x, d) = p(f0 (x), df1 (x)). (Here d is a real number and df does not mean the differential of f .) The dual function qp (d) with respect to p is defined by qp (d) = inf p(f0 (x), df1 (x)), x∈X
d > 0.
Let p+ be the restriction of p to IR2+ . Consider the penalty function L+ p and the dual function qp+ corresponding to p+ : + + L+ p (x, d) = p (f0 (x), df1 (x)), (x ∈ X, d ≥ 0),
qp+ (d) = inf L+ p (x, d), x∈X
(d ≥ 0).
Note that if f1 (x) = 0 for x ∈ X0 then qp = qp+ . Let tp (d) = inf p(f0 (x), df1 (x)). x∈X0
(3.4)
Then qp (d) = min(tp (d), rp+ (1, d)),
(3.5)
rp+ (d0 , d) = inf p+ (d0 f0 (x), df1 (x)).
(3.6)
where rp+ is defined by x∈X1
3
Some nonlinear Lagrange and penalty functions
45
(The function rp+ was introduced and studied in [4] and [5].) If the restriction p+ of p on IR2+ belongs to P1 then rp+ (1, d) = qp+ (d) (see ([4, 5]), so qp (d) = min(tp (d), qp+ (d)). Note that the function tp is decreasing and tp (d) ≤ tp (0) = M (f0 , f1 ),
(d > 0).
The function qp+ (d) = rp+ (1, d) is increasing. It is known (see [4, 5]) that qp+ (d) ≡ rp+ (1, d) ≤ lim rp (1, u) = M (f0 , f1 ). u→+∞
(3.7)
Recall that a positive number d¯ is called a Lagrange multiplier of P (f0 , f1 ) ¯ = M (f0 , f1 ). A positive number d¯ is called an exact with respect to p if qp (d) ¯ = M (f0 , f1 ). We penalty parameter of P (f0 , f1 ) with respect to p+ , if qp+ (d) will now show that the following statement holds. Theorem 2. Consider (f0 , f1 ) ∈ CX and an IPH function p defined on IR2∗ . Assume that the restriction p+ of p to IR2+ belongs to P1 . Then the following assertions are equivalent: 1) there exists a Lagrange multiplier d¯ of P (f0 , f1 ) with respect to p. 2) there exists an exact penalty parameter d¯ of P (f0 , f1 ) with respect to p+ and max(tp (d), rp+ (d)) = M (f0 , f1 ) for all d ≥ 0.
(3.8)
Proof. 1) =⇒ 2). Let d¯ be a Lagrange multiplier of P (f0 , f1 ). Then ¯ 1 (x)) = M (f0 , f1 ). inf p(f0 (x), df
x∈X
¯ 1 (x) for all x ∈ X, we have ¯ + (x) ≥ df Since p is an increasing function and df 1 ¯ = inf p+ (f0 (x), df ¯ + (x)) qp+ (d) 1 x∈X
¯ + (x)) = inf p(f0 (x), df 1 x∈X
¯ 1 (x)) = M (f0 , f1 ). ≥ inf p(f0 (x), df x∈X
On the other hand, due to (3.7) we have qp+ (d) ≤ M (f0 , f1 ) for all d. Thus ¯ = M (f0 , f1 ), that is, d¯ is an exact penalty parameter of P (f0 , f1 ) with qp+ (d) respect to p+ . Due to (3.5) we have ¯ rp+ (1, d)) ¯ = M (f0 , f1 ). min(tp (d), Since tp (d) ≤ M (f0 , f1 ) and rp+ (1, d) ≤ M (f0 , f1 ), it follows that ¯ = M (f0 , f1 ) and rp+ (1, d) ¯ = M (f0 , f1 ). tp (d)
(3.9)
46
J.S. Giri and A.M. Rubinov
Since tp (d) is decreasing and rp+ (1, d) is increasing, (3.9) implies the equalities tp (d) = M (f0 , f1 ), rp+ (1, d) = M (f0 , f1 ),
¯ (0 ≤ d ≤ d), (d¯ ≤ d < +∞),
which, in turn, implies (3.8). 2) Assume now that (3.8) holds. Let Ds = {d : tp (d) = M (f0 , f1 )},
Dr = {d : rp+ (1, d) = M (f0 , f1 )}.
Since p is a continuous function it follows that tp is upper semicontinuous. Also M (f0 , f1 ) is the greatest value of tp and this function is decreasing, therefore it follows that the set Ds is a closed segment with the left end-point equal to zero. It should also be noted that the set Dr is nonempty. Indeed, since p+ ∈ P1 it follows that Dr contains a penalty parameter of P (f0 , f1 ) with respect to p+ . It is observed that the function rp+ (1, ·) is increasing and upper semicontinuous and since M (f0 , f1 ) is the greatest value of this function, it follows that Dr is a closed segment. Due to (3.8) we can say that Ds ∪ Dr = [0, +∞). Since both Ds and Dr are closed segments, we conclude ¯ = M (f0 , f1 ) that the set Dl := Ds ∩ Dr = ∅. Let d¯ ∈ Dl and therefore tp (d) ¯ = M (f0 , f1 ). Due to (3.5) we have qp (d) ¯ = M (f0 , f1 ). and rp+ (1, d) Remark 1. Assume that p+ ∈ P1 and an exact penalty parameter exists. It easily follows from the second part of the proof of Proposition 2 that the set of Lagrange multipliers coincides with the closed segment Dl = Ds ∩ Dr . Note that for penalty parameters the following assertion (Apen ) holds: a number which is greater than an exact penalty parameter is also an exact penalty parameter. The corresponding assertion, Alag : a number, which is greater than a Lagrange multiplier is also a Lagrange multiplier, does not hold in general. Assume that a Lagrange multiplier exists. Then according to Proposition 2 an exact penalty parameter also exists. It follows from Remark 1 that (Alag ) holds if and only if Ds = [0, +∞), that is, inf p(f0 (x), df1 (x)) = M (f0 , f1 ) for all d ≥ 0.
x∈X0
(3.10)
We now point out two cases where (3.10) holds. One of them is closely related to penalization. Let p be an arbitrary IPH function, such that p(1, 0) = 1 and f1 (x) = 0 for all x ∈ X0 (in other words, f1+ = f1 ), then (3.10) holds.
3
Some nonlinear Lagrange and penalty functions
47
We now remove condition f1+ = f1 and consider very special IPH functions, for which (3.10) holds without this condition. Namely, we consider a class P∗ of IPH functions defined on the half-plane IR2∗ = {(u, v) : u ≥ 0} such that (3.10) holds for each problem (f0 , f1 ) ∈ CX . The class P∗ consists of functions p : IR2∗ → IR, such that the restriction of p on the cone IR2+ belongs to P1 and p(u, v) = u for (u, v) ∈ IR2∗ with v ≤ 0. It is clear that each p ∈ P∗ is positively homogeneous of the first degree. Let us now describe some further properties of p. Let (u, v) ≥ (u , v ). Assuming without loss of generality that v ≥ 0, v ≤ 0 we have p(u, v) ≥ p(u , 0) ≥ u = p(u , v ) so p is increasing. Since p(u, 0) = u, it follows that p is continuous. Thus P∗ consists of IPH continuous functions. The simplest example of a function p ∈ P∗ is p(u, v) = max(u, av) with a > 0. Clearly the function 1
pk (u, v) = max((uk + av k ) k , u) 2l+1 , l, m ∈ N belongs to P∗ as well. with k = 2m+1 Let us check that (3.10) holds for each (f0 , f1 ) ∈ CX . Indeed, since f0 (x) > 0 for all x ∈ X, we have
inf p(f0 (x), df1 (x)) = inf f0 (x) = M (f0 , f1 ) for all d ≥ 0.
x∈X0
x∈X0
3.4 Generalized Lagrange functions In this section we consider problems P (f0 , f1 ) such that both f0 and f1 are directionally differentiable functions defined on a set X ⊆ IRn . Recall that a function f defined on X is called directionally differentiable at a point x ∈ intX if for each z ∈ IRn there exists the derivative f (x, z) at the point x in the direction z: f (x, z) = lim
α→+0
1 (f (x + αz) − f (x)). α
Usually only directionally differentiable functions with a finite derivative are considered. We also accept functions whose directional derivative can attain the values ±∞. It is well known (see, for example, [1]) that the maximum of two directionally differentiable functions is also directionally differentiable. In particular the function f + is directionally differentiable, if f is directionally differentiable. Let f (x) = 0. Then (f + ) (x, z) = max(f (x, z), 0) = (f (x, z))+ .
48
J.S. Giri and A.M. Rubinov
Let sk , k > 0 be a function defined on IR2+ by (3.3). Assume that there exists an exact penalty parameter for a problem P (f0 , f1 ) with (f0 , f1 ) ∈ CX . It easily follows from results in [5, 6] that an exact penalty parameter with respect to k < k also exists and that the smallest exact penalty parameter d¯k with respect to sk is smaller than the smallest exact penalty parameter d¯k with respect to sk . The question then arises, does this property hold for 2l + 1 with Lagrange multipliers? (This question makes sense only for k = 2m + 1 k, l ∈ N and functions sk defined by (3.3) on the half-plane IR2∗ .) We provide a proof that the answer to this question is, in general, negative. Let f be a directionally differentiable function defined on a set X and let x ∈ intX. We say that x is a min-stationary point of f on X if for each direction z either f (x, z) = 0 or f (x, z) = +∞. We now present a simple example. Example 1. Let X = IR, )√ x if x > 0, f1 (x) = −x if x ≤ 0, ) √ − x if x > 0, f3 (x) = −x if x ≤ 0.
)√ x if x > 0, f2 (x) = x if x ≤ 0,
Then the point x = 0 is a min-stationary point for f1 and f2 , but this point is not min-stationary for f3 . Proposition 1. (Necessary condition for a local minimum). Let x ∈ intX be a local minimizer of a directionally differentiable function f . Then x is a min-stationary point of f . Proof. Indeed, for all z ∈ IRn and sufficiently small α > 0 we have (1/α)(f (x + αu) − f (x)) ≥ 0. Thus the result follows. Consider a problem P (f0 , f1 ) where (f0 , f1 ) ∈ CX are functions with finite directional derivatives. Consider the IPH function sk defined by (3.3). Let us define the corresponding Lagrange-type function Lsk : Lsk (x, λ) = f0 (x)k + λf1 (x)k .
(3.11)
We have for x ∈ X such that f1 (x) = 0 that Lsk (x, z; λ) = kf0 (x)k−1 (f0 ) (x, z) + λkf1 (x)k−1 (f1 ) (x, z).
(3.12)
Assume now that f1 (x) = 0. We consider the following cases separately. 1) k > 1. Then Lsk (x, z; λ) = kf0 (x)k−1 (f0 ) (x, z).
(3.13)
3
Some nonlinear Lagrange and penalty functions
2) k = 1. Then
49
Lsk (x, z; λ) = (f0 ) (x, z).
(3.14)
3) k < 1. First we calculate the limit 1 (f1 (x + αz))k α 1 (f1 (x) + αf1 (x, z) + o(α))k = lim α→+0 α 1 (αf1 (x, z) + o(α))k . = lim α→+0 α
A(z) := lim
α→+0
We have
⎧ ⎨ +∞ if f1 (x, z) > 0, 0 if f1 (x, z) = 0, A(z) = ⎩ −∞ if f1 (x, z) < 0.
Hence Lsk (x, z; λ)
⎧ ⎨
+∞ if f1 (x, z) > 0, (f0 ) (x, z) if f1 (x, z) = 0, = kf0 (x) ⎩ −∞ if f1 (x, z) < 0. k−1
(3.15)
Note that for problems P (f0 , f1 ) with (f0 , f1 ) ∈ CX a minimizer is located on the boundary of the the set of feasible elements {x : f1 (x) ≤ 0}. Proposition 2. Let k > 1. Let (f0 , f1 ) ∈ CX . Assume that the functions ¯ ∈ intX, which is a f0 and f1 have finite directional derivatives at a point x minimizer of the problem P (f0 , f1 ). Assume that x, u) < 0, there exists u ∈ IRn such that (f0 ) (¯
(3.16)
(that is, x ¯ is a not a min-stationary point for the function f0 over X). Then the point x ¯ is not a min-stationary point of the function Lk for each λ > 0. Proof. Assume that x ¯ is a min-stationary point of the function Lsk (x; λ) over X. Then combining Proposition 1 and (3.13) we have x)k−1 (f0 ) (¯ x, z) ≥ 0, f0 (¯
z ∈ IRn .
x) > 0 it follows that (f0 ) (¯ x, z) ≥ 0 for all z, which contradicts Since f0 (¯ (3.16). It follows from this proposition that the Lagrange multiplier with respect to Lsk (k > 1) does not exist for a problem P (f0 , f1 ) if (3.16) holds. Condition (3.16) means that the constraint f1 (x) ≤ 0 is essential, that is, a minimum under this constraint does not remain a minimum without it. Remark 2. Consider a problem P (f0 , f1 ) with (f0 , f1 ) ∈ CX . Then under some mild assumptions there exists a number k > 1 such that the zero duality
50
J.S. Giri and A.M. Rubinov
gap property holds for the problem P (f0k , f1k ) with respect to the classical Lagrange function (see [2]). This means that sup inf (f0k (x) + λf1k (x)) = λ>0 x∈X
inf
f0k (x).
inf
f0 (x),
x∈X:f1 (x)≤0
Clearly this is equivalent to sup inf sk (f0 (x), λf1 (x)) = λ>0 x∈X
x∈X:f1 (x)≤0
that is, the zero duality gap property with respect to sk holds. It follows from Proposition 2 that a Lagrange multiplier with respect to sk does not exist. Hence there is no a Lagrange multiplier for P (f0 , f1 ) with respect to the classical Lagrange function. Remark 3. Let g(x) = f1+ (x). Then the penalty-type function for P (f0 , f1 ) with respect to sk coincides with the Lagrange-type function for P (f0 , g) with respect to sk . Hence an exact penalty parameter with respect to this penalty function does not exist if (3.16) holds. Proposition 3. Let k < 1 and let (f0 , f1 ) ∈ CX . Assume that the functions ¯ ∈ intX, which is a f0 and f1 have finite directional derivatives at a point x minimizer for the problem P (f0 , f1 ). Assume that x, u) < 0, there exists u ∈ IRn such that (f1 ) (¯
(3.17)
(that is, x ¯ is not a min-stationary point for the function f0 over X). Then the point x ¯ is not a min-stationary point of the function Lsk for each λ > 0. Proof. Assume that a min-stationary point exists. Then combining Proposition 1, (3.15) and (3.17) we get a contradiction. It follows from this proposition that a Lagrange multiplier with respect to Lsk , k < 1, does not exist if condition (3.17) holds. We now give the simplest example, when (3.17) is valid. Let f1 be a differentiable function and ∇f (¯ x) = 0. Then (3.17) holds. Consider now a more complicated and interesting example. Let f1 (x) = maxi∈I gi (x), where gi are differentiable functions. Then f1 is a directionally x, u) = maxi∈I(¯x) [∇gi (x), u], where I(¯ x) = {i ∈ differentiable function and f (¯ I : gi (x) = f1 (x)} and [x, y] stands for the inner product of vectors x and y. Thus (3.17) holds in this case if and only if there exists a vector u such that x), u] < 0 for all i ∈ I(¯ x). To understand the essence of this result, let us [∇gi (¯ consider the following mathematical programming problem with m inequality constraints: min f0 (x) subject to gi (x) ≤ 0, i ∈ I = {1, . . . , m}.
(3.18)
We can present (3.18) as the problem P (f0 , f1 ) with f1 (x) = maxi∈I gi (x). Recall the well-known Mangasarian–Fromovitz (MF) constraint qualification
3
Some nonlinear Lagrange and penalty functions
51
for (3.18) (see, for example, [3]): (MF) holds at a point x ¯ if there exists a x), u] < 0 for all i ∈ I such that gi (¯ x) = 0. Thus vector u ∈ IRn such that [∇gi (¯ (3.17) for P (f, f1 ) is equivalent to (MF) constraint qualification for (3.18). In other words, if (MF) constraint qualification holds then a Lagrange multiplier for Lsk with k < 1 does not exist. (It is known that (MF) implies the existence of a Lagrange multiplier with k = 1.) Let (f, f1 ) ∈ CX , where f0 , f1 are functions with finite directional derivatives. Let g = f1+ and x be a point such that f1 (x) = 0. Then g (x, z) = max(f (x, z), 0) ≥ 0 for all z, hence (3.17) does not hold for the problem P (f0 , g). This means that Proposition 3 could not be applied to a penalty function for P (f, f1 ) with respect to sk . Simple examples show that exact penalty parameters with respect to sk with k < 1 can exist. We now present an example from [5]. We do not provide any details. (These can be found in [5], Example 4.6.) Example 2. Let 0 < b < c < a be real numbers and X = [0, c]. Let f (x) = (a − x)2 , f1 (x) = x − b, so P (f, f1 ) coincides with the following problem: minimize (a − x)2 subject to x ≤ b, x ∈ X. Let k = 1. Then an exact penalty parameter exists and the least exact penalty d¯1 is equal to 2(a − b). Let k = 1/2. Then an exact penalty parameter also exists and the least exact penalty parameter d¯1/2 coincides with c − b. We indicate the following two points: 1) d¯1 does not depend on the set X; d¯1/2 depends on this set. 2) d¯1 depends on the parameter a, that is on the turning point of the parabola; d¯1/2 does not depend on this parameter.
3.5 Example Consider the following one-dimensional optimization problem: 7x 9x2 + + 5, 2 2 subject to f1 (x) = x − 2 ≤ 0, x ∈ X = [0, 4].
min f0 (x) = x3 −
(3.19)
A graphical representation of this problem is given in Figure 3.1 where the shaded area represents the product of the feasible region and the axis {(0, y) : y ∈ R}. ¯ = 2. It can easily be shown that for this problem M (f0 , f1 ) = 2 at x
3.5.1 The Lagrange function approach The corresponding extended Lagrangian for (3.20) is Lsk (x, λ) = sk (f0 (x), λf1 (x)) = ((x3 −
1 7x 9x2 + + 5)k + λk (x − 2)k ) k , 2 2
52
J.S. Giri and A.M. Rubinov y 10
8
objective function (f–0)
6
4
2
x
0 –6
–4
–2
0
2
4
6
8
10
12
Fig. 3.1 P (f0 , f1 ).
recalling that k = Now consider
2l + 1 . 2m + 1 ∂L ∂L dL = f0 (¯ x) + λ f (¯ x). d¯ x ∂f0 ∂f1 1
(3.20)
An easy calculation shows that ⎧ 5 ⎪ k > 1, ⎨−2, dL 5 (¯ x) = − + λ, k = 1, ⎪ dx ⎩ ∞,2 k < 1. ¯ = From this it is clear that an exact Lagrange multiplier λ only for the case k = 1.
(3.21)
5 2
may exist
¯ = 5 provides a Remark 4. In fact Figure 3.2 shows that in this example λ 2 local minimum at x ¯ for k = 1 but not a global minimum, therefore it follows that no exact Lagrange multiplier exists for this problem.
3.5.2 Penalty function approach The corresponding penalty function for (3.20) is 1
+ k k k L+ sk (x; λ) = (f0 + (λf1 ) ) 2 1 k k k k ((x3 − 9x2 + 7x 2 + 5) + λ (x − 2) ) , for x ≥ 2, = 2 x3 − 9x2 + 7x for x ≤ 2. 2 + 5,
3
Some nonlinear Lagrange and penalty functions
53
y 10
8
6
4
2 x
0 –8
–6
–4
–2
0
2
4
6
8
10
12
Fig. 3.2 L(x; 52 ).
By Theorem 13.6 it can easily be shown that an exact penalty parameter exists when k < 1. This is shown in Figure 3.3 where an exact penalty parameter, d¯ = 1, is used.
Fig. 3.3 L+ s 1 (x; 1). 3
From these results we have shown that whereas the adoption of extended penalty functions of the form sk yields an improvement to the traditional penalty function approach, this cannot be generalized to improve the Lagrange approach.
References 1. V. F. Demyanov and A. M. Rubinov, Constructive Nonsmooth Analysis (Peter Lang, Frankfurt on Main, 1995). 2. D. Li, Zero duality gap for a class of nonconvex optimization problems, J. Optim. Theory Appl., 85 (1995), 309–324.
54
J.S. Giri and A.M. Rubinov
3. Z. Q. Luo, J. S. Pang and D. Ralph, Mathematical Programming with Equilibrium Constraints (Cambridge University Press, Cambridge, 1996). 4. A. M. Rubinov, B. M. Glover and X. Q. Yang, Decreasing functions with applications to penalization, SIAM J. Optim., 10(1) (1999), 289–313. 5. Rubinov A. M., Abstract Convexity and Global Optimization (Kluwer Academic Publishers, Dordrecht, 2000). 6. A. M. Rubiniv, X. Q. Yang and A. M. Bagirov, Penalty functions with a small penalty parameter, Optim. Methods Softw. 17 (2002), 931–964.
Chapter 4
Convergence of truncates in l1 optimal feedback control Robert Wenczel, Andrew Eberhard and Robin Hill
Abstract Existing design methodologies based on infinite-dimensional linear programming generally require an iterative process often involving progressive increase of truncation length, in order to achieve a desired accuracy. In this chapter we consider the fundamental problem of determining a priori estimates of the truncation length sufficient for attainment of a given accuracy in the optimal objective value of certain infinite-dimensional linear programs arising in optimal feedback control. The treatment here also allows us to consider objective functions lacking interiority of domain, a problem which often arises in practice. Key words: l1 -feedback control, epi-distance convergence, truncated convex programs
4.1 Introduction In the literature on feedback control there exist a number of papers addressing the problem of designing a controller to optimize the response of a system to a fixed input. In the discrete-time context there are many compelling Robert Wenczel Department of Mathematics, Royal Melbourne University of Technology, Melbourne 3001, AUSTRALIA Andrew Eberhard Department of Mathematics, Royal Melbourne University of Technology, Melbourne 3001, AUSTRALIA e-mail:
[email protected] Robin Hill Department of Mathematics, Royal Melbourne University of Technology, Melbourne 3001, AUSTRALIA C. Pearce, E. Hunt (eds.), Structure and Applications, Springer Optimization and Its Applications 32, DOI 10.1007/978-0-387-98096-6 4, c Springer Science+Business Media, LLC 2009
55
56
R. Wenczel et al.
reasons for using error-sequences in the space l1 as the basic variable, and using various measures of performance (see, for example, [15]) which includes the l1 -norm (see, for example, [11]). The formulation of l1 -problems leads to the computation of inf M f (and determination of optimal elements, if any), where M is an affine subspace of a suitable product X of l1 with itself. This subspace is generated by the YJBK (or Youla) parameterization [24] for the set of all stabilizing feedback-controllers for a given linear, timeinvariant system. The objective f will typically be the l1 -norm, (see [6], [11]). As is now standard in the literature on convex optimization ( [20], [21]), we will use convex discontinuous extended-real-valued functions, of the form f = · 1 + δC , where δC is the indicator function (identically zero on C, identically +∞ elsewhere) of some closed convex set in l1 . This formalism is very flexible, encompassing many problem formats, including the case of timedomain-template constraints ([10, Chapter 14], [14]). These represent bounds on signal size, of the form Bi ≤ ei ≤ Ai (for all i), where e = {ei }∞ i=0 denotes the error signal. Often there are also similar bounds on the control signal u. As the Youla parameterization generates controllers having rational z-transform, the variables in M should also be taken to be rational in this sense, if the above infimum is to equal the performance limit [6] for physically realizable controllers. (If this condition is relaxed, then inf M f provides merely a lower bound for the physical performance limit.) The set M may be recharacterized by a set of linear constraints and thus forms an affine subspace of X. The approach in many of the above references is to evaluate inf M f by use of classical Lagrangian-type duality theory for such minimization problems. The common assumption is that the underlying space X is l1 , or a product thereof, and that M is closed, forcing it to contain elements with non-rational ztransform, counter to the “physical” model. Consequently, inf M f may not coincide with the performance limit for physically realizable (that is, rational) controllers. However, in the context of most of the works cited above, such an equality is actually easily established (as was first noted in [27]). Indeed, whenever it is assumed that C has nonempty interior, on combining this with the density in M of the subset consisting of its rational members [19], we may deduce (see Lemma 15 below) that the rational members of C ∩ M are l1 -dense in C ∩ M . This yields the claimed equality for any continuous objective function (such as the l1 -norm). Use of the more modern results of conjugate duality permits the extension of the above approach to a more general class of (−∞, +∞]-valued objective functions f . They are applicable even when int C may vanish. In this case, the question of whether inf M f equals the physical limit becomes nontrivial (in contrast to the case when C has interior). Indeed, if inf M f is strictly less than the physical limit, any result obtained by this use of duality is arguably of questionable engineering significance. It is therefore important to know when inf M f is precisely the performance limit for physically realizable controllers, to ensure that results obtained via the duality approaches described above are physically meaningful. Note that
4
Convergence of truncates in l1 optimal feedback control
57
this question is posed in the primal space, and may be analyzed purely in the primal space. This question will be the concern of this chapter. In this paper, we derive conditions on the system, and on the time-domain template set C, that ensure inf M (f0 + δC ) = inf C∩M f0 is indeed the performance limit for physically realizable controllers, for various convex lowersemicontinuous performance measures f0 . Here we only treat a class of 1input/2-output problems (referred to as “two-block” in the control literature), although the success of these methods for this case strongly suggests the possibility of a satisfactory extension to multivariable systems. Existing results on this question (see, for example, [14, Theorem 5.4]) rely on Lagrangian duality theory, and thereby demand that the time-domain template C has interior. Here, for the class of two-block problems treated, we remove this interiority requirement. Our result will be obtained by demonstrating the convergence of a sequence of truncated (primal) problems. Moreover, this procedure will allow the explicit calculation of convergence estimates, unlike all prior works with the exception of [23]. (This latter paper estimates bounds on the truncation length for a problem with H∞ -norm constraints and uses state-space techniques, whereas our techniques are quite distinct.) The approach followed in our chapter has two chief outcomes. First, it validates the duality-based approach in an extended context, by ensuring that the primal problem posed in the duality recipe truly represents the limit-of-performance for realizable controllers. Secondly, it provides a computational alternative to duality itself, by exhibiting a convergent sequence of finite-dimensional primal approximations with explicit error estimates along the sequence. (This contrasts with traditional “primal–dual” approximation schemes, which generally do not yield explicit convergence rates.) This will be achieved by the use of some recently developed tools in optimization theory (the relevant results of which are catalogued in Section 4.5)— namely, the notion of epi-distance (or Attouch–Wets) convergence for convex sets and functions [2, 3, 4, 5]. Epi-distance convergence has the feature that if fn converges to f in this sense, then subject to some mild conditions, * * * * *inf fn − inf f * ≤ d(f, fn ), X
X
where d(f, g) is a metric describing this mode of convergence. Since the l1 control problem is expressible as inf X (f + δM ), this leads naturally to the question of the Attouch–Wets convergence of sums fn + δMn of sequences of functions (where fn and Mn are approximations to f and M respectively). A result from [5], which estimates the “epi-distances” d(fn + gn , f + g) between sums of functions, in terms of sums of d(fn , f ) and d(gn , g), is restated and modified to suit our purpose. This result requires that the objective and constraints satisfy a so-called “constraint qualification” (CQ). In Section 4.6 some conditions on the template C and on M are derived that ensure that the CQ holds. Also, the truncated problems will be defined, and some fundamental limitations on the truncation scheme (in relation to
58
R. Wenczel et al.
satisfiability of the CQ) will be discussed. Specifically, the basic optimization will be formulated over two variables e and u in l1 . The truncated approximate problems will be formed by truncating in the e variable only, since the CQ will be seen to fail under any attempt to truncate in both variables. Thus in these approximate problems, the set of admissible (e, u) pairs will contain a u of infinite length. Despite this, it will be noted that the set of such (e, u)’s will be generated by a finite basis (implied by the e’s), so these approximates are truly finite-dimensional. Finally, in Section 4.7, the results of Section 4.5 will be applied to deduce convergence of a sequence of finite-dimensional approximating minimizations. This will follow since the appropriate distances d(Cn , C) can be almost trivially calculated. Also, we observe that for a sufficiently restricted class of systems, our truncation scheme is equivalent to a simultaneous truncation in both variables e and u. In fact, this equivalence is satisfied precisely when the system has no minimum-phase zeros. The motivation for this work arose from a deficiency in current computational practices which use simultaneous primal and dual truncations. These yield upper and lower bounds for the original primal problem, but have the disadvantage of not providing upper estimates on the order of truncation sufficient for attainment of a prescribed accuracy.
4.2 Mathematical preliminaries We let R stand for the extended reals [−∞, +∞]. For a Banach space X, balls in X centered at 0 will be written as B(0, ρ) = { x ∈ X | x < ρ } and ¯ ρ) = { x ∈ X | x ≤ ρ }. Corresponding balls in the dual space X ∗ will B(0, ¯ ∗ (0, ρ) respectively. The indicator function of a set be denoted B ∗ (0, ρ) and B A ⊆ X will be denoted δA . We will use u.s.c. to denote upper-semicontinuity and l.s.c. to denote lower-semicontinuity. Recall that a function f : X → R is called proper if never equal to −∞ and not identically +∞, and proper closed if it is also l.s.c. For a function f : X → R, the epigraph of f , denoted epi f , is the set {(x, α) ∈ X × R | f (x) ≤ α}. The domain, denoted dom f, is the set {x ∈ X | f (x) < +∞}. The (sub-)level set {x ∈ X | f (x) ≤ α} (where α > inf X f ) will be given the abbreviation {f ≤ α}. For > 0, and if inf X f is finite, -argmin f = {x ∈ X | f (x) ≤ inf X f + } is the set of approximate minimizers of f . Any product X×Y of normed spaces will always be understood to be endowed with the box norm (x, y) = max{ x , y }; any balls in such product spaces will always be with respect to the box norm. ∞ Here l1 (C) denotes ∞the Banach space of all complex sequences a = {an }n=0 such that a 1 := n=0 |an | is finite; l1 denotes the Banach space of all real sequences in l1 (C); and l∞ denotes the Banach space of all real sequences a = {an }∞ n=0 such that a ∞ := supn |an | is finite.
4
Convergence of truncates in l1 optimal feedback control
59
For two sequences a, b their convolution a ∗ b is the sequence (a ∗ b)i = i j=0 aj bi−j . The length of a sequence a, denoted l(a), is the smallest integer n such that ai = 0 for all i ≥ n. ¯ to be the closed unit disk {z ∈ C | |z| ≤ 1} in the complex We define D plane; D is the open unit disk {z ∈ C | |z| < 1}. ∞ ˆ(z) = n=0 an z n for The z-transform of a = {an }∞ n=0 is the function a complex z wherever it is defined. The inverse z-transform of a ˆ will be written a). as Z −1 (ˆ Also l 1 denotes the set of all z-transforms of sequences in l1 . It can be ¯ that regarded as a subset of the collection of all continuous functions on D are analytic on D. We use R∞ to denote the set of all rational functions of a complex variable, ¯ with no poles in D. Definition 1. Let A be a convex set in a topological vector space and x ∈ A. Then 1 cone A = ∪λ>0 λA (the smallest convex cone containing A); 2 The core or algebraic interior of A is characterized as x ∈ core A iff ∀y ∈ X, ∃ > 0 such that ∀λ ∈ [−, ] we have x + λy ∈ A. The following generalized interiority concepts were introduced in [8] and [17, 18] respectively in the context of Fenchel duality theory, to frame a sufficient condition for strong duality that is weaker than the classical Slater condition. Definition 2. Let A be a convex set in a topological vector space and x ∈ A. Then 1 The quasi relative interior of A (qri A) consists of all x in A for which cone (A − x) is a closed subspace of X; 2 The strong quasi relative interior of A (sqri A) consists of all x in X for which cone (A − x) is a closed subspace of X. Note that 0 ∈ core A if and only if cone A = X, and that in general, core A ⊆ sqri A ⊆ qri A. Nearly all modern results in conjugate duality theory use constraint qualifications based on one or other of these generalized interiors. Some applications of such duality results to (discrete-time) feedback optimal control may be found in [11] and [15]. An example of an application to a problem in deterministic optimal control (that is, without feedback) is outlined in [17]. From [16] and [20] we have the following. Recall that a set A in a topological linear space X is ideally convex if for any bounded sequence{xn } ⊆ A ∞ ∞ and {λn } of nonnegative numbers with n=1 λn = 1, the series n=1 λn xn either converges to an element of A, or else does not converge at all. Open or closed convex sets are ideally convex, as is any finite-dimensional convex
60
R. Wenczel et al.
set. In particular, if X is Banach, then such series ∞ always converge, and the definition of ideal convexity only requires that n=1 λn xn be in A. From [16, Section 17E] we have the following proposition. Proposition 1. For a Banach space X, 1 If C ⊆ X is closed convex, it is ideally convex. 2 For ideally convex C, core C = core C = int C = int C. 3 If A and B are ideally convex subsets of X, one of which is bounded, then A − B is ideally convex. Proof. We prove the last assertion only; the rest can be found in the cited reference. Let {a n − bn } ⊆ A − B be a bounded sequence and let λn ≥ ∞ 0 be such that n=1 λn = 1. Then, due to the assumed boundedness of } ⊆ A and {bn } ⊆ B are both bounded, yielding the one of A or B, {a ∞ ∞ n∞ λ a ∈ A and λ b ∈ B. Thus convergent sums n n n n n=1 n=1 n=1 λn (an − ∞ ∞ bn ) = n=1 λn an − n=1 λn bn ∈ A − B. Corollary 1. Let A and C be closed convex subsets of the Banach space X. Then 0 ∈ core (A − C) implies 0 ∈ int (A ∩ B(0, ρ) − C ∩ B(0, ρ)) for some ρ > 0. ¯ ∈ A ∩ C ∩ B(0, ρ). Then for x ∈ X, x = Proof. Let ρ > inf A∩C · and let x λ(a−c) for some λ > 0, a ∈ A, c ∈ C since by assumption, cone (A−C) = X. x < ρ, Then for any t ≥ 1 sufficiently large so that t−1 ( a + c ) + ¯ + , + , , + 1 1 1 1 a+ 1− x ¯ − c+ 1− x ¯ x = tλ t t t t ∈ tλ (A ∩ B(0, ρ) − C ∩ B(0, ρ)) ⊆ cone (A ∩ B(0, ρ) − C ∩ B(0, ρ)) . Hence 0 ∈ core (A ∩ B(0, ρ) − C ∩ B(0, ρ)) from the arbitrariness of x ∈ X. The result follows since by Proposition 1 the core and interior of A∩B(0, ρ)− C ∩ B(0, ρ) coincide.
4.3 System-theoretic preliminaries 4.3.1 Basic system concepts In its most abstract form, a system may be viewed as a map H : XI → XO between a space XI of inputs and a space XO of outputs. In the theory of feedback stabilization of systems, interconnections appear where the output of one system forms the input of another, and for this to make sense, XI
4
Convergence of truncates in l1 optimal feedback control
61
and XO will, for simplicity, be taken to be the same space X. The system H is said to be linear, if X is a linear space and H a linear operator thereon. Our focus will be on SISO (single-input/single-output) linear discrete-time systems. In this case, X will normally be contained in the space RZ of realvalued sequences. For n ∈ N define a time-shift τn on sequences in X by (τn φ)i := φi−n ,
φ∈X.
If each τn commutes with H, then H is called shift- (or time-) invariant. Our interest will be in linear time-invariant (or LTI) systems. It is well known [12] that H is LTI if and only if it takes the form of the convolution operator h∗ for some h ∈ X. This h is called the impulseresponse of H, since h = H(δ) = h∗δ, where δ is the (Dirac) delta-function in continuous time, or is the unit pulse sequence (1, 0, 0, . . . ), in discrete time. The discrete-time LTI system H = h∗ is causal if the support of h lies in the positive time-axis N = {0, 1, 2, . . .}. The significance of this notion is clarified after observing the action of H on an input u, which takes the form (Hu)n = k≤n hk un−k , so if h takes any nonzero values for negative time then evaluation of the output at time n would require foreknowledge of input behavior at later times. Note that for LTI systems, a natural (but by no means the only) choice for the “space of signals” X is the space of sequences for which the z-transform exists. From now on we identify a LTI system H with its impulse-response h := H(δ). Definition 3. The LTI system h is BIBO (bounded-input/bounded-output)stable if h ∗ u ∈ l∞ whenever u ∈ l∞ . From the proof of [12, Theorem 7.1.5], it is known that H = h∗ is BIBOstable if and only if h ∈ l1 . Any LTI system H can be characterized by its transfer function, defined ˆ of its impulse-response h. The input–output relation to be the z-transform h ˆ u for appropriate inputs u. Thus convolution equathen takes the form u ˆ → hˆ tions can be solved by attending to an algebraic relation, much simplifying the analysis of such systems. For systems H1 , H2 we shall often omit the convolution symbol ∗ from products H1 H2 , which will be understood to be the system (h1 ∗ h2 )∗. By the commutativity of ∗, H1 H2 = H2 H1 . This notation will be useful in that now formal manipulation of (LTI) systems is identical to that of their transforms (that is, transfer functions). Consequently, we may let the symbol H stand either for the system, or its impulse-response h, or its transfer ˆ function h. We now express stability in terms of rational transfer functions. The algebra R∞ to be defined below shall have a fundamental role in the theory of feedback-stabilization of systems. First, we recall some notation:
62
R. Wenczel et al.
R[z] — the space of polynomials with real coefficients, of the complex variable z; R(z) — the space of rational functions with real coefficients, of the complex variable z. Here R∞ := {h ∈ R(z) | h has no poles in the closed unit disk} . It is readily established that R∞ = l 1 ∩ R(z) , so R∞ forms the set of all rational stable transfer functions.
4.3.2 Feedback stabilization of linear systems Here we summarize the theory of stabilization of (rational) LTI systems by use of feedback connection with other (rational) LTI systems. All definitions and results given in this subsection may be found in [24]. Consider the feedback configuration of Figure 4.1. Here w is the “reference input,” e = w − y is the “(tracking–) error,” representing the gap between the closed-loop output y and the reference input w, and u denotes the control signal (or “input activity,” or “actuator output”).
Fig. 4.1 A closed-loop control system.
Definition 4. A (rational) LTI discrete-time system K is said to (BI-BO-) stabilize the (rational) LTI system P if the closed loop in Figure 4.1 is stable in the sense that for any bounded input w ∈ l∞ and any bounded additivelyapplied disturbance (such as Δ in Figure 4.1) at any point in the loop, all resulting signals in the loop are bounded. Such K is referred to as a stabilizing compensator or controller. It should be noted that the definition as stated applies to general non-LTI, even nonlinear systems, and amounts to the requirement of “internal,” as well as “external,” stability. This is of importance, since real physical systems have
4
Convergence of truncates in l1 optimal feedback control
63
a finite operating range, and it would not be desirable for, say, the control signal generated by K to become too large and “blow up” the plant P . Writing P = p∗ and K = k∗, with pˆ and kˆ in R(z) and noting that the transfer function between any two points of the loop can be constructed by addition or multiplication from the three transfer functions given in (4.1) below, we obtain the following well-known result. Proposition 2. K stabilizes P if and only if ˆ , k/(1 ˆ ˆ and pˆ/(1 + pˆk) ˆ ∈ R∞ . 1/(1 + pˆk) + pˆk)
(4.1)
We denote by S(P ) the set of rational stabilizing compensators for the plant P . The fundamental result of the theory of feedback stabilization (the YJBK factorization — see Proposition 3) states that S(P ) has the structure of an affine subset of R(z). Before we move to a precise statement of this result, we remind the reader that we only intend to deal with singleinput/single-output systems. In the multi-input/multi-output case, each system would be represented by a matrix of transfer functions, which greatly complicates the analysis. We note, however, that this factorization result does extend to this case (see [24, Chapter 5]). ˆ and dˆ are coprime if Definition 5. Let n ˆ and dˆ be in R∞ . We say that n there exist x ˆ and yˆ in R∞ such that x ˆn ˆ + yˆdˆ ≡ 1 in R(z) , ˆ = 1 except at singularities). This (that is, for all z ∈ C, x ˆ(z)ˆ n(z) + yˆ(z)d(z) can be easily shown to be identical to coprimeness in R∞ considered as an abstract ring. Let pˆ ∈ R(z). It has a coprime factorization pˆ = n ˆ /dˆ where n ˆ and dˆ are in r for polynomials qˆ and rˆ having no common R∞ . Indeed, we can write pˆ = qˆ/ˆ factors, which implies coprimeness in R[z] and hence in R∞ ⊇ R[z]. We can now state the fundamental theorem [24, Chapter 2], also referred to as the Youla (or YJBK) parameterization. Proposition 3. Let the plant P have rational transfer function pˆ ∈ R(z), let ˆ and yˆ in R∞ arise n ˆ and dˆ in R∞ form a coprime factorization and let x ˆ Then from the coprimeness of n ˆ and d. * ˆq * x ˆ + dˆ * qˆ ∈ R∞ , qˆ = yˆ/ˆ S(P ) = n . (4.2) yˆ − n ˆ qˆ * This result has the following consequence for the stabilized closed-loop mappings. Recall that 1/(1 + pˆcˆ) is the transfer function taking input w to e (that is, eˆ = w/(1 ˆ + pˆcˆ)) and cˆ/(1 + pˆcˆ) maps the input w to u. We now have (see [24]) the following result.
64
R. Wenczel et al.
Corollary 2. The set of all closed-loop maps Φ taking w to (e, u), achieved by some stabilizing compensator C ∈ S(P ), is . ) + , + ,* ˆ ** = dˆ yˆ − qˆdˆ n , q ˆ = y ˆ /ˆ n , (4.3) Φ q ˆ ∈ R ∞ x ˆ dˆ * and has the form of an affine set in R(z) × R(z).
4.4 Formulation of the optimization problem in l1 In the following, we consider SISO (single-input/single-output) rational, linear time-invariant (LTI) systems. For such a system (the ‘plant’) we characterize the set of error-signal and control-signal pairs achievable for some stabilizing compensator, in the one-degree-of-freedom feedback configuration (Figure 4.1). The derivation, based on the Youla parameterization, is standard. Assume that the reference input w = 0 has rational z-transform and the plant P is rational and causal, so it has no pole at 0. For a = eiθ on the unit circle, define a subspace Xa of l∞ by Xa = l1 + span {c, s}, where
∞ c := {cos kθ}∞ k=0 and s := {sin kθ}k=0 .
We shall assume that the error signals (usually denoted by e here) are in l1 , so we are considering only those controllers that make the closed-loop system track the input (in an l1 -sense), and we shall assume also that the associated control signal remains bounded (and in fact resides in Xa ). Definition 6. For any u ∈ Xa , let uc and us denote the (unique) real numbers such that u − uc c − us s is in l1 . Let the plant P have transfer function P (z) with the coprime factorization ˆ where n P (z) = n ˆ (z)/d(z) ˆ and dˆ are members of R[z] (the space of polynomis als, with real coefficients, in the complex variable z). Write n ˆ (z) = i=0 ni z i , ˆ = t di z i , n := (n0 , n1 , .., ns , 0, ..) and d := (d0 , .., dt , 0, ..). d(z) i=0 Let x and y be finite-length sequences such that x ˆn ˆ + yˆdˆ = 1 in R[z]. Their existence is a consequence of the coprimeness of n ˆ and dˆ in R[z] (see [24]). If we aim to perform an optimization over a subset of the set S(P ) of stabilizing controllers K for P , such that for each K in this subset, the corresponding error signal φ(K) and control output u(K) are in l1 and Xa respectively, the appropriate feasible set is
4
Convergence of truncates in l1 optimal feedback control
65
F0 := {(e, u) ∈ l1 × Xa | e = e(K) for some K ∈ S(P ), u = K ∗ e} = {(e, u) ∈ l1 × Xa | (e, u) = w ∗ d ∗ (y, x) − q ∗ w ∗ d ∗ (n, −d) for some q such that qˆ ∈ R∞ \{ˆ y /ˆ n}}, where the latter equality follows from the YJBK factorization for S(P ) (use Property 3 or Corollary 2). Henceforth we shall always assume that w ∈ Xa is rational (that is, has rational z-transform), the plant P is rational and has no zero at a. By the lemma to follow (whose proof is deferred to the Appendix) we observe that a single sinusoid suffices to characterize the asymptotic (or steadystate) behavior of u for all feasible signal pairs (e, u) ∈ F0 . Lemma 1. Let w ∈ Xa and P be rational and assume that P (a) = 0. Then for any (e, u) ∈ F0 , 1 1 − ws Im P (a) , uc = βc := wc Re P (a) 1 1 us = βs := ws Re P (a) + wc Im P (a) .
Using this lemma, we can translate F0 in the u variable to obtain the set F := F0 − (0, βc c + βs s), having the form (where we use the notation x ∗ (y, z) := (x ∗ y, x ∗ z) for sequences x, y, z) F =(e, { u) ∈ l1 × l1 | (e, u) = w ∗ d ∗ (y, x) − (0, βc c + βs s) − q ∗ w ∗ d ∗ (n, −d) y /ˆ n}}. for some q such that qˆ ∈ R∞ \{ˆ We need to recast F into a form to which the tools of optimization theory can be applied. As a first step, formally define the sets M := * ⎫ ⎧ * eˆ(¯ pi | ≤ 1) i = 1,.., m1⎪ pi ) = 0 ( p¯i pole of P : |¯ ⎪ * ⎪ ⎪ ⎬ ⎨ * eˆ(¯ z ) = w(¯ ˆ zj )(¯ zj zero of P : |¯ zj | ≤ 1) j = 1,.., m2 , (e, u) ∈ (l1 )2 ** j vk ) = 0 (¯ vk zero of w ˆ : |¯ vk | ≤ 1) k = 1,.., m3⎪ ⎪ ⎪ * eˆ(¯ ⎪ ⎭ ⎩ d ∗ e*+ n ∗ u = w ∗ d − n ∗ (βc c + βs s) ˆ rational, e = 0} and Mr := {(e, u) ∈ M | eˆ, u (0) M :=* ⎧ ⎫ * eˆ(¯ pi a pole of P with |¯ pi | ≤ 1) i = 1,.., m1 ⎬ pi ) = 0 (¯ ⎨ * zj ) = w(¯ ˆ zj )(¯ zj a zero of P with |¯ zj | ≤ 1) j = 1,.., m2 e ∈ l1 ** eˆ(¯ ⎩ ⎭ * eˆ(¯ vk ) = 0 (¯ vk a zero of w ˆ with |¯ vk | ≤ 1) k = 1,.., m3 with the understanding that in the above sets, whenever P and w ˆ have a common pole at a (and hence at a ¯), the constraint eˆ(a) = 0 is absent.
66
R. Wenczel et al.
Moreover, note that in the definition of M , the constraints on eˆ at the zeros z¯ (if z¯ = a, a ¯) are redundant, as follows from the closed-loop equation d ∗ e + n ∗ u = w ∗ d − n ∗ (βc c + βs s). The above constraint system can also be obtained from the general multivariable formalism of [9], [14] etc., on removal of the redundancies in the latter. The content of the following remark will be used repeatedly in our work. e, u ¯) be Remark 1. We note the following relation between M (0) and M . Let (¯ e, u ¯) consist of elements satisfying any element of M . Then M (0) − e¯ and M −(¯ the corresponding constraints with right-hand sides set to zero. Assuming that P (equivalently, n ˆ ) has no zeros on the unit circle, and that all its zeros ˆe/ˆ n) maps in D are simple, then the map T on M (0) − e¯ taking e to −Z −1 (dˆ 1 e, u ¯), into l (by Lemma 4 below) with T ≤ κ d 1 . Then (e, T e) ∈ M − (¯ since d ∗ e + n ∗ T e = 0, which follows on taking z-transforms. The next two lemmas give simple sufficient conditions for the feasible set F to be fully recharacterized as an affine subspace defined by a set of linear constraints, and for when elements of finite length exist. Since they are only minor modifications of standard arguments, we omit the proofs here, relegating them to the Appendix. Lemma 2. Assume that either ¯ the poles/zeros of P and w 1 in D ˆ are distinct and simple; or ¯ 2 in D\{a, a ¯} the poles/zeros of P and w ˆ are distinct and simple, and P and w ˆ have a common simple pole at z = a. Then Mr = F. Lemma 3. Let P and w ∈ Xa be rational with P (a) = 0. Assume also the following: 1 either ¯ the poles/zeros of P and w (a) in D ˆ are distinct and simple; or ¯ (b) in D\{a, a ¯} the poles/zeros of P and w ˆ are distinct and simple, and P and w ˆ have a common simple pole at z = a; 2 (w − wc c − ws s) ∗ d has finite length; 3 all zeros (in C) of P are distinct and simple. Then M contains elements (¯ e, u ¯) of finite length. Remark 2. In the above result, one can always choose that e¯ = 0, so that (¯ e, u ¯) ∈ Mr , which coincides with F by Lemma 2. Thus the latter is nonempty. Lemma 4. Let fˆ ∈ l 1 and let p(·) be a polynomial with real coefficients with no zeros on the unit circle. Assume also at each zero of p(·) in the open unit
4
Convergence of truncates in l1 optimal feedback control
67
disk, that fˆ has a zero also, of no lesser multiplicity. Then fˆ(·)/p(·) ∈ l 1 . Also, if q := Z −1 (fˆ/p), then q 1 ≤ κ f 1 where , + 1 , (1 − |a|) 1 − 1/κ = |a | a, a zeros of p: |a|<1, |a |>1 and where in the above product, the zeros appear according to multiplicity, and moreover, both members of any conjugate pair of zeros are included.
Proof. See the Appendix.
Remark 3. Such a result will not hold in general if p(·) has zeros on the unit circle; a counterexample for |a| = 1 is as follows. a−j for j ≥ 1. Then f ∈ l1 (C) and fˆ(a) = 0 since Let f0 = −1 and fj = j(j+1) ∞ fˆ(z) 1 k=1 k(k+1) = 1. If q is the inverse z-transform of [z → z−a ], then for each k, * * * 1 1 j* f a |qk | = * ak+1 j>k j * = k+1 so q ∈ / l1 (C). The hard bounds on the error and on input activity will be represented by the closed convex sets C (e) and C (u) respectively, given by (e)
C (e) := {e ∈ l1 | Bi (e)
where Bi
(e)
≤ 0 ≤ Ai
(e)
≤ ei ≤ Ai for all i},
(4.4)
eventually for large i, and (u)
(u)
(4.5)
C := C (e) × C (u) .
(4.6)
C (u) := {u ∈ l1 | Bi
≤ ui ≤ Ai for all i}, where
We shall also make use of the truncated sets (e)
Ck := {e ∈ C (e) | ei = 0 for i ≥ k}.
(4.7)
(u)
We shall not use the truncated sets Ck , for reasons that will be indicated in Section 4.6.1.
4.5 Convergence tools It is well known that there is no general correspondence between pointwise convergence of functions, and the convergence, if any, of their infima, or of optimal or near optimal elements. Hence approximation schemes for minimization of functions, based on this concept, are generally useless. In contrast,
68
R. Wenczel et al.
schemes based on monotone pointwise convergence, or on uniform convergence, have more desirable consequences in this regard. However, the cases above do not include all approximation schemes of interest. In response to this difficulty, a new class of convergence notions has gained prominence in optimization theory, collectively referred to as “variational convergence” or “epi-convergence.” These are generally characterized as suitable set-convergences of epigraphs. For an extensive account of the theory, the reader is referred to [1, 7]. Such convergences have the desirable property that epi-convergence of functions implies, under minimal assumption, the convergence of the infima to that of the limiting function, and further, for some cases, the convergence of the corresponding optimal, or near-optimal, elements. The convergence notion with the most powerful consequences for the convergence of infima is that of Attouch and Wets, introduced by them in [2, 3, 4]. Propositions 4 and 5 (see below) will form the basis for the calculation of explicit error estimates for our truncation scheme. Definition 7. For sets C and D in a normed space X, and ρ > 0, d(x, D) := inf x − d , d∈D
eρ (C, D) :=
sup
d(x, D),
x∈C∩B(0,ρ)
haus ρ (C, D) := max{eρ (C, D), eρ (D, C)}, (the ρ-Hausdorff distance) . For functions f and g mapping X to R, the “ρ-epi-distance” between f and g is defined by dρ (f, g) := haus ρ (epi f, epi g). It is easily shown that haus ρ (C, D) = dρ (δC , δD ). Definition 8. Let Cn and C be subsets of X. Then Cn Attouch–Wets converges to C iff for each ρ > 0, we have lim haus ρ (Cn , C) = 0.
n→∞
If fn and f are R-valued functions on X, then fn Attouch–Wets (or epidistance) converges to f if for each ρ > 0, lim dρ (fn , f ) = 0,
n→∞
(that is, if their epigraphs Attouch–Wets converge). This convergence concept is well suited to the treatment of approximation schemes in optimization, since the epi-distances dρ provide a measure of the
4
Convergence of truncates in l1 optimal feedback control
69
difference between the infima of two functions, as indicated by the following result of Attouch and Wets. For ϕ : X → R with inf X ϕ finite, we denote by -argmin ϕ the set {x ∈ X | ϕ(x) ≤ inf X ϕ + } of -approximate minimizers of ϕ on X. Proposition 4. [4, Theorem 4.3] Let X be normed and let ϕ and ψ be proper R-valued functions on X, such that 1 inf X ϕ and inf X ψ are both finite; 2 there exists ρ0 > 0 such that for all > 0, (-argmin ϕ) ∩ B(0, ρ0 ) = ∅ and (-argmin ψ) ∩ B(0, ρ0 ) = ∅. * * * * *inf ϕ − inf ψ * ≤ dα(ρ0 ) (ϕ, ψ),
Then
X
X
where α(ρ0 ) := max{ρ0 , 1 + | inf X ϕ|, 1 + | inf X ψ|}. Since our optimization problems are expressible as the infimum of a sum of two (convex) functions, we will find useful a result on the Attouch–Wets convergence of the sum of two Attouch–Wets convergent families of (l.s.c. convex) functions. The following result by Az´e and Penot may be found in full generality in [5, Corollary 2.9]. For simplicity, we only quote the form this result takes when the limit functions are both nonnegative on X. Proposition 5. Let X be a Banach space, let fn , f , gn and g be proper closed convex R-valued functions on X, with f ≥ 0 and g ≥ 0. Assume that we have the Attouch–Wets convergence fn → f and gn → g, and that for some s > 0, t ≥ 0 and r ≥ 0, B(0, s)2 ⊆ Δ(X) ∩ B(0, r)2 − {f ≤ r} × {g ≤ r} ∩ B(0, t)2 ,
(4.8)
where Δ(X) := {(x, x) | x ∈ X} and B(0, s)2 denotes a ball in the box norm in X × X (that is, B(0, s)2 = B(0, s) × B(0, s)). Then for each ρ ≥ 2r + t and all n ∈ N such that dρ+s (fn , f ) + dρ+s (gn , g) < s , we have dρ (fn + gn , f + g) ≤
2r + s + ρ [dρ+s (fn , f ) + dρ+s (gn , g)] . s
In particular, fn + gn Attouch–Wets converges to f + g. Corollary 3. Let Cn , C and M be closed convex subsets of a Banach space X and let fn and f be proper closed convex R-valued functions on X with f ≥ 0 and Attouch–Wets convergence fn → f and Cn → C. Suppose that for some s > 0, t ≥ 0 and r ≥ 0, B(0, s)2 ⊆ Δ(X) ∩ B(0, r)2 − (C × M ) ∩ B(0, t)2 ;
(4.9)
70
R. Wenczel et al.
B(0, s)2 ⊆ Δ(X) ∩ B(0, r)2 − [{f ≤ r} × (C ∩ M )] ∩ B(0, t)2 .
(4.10)
Assume further that fn and f satisfy ∃n0 ∈ N, α ∈ R such that max{ sup
inf fn , inf f } ≤ α and
n≥n0 Cn ∩M
C∩M
(4.11)
there exists ρ0 > 0 such that / B(0, ρ0 ) ⊇ {fn ≤ α + 1} ∩ Cn ∩ M and B(0, ρ0 ) ⊇ {f ≤ α + 1} ∩ C ∩ M . n≥n0
(4.12) Then for any fixed ρ ≥ max{2r + t, ρ0 , α + 1}, and all n ≥ n0 for which dρ+s (fn , f ) +
2r + 2s + ρ dρ+2s (Cn , C) < s , and dρ+2s (Cn , C) < s , (4.13) s
we have * * * * * inf fn − inf f * ≤ dρ (fn + δC ∩M , f + δC∩M ) n *Cn ∩M C∩M * 2r + 2s + ρ 2r + s + ρ dρ+s (fn , f ) + dρ+2s (Cn , C) . ≤ s s Proof. For n ≥ n0 , ϕ := f + δC∩M and ϕn := fn + δCn ∩M satisfy the hypotheses of Proposition 4 and so * * * * * inf fn − inf f * ≤ dα(ρ ) (ϕ, ϕn ) ≤ dρ (ϕ, ϕn ) . 0 * * Cn ∩M
C∩M
The estimates for dρ (ϕ, ϕn ) will now follow from two applications of Proposition 5. Indeed, from (4.9) and Proposition 5, whenever ρ ≥ 2r+t and n is such that dρ+2s (Cn , C) < s, then dρ+s (Cn ∩M, C ∩M ) ≤ 2r+s+(ρ+s) dρ+2s (Cn , C). s Taking n to be such that (4.13) holds, we find that dρ+s (fn , f ) + dρ+s (Cn ∩ M, C ∩ M ) < s, so from (4.10) and Proposition 5 again, 2r + s + ρ [dρ+s (fn , f ) + dρ+s (Cn ∩ M, C ∩ M )] dρ (ϕ, ϕn ) ≤ s 2r + 2s + ρ 2r + s + ρ dρ+s (fn , f ) + dρ+2s (Cn , C) . ≤ s s If we keep fn fixed in this process, then only one iteration of Proposition 5 is required, which will lead to a better coefficient for the rate of convergence than that obtained from taking dρ (fn , f ) ≡ 0 in Corollary 3. Corollary 4. Let Cn , C and M be closed convex subsets of a Banach space X and let f be a proper closed convex R-valued functions on X with f ≥ 0 and Attouch–Wets convergence Cn → C. Suppose that for some s > 0, t ≥ 0 and r ≥ 0,
4
Convergence of truncates in l1 optimal feedback control
B(0, s)2 ⊆ Δ(X) ∩ B(0, r)2 − [({f ≤ r} ∩ M ) × C] ∩ B(0, t)2 .
71
(4.14)
Assume further that for some n0 ∈ N, α ∈ R and ρ0 > 0, inf
Cn0 ∩M
f ≤ α and B(0, ρ0 ) ⊇ {f ≤ α + 1} ∩ C ∩ M .
(4.15)
Then for any fixed ρ ≥ max{2r + t, ρ0 , α + 1}, and all n ≥ n0 for which dρ+s (Cn , C) < s , we have
(4.16)
* * * * * inf f − inf f * ≤ dρ (f + δC ∩M , f + δC∩M ) n *Cn ∩M C∩M * ≤
2r + s + ρ dρ+s (Cn , C) . s
Proof. Similar to that of Corollary 3, but with only one appeal to Proposition 5.
4.6 Verification of the constraint qualification Our intention will be to apply the results of the preceding section with X = l1 ×l1 (with the box norm), so the norm on X ×X will be the four-fold product of the norm on l1 . From now on we will not notationally distinguish balls in the various product spaces, the dimensionality being clear from context. Before we can apply the convergence theory of Section 4.5, we need to check that the constraint qualifications (4.9) and (4.10) (or (4.14)) are satisfied. This will be the main concern in this section. First, we consider some more readily verifiable sufficient conditions for (4.9) and (4.10) to hold. Lemma 5. Suppose that for sets C and M we have B(0, s) ⊆ C ∩ B(0, σ) − M
(4.17)
for some s and σ. Then it follows that B(0, s/2)2 ⊆ Δ(X) ∩ B(0, σ + 3s/2)2 − (C × M ) ∩ B(0, σ + s)2
(4.18)
(of the form (4.9)). Next, suppose we have B(0, μ) ⊆ {f ≤ ν} ∩ B(0, λ) − (C ∩ M ) ∩ B(0, λ) for some λ, μ and ν. Then it follows that
(4.19)
72
R. Wenczel et al.
B(0, μ/2)2 ⊆ Δ(X) ∩ B(0, μ/2 + λ)2 − ({f ≤ ν} × (C ∩ M )) ∩ B(0, λ)2 (4.20) (of the form (4.10)). Also, B(0, s) ⊆ C ∩ B(0, σ) − {f ≤ r} ∩ M ∩ B(0, σ + s)
(4.21)
implies B(0, s/2)2 ⊆ Δ(X) ∩ B(0, σ + 3s/2)2 − (C × ({f ≤ r} ∩ M )) ∩ B(0, σ + s)2 (4.22) (which is of the form of (4.14)). Proof. Now (4.17) implies B(0, s) ⊆ C ∩ B(0, σ) − M ∩ B(0, σ + s). Place D := Δ(X)−(C ×M )∩B(0, σ +s)2 . Then, if P denotes the subtraction map taking (x, y) to y − x, we have P (D) = C ∩ B(0, σ + s) − M ∩ B(0, σ + s) ⊇ C ∩ B(0, σ) − M ∩ B(0, σ + s) ⊇ B(0, s), and hence B(0, s/2)2 ⊆ P −1 (B(0, s)) ⊆ P −1 P (D) = D + P −1 (0) = D + Δ(X) = D since Δ(X) + Δ(X) = Δ(X). Thus B(0, s/2)2 ⊆ Δ(X) − (C × M ) ∩ B(0, σ + s)2 , from which (4.18) clearly follows (and is of the form (4.9) for suitable r, s and t). Next, suppose we have (4.19). If we define D := Δ(X) − ({f ≤ ν} × (C ∩ M ))∩B(0, λ)2 , then similarly P (D) = {f ≤ ν}∩B(0, λ)−(C ∩M )∩B(0, λ) ⊇ B(0, μ) so, proceeding as above, we obtain (4.20) (which is of the form (4.10)). The last assertion follows from the first on substituting {f ≤ r} ∩ M for M therein. In this chapter, the objective functions f will always be chosen such that (4.17) will imply (4.19) or (4.21). Hence in this section we focus on verification of (4.17). The constraints of M (0) have the form Ae = b, where m = 2m1 +2m2 +2m3 and ˆ z1 ),.., Re w(¯ ˆ zm1 ), Im w(¯ ˆ zm1 ), 0,.., 0,.., 0)T ∈ Rm b = (Re w(¯ ˆ z1 ), Im w(¯ with A : l1 → Rm given by
4
Convergence of truncates in l1 optimal feedback control
73
Ae := (.., Re eˆ(¯ zi ), Im eˆ(¯ zi ),.., Re eˆ(¯ pj ), Im eˆ(¯ pj ),.., Re eˆ(¯ vk ), Im eˆ(¯ vk ),..)T , (4.23) where z¯i , p¯j and v¯k denote distinct elements of the unit disk, and i, j and k range over {1, . . . , m1 }, {1, . . . , m2 } and {1, . . . , m3 } respectively. Then A is expressible as a matrix operator of the form ⎛ ⎞ 1 Re z¯1 Re z¯12 · · · ⎜ 0 Im z¯1 Im z¯12 · · · ⎟ ⎜ ⎟ ⎜ .. ⎟ .. .. ⎜. ⎟ . . ⎜ ⎟ 2 ⎜ 1 Re z¯m1 Re z¯m ⎟ · · · 1 ⎜ ⎟ 2 ⎜ 0 Im z¯m1 Im z¯m ⎟ · · · 1 ⎜ ⎟ ⎜ 1 Re p¯1 Re p¯21 · · · ⎟ ⎜ ⎟ ⎜ 0 Im p¯1 Im p¯21 · · · ⎟ ⎜ ⎟ ⎜ ⎟ .. .. (aij )1≤i≤m; 0≤j<∞ = ⎜ ... ⎟, . . ⎜ ⎟ ⎜ 1 Re p¯m Re p¯2 · · · ⎟ m2 2 ⎜ ⎟ ⎜ 0 Im p¯m Im p¯2 · · · ⎟ m2 2 ⎜ ⎟ ⎜ 1 Re v¯1 Re v¯2 · · · ⎟ 1 ⎜ ⎟ ⎜ 0 Im v¯1 Im v¯2 · · · ⎟ 1 ⎜ ⎟ ⎜. ⎟ .. .. ⎜ .. ⎟ . . ⎜ ⎟ ⎝ 1 Re v¯m Re v¯2 · · · ⎠ m3 3 2 ··· 0 Im v¯m3 Im v¯m 3 where rows of imaginary parts of the matrix (aij ) and of b are omitted whenever the associated z¯i , p¯j or v¯k is real. For integer K, define A(K) to be the truncated operator taking RK into m R given by the matrix (aij )1≤i≤m,0≤j
β ∞ = 1 A(K) ξ = β
is finite. In particular, αK ≤ αm for all K ≥ m and 2' 2 2 (m) (−1 2 2, A αm = 2 2 2 the norm being taken relative to the 1-norm on the range of the inverse 3 (m) 4−1 A and the ∞-norm on its domain. Note that αK satisfies (∀β ∈ Rm )(∃ξ ∈ RK )(A(K) ξ = βand ξ 1 ≤ αK β ∞ ).
74
R. Wenczel et al.
Lemma 6. Let C (e) := {e ∈ l1 | Bi ≤ ei ≤ Ai ; i = 0, 1, 2, . . . } where the bounds Bi , Ai satisfy: ∃ξ ∈ RK such that Aξ = b and Bi < ξi < Ai for i = 0, 1, . . . , K − 1; and Bi ≤ 0 ≤ Ai for i ≥ K. Let K ≥ m, and let αK be as in (4.24). Also, let :=
min
0≤i≤K−1
{|Ai − ξi |, |Bi − ξi |}
(so > 0). Then −1 −1 B(0, αK ) ⊆ C (e) ∩ B(0, + ξ 1 ) − M (0) ∩ B(0, (1 + αK ) + ξ 1 ) .
Proof. Place s := /αK and let η ∈ B(0, s). Set e := {ξ, 0, 0, ..} ∈ C (e) ∩M (0) . As A(K) maps onto Rm , there exists p ∈ RK ⊆ l1 such that Ap = A(K) p = Aη with η (¯ zi )|, |ˆ η (¯ pj )|, |ˆ η (¯ vk )|} ≤ αK η 1 < . p 1 ≤ αK Aη ∞ ≤ αK max {|ˆ z¯i ,p¯j ,¯ vk
Consequently, p ∈ C (e) − e since for each i ≤ K − 1, |pi | < ≤ mini
κ d 1 (ρ + ¯ e 1 ), where κ 3 B(0, μ) ⊆ C (u) − u is given in Lemma 4. e 1 )}, we have Then, if s := min{τ, μ − κ d 1 (ρ + ¯ B(0, s) ⊆ C ∩ B(0, max{σ, μ + ¯ u 1 }) − M.
(4.25)
Proof. Let (ξ, η) ∈ B(0, s). From Assumption 7, ξ = v − e ∈ C (e) ∩ B(0, σ) − e 1 ) M (0) ∩ B(0, ρ) so ξ = v − e where v = v − e¯ ∈ (C (e) − e¯) ∩ B(0, σ + ¯ e 1 ). Place u = η + T e , then and e = e − e¯ ∈ (M (0) − e¯) ∩ B(0, ρ + ¯ u 1 ≤ η 1 + T e 1 ≤ η 1 + T ( ¯ e 1 + ρ) < s + T ( ¯ e 1 + ρ) ≤ μ,
4
Convergence of truncates in l1 optimal feedback control
75
where the last inequality follows by Remark 1, and so u ∈ C (u) − u ¯ from Asu 1 ). Also, v + e¯ = v ∈ C (e) ∩B(0, σ) sumption 7 and u+ u ¯ ∈ C (u) ∩B(0, μ+ ¯ and e, u ¯)) (ξ, η) = (v , u) − (e , T e ) ∈ (v , u) − (M − (¯ = (v + e¯, u + u ¯) − M = (v, u + u ¯) − M ⊆ C ∩ B(0, max{σ, μ + ¯ u 1 }) − M.
Remark 4. The existence of (¯ e, u ¯) in M of finite length is ensured, for in(e) (e) stance, under the conditions of Lemma 3. If the bounds Ai and Bi satisfy (e) (e) e) (where l(·) denotes length), then Condition 7 Bi < e¯i < Ai for i ≤ l(¯ of Lemma 7 follows, for suitable constants, from Lemma 6. (Note again, that (e) (e) e), so by making these bounds this holds for arbitrary Ai and Bi for i > l(¯ decay to zero sufficiently rapidly, as will be shown in Lemma 13 to come, we can enforce compactness of C (e) , which will be essential for the Attouch–Wets (e) convergence of Cn to C (e) ). (u) (u) ¯ (∀i, If, furthermore, the bounds Ai and Bi are chosen to envelop u (u) (u) ¯i < Ai ), and to be bounded away from zero by sufficient distance, Bi < u for all i, Condition 7 of Lemma 7 is also satisfied. Remark 5. Note that 0 ∈ int (C − M ) iff (4.9) holds for some r, s and t. Indeed, if 0 ∈ int (C−M ) then by Corollary 1 we obtain (4.17) for some s and σ, which implies (4.18), which is of the form (4.9) for suitable constants. Conversely, (4.9) implies (where P (x, y) := x−y) 0 ∈ int P (Δ(X) ∩ B(0, r)2 − (C × M ) ∩ B(0, t)2 ) ⊆ int (C ∩ B(0, t) − M ∩ B(0, t)) ⊆ int (C − M ). If we are interested only in knowing that Cn ∩ M Attouch–Wets converges to C ∩ M , and not in the actual rate of such convergence, then 0 ∈ int (C − M ) certainly suffices for the applicability of the results of the theory to this end. Note however that in this case, the error bounds obtained in Section 4.7 are now not “computable” since we do not have explicit values for the constants r, s and t appearing in the constraint qualification. A sufficient condition for 0 ∈ int (C − M ) may be obtained on modification of Lemma 7. Lemma 8. Let C (e) , C (u) , C and M be as above, and assume that: ¯); 1 there exists (¯ e, u ¯) ∈ C ∩ M such that 0 ∈ core (C (u) − u (e) 2 0 ∈ core (C − M (0) ); and 3n ˆ has no zeros on the unit circle. Then 0 ∈ int (C − M ).
76
R. Wenczel et al.
Proof. Note first that ¯) = cone (C (e) − e¯) × cone (C (u) − u ¯) cone (C (e) − e¯) × (C (u) − u ¯ are convex sets containing 0. Then (where T since C (e) − e¯ and C (u) − u denotes the mapping introduced in Remark 1) cone (C − M ) = cone (C − (¯ e, u ¯)) − (M − (¯ e, u ¯)) (e) (u) ¯) − (M − (¯ e, u ¯)) = cone (C − e¯) × cone (C − u (e) 1 = cone (C − e¯) × l −{(e, u) | e ∈ M (0) − e¯, d ∗ e + n ∗ u = 0} = cone (C (e) − e¯) × l1 − {(e, T e) | e ∈ M (0) − e¯} ⊆ cone (C (e) − M (0) ) × l1 , where the final inclusion is in fact an equality, as follows by noting that if e)− (ξ, η) ∈ cone (C (e) −M (0) )×l1 , then ξ ∈ cone (C (e) −M (0) ) = cone (C (e) −¯ (M (0) − e¯) so ξ = v − e for some v ∈ cone (C (e) − e¯) and e ∈ M (0) − e¯. Setting e)×l1 −{(e, T e) | u := η+T e ∈ l1 yields (ξ, η) = (v, u)−(e, T e) ∈ cone (C (e) −¯ e ∈ M (0) − e¯}. Thus cone (C − M ) = cone (C (e) − M (0) ) × l1 , which by the assumptions yields 0 ∈ core (C − M ) and the result then follows from Corollary 1.
4.6.1 Limitations on the truncation scheme In Section 4.7, we will apply Corollary 3 to deduce various convergence re(e) (u) sults. For this, it will be necessary that Ck := Ck × Ck Attouch–Wets (u) converges to C = C (e) × C (u) , where Ck denote the corresponding trun(u) cations of C . Recall that we need a condition of the form (4.17) for the application of the convergence theory of Section 4.5. This has an untoward consequence in relation to convergence of truncations of C (u) . From Lemma 9 below, we see that Attouch–Wets convergence of Ck to C is impossible unless (u) we keep Ck = C (u) for all k; indeed, if truncations of C (u) are included, then Attouch–Wets convergence will occur if and only if C, and hence C (u) , is locally compact (in the sense of having compact intersections with all closed balls), which is incompatible with the constraint qualification (4.17), as we shall observe in Lemma 10. Further, if instead we try to truncate the space M to form an expanding family of finite-dimensional subspaces Mn , then similarly, any Attouch–Wets convergence of Mn to M demands local compactness of M , which is an impossibility since the latter has infinite dimension. We therefore use truncations in the e-variable only, yielding the form Ck := (e) Ck ×C (u) . Thus our truncations will generally not consist purely of elements (e, u) of fixed finite length. It will be shown currently however (see the end
4
Convergence of truncates in l1 optimal feedback control
77
of this section) that each Cn ∩ M is in fact contained in a finite-dimensional subspace, but the basis thereof may consist of infinite-length members (in the u-variable). If we wish for these truncations to contain only (e, u) of some fixed finite length dependent only on n, then further assumptions on the plant will be required (see Lemma 16). Lemma 9. Let X be a Banach space, let Cn and C be closed convex subsets, with Cn ⊆ C for all n, and Cn Attouch–Wets convergent to C. Assume also ¯ ρ) is compact. Then C ∩ B(0, ¯ ρ) is for each n ∈ N and ρ > 0 that Cn ∩ B(0, compact whenever C ∩ B(0, ρ) is nonempty. ¯ ρ) is totally bounded. Now 0 ∈ int (C− Proof. It suffices to show that C∩B(0, ¯ ρ)) and hence B(0, s) ⊆ C∩B(0, ¯ ρ+s)−B(0, ¯ ρ) for some s > 0 (see (4.17) B(0, ¯ ρ)). On comparing (4.17) and (4.18) we find the indicator with M := B(0, satisfy a condition of the form functions fn = δCn , f = δC and g = δB(0,ρ) ¯ ¯ ρ) Attouch–Wets converges to C∩B(0, ¯ ρ). (4.8), so by Proposition 5, Cn ∩B(0, ¯ ρ) ⊆ Let > 0. By this convergence, there exists n such that C ∩ B(0, ¯ ρ) + B(0, /2). From the compactness of Cn ∩ B(0, ¯ ρ), there exist Cn ∩ B(0, 5N ¯ ¯ x1 , . . . , xN in Cn ∩ B(0, ρ) ⊆ C ∩ B(0, ρ) such that i=1 B(xi , /2) contains ¯ ρ). Hence C ∩ B(0, ¯ ρ) ⊆ 5N B(xi , ). Cn ∩ B(0, i=1 Lemma 10. Suppose that C = C (e) × C (u) is locally compact in the sense of Lemma 9 and n ˆ has no zeros on the unit circle. Then 0∈ / int (C − M ) . Proof. Supposing the contrary, Corollary 1 yields ρ > 0 satisfying cone (C ∩ B(0, ρ) − M ) = l1 × l1 , which in turn implies that (∀(ξ, η) ∈ l1 × l1 )(∃e ∈ M (0) − e¯) with ξ + e ∈ cone (C (e) ∩ B(0, ρ) − e¯) and
(4.26)
¯), η + T e ∈ cone (C (u) ∩ B(0, ρ) − u
where T is as in Remark 1, and (¯ e, u ¯) is a fixed member of C ∩ B(0, ρ) ∩ M . Let χ ∈ l1 . By the surjectivity of A : l1 → Rm given in (4.23), there exists ˆ z ) = χ(¯ ˆ z ). Place η := ξ ∈ l'1 such that(for each zero z¯ in D for n ˆ , ξ(¯ ˆ z )/d(¯ ˆ n . Since χ Z −1 (χ ˆ − dˆξ)/ˆ ˆ − dˆξˆ now must have a zero at each (simple) zero in D for n ˆ , and the latter has no zeros on the unit circle, we have η ∈ l1 by Lemma 4. Thus χ = d ∗ ξ + n ∗ η. With e ∈ M (0) − e¯ as in (4.26), it follows (on noting that d ∗ e + n ∗ T e = 0 from the definition of T ) that χ = d∗ξ+n∗η ¯) ∈ d ∗ cone (C (e) ∩ B(0, ρ) − e¯) + n ∗ cone (C (u) ∩ B(0, ρ) − u −(d ∗ e + n ∗ T e) = cone (d ∗ (C (e) ∩ B(0, ρ) − e¯)) + cone (n ∗ (C (u) ∩ B(0, ρ) − u ¯)) ' ( ⊆ cone d ∗ (C (e) ∩ B(0, ρ) − e¯) + n ∗ (C (u) ∩ B(0, ρ) − u ¯) ,
78
R. Wenczel et al.
where the latter inclusion follows since both C (e) ∩ B(0, ρ) − e¯ and C (u) ∩ B(0, ρ) − u ¯ are convex sets containing 0. Since χ ∈ l1 is arbitrary, ( ' ¯) = l 1 , cone d ∗ (C (e) ∩ B(0, ρ) − e¯) + n ∗ (C (u) ∩ B(0, ρ) − u ¯ ρ) − e¯) + n ∗ (C (u) ∩ B(0, ¯ ρ) − u ¯) so the compact convex set d ∗ (C (e) ∩ B(0, has a nonempty core (Definition 1) and hence by Proposition 1, a nonempty interior. However, this latter property is forbidden for any compact subset of an infinite-dimensional normed space. Thus we arrive at a contradiction. We end this section with the promised verification that each truncation (e) Cn ∩ M = (Cn × C (u) ) ∩ M is indeed finite-dimensional. Lemma 11. Under the assumptions of this section, Ck ∩ M is of finite dimension for each k. Proof. Assume that C ∩ M has a member (¯ e, u ¯) with e¯ of finite length. Now e)}. Since d∗(e− e¯)+n∗(u− u ¯) = 0 let (e, u) ∈ Ck ∩M , and let K := max{k, l(¯ ¯ = T (e − e¯). Since e − e¯ ∈ (M (0) − e¯) ∩ RK , and e − e¯ ∈ M (0) − e¯, then u − u K we can write e − e¯ = i=1 αi e(i) where the {e(i) }K i=0 is some spanning set for (M (0) − e¯) ∩ RK . Placing u(i) := T e(i) ∈ l1 , we obtain K
K αi u(i) = T ( αi e(i) ) = T (e − e¯) = u − u ¯
i=1
i=1
K so that u ∈ u ¯ +span {u(i) }K u +span {u(i) }K i=1 , and hence Ck ∩M ⊆ R ×(¯ i=1 ), a subspace of finite dimension. Note again that there is no guarantee that any of the u(i) has finite length.
4.7 Convergence of approximates As asserted in the opening paragraph of Section 4.6.1, if we wish to apply convergence theory, we cannot simultaneously truncate in both e and u. (e) Accordingly, our truncations will always be of the form Cn = Cn × C (u) . The following two lemmas show that compactness of C (e) is essential for the (e) Attouch–Wets convergence of the truncations Cn and Cn . Lemma 12. Let C, C (e) and C (u) be as usual, with also ∞ (e) (e) i=0 max{|Ai |, |Bi |} < +∞. Then dρ (Cn , C) =
(e)
(e)
max{|Ai |, |Bi |}for any ρ ≥
i≥n
and Cn Attouch–Wets converges to C.
∞ i=0
(e)
(e)
max{|Ai |, |Bi |}
4
Convergence of truncates in l1 optimal feedback control
79
(e)
Proof. Since Cn ⊆ C (e) for all n, we have dρ (Cn , C) = dρ (Cn(e) × C (u) , C (e) × C (u) ) = dρ (C (e) , Cn(e) ) = eρ (C (e) , Cn(e) ) , (e)
(e) (e) and we compute the latter. Let e ∈ C = C ∩ B(0, ρ). Then d(e, Cn ) = e − e(n) = i≥n |ei | where e(n) denotes the truncation to length n (that is, e(n) = (e0 , ..., en−1 , 0, 0, ..)), and hence as n → ∞ (e) (e) |ei | = max{|Ai |, |Bi |} → 0 . eρ (C (e) , Cn(e) ) = sup e∈C (e) i≥n
i≥n
Lemma 13. The set C (e) is compact in l1 if and only if the bounds satisfy ∞
(e)
(e)
max{|Ai |, |Bi |} < +∞ .
i=0 (e)
(e)
Proof. Let i0 be such that Bi ≤ 0 ≤ Ai for all i ≥ i0 . If C (e) is compact, (e) (e) then xn := (A0 , .., An , 0, 0, ..) ∈ C (e) (n ≥ i0 ) must have a convergent subsequence, along which we then have the uniform boundedness of the norms nk ∞ (e) (e) |Ai |, and since these increase with k, i=0 |Ai | is finite. xnk = i=0 ∞ Similarly, i=0 |Bi | is finite. ∞ (e) (e) (e) Conversely, if i=0 max{|Ai |, |Bi |} < +∞, the compactness of C (e) follows from Lemma 9 since its truncations Cn are all compact and Attouch– (e) Wets converges to C by Lemma 12. The next lemma shows that C ∩M is always bounded whenever the bounds on e define sequences in l1 . This ensures that condition (4.12) will always be satisfied for any objective f . Lemma 14. Suppose (as usual here) that n ˆ has no zeros on the unit circle and that all its zeros in the unit disk are simple. Then C ∩ M ⊆ B(0, ρ0 ), where ∞ 6 7 (e) (e) max |Ai |, |Bi | , ρ0 = max i=0
κ b 1 + d 1
∞
-
6 max
7
(e) (e) |Ai |, |Bi |
i=0
where κ is as in Lemma 4 and b := w ∗ d − n ∗ (βc c + βs s). ˆe Proof. Let (e, u) ∈ C ∩ M , then from the relation d ∗ e +n ∗ u = b, ˆb − dˆ ˆ ˆ ¯ of n has zeros at each zero in D ˆ , and since u = Z −1 b−nˆdˆe , Lemma 4 yields that u 1 ≤ κ b − d ∗ e 1 ≤ κ( b 1 + d 1 e 1 ). From this, and the relation ∞ (e) (e) e 1 ≤ i=0 max{|Ai |, |Bi |}, the result follows.
80
R. Wenczel et al.
Assembling all the parts we obtain our main result. Theorem 1. Let fn and f be proper closed convex R-valued functions, with fn Attouch–Wets convergent to f and f ≥ 0. Also, assume the following for C and M : 1n ˆ has no zeros on the unit circle, and all its zeros in the unit disk are simple; (e) (e) (e) form sequences in l1 , and 2 The bounds {Bi , Ai }∞ i=0 characterizing C also satisfy the requirement that for some K ≥ m, ∃ξ ∈ RK such that (e) (e) (e) (e) Aξ = b with Bi < ξi < Ai for i = 0, 1, . . . , K − 1; and Bi ≤ 0 ≤ Ai for i ≥ K; ¯ 3 There exists (¯ e, u ¯) ∈ C ∩ M of finite length such that B(0, μ) ⊆ C (u) − u −1 ) + ξ 1 + ¯ e 1 ], where κ is for some positive μ with μ > κ d 1 [(1 + αK given in Lemma 4 and αK in (4.24); 4 γ := max{supn≥n0 inf Cn ∩M fn , inf C∩M f } is finite; 5 B(0, μ) ⊆ {f ≤ ν} ∩ B(0, λ) − (C ∩ M ) ∩ B(0, λ) for some λ, μ and ν (that is, (4.19) holds). Define the constants :=
min
0≤i≤K−1
{|Ai − ξi |, |Bi − ξi |}
(4.27)
8 −1 9 1 −1 min αK , μ − κ d 1 [(1 + αK ) + ξ 1 + ¯ e 1 ] 2 s := min{s , μ} u 1 + 3s , μ/2 + λ, ν} r := max{ + ξ 1 + 3s , μ + ¯ u 1 + 2s , λ} . t := max{ + ξ 1 + 2s , μ + ¯ s :=
(4.28) (4.29) (4.30) (4.31)
Let ρ0 be as in Lemma 14. Then for any fixed ρ satisfying ∞ 6 7 (e) (e) ρ > max 2r + t, ρ0 , γ + 1, max |Ai |, |Bi | , i=0
and all n ≥ n0 for which dρ+s (fn , f ) +
(e)
i≥n
(e)
max{|Ai |, |Bi |} < s and
2r + 2s + ρ (e) (e) max{|Ai |, |Bi |} < s , s i≥n
it follows that * * 6 7 * * (e) (e) * inf fn − inf f * ≤ (2r + s + ρ)(2r + 2s + ρ) max |A |, |B | i i * Cn ∩M C∩M * s2 i≥n
2r + s + ρ dρ+s (fn , f ) . + s
4
Convergence of truncates in l1 optimal feedback control
81
Proof. By assumption 1 and Lemma 6, we obtain an inclusion of the form of (4.17): −1 −1 ) ⊆ C (e) ∩ B(0, + ξ 1 ) − M (0) ∩ B(0, (1 + αK ) + ξ 1 ) . B(0, αK
This, along with assumptions 1 and 1, may be inserted into Lemma 7 to yield u 1 }) − M . B(0, 2s ) ⊆ C ∩ B(0, max{ + ξ 1 , μ + ¯ u 1 }) Lemma 5 then gives (where r := max{ + ξ 1 , μ + ¯ B(0, s ) ⊆ Δ ∩ B(0, r + 3s ) − (C × M ) ∩ B(0, r + 2s ) , which, along with (4.20) (a consequence of assumption 1 via Lemma 5) yields (4.9) and (4.10) (in Corollary 3) for r, s and t as above. Further, assumption 1 gives (4.11), and (4.12) follows from Lemma 14 with the indicated value for ρ0 . Noting the explicit form for dρ (Cn , C) in ∞ (e) (e) Lemma 12 (for ρ ≥ i=0 max{|Ai |, |Bi |}), the result may now be read from Corollary 3. In particular, if fn = f for all n, we see that * * * * *7 6* * * * (e) * * (e) * * inf f − inf f * ≤ (2r + s + ρ)(2r + 2s + ρ) max *Ai * , *Bi * . *Cn ∩M * 2 C∩M s i≥n
However, in this case, we can obtain better constants by using Corollary 4 in place of Corollary 3, which will require the condition (4.21) (or (4.14)). To illustrate, if f is like a norm, say, f (e, u) := e 1 + ζ u 1 (ζ > 0), then for any r > 0, {f ≤ r} ⊇ B(0, r/(2 max{1, ζ})) so if (4.17) holds (for some s, σ) then taking r = 2(s + σ) max{1, ζ} gives C ∩ B(0, σ) − {f ≤ r} ∩ M ∩ B(0, σ + s) ⊇ C ∩ B(0, σ) − M ∩ B(0, σ + s) ⊇ B(0, s)
by (4.17)
so (4.21), and hence (4.14), holds for suitable constants. Accordingly, we arrive at the following result. Theorem 2. Let f : l1 × l1 → R have the form f (e, u) := e 1 + ζ u 1 for some ζ > 0. Assume Conditions 1, 1 and 1 of Theorem 1 hold, and Condition 1 is replaced by γ :=
inf
Cn0 ∩M
f <∞
for some n0 .
Define as in (4.27), and s := s by (4.28), with u 1 } + 2s) and r := 2 max{1, ζ} (max{ + ξ 1 , μ + ¯
(4.32)
82
R. Wenczel et al.
t := max{ + ξ 1 + 2s, μ + ¯ u 1 + 2s} . Then, for any fixed ρ satisfying
ρ > max 2r + t, ρ0 , γ + 1,
∞
(4.33) -
6 7 (e) (e) max |Ai |, |Bi |
i=0
(where ρ0 appears in Lemma 14) and all n ≥ n0 for which (e) (e) max{|Ai |, |Bi |} < s , i≥n
we have
* * 6 7 * * (e) (e) * inf f − inf f * ≤ 2r + s + ρ max |Ai |, |Bi | . * *Cn ∩M C∩M s i≥n
Proof. This follows along similar lines to that for Theorem 1, but uses the last displayed relation before Theorem 2 to obtain (4.21) from (4.17), so that (4.14) is obtained for the above r, s and t and we may apply Corollary 4. In summary: we have obtained • that inf C∩M f provides the exact lower bound for the performance of rational controllers for the l1 control problem, and • computable convergence estimates for the approximating truncated problems. Note further that these results are obtained for the case where the hardbound set C (or time-domain template) has no interior (since C (e) is assumed compact and hence has an empty interior). This then extends a result of [14] on such approximations (in the particular two-block control problem we consider) to cases where int C is empty. We note however that the cited result from [14] in fact has an alternate short proof by an elementary convexity argument (see Lemma 15 below) once the density in M of the subset of members of finite length is demonstrated. (This density is established, in the general “multi-block” case, in [19]. For the special two-block setup we consider, this property is proved in [26].) The above convergence results should be readily extendible to the multi-block formalism of [14]. Lemma 15. Let C, M and M0 be convex sets, with M = M0 and (int C) ∩ M = ∅. Then C ∩ M = C ∩ M0 . Proof. Let x ∈ C ∩ M and x0 ∈ (int C) ∩ M . Then for each 0 < λ < 1, xλ := λx0 + (1 − λ)x ∈ (int C) ∩ M , and xλ → x for λ → 0. For each λ, the density of M0 yields approximates to xλ from M0 which, if sufficiently close, must be in (int C) ∩ M0 .
4
Convergence of truncates in l1 optimal feedback control
83
This argument in fact leads to a very quick proof of [14, Theorem 5.4] (whose original proof contains a flaw, detailed in [27]) which asserts the equality inf C∩M · 1 = inf C∩Mr · 1 under a slightly stronger assumption, which in fact implies nonemptiness of (int C) ∩ M . To see the applicability of Lemma 15 to the cited result from [14], we temporarily adopt the notation thereof. Assign C and M as follows: C := {Φ ∈ l1nz ×nw | Atemp Φ ≤ btemp } and M := {Φ ∈ l1nz ×nw | Afeas Φ = bfeas } , where btemp ∈ l∞ , bfeas ∈ Rcz × l1nz ×nw , Atemp : l1nz ×nw → l∞ and Afeas : l1nz ×nw → Rcz × l1nz ×nw are bounded linear, and where the symbol ≤ stands for the partial order on l∞ induced by its standard positive cone P + . The assumption of [14, Theorem 5.4] is that btemp − Atemp Φ0 ∈ int P + for some Φ0 ∈ M . However, the continuity of Atemp implies that Φ0 ∈ int C and hence Φ0 ∈ (int C) ∩ M , which is the assumption of Lemma 15. This, coupled with the density of M0 := Mr in M , gives the result. As was discussed in Section 4.6.1, the approximating problems are constructed by truncating in the e-variable only, otherwise the method fails. However, the truncated constraint-sets Ck ∩ M satisfy dim span Ck ∩ M ≤ 2k
(for large k)
by Lemma 11, whereby the approximating minimizations are indeed over finite-dimensional sets. Since in general (e, u) ∈ Ck ∩ M does not imply that u has finite length, one may ask under what conditions it may be possible for there to exist, for each k, a uniform bound m(k) to the length of u whenever (e, u) ∈ Ck ∩ M . In this case, the truncated sets Ck ∩ M would resemble those obtained by the more natural method of simultaneously truncating both e and u (a strategy that fails to yield convergence, in general, by the results of Section 4.6.1). From the lemma to follow, such a property can hold only when the plant has no minimum-phase zeros (that is, no zeros outside the unit circle). Hence, except for some highly restrictive cases, Ck ∩ M will always contain some u of infinite length. Lemma 16. Suppose that the assumptions of Theorem 1 hold, and assume further that n ˆ has no zeros outside the open unit disk. (As usual, all zeros are assumed simple). Let (¯ e, u ¯) be in C ∩ M with both e¯ and u ¯ having finite length. Then for any k and any (e, u) ∈ Ck ∩ M , l(u) ≤ max{l(¯ u), l(d) − l(n) + max{l(¯ e), k}}, where l(·) denotes length. (e) Moreover, assuming that the conditions of Remark 4 hold (with Ai > (e) e)), then if for all k, each (e, u) ∈ Ck ∩ M has u of finite 0 > Bi for i > l(¯ length, it follows that n ˆ cannot have zeros outside the unit disk.
84
R. Wenczel et al.
Proof. Let (e, u) ∈ Ck ∩ M and let (e1 , u1 ) := (e, u) − (¯ e, u ¯). Then l(e1 ) ≤ e1 /ˆ n) from the max{l(e), l(¯ e)} and similarly for u1 . Also u1 = −d ∗ Z −1 (ˆ corresponding convolution relation defining M . Since eˆ1 is a polynomial having zeros at each zero of n ˆ in the unit disk and hence the whole plane (reˆ1 is a polynomial of degree at most call e1 ∈ M (0) − e¯), we have that u e)} − 1, and so, as u = u1 + u ¯, l(d) − l(n) + l(e1 ) − 1 ≤ l(d) − l(n) + max{l(e), l(¯ the result follows. e, u ¯) For the second assertion, suppose n ˆ (z0 ) = 0 for some |z0 | > 1. With (¯ as above, it follows from the closed–loop equation d∗¯ e +n∗¯ u = w∗d−n∗β =: ˜b ˆ 0 ) is finite and eˆ¯(z0 ) = w(z ˆ 0 ). Let k exceed (where β := βc c + βs s) that w(z (0) and the length of e¯, and let both the number of constraints defining M e ∈ M (0) ∩ Rk . If u := Z −1
ˆ ˜ ˆe b−dˆ n ˆ
, the interpolation constraints on e ensure
that u ˆ has no poles in the closed unit disk (so u ∈ l1 ) and hence (e, u) ∈ M . (e) ¯) = Also, since from the assumptions, cone (Ck − e¯) ⊇ Rk and cone (C (u) − u 1 l , we have (e, u) − (¯ e, u ¯) ∈ (M − (¯ e, u ¯)) ∩ (Rk × l1 ) (e) ¯) ⊆ (M − (¯ e, u ¯)) ∩ cone (Ck − e¯) × cone (C (u) − u e, u ¯)) = (M − (¯ e, u ¯)) ∩ cone (Ck − (¯ (e)
(recalling that Ck := Ck × C (u) ) e, u ¯)) , = cone (Ck ∩ M − (¯ e, u ¯) for some positive λ. Since now whence λ((e, u) − (¯ e, u ¯)) ∈ Ck ∩ M − (¯ λ(e, u) + (1 − λ)(¯ e, u ¯) ∈ Ck ∩ M , the hypothesized finiteness of length of λu + (1 − λ)¯ u implies, via the equation d ∗ (λe + (1 − λ)¯ e) + n ∗ (λu + (1 − λ)¯ u) = w ∗ d − n ∗ β , ˆ 0 ) and hence eˆ(z0 ) = w(z ˆ 0 ). Since k is arbithat (λe + (1 − λ)¯ e) (z0 ) = w(z trary, we have shown that every finite-length e ∈ M (0) satisfies an additional ˆ 0 ) at z0 , which yields a contradiction. interpolation constraint eˆ(z0 ) = w(z Remark 6. Under the assumptions of the preceding lemma, note that for k ≥ max{l(¯ e), l(¯ u) − l(d) + l(n)}, and (e, u) ∈ Ck ∩ M , we have l(u) ≤ k + l(d) − l(n) := k + l, so for such k, Ck ∩ M consists precisely of those elements (e, u) of C ∩ M with e of length k and u of length k + l. If l ≤ 0, then Cn ∩ M = {(e, u) ∈ C ∩ M | l(e) ≤ n, l(u) ≤ n} := Qn . If l ≥ 0, then for all n ≥ max{l(¯ e), l(¯ u) − l(d) + l(n)}, Cn ∩ M ⊆ Qn+l ⊆ Cn+l ∩ M,
4
Convergence of truncates in l1 optimal feedback control
85
and hence Qn Attouch–Wets converges to C ∩ M , so from Corollary 3, inf fn → inf f Qn
C∩M
as n → ∞. Observe that the sets Qn represent truncations of C ∩ M to the same length n for both e and u.
4.7.1 Some extensions So far we have considered CQ of the form of an interiority 0 ∈ int (C − M ), leading, via Proposition 5, to the determination of rates of convergence for our truncation scheme. If the CQ is weakened to the strong quasi relative interiority 0 ∈ sqri (C − M ) (meaning cone (C − M ) is a closed subspace), it is not immediately possible to apply Proposition 5. In this case, then, explicit convergence estimates may not be obtainable. We may still, however, derive limit statements of the form inf Cn ∩M f → inf C∩M f for reasonable f . To achieve this, we use an alternate result [13] on the Attouch–Wets convergence of sums of functions, based on a sqri-type CQ, but unfortunately, this result will not provide the estimates obtainable through Proposition 5. We proceed by first establishing the Attouch–Wets convergence of Cn ∩ M → C ∩ M using [13, Theorem 4.9]. In this context, the required CQ is: (1) 0 ∈ sqri (C − M ) and cone (C − M ) has closed algebraic complement Y ; and (2) that Y ∩ span (Cn − M ) = {0} for all n. Note that (1) implies (2) since Cn ⊆ C for all n, so we need only consider (1). The following two lemmas provide a sufficient condition for 0 ∈ sqri (C − M ). Lemma 17. Suppose that: 1 0 ∈ sqri (C (e) − M (0) ); ¯) for some (¯ e, u ¯) ∈ C ∩ M ; and 2 0 ∈ core (C (u) − u 3n ˆ has no zeros on the unit circle. Then 0 ∈ sqri (C − M ) and cone (C − M ) has a closed algebraic complement. Proof. By arguing as in Lemma 8, cone (C − M ) = cone (C (e) − M (0) ) × l1 . Thus it forms a closed subspace, so strong quasi relative interiority is established. Since M (0) − e¯ is a subspace of finite codimension in l1 , the complementary space Y0 to cone (C (e) − M (0) ) is finite dimensional and hence closed. Clearly then cone (C − M ) has the closed complement Y := Y0 × {0}. From Lemma 17 the problem is reduced to finding conditions under which 0 ∈ sqri (C (e) − M (0) ). (e)
Lemma 18. Let e¯ ∈ l1 and C (e) = {e ∈ l1 | Bi (e) (e) 0, 1, 2, . . . }, where the bounds Bi and Ai satisfy:
(e)
≤ ei ≤ Ai ; i =
86
R. Wenczel et al. (e)
(e)
1 Bi ≤ e¯i ≤ Ai for all i ∈ N; (e) (e) 2 i |Ai | and i |Bi | < ∞; (e) (e) 3 for a subsequence {ik }∞ ¯ik = Aik , and for all i not in k=0 we have Bik = e (e)
this subsequence, Bi
(e)
< e¯i < Ai .
Then 0 ∈ sqri (C (e) − M (0) ). Proof. After projecting into a suitable subspace of l1 , we shall follow the argument of Lemma 6. Let P denote the projection on l1 onto the subspace consisting of sequences vanishing on {ik }∞ k=0 . That is, (Pe)i := 0 for i ∈ and (Pe) := e otherwise. Evidently P is continuous and maps closed {ik }∞ i i k=0 sets to closed sets. Next, observe that (I − P)cone (C (e) − e¯) = cone (I − P)(C (e) − e¯) = {0}, since if e ∈ (C (e) − e¯) then for all k, we have eik ∈ (e) (e) [Bik − e¯ik , Aik − e¯ik ] = {0} so eik = 0, yielding (I − P)e = 0 by the definition of P. Thus we obtain cone (C (e) − M (0) ) = cone P(C (e) − M (0) ) + (I − P)(M (0) − e¯) .
(4.34)
If we can show that cone P(C (e) − M (0) ) = Pl1 then (4.34) would give cone (C (e) − M (0) ) = Pl1 + (I − P)(M (0) − e¯) = Pl1 + P(M (0) − e¯) + (M (0) − e¯) = Pl1 + (M (0) − e¯) which must be closed since Pl1 is closed and the linear subspace M (0) − e¯ has finite codimension. We now verify that cone P(C (e) −M (0) ) = Pl1 . (This part of the argument parallels that of Lemma 6.) Let ξ ∈ Pl1 , with −1 −1 := αK ξ 1 < αK
min
(e)
(e)
{|Ai − ξi |, |Bi
0≤i≤K−1; i∈{i / k}
− ξi |} > 0 .
There exists η ∈ RK ⊆ l1 such that η 1 ≤ αK ξ 1 < with Aη = Aξ. Then (e) (e) for i ∈ / {ik }, we have Bi < e¯i + ηi < Ai whenever i < K, and for i ≥ K, (e) (e) e¯i + ηi = e¯i ∈ [Bi , Ai ], whence P(η + e¯) ∈ PC (e) (or Pη ∈ P(C (e) − e¯)). (0) Also, η − ξ ∈ M − e¯ as A(η − ξ) = 0, implying Pη − ξ = Pη − Pξ ∈ P(M (0) − e¯). Thus ξ = Pη − (Pη − ξ) ∈ P(C (e) − e¯) − P(M (0) − e¯) = P(C (e) − M (0) ). −1 ) ∩ Pl1 ⊆ P(C (e) − M (0) ) whence This shows that B(0, αK (e) (0) cone P(C − M ) = Pl1 as required.
Corollary 5. Under the conditions of Lemmas 17 and 18, and for any convex closed real-valued f : l1 × l1 → R,
4
Convergence of truncates in l1 optimal feedback control
lim
87
inf f = inf f .
n→∞ Cn ∩M
C∩M
Proof. By the lemmas (and the cited result from [13]) the indicator functions δCn ∩M Attouch–Wets converge to δC∩M , and (since dom f has an interior) by any of the cited sum theorems, f + δCn ∩M → f + δC∩M also. The result then follows from Proposition 4. Our formulation of the control problem (see Section 4.4) was chosen to permit its re-expression as an optimization over l1 × l1 . This choice resulted from a wish to compare with the other methods we described in the introduction, which used duality theory. Accordingly, we sought to formulate minimizations over a space such as l1 , which has a nice dual (namely, l∞ ). However, this is not the only problem formulation that can be treated by the methods of this paper. Recall from Section 4.4 that we considered only stabilizing controllers K ∈ S(P ) for which the resulting control signal u was restricted to the subspace Xa ⊆ l∞ . This requirement will now be relaxed to u ∈ l∞ . This will be seen to entail only trivial changes to the main results of this section. (Note that in this case the resulting optimization is taken over l1 × l∞ , so duality methods would not be readily applicable, since the dual (l∞ )∗ has a complicated characterization.) The basic feasible set is now of the form F0 := {(e, u) ∈ l1 × l∞ | e = e(K)for some K ∈ S(P ) u = K ∗ e} where u is free to range through l∞ instead of its subspace Xa . If we define (where w ∈ l∞ and has rational z-transform) the altered sets ⎫ ⎧ (e, u) ∈ l1 × l∞ | ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ e ˆ (¯ p ) = 0 (¯ p pole of P : |¯ p | ≤ 1) i = 1,.., m ⎪ i i i 1 ⎪ ⎬ ⎨ M := eˆ(¯ zj ) = w(¯ ˆ zj ) (¯ zj zero of P : |¯ zj | ≤ 1) j = 1,.., m2 , ⎪ ⎪ ⎪ ⎪ ⎪ eˆ(¯ vk ) = 0 (¯ vk zero of w ˆ : |¯ vk | ≤ 1) k = 1,.., m3 ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎭ ⎩ d∗e+n∗u=w∗d ˆ rational, e = 0}, Mr := {(e, u) ∈ M | eˆ, u the argument of Lemma 2 again yields Mr = F0 . Accordingly, we may now frame a minimization problem over the subset C ∩ M of l1 × l∞ where C = C (e) × C (u) but now C (u) = {u ∈ l∞ | Bi
(u)
(u)
≤ ui ≤ Ai
for all i ∈ N} ⊆ l∞ .
For simplicity, we consider only the required changes to Theorem 2, since this result has a simpler statement than Theorem 1. In fact, with the assumptions of Theorem 2, but with f : l1 ×l∞ → R given by f (e, u) = e 1 +ζ u ∞ , and u ∞ , the form of Theorem 2 is unaltered, all references to ¯ u 1 converted to ¯
88
R. Wenczel et al.
except that the ρ0 of Lemma 14 is not available, but another value may be used (namely, ρ0 = (γ + 1) max{1, ζ −1 }). Theorem 3. Let f : l1 × l∞ → R have the form f (e, u) := e 1 + ζ u ∞ for some ζ > 0. Assume Conditions 1, 1 and 1 of Theorem 1 hold (where the interiority in Condition 1 is relative to the l∞ -norm) and Condition 1 is replaced by γ := inf f < ∞ for some n0 . Cn0 ∩M
Define as in (4.27), with 8 −1 9 1 −1 min αK , μ − κ d 1 [(1 + αK ) + ξ 1 + ¯ e 1 ] , 2 u ∞ } + 2s) and r := 2 max{1, ζ} (max{ + ξ 1 , μ + ¯ u ∞ + 2s} . t := max{ + ξ 1 + 2s, μ + ¯ s :=
Then for any fixed ρ satisfying ρ > max 2r + t, (γ + 1) max{1, ζ −1 },
∞
6 max
7
-
(e) (e) |Ai |, |Bi |
i=0
and all n ≥ n0 for which
(e)
(e)
max{|Ai |, |Bi |} < s ,
i≥n
we obtain * * 6 7 * * (e) (e) * inf f − inf f * ≤ 2r + s + ρ max |Ai |, |Bi | . * *Cn ∩M C∩M s i≥n
Proof. Since the form of the convergence theory of Section 4.5 (and the associated CQs) is insensitive to the particular norm used, all we need to do to obtain a proof in this context is to reinterpret all statements in u-space as referring instead to the norm · ∞ . The only places where changes occur u ∞ ; and instead of are in Lemma 7, where all references to ¯ u 1 become ¯ using ρ0 from Lemma 14, we exploit the form of f to ensure the existence of some ρ0 for which Condition (4.15) of Proposition 4 is valid.
4.8 Appendix Remark 7. (A comment on z-transforms) ˆ On a small Let w = {wn }∞ n=0 be a sequence with a rational z-transform w. neighborhood of 0 in the complex plane, this is given by a power series
4
Convergence of truncates in l1 optimal feedback control
89
∞
wn z n for such z. However, by rationality, w ˆ has extension to the whole plane by its functional form, and so is extended beyond the domain of convergence of the power series. ∞ For example, if w = (1, 1, ...), then for |z| < 1, w(z) ˆ = n=0 z n = 1/(1−z), but the function taking z to 1/(1 − z) is defined on the whole plane except at 1 and so constitutes an extension in the above sense. We shall always assume that such transforms have been so extended. n=0
¯ + uc c + us s and Proof of Lemma 1: Let (e, u) ∈ F0 and write u = u 1 c+w s, with u ¯ and w ¯ in l . The ensuing relation d∗e+n∗u = d∗w w = w+w ¯ c s implies that ˆe + n ˆ¯ − w ˆ¯ dˆ ∈ lˆ1 ˆ )ˆ c + (ws dˆ − us n ˆ )ˆ s ≡ dˆ ˆu (wc dˆ − uc n ¯ Since and so cannot have any poles in the closed disk D. cˆ(z) =
z sin θ 1 − z cos θ and sˆ(z) = , (z − a)(z − a ¯) (z − a)(z − a ¯)
it follows that z →
ˆ − uc n ˆ − us n (wc d(z) ˆ (z))(1 − z cos θ) + (ws d(z) ˆ (z))z sin θ (z − a)(z − a ¯)
¯ and hence none at all in C, and must be a polynomial (so has no poles in D that d ∗ (wc c + ws s) − n ∗ (uc c + us s) has finite length). If now θ = 0, so s = 0, the above amounts to stating that z →
ˆ − uc n ˆ (z) wc d(z) 1−z
ˆ is polynomial, from which we obtain uc = wc d(1)/ˆ n(1) = wc /P (1). Simˆ cn wc d−u ˆ polynomial, so that ilarly, if θ = π (so again s = 0), we have 1+· ˆ uc = wc d(−1)/ˆ n(−1) = wc /P (−1). For other values of θ, similar reasoning implies ˆ − uc n ˆ − us n ˆ (a))(1 − a cos θ) + (ws d(a) ˆ (a))a sin θ = 0. (wc d(a) Rerranging terms and taking real and imaginary parts yields us cos θ sin θ + uc sin2 θ = Re us sin2 θ − uc sin θ cos θ
ws a sin θ+wc (1−a cos θ) P (a)
= Im
and
ws a sin θ+wc (1−a cos θ) P (a)
,
with the right-hand side vanishing if P has a pole at a. Solving this for uc and us gives the desired relation.
90
R. Wenczel et al.
Proof of Lemma 2: It suffices to show that Mr ⊆ F, since the reverse ˆ n − eˆ/(ˆ nw ˆ d). inclusion is easy to demonstrate. Let (e, u) ∈ Mr . Define R := yˆ/ˆ ˆ Then R = yˆ/ˆ n since e = 0. If we can prove that R ∈ R∞ then eˆ = w ˆ d(ˆ y −Rˆ n) ˆ x + Rd) ˆ and thus ˆ d(ˆ and from the convolution relation, u ˆ = βc cˆ + βs sˆ + w (e, u) ∈ F. ¯ the only candidates for poles of R are the Now, in the closed unit disk D, poles/zeros of P and the zeros of w. ˆ The proof now proceeds by successive elimination of these possibilities. ˆ p) = 0 and n If p¯ is such a pole (so d(¯ ˆ (¯ p) = 0), then, if p¯ is not a pole for w, ˆ w(¯ ˆ p) is finite and nonzero (from the assumptions on pole/zero positioning), ˆ which itself is implied so the nonsingularity of R at p¯ follows from that of eˆ/d, by the interpolation constraint eˆ(¯ p) = 0, whereby eˆ has a zero at p¯ cancelling ˆ If p¯ is also a pole of w the simple zero at p¯ for d. ˆ (so p¯ = a or p¯ = a ¯) then ˆ is finite and nonzero at a and a w ˆ d(·) ¯, so R has no pole there, regardless of the value of eˆ(a). ˆ z ) = 0 and n ¯ for P (so d(¯ ˆ (¯ z ) = 0), then again w ˆ is If z¯ is a zero in D finite and nonzero here. Now R is expressible as nˆ1dˆ(1 − weˆˆ ) − xdˆˆ at least in a punctured neighborhood of z¯, where we used the relation x ˆn ˆ + yˆdˆ = 1. The interpolation constraint eˆ(¯ z ) = w(¯ ˆ z ) = 0 means that 1 − weˆˆ has a zero at z¯, cancelling the simple zero of n ˆ dˆ there. Again the singularity of R here is removable. If v¯ is a zero of w, ˆ it is neither a pole nor a zero of P (by assumption) so ˆ v ) are both nonzero. The constraint eˆ(¯ n ˆ (¯ v ) and d(¯ v ) = 0 then implies that eˆ/w, ˆ and hence R, has a removable singularity there. Thus R has no poles in ¯ as claimed. D The proof of Lemma 4 will follow from the next two elementary lemmas. 1 (C) and let |a| < 1 be a zero of fˆ. Then fˆ(·)/(· − a) ∈ Lemma 19. Let fˆ ∈ l 1 (C) and is in l 1 if fˆ ∈ l 1 and a ∈ R. l
Proof. The indicated function is analytic on a neighborhood of 0 in the plane, so q := Z −1 [fˆ(·)/(· − a)] exists as a complex sequence. Assume now that a = 0 (the argument for a = 0 is trivial and hence omitted). Then since q = Z −1 [1/(· − a)] ∗ f = − a1 (1, a1 , a12 , . . . ) ∗ f , 1 qk = − ak+1
k j=0
fj aj =
1 ak+1
j>k
fj aj since fˆ(a) = 0.
Then, where an interchange of the order of summation occurs below, ∞ ∞ 1 1 j |f ||a| = |fj ||a|j q 1 ≤ j k+1 |a| |a|k+1 j=1 k=0 j>k k<j , + ∞ f 1 1 j ≤ . ≤ |fj ||a| j (1 − |a|) |a| 1 − |a| j=1
4
Convergence of truncates in l1 optimal feedback control
1 (C) and |a| > 1, then Lemma 20. If fˆ ∈ l and a ∈ R.
fˆ(·) ·−a
91
1 (C) and is in l 1 if fˆ ∈ l 1 ∈ l
Proof. As in the preceding lemma, let q be the inverse transform of the function in question. From the expression for qk given there, and an interchange of summations, ∞ ∞ 1 f 1 1 = |fj ||a|j = |fj ||a|j q 1 ≤ 1 . k+1 |a| 1 − |a| |a|j 1 − 1 j=0 j=0 k≥j |a|
Proof of Lemma 4: By Lemma 20, no generality is lostby assuming that ¯i ) p has no zeros outside the open unit disk. Write p(z) = C i (z − ai )(z − a ¯i are the zeros of p, with the understanding that for real ai where ai and a we only include the single factor z − ai in the above product. Also, we allow the possibility that the ai are nondistinct. Now, as fˆ(a1 ) = 0, Lemma 19 1 (C). If a is real, then the function is in l 1 —if implies that fˆ(·)/(· − a1 ) ∈ l 1 complex, then a1 = a ¯1 and since fˆ(¯ a1 ) = 0, so fˆ(·)/(· − a1 ) has a zero at a ¯1 fˆ(·) 1 (C) and hence is also in l 1 since it and by Lemma 19 again, ∈ l (·−a1 )(·−¯ a1 )
is a symmetric function of z. Continue inductively to complete the proof. Proof of Lemma 3: Let b1 , .., bp ∈ R and a1 , .., aq ∈ C\IR denote the ¯ the zeros of w ¯ along with the zeros collection of poles/zeros of P in D, ˆ in D, ¯ of P outside D. From the assumed distinctness of these, it follows that the corresponding square matrix ⎞ ⎛ 1 b1 · · · bp+2q−1 1 ⎟ ⎜. .. .. ⎟ ⎜ .. . . ⎟ ⎜ ⎜1 b p+2q−1 ⎟ · · · bp p ⎟ ⎜ ⎜ p+2q−1 ⎟ ⎟ ⎜ 1 Re a1 · · · Re a1 ⎟ ⎜. .. .. ⎟ ⎜. ⎟ ⎜. . . ⎜ p+2q−1 ⎟ ⎟ ⎜ 1 Re aq · · · Re aq ⎟ ⎜ ⎟ ⎜ 0 Im a1 · · · Im ap+2q−1 1 ⎟ ⎜ .. .. ⎟ ⎜ .. ⎠ ⎝. . . 0 Im aq · · · Im ap+2q−1 q has full rank over R. ¯ is a zero of P , then w(¯ Now, note that if z¯ ∈ / D ˆ z ) is finite. To see this, ˆ note that d(¯ z ) = 0, so by (2), (w − wc c − ws s)ˆ has no pole at z¯, and hence ˆ j) neither has w, ˆ since z¯ equals neither a nor its conjugate. Thus w(a ˆ i ) and w(b are all finite. By the surjectivity of the above matrix, and constructing an ˆ j ), we can find a nonzero appropriate real (p + 2q)-vector from w(a ˆ i ) and w(b e ∈ l1 of finite length such that all the interpolation constraints of M are satisfied, with, furthermore, ¯ of P . eˆ(¯ z ) = w(¯ ˆ z )at each zero z¯ ∈ /D
92
R. Wenczel et al.
We now seek u ∈ l1 of finite length such that u ˆ= =
ˆe w ˆ dˆ − n ˆ (βc cˆ + βs sˆ) − dˆ n ˆ 6 7 ˆ ˆe ˆ ¯ d + d(wc cˆ + ws sˆ) − n w ˆ (βc cˆ + βs sˆ) − dˆ n ˆ
where w ¯ := w − wc c − ws s. From the definition of βc and βs (see Lemma 1), ˆ c cˆ + ws sˆ) − n it follows that d(w ˆ (βc cˆ + βs sˆ) has no pole at a (nor at a ¯) and so must be polynomial, and hence the numerator in the above expression for ¯ dˆ and eˆ are. To show that u u ˆ is polynomial, since w ˆ is polynomial, we only need to see that the numerator has a zero at each zero of n ˆ . (Recall that we assumed all these rational transforms to be extended by their functional form to the whole complex plane, as in Remark 7). Indeed, let z¯ be a zero of n ˆ (that is, of P ). Then z¯ = a, a ¯ and eˆ(¯ z ) = w(¯ ˆ z ), and since cˆ(¯ z ) and sˆ(¯ z ) are both finite, the numerator evaluated at z¯ is ˆ z) ˆ z )−d(¯ ˆ z )(wc cˆ(¯ ¯ z )d(¯ z ) + ws sˆ(¯ z )) + n ˆ (¯ z )(βc cˆ(¯ z ) + βs sˆ(¯ z )) − eˆ(¯ z )d(¯ w(¯ ˆ z ) − d(¯ ˆ z )(wc cˆ(¯ ˆ z )d(¯ ˆ z) ¯ z )d(¯ = w(¯ z ) + ws sˆ(¯ z )) − φ(¯ ˆ z) = 0 = (w(¯ ˆ z ) − eˆ(¯ z ))d(¯ as claimed. Thus u has finite length.
References 1. H. Attouch, Variational Convergence for Functions and Operators, Applicable Mathematics Series (Pitman, London, 1984). 2. H. Attouch and R. J.–B. Wets, (1991), Quantitative stability of variational systems: I. The epigraphical distance, Trans. Amer. Math. Soc., 328 (1991), 695–729. 3. H. Attouch and R. J.–B. Wets, (1993), Quantitative stability of variational systems: II. A framework for nonlinear conditioning, SIAM J. Optim., 3, No. 2 (1993), 359–381. 4. H. Attouch and R. J.–B. Wets, Quantitative stability of variational systems: III. – approximate solutions, Math. Programming, 61 (1993), 197–214. 5. D. Az´ e and J.–P. Penot, Operations on convergent families of sets and functions, Optimization, 21, No. 4 (1990), 521–534. 6. S. P. Boyd and C. H. Barratt, Linear Controller Design: Limits of Performance (Prentice–Hall, Englewood Cliffs, NJ 1991). 7. G. Beer, Topologies on Closed and Closed Convex Sets, Mathematics and its Applications, 268 (Kluwer Academic Publishers, Dordrecht, 1993). 8. J. M. Borwein and A. S. Lewis, Partially–finite convex programming, Part I: Quasi relative interiors and duality theory, Math. Programming, 57 (1992), 15–48. 9. M. A. Dahleh and J. B. Pearson, l1 -Optimal feedback controllers for MIMO discrete– time systems, IEEE Trans. Automat. Control, AC-32 (1987), 314–322. 10. M. A. Dahleh and I. J. Diaz-Bobillo, Control of Uncertain Systems: A Linear Programming Approach (Prentice–Hall, Englewood Cliffs, NJ 1995). 11. G. Deodhare and M. Vidyasagar, Control system design by infinite linear programming, Internat. J. Control, 55, No. 6 (1992), 1351–1380.
4
Convergence of truncates in l1 optimal feedback control
93
12. C. A. Desoer and M. Vidyasagar, Feedback Systems: Input–Output Properties (Academic Press, New York, 1975). 13. A. C. Eberhard and R. B. Wenczel, Epi–distance convergence of parametrised sums of convex functions in non–reflexive spaces, J. Convex Anal., 7, No. 1 (2000), 47–71. 14. N. Elia and M. A. Dahleh, Controller design with multiple objectives, IEEE Trans. Automat. Control, AC-42, No. 5 (1997), 596–613. 15. R. D. Hill, A. C. Eberhard, R. B. Wenczel and M. E. Halpern, Fundamental limitations on the time–domain shaping of response to a fixed input, IEEE Trans. Automat. Control, AC-47, No. 7 (2002), 1078–1090. 16. R. B. Holmes, Geometric Functional Analysis and its Applications, Graduate Texts in Mathematics, 24 (Springer-Verlag, New York, 1975). 17. V. Jeyakumar, Duality and infinite–dimensional optimization, Nonlinear Analysis: Theory, Methods and Applications, 15, (1990), 1111–1122. 18. V. Jeyakumar and H. Wolkowicz, Generalizations of Slater’s constraint qualification for infinite convex programs, Math. Programming, 57 (1992), 85–102. 19. J. S. McDonald and J. B. Pearson, l1 –Optimal control of multivariable systems with output norm constraints, Automatica J. IFAC, 27, No. 2 (1991), 317–329. 20. R. T. Rockafellar, Conjugate Duality and Optimization (SIAM, Philadelphia, PA, 1974). 21. R. T. Rockafellar, Convex Analysis (Princeton University Press, Princeton, NJ, 1970). 22. R. T. Rockafellar and R. J.–B. Wets, Variational systems, an introduction, Multifunctions and Integrands, G. Salinetti (Ed.), Springer-Verlag Lecture Notes in Mathematics No. 1091 (1984), 1–54. 23. H. Rotstein, Convergence of optimal control problems with an H∞–norm constraint, Automatica J. IFAC, 33, No. 3 (1997), 355–367. 24. M. Vidyasagar, Control System Synthesis (MIT Press, Cambridge MA, 1985). 25. M. Vidyasagar, Optimal rejection of persistent bounded disturbances, IEEE Trans. Automat. Control, AC-31 (1986), 527–534. 26. R. B. Wenczel, PhD Thesis, Department of Mathematics, RMIT, 1999. 27. R. B. Wenczel, A. C. Eberhard and R. D. Hill, Comments on “Controller design with multiple objectives,” IEEE Trans. Automat. Control, 45, No. 11 (2000), 2197–2198.
Chapter 5
Asymptotical stability of optimal paths in nonconvex problems Musa A. Mamedov
Abstract In this chapter we study the turnpike property for the nonconvex optimal control problems described by the differential inclusion x˙ ∈ a(x). We T study the infinite horizon problem of maximizing the functional 0 u(x(t)) dt as T grows to infinity. The purpose of this chapter is to avoid the convexity conditions usually assumed in turnpike theory. A turnpike theorem is proved in which the main conditions are imposed on the mapping a and the function u. It is shown that these conditions may hold for mappings a with nonconvex images and for nonconcave functions u. Key words: Turnpike property, differential inclusion, functional
5.1 Introduction and background Let x ∈ Rn and Ω ⊂ Rn be a given set. Denote by Πc (Rn ) the set of all compact subsets of Rn . We consider the following problem: ·
x ∈ a(x), x(0) = x0 ,
(5.1)
T u(x(t))dt → max .
JT (x(·)) =
(5.2)
0
Here x0 ∈ Ω ⊂ Rn is an assigned initial point. The multivalued mapping a : Ω → Πc (Rn ) has compact images and is continuous in the Hausdorff Musa A. Mamedov School of Information Technology and Mathematical Sciences, University of Ballarat, Victoria 3353, AUSTRALIA e-mail: musa [email protected] C. Pearce, E. Hunt (eds.), Structure and Applications, Springer Optimization and Its Applications 32, DOI 10.1007/978-0-387-98096-6 5, c Springer Science+Business Media, LLC 2009
95
96
M.A. Mamedov
metric. We assume that at every point x ∈ Ω the set a(x) is uniformly locally–connected (see [3]). The function u : Ω → R1 is a given continuous function. In this chapter we study the turnpike property for the problem given by (6.1) and (5.2). The term ‘turnpike property’ was first coined by Samuelson in 1958 [16] when he showed that an efficient expanding economy would spend most of its time in the vicinity of a balanced equilibrium path. This property was further investigated by Radner [13], McKenzie [11], Makarov and Rubinov [7] and others for optimal trajectories of a von Neuman–Gale model with discrete time. In all these studies the turnpike property was established under some convexity assumptions. In [10] and [12] the turnpike property was defined using the notion of statistical convergence (see [4]) and it was proved that all optimal trajectories have the same unique statistical cluster point (which is also a statistical limit point). In these works the turnpike property is proved when the graph of the mapping a does not need to be a convex set. The turnpike property for continuous-time control systems has been studied by Rockafellar [14], [15], Cass and Shell [1], Scheinkman [18], [17] and others who imposed additional conditions on the Hamiltonian. To prove a turnpike theorem without these kinds of additional conditions has become a very important problem. This problem has recently been further investigated by Zaslavsky [19], [21], Mamedov [8], [9] and others. The theorem proved in the current chapter was first given as a short note in [8]. In this work we give the proof of this theorem and explain the assumptions used in the examples.
Definition 1. An absolutely continuous function x(·) is called a trajectory (solution) to the system (6.1) in the interval [0, T ] if x(0) = x0 and almost · everywhere on the interval [0, T ] the inclusion x (t) ∈ a(x(t)) is satisfied. We denote the set of trajectories defined on the interval [0,T] by XT and let JT∗ =
sup JT (x(·)). x(·)∈XT
We assume that the trajectories of system (5.1) are uniformly bounded, that is, there exists a number L < +∞ such that |x(t)| ≤ L for all t ∈ [0, T ], x(·) ∈ XT , T > 0.
(5.3)
Note that in this work we focus our attention on the turnpike property of optimal trajectories. So we do not consider the existence of bounded trajectories defined on [0, ∞]. This issue has been studied for different control problems by Leizarowitz [5], [6], Zaslavsky [19], [20] and others. Definition 2. The trajectory x(·) is called optimal if J(x(·)) = JT∗ and is called ξ-optimal (ξ > 0) if
5
Asymptotical stability of optimal paths in nonconvex problems
97
J(x(·)) ≥ JT∗ − ξ. Definition 3. The point x is called a stationary point if 0 ∈ a(x). Stationary points play an important role in the study of the asymptotical behavior of optimal trajectories. We denote the set of stationary points by M: M = {x ∈ Ω : 0 ∈ a(x)}. We assume that the set M is bounded. This is not a hard restriction, because we consider uniformly bounded trajectories and so the set Ω can be taken as a bounded set. Since the mapping a(·) is continuous the set M is also closed. Then M is a compact set. Definition 4. The point x∗ ∈ M is called an optimal stationary point if u(x∗ ) = max u(x). x∈M
In turnpike theory it is usually assumed that the optimal stationary point x∗ is unique. We also assume that the point x∗ is unique, but the method suggested here can be applied in the case when we have several different optimal stationary points.
5.2 The main conditions of the turnpike theorem Turnpike theorems for the problem (6.1), (5.2) have been proved in [14], [18] and elsewhere, where it was assumed that the graph of the mapping a is a compact convex set and the function u is concave. The main conditions are imposed on the Hamiltonian. In this chapter a turnpike theorem is presented in which the main conditions are imposed on the mapping a and the function u. Here we present a relation between a and u which provides the turnpike property without needing to impose conditions such as convexity of the graph of a and of the function u. On the other hand this relation holds if the graph of a is a convex set and the function u is concave. Condition M There exists b < +∞ such that for every T > 0 there is a trajectory x(·) ∈ XT satisfying the inequality JT (x(·)) ≥ u∗ T − b. Note that satisfaction of this condition depends in an essential way on the initial point x0 , and in a certain sense it can be considered as a condition for the existence of trajectories converging to x∗ . Thus, for example, if there exists a trajectory that hits x∗ in a finite time, then Condition M is satisfied. Set B = {x ∈ Ω : u(x) ≥ u∗ }.
98
M.A. Mamedov
We fix p ∈ Rn , p = 0, and define a support function c(x) = max py. y∈a(x)
Here the notation py means the scalar product of the vectors p and y. By |c| we will denote the absolute value of c. We also define the function ϕ(x, y) =
u(x)−u∗ |c(x)|
+
u(y)−u∗ c(y) .
Condition H There exists a vector p ∈ Rn such that H1 c(x) < 0 for all x ∈ B, x = x∗ ; x) > 0; H2 there exists a point x ˜ ∈ Ω such that p˜ x = px∗ and c(˜ H3 for all points x, y, for which px = py, c(x) < 0, c(y) > 0, the inequality ϕ(x, y) < 0 is satisfied; and also if xk → x∗ , yk → y = x∗ , pxk = pyk , c(xk ) < 0 and c(yk ) > 0, then
lim supk→∞ ϕ(xk , yk ) < 0.
Note that if Condition H is satisfied for any vector p then it is also satisfied for all λp, (λ > 0). That is why we assume that ||p|| = 1. Condition H1 means that derivatives of the system (6.1) are directed to one side with respect to p, that is, if x ∈ B, x = x∗ , then py < 0 for all y ∈ a(x). It is also clear that py ≤ 0 for all y ∈ a(x∗ ) and c(x∗ ) = 0. Condition H2 means that there is a point x ˜ on the plan {x ∈ Rn : p(x − ∗ y > 0 for some y˜ ∈ a(˜ x). This is not a restrictive x ) = 0} such that p˜ assumption, but the turnpike property may be not true if this condition does not hold. The main condition here is H3. It can be considered as a relation between the mapping a and the function u which provides the turnpike property. Note that Conditions H1 and H3 hold if the graph of the mapping a is a convex set (in Rn × Rn ) and the function u is strictly concave. In the next example we show that Condition H can hold for mappings a without a convex graph and for functions u that are not strictly concave (in this example the function u is convex). Example 1 Let x = (x1 , x2 ) ∈ R2 and the system (6.1) have the form ·
x1 = λ[x21 + (x22 + 1)2 + w], ·
x2 = f (x1 , x2 , v),
− 1 ≤ w ≤ +1, v ∈ U ⊂ Rm .
Here λ > 0 is a positive number, the function f (x1 , x2 , v) is continuous and f (0, 0, v˜) = 0 for some v˜ ∈ U.
5
Asymptotical stability of optimal paths in nonconvex problems
99
The mapping a can be written as a(x) = { y = (y1 , y2 ) : y1 = λ[x21 + (x22 + 1)2 + w], y2 = f (x1 , x2 , v), x = (x1 , x2 ) ∈ R2 , − 1 ≤ w ≤ +1, v ∈ U }. The function u is given in the form u(x) = cx2 + dx2k 1 , where
d > 0, c ≥ 2d, k ∈ {1, 2, 3, ...}.
We show that Condition H holds. It is not difficult to see that the set of stationary points M contains the point (0, 0) and also M ⊂ B1 (0, −1), where B1 (0, −1) represents the sphere with center (0, −1) and with radius 1. We have u∗ = max u(x) = u(0, 0) = 0. x∈M
Therefore x∗ = (0, 0) is a unique optimal stationary point. We fix the vector p = (−1, 0) and calculate the support function c(x) = maxy∈a(x) py : c(x) = −λ(x21 + x22 + 2x2 ). Take any point x = (x1 , x2 ) ∈ B = {x : u(x) > 0} such that x = / B1 (0, −1) and therefore c(x) < 0. Then Condition x∗ = (0, 0). Clearly x ∈ H1 holds. Condition H2 also holds, because, for example, for the point x ˜= (0, −1) for which p˜ x = 0 we have c(˜ x) = λ > 0. Now we check Condition H3. Take any two points x = (x1 , x2 ), y = (y1 , y2 ) for which px = py, c(x) < 0, c(y) > 0. If u(x) < 0, from the expression of the function φ(x, y) we obtain that φ(x, y) < 0. Consider the case u(x) ≥ 0. From px = py we have x1 = y1 . Denote ξ = x1 = y1 . Since c(y) > 0 and λ > 0 we obtain ξ 2 + (y2 + 1)2 < 1. Therefore 0 < ξ < 1 and y2 + (1/2)ξ2 < 0. On the other hand u(y) − u∗ = cy2 + dξ 2k < cy2 + dξ 2 ≤ c[y2 + (1/2)ξ 2 ]. Since c(x) < 0 and u(x) ≥ 0 then |c(x)| = λ(ξ 2 +x22 +2x2 ), x2 +(1/2)ξ 2 + (1/2)x22 > 0, u(x) − u∗ = cx2 + dξ 2k < cx2 + dξ 2 ≤ c[x2 + (1/2)ξ 2 ]. Thus c(x2 + (1/2)ξ 2 ) c u(x) − u∗ < ≤ , |c(x)| λ(ξ 2 + x22 + 2x2 ) 2λ c(y2 + (1/2)ξ 2 ) c u(y) − u∗ < ≤− . c(y) −λ(ξ 2 + y22 + 2y2 ) 2λ
100
M.A. Mamedov
From these inequalities we have ϕ(x, y) < 0, that is, the first part of H3 holds. The second part of Condition H3 may also be obtained from these inequalities. Therefore Condition H holds. We now formulate the main result of the current chapter. Theorem 1. Suppose that Conditions M and H are satisfied and that the optimal stationary point x∗ is unique. Then: 1. there exists C < +∞ such that
T
(u(x(t)) − u∗ )dt ≤ C
0
for every T > 0 and every trajectory x(t) ∈ XT ; 2. for every ε > 0 there exists Kε,ξ < +∞ such that meas {t ∈ [0, T ] : ||x(t) − x∗ || ≥ ε} ≤ Kε,ξ for every T > 0 and every ξ-optimal trajectory x(t) ∈ XT ; 3. if x(t) is an optimal trajectory and x(t1 ) = x(t2 ) = x∗ , then x(t) ≡ x∗ for t ∈ [t1 , t2 ]. The proof of this theorem is given in Section 7, and in Sections 3 to 6 we give preliminary results.
5.3 Definition of the set D and some of its properties In this section a set D is introduced. This set will be used in all sections below. Denote M∗ = {x ∈ Ω : c(x) ≥ 0}. Clearly M ⊂ M∗ . We recall that B = {x ∈ Ω : u(x) ≥ u∗ }. Consider a compact set D ⊂ Ω for which the following conditions hold: a) x ∈ int D for all x ∈ B, x = x∗ ; b) c(x) < 0 for all x ∈ D, x = x∗ ; c) D ∩ M∗ = {x∗ } and B ⊂ D. It is not difficult to see that there exists a set D with properties a), b) and c). For example such a set can be constructed as follows. Let x ∈ B, x = x∗ . Then c(x) < 0. Since the mapping a is continuous in the Hausdorff metric the function c(x) is continuous too. Therefore there exists εx > 0 such that c(x ) < 0 for all x ∈ Vεx (x) ∩ Ω. Here Vε (x) represents the open ε-neighborhood of the point x. In this case for the set
5
Asymptotical stability of optimal paths in nonconvex problems
D = cl
⎧ ⎨
/
⎩
x∈B,x =x∗
101
⎫ ⎬ V 21 εx (x) ∩ Ω ⎭
Conditions a) to c) are satisfied. Lemma 1. For every ε > 0 there exists νε > 0 such that u(x) ≤ u∗ − νε for every x ∈ Ω, x ∈ / int D and ||x − x∗ || ≥ ε. Proof. Assume to the contrary that for any ε > 0 there exists a sequence xk such that xk ∈ / int D, ||xk − x∗ || ≥ ε and u(xk ) → u∗ as k → ∞. Since the / int D sequence xk is bounded it has a limit point, say x . Clearly x = x∗ , x ∈ and also u(x ) = u∗ , which implies x ∈ B. This contradicts Condition a) of the set D. Lemma 2. For every ε > 0 there exists ηε > 0 such that c(x) < −ηε and for all x ∈ D, ||x − x∗ || ≥ ε.
Proof. Assume to the contrary that for any ε > 0 there exists a sequence xk such that xk ∈ D, ||xk − x∗ || ≥ ε and c(xk ) → 0. Let x be a limit point of the sequence xk . Then x ∈ D, x = x∗ and c(x ) = 0. This contradicts Property b) of the set D.
5.4 Transformation of Condition H3 In this section we prove an inequality which can be considered as a transformation of Condition H3. Take any number ε > 0 and denote Xε = {x : ||x − x∗ || ≥ ε}. Consider the sets D ∩ Xε and M∗ ∩ Xε . Let x ∈ D ∩ Xε , y ∈ M∗ ∩ Xε and px = py, with c(x) < 0 and c(y) > 0. From Condition H3 it follows that ϕ(x, y) < 0.
(5.4)
We show that for every ε > 0 there exists δε > 0 such that ϕ(x, y) < −δε
(5.5)
for all x ∈ D (x = x∗ ), y ∈ M∗ ∩ Xε , for which px = py, c(x) < 0 and c(y) > 0.
102
M.A. Mamedov
First we consider the case where x ∈ D ∩ Xε . In this case if (5.5) is not true then there exist sequences (xn ) and (yn ), for which pxn = pyn , c(xn ) < 0, c(yn ) > 0, xn ∈ D ∩ Xε , yn ∈ M∗ ∩ Xε and ¯, yn → y¯, ϕ(xn , yn ) → 0. xn → x From Lemma 17.4.1 it follows that the sequence {(u(xn ) − u∗ )/|c(xn )|} is bounded. Since ϕ(xn , yn ) → 0, the sequence {(u(yn ) − u∗ )/c(yn )} is also bounded and therefore from Lemma 17.3.1 we obtain c(¯ y ) > 0. We also obtain c(¯ x) < 0 from the inclusion x ¯ ∈ D ∩ Xε . Thus the function ϕ(x, y) is continuous at the point (¯ x, y¯). Then from ϕ(xn , yn ) → 0 it follows that ϕ(¯ x, y¯) = 0, which contradicts (5.4). We now consider the case where x ∈ D, x = x∗ . Assume that (5.5) does not hold. Then there exist sequences (xn ) and (yn ), for which pxn = ¯, yn → y¯, pyn , c(xn ) < 0, c(yn ) > 0, xn ∈ D, yn ∈ M∗ ∩ Xε and xn → x ¯ = x∗ , we have a contradiction similar to the first case. If ϕ(xn , yn ) → 0. If x x ¯ = x∗ , taking the inequality y¯ = x∗ into account we obtain a contradiction to the second part of Condition H3. Thus we have shown that (5.5) is true. Define the function ϕ(x, y)δ1 ,δ2 =
u(x) − u∗ u(y) − u∗ + , |c(x)| + δ1 c(y) + δ2
(δ1 ≥ 0, δ2 ≥ 0).
Since the support function c(·) is continuous, from Conditions H2 and H3 it follows that there exists a number b ∈ (0, +∞) such that u(x) − u∗ ≤b |c(x)|
for all x ∈ D, x = x∗ .
(5.6)
For a given ε > 0 we choose a number γ(ε) > 0 such that b−
νε ≤ −δε ; γε
(5.7)
here the number νε is defined by Lemma (17.3.1). Clearly u(y) − u∗ ≤ −νε
for all y ∈ M∗ ∩ Xε .
(5.8)
By using γ(ε) we divide the set M∗ ∩ Xε ∩ {y : c(y) > 0} into two parts: 1 γ(ε)}, 2 1 Y2 = {y ∈ M∗ ∩ Xε ∩ {y : c(y) > 0} : c(y) < γ(ε)}. 2
Y1 = {y ∈ M∗ ∩ Xε ∩ {y : c(y) > 0} : c(y) ≥
5
Asymptotical stability of optimal paths in nonconvex problems
Consider the set Y2 . Denote δ¯2 = Then u(x) − u∗ ≤b |c(x)| + δ1
1 2
103
γ(ε) and take any number δ¯1 > 0.
for all x ∈ D, x = x∗ , δ1 ≤ δ¯1
(5.9)
and
1 1 c(y) + δ2 ≤ c(y) + δ¯2 < γ(ε) + γ(ε) = γ(ε) 2 2 for all 0 ≤ δ2 ≤ δ¯2 , y ∈ Y2 . Using (5.8) we obtain νε νε u(y) − u∗ ≤− ≤− c(y) + δ2 c(y) + δ2 γ(ε)
for all y ∈ Y2 .
(5.10)
Thus from (5.9) and (5.10) we have ϕ(x, y)δ1 ,δ2 ≤ b −
νε ≤ − δε γ(ε)
(5.11)
for all (x, y) and (δ1 , δ2 ) satisfying x ∈ D, x = x∗ , y ∈ Y2 , c(x) < 0, c(y) > 0, 1 0 ≤ δ1 ≤ δ¯1 , 0 ≤ δ2 ≤ δ¯2 = γ(ε). 2 Now consider the set Y1 . Since Y1 is a bounded closed set and c(y) ≥ γ(ε) > 0 for all y ∈ Y1 , then the function (u(y) − u∗ )/(c(y) + δ2 ) is uniformly continuous with respect to (y, δ2 ) on the closed set Y1 × [0, L], where L > 0. That is why for a given number δε there exists δˆ¯21 (ε) such that 1 2
1 u(y) − u∗ u(y) − u∗ ≤ δε − c(y) + δ2 c(y) 2
for all y ∈ Y1 , 0 ≤ δ2 ≤ δˆ¯21 (ε).
On the other hand u(x) − u∗ u(x) − u∗ ≤ |c(x)| + δ1 |c(x)|
for all x ∈ D, x = x∗ , u(x) ≥ u∗ , 0 ≤ δ1 ≤ δ¯1 .
Then if px = py we have ϕ(x, y)δ1 ,δ2 ≤ ϕ(x, y) +
1 1 1 δε ≤ − δ ε + δε = − δε , 2 2 2
for all x ∈ D, x = x∗ , u(x) ≥ u∗ , y ∈ Y1 , px = py, 0 ≤ δ1 ≤ δ¯1 and 0 ≤ δ2 ≤ δˆ¯21 (ε).
(5.12)
104
M.A. Mamedov
Now consider the case when x ∈ D and u(x) < u∗ . Since the function c(y) is bounded on the set Y1 , then for ε > 0 there exist δˆ¯22 (ε) > 0 and δˆε > 0 such that 1 u(y) − u∗ ≤ − δˆε c(y) + δ2 2
for all y ∈ Y1 , 0 ≤ δ2 ≤ δˆ¯22 (ε).
Then ϕ(x, y)δ1 ,δ2 ≤
u(y) − u∗ 1 ≤ − δˆε , c(y) + δ2 2
(5.13)
for all x ∈ D, u(x) < u∗ , y ∈ Y1 , px = py, c(x) < 0, 0 ≤ δ1 ≤ δ¯1 and 0 ≤ δ2 ≤ δˆ¯22 (ε). Therefore from (5.11) to (5.13) we have ϕ(x, y)δ1 ,δ2 ≤ − δ¯ε ,
(5.14)
for all x ∈ D, x = x∗ , y ∈ M∗ ∩ Xε , px = py, c(y) > 0, 0 ≤ δ1 ≤ δ¯1 and 0 ≤ δ2 ≤ δ2 (ε). Here δ¯ε = min{ 12 δε ,
1 2
δˆε } and δ2 (ε) = min{ 12 γ(ε), δ¯ˆ21 (ε), δ¯ˆ22 (ε)}.
Consider the function (u(x) − u∗ )/(|c(x)| + δ1 ) with respect to (x, δ1 ). From Lemma 17.4.1 we obtain that this function is continuous on the set (D ∩ Xεˆ) × [0, δ¯1 ] for all εˆ > 0. Then for a given number 12 δ¯ε > 0 there exists η = η(ε, εˆ) > 0 such that u(x ) − u∗ 1 u(x) − u∗ − ≤ δ¯ε |c(x)| + δ1 |c(x )| + δ1 2 for all x ∈ cl (Vη (x )) = {x : ||x − x || ≤ η}, x ∈ D ∩ Xε and 0 ≤ δ1 ≤ δ¯1 . If px = py then we have 1¯ 1 δε ≤ − δ¯ε , 2 2 ∗ for all x ∈ cl (Vη (x )), x ∈ D ∩ Xε , y ∈ M ∩ Xε , px = py, c(y) > 0, 0 ≤ δ1 ≤ δ¯1 and 0 ≤ δ2 ≤ δ2 (ε). Denote δ (ε) = min{ 12 δ¯ε , δ2 (ε)}. Obviously δ (ε) > 0 if ε > 0 and also for every ε > 0 there exists δ > 0 such that δ (ε) ≥ δ for all ε ≥ ε . Therefore there exists a continuous function δ (ε), with respect to ε, such that ϕ(x, y)δ1 ,δ2 ≤ − ϕ(x , y)δ1 ,δ2 +
• δ (ε) ≤ δ (ε) for all ε > 0 and • for every ε > 0 there exists δ > 0 such that δ (ε) < δ for all ε ≥ ε .
5
Asymptotical stability of optimal paths in nonconvex problems
105
Let x ∈ D and y ∈ M∗ . Taking εˆ = ||x − x∗ || and e = ||y − x∗ ||, we define the functions δ(y) = δ (ε) = δ (||y − x∗ ||) and η(x , y) = η(ε, εˆ) = η(||y − x∗ ||, ||x − x∗ ||). Clearly in this case the function δ(y) is continuous with respect to y. Thus the following lemma is proved. Lemma 3. Assume that at the point x ∈ D, y ∈ M∗ we have px = py, c(x) < 0 and c(y) > 0. Then for every point x and numbers δ1 , δ2 satisfying
x ∈ cl Vη(x ,y) (x ) , c(x) < 0, 0 ≤ δ1 ≤ δ¯1 and 0 ≤ δ2 ≤ δ2 (y), the following inequality holds: u(y) u(x) + ≤ u∗ |c(x)| + δ1 c(y) + δ2
+
1 1 + |c(x)| + δ1 c(y) + δ2
, − δ(y).
Here the functions η(x , y) and δ(y) are such that δ(y) is continuous and ∧
∼
∧
for every ε > 0, ε> 0 there exist δ ε > 0 and η ε,∼ε > 0 such that ∧
∧
δ(y) ≥δ ε and η(x , y) ≥η ε,∼ε ∼
for all (x , y) for which ||x − x∗ || ≥ ε, ||y − x∗ || ≥ ε.
5.5 Sets of 1st and 2nd type: Some integral inequalities In this section, for a given trajectory x(t) ∈ XT we divide the interval [0, T ] into two types of intervals and prove some integral inequalities. We assume that x(t) is a given continuously differentiable trajectory.
5.5.1 Consider a set {t ∈ [0, T ] : x(t) ∈ int D}. This set is an open set and therefore it can be presented as a union of a countable (or finite) number of open intervals τk = (tk1 , tk2 ), k = 1, 2, 3, ..., where τk ∩ τl = ∅ if k = l. We denote the set of intervals τk by G = {τk : k = 1, 2, ...}. From the definition of the set D we have d · px(t) = p x (t) ≤ c(x(t)) < 0 for all t ∈ τk , k = 1, 2, .... dt Then px(tk1 ) > px(tk2 ) for all k. We introduce the notation piτk = px(tki ), i = 1, 2, ..., Pτk = [p2τk , p1τk ] and 0 Pτk = (p2τk , p1τk ).
106
M.A. Mamedov
5.5.2 We divide the set G into the sets gm (G = ∪m gm ), such that for every set g = gm the following conditions hold: a) The set g consists of a countable (or finite) number of intervals τk , for which i - related intervals Pτ0k are disjoint; ii - if [t1τi , t2τi ] ≤ [t1τj , t2τj ] for some τi , τj ∈ g, then Pτi ≥ Pτj ; (here and henceforth the notation [t1τi , t2τi ] ≤ [t1τj , t2τj ] and Pτi ≥ Pτj means that t2τi ≤ t1τj and p2τi ≥ p1τj , respectively) iii - if [t1τ , t2τ ] ⊂ [t1g , t2g ] for τ ∈ G, then τ ∈ g; here t1g = inf τ ∈g t1τ and t2g = supτ ∈g t2τ . b) The set g is a maximal set satisfying Condition a); that is, the set g cannot be extended by taking other elements τ ∈ G such that a) holds. Sets gm (G = ∪m gm ) satisfying these conditions exist. For example we can construct a set g satisfying Conditions a) and b) in the following way. Take any interval τ1 = (t11 , t12 ). Denote by t2 a middle point of the interval [0, t11 ] : t2 = 12 t11 . If for all intervals τ ∈ G, for which τ ⊂ [t2 , t11 ], Conditions i and ii hold, then we take t3 = 12 t2 , otherwise we take t3 = 12 (t2 + t11 ). We repeat this process and obtain a convergent sequence tn . Let tn → t . In this case for all intervals τ ∈ G, for which τ ⊂ [t , t11 ], Conditions i and ii are satisfied. Similarly, in the interval [t12 , T ] we find the point t such that for all intervals τ ∈ G, for which τ ⊂ [t12 , t ], Conditions i and ii are satisfied. Therefore we obtain that for the set g which consists of all intervals τ ∈ G, for which τ ⊂ [t , t ], Condition a) holds. It is not difficult to see that for the set g Condition b) also holds. Thus we have constructed a set g1 = g satisfying Conditions a) and b). We can construct another set g2 taking G \ g1 in the same manner and so on. Therefore G = ∪m gm , where for all gm Conditions a) and b) hold. For every set g, we have intervals [t1g , t2g ] (see iii) and [p1g , p2g ], where p1g = supτ ∈g p1τ and p2g = inf τ ∈g p2τ . Definition 5. We say that g1 < g2 if: a) p2g1 < p1g2 and t2g1 < t1g2 ; b) there is no g ∈ G which belongs to the interval [t2g1 , t1g2 ]. Take some set g 1 ∈ G and consider all sets gm ∈ G, m = ±1, ±2, ±3, ... for which ... < g−2 < g−1 < g 1 < g1 < g2 < .... The number of sets gm may be at most countable. Denote G1 = {g 1 } ∪ {gm : m = ±1, ±2, ±3, ...}.
5
Asymptotical stability of optimal paths in nonconvex problems
107
We take another set g 2 ∈ G \ G1 and construct a set G2 similar to G1 , and so on. We continue this procedure and obtain sets Gi . The number of these sets is either finite or countable. Clearly G = ∪i Gi . Denote t1Gi = inf g∈Gi t1g and t2Gi = supg∈Gi t2g . Clearly ∪i [t1Gi , t2Gi ] ⊂ [0, T ]
(5.15)
and (t1Gi , t2Gi ) ∩ (t1Gj , t2Gj ) = ∅ for all i = j. Proposition 1. Let gk ∈ Gi , k = 1, 2, .... Then a) if g1 < g2 < g3 < .... then x(t2Gi ) = x∗ ; b) if g1 > g2 > g3 > .... then x(t1Gi ) = x∗ . Proof. Consider case a). Take some gk . By Definition 5 it is clear that there exists an interval τk ∈ gk and a point tk ∈ τk such that x(tk ) ∈ D.
(5.16)
Consider the interval [t2gk , t1gk+1 ]. Since gk < gk+1 by Definition 5 we have px(t2gk ) < px(t1gk+1 ). Therefore there exists a point sk ∈ (t2gk , t1gk+1 ) such that ·
p x (sk ) ≥ 0, which implies x(sk ) ∈ M∗ .
(5.17)
On the other hand, |t2gk − t1gk | → 0 and |t1gk+1 − t2gk | → 0, as k → ∞. −
−
−
Then x(t) →x as t → t2Gi . In this case x(tk ) →x and x(sk ) →x as k → ∞. Therefore from the definition of the set D and from (5.16) and (5.17) we −
−
obtain x∈ D ∩ M∗ = {x∗ }; that is, x= x∗ . We can prove case b) in the same manner.
5.5.3 Take the set Gi . We denote by t1i an exact upper bound of the points t2Gm satisfying t2Gm ≤ t1Gi and by t2i an exact lower bound of the points t1Gm satisfying t1Gm ≥ t2Gi . Proposition 2. There exist points ti ∈ [t1i , t1Gi ] and ti ∈ [t2Gi , t2i ] such that x(ti ) = x(ti ) = x∗ . Proof. First we consider the interval [t1i , t1Gi ]. Two cases should be studied. 1. Assume that the exact upper bound t1i is not reached. In this case there exists a sequence of intervals [t1Gm , t2Gm ] such that t2Gm → t1i and t1Gm → t1i . Since the intervals (t1Gm , t2Gm ) are disjoint we obtain that x(t1i ) = x∗ . Therefore ti = t1i .
108
M.A. Mamedov
2. Assume that t1i = t2Gm for any Gm . If there exists a sequence gk ∈ Gm , k = 1, 2, ..., such that g1 < g2 < ..., then from Proposition 1 it follows that x(t2Gm ) = x∗ . So we can take ti = t2Gm = t1i . Therefore we can assume that the set Gm consists of gmk , where gm1 > gm2 > .... Now consider the set Gi . If in this set there exists a sequence gil such that gi1 > gi2 > · · · then from Proposition 6.1 it follows that x(t1Gi ) = x∗ , so we can take ti = t1Gi . That is why we consider the case when the set Gi consists of gil , where gi1 < gi2 < .... Consider the elements gm1 and gi1 . We denote gm = gm1 and gi = gi1 . The elements gm and gi belong to the different sets Gm and Gi , so they cannot be compared by Definition 5. Since the second condition of this definition holds, the first condition is not satisfied, that is, p2Gm ≥ p2Gi . This means that the interval τ in gm and gi forms only one element of type g. This is a contradiction. Therefore we can take either ti = t1i or ti = t1Gi . Similarly we can prove the proposition for the interval [t2Gi , t2i ]. Note We take ti = 0 (or ti = T ) if for the chosen set Gi there does not exist Gm such that t2Gm ≤ t1Gi (or t1Gm ≥ t2Gi , respectively). Therefore the following lemma is proved. Lemma 4. The interval [0, T ] can be divided into an at most countable number of intervals [0, t1 ], [t1k , t2k ] and [t2 , T ], such that the interiors of these intervals are disjoint and a) [0, T ] = [0, t1 ] ∪ {∪k [t1k , t2k ]} ∪ [t2 , T ]; b) in each interval [0, t1 ], [t1k , t2k ] and [t2 , T ] there is only one set G0 , Gk and GT , respectively, and G = G0 ∪ {∪k Gk } ∪ GT ; c) x(t1 ) = x(t1k ) = x(t2k ) = x(t2 ) = x∗ ,
k = 1, 2, 3, ....
5.5.4 In this subsection we give two lemmas. Lemma 5. Assume that the function x(t) is continuously differentiable on the interval [t1 , t2 ] and p1 < p2 , where pi = px(ti ), i = 1, 2. Then there exists an at most countable number of intervals [tk1 , tk2 ] ⊂ [t1 , t2 ] such that · a) p x (t) > 0, t ∈ [tk1 , tk2 ], k = 1, 2, ...; b) [pk1 , pk2 ] ⊂ [p1 , p2 ] and pk1 < pk2 for all k = 1, 2, ..., where pki = px(tki ), i = 1, 2;
5
Asymptotical stability of optimal paths in nonconvex problems
109
c) the intervals (pk1 , pk2 ) are disjoint and p 2 − p1 = (pk2 − pk1 ). k
Proof: For the sake of definiteness, we assume that px(t) ∈ [p1 , p2 ], t ∈ [t1 , t2 ]. Otherwise we can consider an interval [t1 , t2 ] ⊂ [t1 , t2 ], for which px(t) ∈ [p1 , p2 ], t ∈ [t1 , t2 ] and pi = px(ti ), i = 1, 2. We set t(q) = min{t ∈ [t1 , t2 ] : px(t) = q} for all q ∈ [p1 , p2 ] and then define a set m = {t(q) : q ∈ [p1 , p2 ]}. Clearly m ⊂ [t1 , t2 ]. Consider a function a(t) = px(t) defined on the set m. It is not difficult to see that for every q ∈ [p1 , p2 ] there is only one number . t(q), for which a(t(q)) = q, and also a (t(q)) ≥ 0, ∀q ∈ [p1 , p2 ]. We divide the interval [p1 , p2 ] into two parts as follows: .
.
P1 = {q : a (t(q)) = 0} and P2 = {q : a (t(q)) > 0}. Define the sets m(P1 ) = {t(q) : q ∈ P1 } and m(P2 ) = {t(q) : q ∈ P2 }. We denote by mΛ a set of points t ∈ [t1 , t2 ] which cannot be seen from the left (see [2]). It is known that the set mΛ can be presented in the form mΛ = ∪n (αn , βn ). Then we can write [t1 , t2 ] = m(P1 ) ∪ m(P2 ) ∪ mΛ ∪ (∪n {βn }).
Let q ∈ P2 . Since the function x(t) is continuously differentiable, there exists a number ε > 0, such that Vε (q) ⊂ P2 , where Vε (q) stands for the open ε-neighborhood of the point q. Therefore the set m(P2 ) is an open set and that is why it can be presented as m(P2 ) = ∪k (tk1 , tk2 ). Thus we have
t2 p2 − p1 = px(t2 ) − px(t1 ) =
·
t1 tk 2
k
tk 1
βn
·
p x (t)dt +
n α n
·
·
p x (t)dt =
p x (t)dt + m(P1 )
p x (t)dt +
·
p x (t)dt. ∪n {βn }
110
M.A. Mamedov ·
.
It is not difficult to observe that p x (t) = a (t) = 0, ∀t ∈ m(P1 ), meas(∪n {βn }) = 0 and px(αn ) = px(βn ), n = 1, 2, ... (see [2]). Then we obtain (px(tk2 ) − px(tk1 )) = (pk2 − pk1 ). p2 − p1 = k
k
Therefore for the intervals [tk1 , tk2 ] all assertions of the lemma hold. Lemma 6. Assume that on the intervals [t1 , t2 ] and [s2 , s1 ] the following conditions hold:
1. px(ti ) = px(si ) = pi , i = 1, 2. 2. x(t) ∈ int D, ∀t ∈ (t1 , t2 ). In particular, from this condition it follows · that p x (t) < 0, ∀t ∈ (t1 , t2 ). · 3. p x (s) > 0, ∀s ∈ (s2 , s1 ). Then
t2
s1
u(x(s))ds ≤ u [(t2 − t1 ) + (s1 − s2 )] −
u(x(t))dt + t1
s1
∗
s2
δ 2 (x(s))ds s2
where the function δ(x) is as defined in Lemma 17.5.1. Proof: Consider two cases.
I. Let p∗ = px∗ = pi , i = 1, 2. We recall that pi = px(ti ), i = 1, 2. In this case from Conditions 2) and 3) we have ∼
ε= ρ (x∗ , {x(t) : t ∈ [t1 , t2 ]}) > 0 and ε = ρ (x∗ , {x(s) : s ∈ [s2 , s1 ]}) > 0. ∧
∧
Now we use Lemma 17.5.1. We define δ =δ ε > 0 and η =η ε,∼ε for the chosen ∼
numbers ε and ε. We take any number N > 0 and divide the interval [p2 , p1 ] into N equal parts [pk2 , pk1 ]. From Conditions 2 and 3 it follows that in this case the intervals [t1 , t2 ] and [s2 , s1 ] are also divided into N parts, say [tk1 , tk2 ] and [sk2 , sk1 ], respectively. Here px(tki ) = px(ski ) = pki , i = 1, 2, k = 1, ..., N. Clearly pk1 − pk2 =
p 1 − p2 → 0 as N → ∞. N ∼
Since x(t) ∈ D and ||x(t) − x∗ || ≥ ε > 0 for all t ∈ [t1 , t2 ] then from Lemma 17.4.1 it follows that ·
p x (t) ≤ c(x(t)) < −η∼ε < 0. That is why for every k we have tk2 − tk1 → 0 as N → ∞. Therefore for a given η > 0 there exists a number N such that
5
Asymptotical stability of optimal paths in nonconvex problems
max
k t,s∈[tk 1 ,t2 ]
111
||x(t) − x(s)|| < η for all k = 1, ..., N.
(5.18)
Now we show that for all k = 1, ..., N sk1 − sk2 → 0 as N → ∞.
(5.19)
Suppose that (5.19) is not true. In this case there exists a sequence of intervals [sk2N , sk1N ], such that ski N → si , i = 1, 2, and s2 < s1 . Since pk1N −pk2N → 0 as N → ∞ then px(s2 ) = px(s1 ) = p , and moreover px(s) = p for all s ∈ [s2 , s1 ]. This is a contradiction. So (5.19) is true. A. Now we take any number k and fix it. For the sake of simplicity we denote the intervals [tk1 , tk2 ] and [sk2 , sk1 ] by [t1 , t2 ] and [s2 , s1 ], respectively. Let pi = px(ti ) = px(si ), i = 1, 2. Take any s ∈ (s2 , s1 ) and denote by t the point in the interval (t1 , t2 ) for which px(t ) = px(s). From (6.8) it follows that x(t) ∈ Vη (x(t )) for all t ∈ [t1 , t2 ]. Therefore we can apply Lemma 17.5.1. We also note that the following conditions hold: ·
• |cx(t))| ≤ |p x (t)| for all t ∈ (t1 , t2 ); · • cx(s)) ≥ p x (s) for all s ∈ (s2 , s1 ); • u(x(s)) ≤ u∗ for all s ∈ [s2 , s1 ]. Then from Lemma 17.5.1 we obtain u(x(t)) ·
|p x (t)| + δ1
+
u(x(s)) ·
p x (s) + δ2
≤u
1
∗
·
|p x (t)| + δ1
+
1
·
p x (s) + δ2
− δ(x(s)),
(5.20)
for all t ∈ (t1 , t2 ), s ∈ (s2 , s1 ), δ1 ∈ [0, δ¯1 ] and δ2 ∈ [0, δ(x(s))]. Denote ξ = mins∈[s2 ,s1 ] δ(x(s)). Clearly ξ > 0. Since the function δ(x) is ∼ continuous there is a point s∈ [s2 , s1 ] such that ∼
ξ = δ(x( s)). We transform t → π and s → ω as follows: s2 − s1 (t1 − t), t ∈ [t1 , t2 ], t1 − t2 ω = px(s) + ξ(s − s1 ), s ∈ [s2 , s1 ].
π = px(t) + ξ
112
M.A. Mamedov
Clearly ∼
·
·
dπ = [p x (t)− ξ ]dt and dω = [p x (s) + ξ]ds, ∼
where ξ = ξ(s2 − s1 )/(t1 − t2 ). ∼
·
·
Since p x (t)− ξ < 0 and p x (s) + ξ > 0 then there exist inverse functions
t = t(π) and s = s(ω). We also note that π1 = px(t1 ) = px(s1 ) = ω1 and
π2 = px(t2 ) + ξ(s2 − s1 ) = ω2 . Therefore we have A
t2
=
u(x(t))dt + t1 π
2
=
=
s1 u(x(s))ds s2
u(x(t(π)))
ω1 ∼ dπ
+
u(x(s(ω)))
dω · x (s(ω)) + ξ p ω2 u(x(s(ω))) dω. ∼ + · · p x (s(ω)) + ξ |p x (t(ω))|+ ξ ·
π1 p x (t(π))− ξ
ω1 u(x(t(ω))) ω2
∼
Let δ¯1 > ξ . Since ξ ≤ δ(x(t)) = δ(x(t(ω))), t(ω) ∈ [t1 , t2 ], s(ω) ∈ [s2 , s1 ], then from (6.10) we obtain
ω1
ω1 1 1 ∗ dω − δ(x(s(ω)))dω A≤ u ∼ + · · x (s(ω)) + ξ p x (t(ω))|+ |p ξ ω2 ω2 ⎛t ⎞
2
s1
s1 · ∗⎝ ⎠ =u dt + ds − δ(x(s))[p x (s) + ξ]ds t1 ∗
s2
s2
s1
≤ u [(t2 − t1 ) + (s1 − s2 )] −
ξδ(x(s)ds. s2 ∼
On the other hand δ(x(s)) ≥ ξ = δ(x( s)). Thus ∼
A ≤ u∗ [(t2 − t1 ) + (s1 − s2 )] − (s1 − s2 )δ 2 (x( s)). B. Now we consider different numbers k. The last inequality shows that ∼ for every k = 1, ..., N there is a point sk ∈ [sk2 , sk1 ] such that
5
Asymptotical stability of optimal paths in nonconvex problems
t2
s1 u(x(t))dt +
t1
113 ∼
u(x(s))ds ≤ u∗ [(tk2 − tk1 ) + (sk1 − sk2 )] − (sk1 − sk2 )δ 2 (x(sk )).
s2
Summing over k we obtain
t2
s1 u(x(t))dt +
t1
u(x(s))ds ≤ u∗ [(t2 − t1 ) + (s1 − s2 )] −
N
∼
(sk1 − sk2 )δ 2 (x(sk )).
k=1
s2
Therefore the lemma is proved taking into account (5.19) and passing to the limit as N → ∞. II. Now consider the case when p∗ = pi for some i = 1, 2. For the sake of definiteness we assume that p∗ = p1 . Take any number α > 0 and consider the interval [p2 , p1 − α]. Denote by [t1 − t(α), t2 ] and [s2 , s1 − s(α)] the intervals which correspond to the interval [p2 , p1 − α]. Clearly t(α) → 0 and s(α) → 0 as α → 0. We apply the result proved in the first part of the lemma for the interval [p2 , p1 − α]. Then we pass to the limit as α → 0. Thus the lemma is proved.
5.5.5 We define two types of sets. Definition 6. The set π ⊂ [0, T ] is called a set of 1st type on the interval [p2 , p1 ] if the following conditions hold: a) The set π consists of two sets π1 and π2 , that is, π = π1 ∪ π2 , such that / int D, ∀t ∈ π2 . x(t) ∈ int D, ∀t ∈ π1 and x(t) ∈ b) The set π1 consists of an at most countable number of intervals dk , with end-points tk1 < tk2 and the intervals (px(tk2 ), px(tk1 )), k = 1, 2, ..., are disjoint. Clearly in this case the intervals d0k = (tk1 , tk2 ) are also disjoint. c) Both the inequalities p1 ≥ supk px(tk1 ) and p2 ≤ inf k px(tk2 ) hold. Definition 7. The set ω ⊂ [0, T ] is called a set of 2nd type on the interval [p2 , p1 ] if the following conditions hold: a) x(t) ∈ / int D, ∀t ∈ ω. b) The set ω consists of an at most countable number of intervals [sk2 , sk1 ], such that the intervals (px(sk2 ), px(sk1 )), k = 1, 2, ..., are nonempty and disjoint, and [px(sk1 ) − px(sk2 )]. p1 − p2 = k
114
M.A. Mamedov
Lemma 7. Assume that π and ω are sets of 1st and 2nd type on the interval [p2 , p1 ], respectively. Then
u(x(t))dt ≤ u∗ meas(π ∪ ω) − [u∗ − u(x(t))]dt − δ 2 (x(t))dt, π∪ω
Q
E
where / int D}; a) Q ∪ E = ω ∪ π2 = {t ∈ π ∪ ω : x(t) ∈ b) for every ε > 0 there exists a number δε > 0 such that δ 2 (x) ≥ δε for all x for which ||x − x∗ || ≥ ε; c) for every δ > 0 there exists a number K(δ) < ∞ such that meas[(π ∪ ω) ∩ Zδ ] ≤ K(δ)meas[(Q ∪ E) ∩ Zδ ], where Zδ = {t ∈ [0, T ] : |px(t) − p∗ | ≥ δ}. Proof. Let π = π1 ∪ π2 , π1 = ∪k dk , ∪n νn ⊂ ω and νn = [sn2 , sn1 ] (see Definitions 6 and 7). We denote π10 = ∪k d0k and d0k = (tk1 , tk2 ). Clearly meas π1 = meas π10 and that is why below we deal with d0k . Denote pni = px(sni ), i = 1, 2. Clearly pn2 < pn1 . Since the function x(t) is absolutely continuous, from Lemma 5 it follows that there exists an at most nm n n countable number of intervals [snm 2 , s1 ] ⊂ [s2 , s1 ], m = 1, 2, ..., such that · nm i - p x (s) > 0, for all s ∈ [snm 2 , s1 ], n, m = 1, 2, ...; nm nm n n nm for all n, m, ii - [p2 , p1 ] ⊂ [p2 , p1 ] and p2 < pnm 1 = px(snm here pnm i i ), i = 1, 2; nm iii - the intervals (pnm 2 , p1 ), n, m = 1, 2, ..., are disjoint and pn1 − pn2 =
(pnm − pnm 1 2 ).
m
Therefore the set ω contains an at most countable number of intervals νm = m (sm 2 , s1 ), such that: m m m 1. the intervals (pm 2 , p1 ), m = 1, 2, . . . , are disjoint (here pi = px(si ), i = 1, 2, m = 1, 2, . . .) ; 2. ·
m p x (t) > 0, for all t ∈ ∪m (sm 2 , s1 ).
(5.21)
Now we take some interval d0k = (tk1 , tk2 ) and let pki = px(tki ), i = 1, 2. Denote km k k m m [pkm 2 , p1 ] = [p2 , p1 ] ∩ [p2 , p1 ].
(5.22)
5
Asymptotical stability of optimal paths in nonconvex problems
115
·
Since p x (t) < 0, for all t ∈ d0k , from (5.21) it follows that there are two km km km intervals [tkm 1 , t2 ] and [s2 , s1 ] corresponding to the nonempty interval km , p ], and [pkm 2 1 km k k (tkm 2 − t1 ) = t2 − t1 . m
Applying Lemma 6 we obtain km
km
t2
s1
u(x(t))dt+ tkm 1
km
s1
km km km u(x(s))ds ≤ u∗ [(tkm 2 −t1 )+(s1 −s2 )]−
skm 2
δ 2 (x(s))ds.
skm 2
Summing over m and then over k we have tk 2
k
km
u(x(t))dt +
tk 1
s1
u(x(s))ds
k,m km s2
⎡
≤ u∗ ⎣
(tk2 − tk1 ) +
k
⎤ km ⎦ (skm − 1 − s2 )
k,m
km
s1
δ 2 (x(s))ds.
k,m km s2
km Denote ω = ∪k,m [skm 2 , s1 ]. Clearly ω ⊂ ω. Therefore
u(x(t))dt π∪ω
u(x(t))dt +
=
u(x(t))dt +
ω
π1
≤ u∗ (meas π1 + meas ω ) −
−
u(x(t))dt +
ω\ω
u(x(t))dt π2
δ 2 (x(s))ds ω
[u∗ − u(x(t))]dt + u∗ [meas π2 + meas (ω \ ω )]
π2 ∪(ω\ω )
= u∗ meas (π ∪ ω) −
Q
[u∗ − u(x(t))]dt −
δ 2 (x(t))dt, E
where Q = π2 ∪ (ω \ ω ) and E = ω . Now we check Conditions a), b) and c) of the lemma. Condition a) holds, because Q ∪ E = π2 ∪ (ω \ ω ) ∪ ω = π2 ∪ ω. Condition b) follows from Lemma 17.5.1. We now check Condition c). Take any number δ > 0 and denote Pδ = {l : |l − p∗ | ≥ δ}.
116
M.A. Mamedov
km k k m m Consider the intervals [pkm 2 , p1 ], [p2 , p1 ] and [p2 , p1 ] (see (6.11)) cor0 responding to the interval dk , where
pk1 − pk2 =
km (pkm 1 − p2 ).
(5.23)
m
From Lemma 17.4.1 we have ·
p x (t) ≤ c(x(t)) < −ηδ for all t ∈ π1 ∩ Zδ .
(5.24)
On the other hand there exists a number K < +∞, for which ·
p x (t) ≤ K for all t ∈ [0, T ]. Therefore meas
[pk2 , pk1 ]
·
[−p x (t)]dt ≥ ηδ meas (dk ∩ Zδ ).
∩ Pδ = dk ∩Zδ
Summing over k we have
I= meas [pk2 , pk1 ] ∩ Pδ ≤ ηδ meas (π1 ∩ Zδ ).
(5.25)
k
On the other hand, from (5.23) it follows that I=
km meas [pkm 2 , p1 ] ∩ Pδ =
k,m
≤ K
·
p x (t)dt
k,m km km [s2 ,s1 ]∩Zδ
km meas [skm 2 , s1 ] ∩ Zδ ≤ K meas (ω ∩ Zδ ) .
(5.26)
k,m
Thus from from (5.25) and (5.26) we obtain meas (π1 ∩ Zδ ) ≤
K K meas (E ∩ Zδ ) ≤ meas [(Q ∪ E) ∩ Zδ ]. ηδ ηδ
But Q ∪ E = π2 ∪ ω and therefore meas [(π ∪ ω) ∩ Zδ ] = meas (π1 ∩ Zδ ) + meas [(Q ∪ E) ∩ Zδ ] K ≤ meas [(Q ∪ E) ∩ Zδ ] + meas [(Q ∪ E) ∩ Zδ ]. ηδ Then Condition c) holds if we take Kδ = proved.
K ηδ
+ 1 and thus the lemma is
5
Asymptotical stability of optimal paths in nonconvex problems
117
5.6 Transformation of the functional (5.2) In this section we divide the sets G0 , Gk and GT (see Lemma 4) into sets of 1st and 2nd type such that Lemma 7 can be applied. Note that x(t) is a continuously differentiable trajectory.
5.6.1 Lemma 8. Assume that the set Gi consists of a finite number of elements gk , g1 < g2 < ... < gN . Then 2
tg N
u(x(t))dt =
u(x(t))dt +
k π ∪ω k k
t1g 1
u(x(t))dt. F
Here πk and ωk are the sets of 1st and 2nd type in the interval [pk2 , pk1 ] and the set F is either a set of 1st type in the interval [p2gN , p1g1 ], if p2gN ≤ p1g1 , or is a set of 2nd type in the interval [p1g1 , p2gN ], if p2gN > p1g1 . Proof. Take the set g1 and assume that [p2g1 , p1g1 ] and [t1g1 , t2g1 ] are the corresponding intervals. Note that we are using the notation introduced in Section 5.5. Take the set g2 . A. First we consider the case p1g2 < p1g1 . In this case there is a point t ∈ [t1g1 , t2g1 ] such that px(t1 )) = p1g2 . Denote π1 = [t1 , t2g1 ] and ω1 = [t2g1 , t1g2 ]. Note that by Definition 6 we have p2g1 < p1g2 . It is clear that π1 and ω1 are sets of 1st and 2nd type on the interval [p2g1 , p1g2 ], respectively. Therefore 1
2
tg2
u(x(t))dt =
t1g1
u(x(t))dt +
π1 ∪ω1
u(x(t))dt, π11
where π11 = [t1g1 , t1 ] ∪ [t1g2 , t2g2 ] is a set of 1st type on the interval [p2g2 , p1g1 ]. B. Now we assume that p1g2 ≥ p1g1 . In this case there is a point t1 ∈ [t2g1 , t1g2 ] such that px(t1 ) = p1g1 . Denote π1 = [t1g1 , t2g1 ] and ω1 = [t2g1 , t1 ]. Consider two cases. 1. Let p2g2 ≥ p1g1 . Then there is a point t2 ∈ [t1 , t1g2 ] such that px(t2 )) = p2g2 . In this case we denote
118
M.A. Mamedov
π2 = [t1g2 , t2g2 ], ω2 = [t2 , t1g2 ] ; and]; ω1 = [t1 , t2 ]. Therefore 2
tg2 u(x(t))dt =
u(x(t))dt +
i=1,2π ∪ω i i
t1g1
u(x(t))dt, ω1
where ω1 is a set of 2nd type on the interval [p1g1 , p2g2 ]. 2. Let p2g2 < p1g1 . Then there is a point t2 ∈ [t1 , t1g2 ] such that px(t2 )) = p1g1 . In this case we denote π1 = [t1g2 , t2 ], ω2 = [t1 , t1g2 ] and π1 = [t2 , t2g2 ]. Therefore 2
tg2 u(x(t))dt =
u(x(t))dt +
i=1,2π ∪ω i i
t1g1
u(x(t))dt, π1
where π1 is a set of 1st type on the interval [p2g2 , p1g1 ]. We repeat this procedure taking g3 , g4 , ...gN and thus the lemma is proved. Lemma 9. Assume that gn ∈ Gi , n = 1, 2, ..., g1 < g2 < ..., and t2 = limn→∞ t2gn . Then
t
2
u(x(t))dt =
u(x(t))dt +
n π ∪ω n n
t1g1
u(x(t))dt. F
Here πn and ωn are sets of 1st and 2nd type in the interval [p2n , p1n ] and the set F is either a set of 1st type in the interval [p∗ , p1g1 ], if p∗ ≤ p1g1 , or is a set of 2nd type in the interval [p1g1 , p∗ ], if p∗ > p1g1 . Proof. We apply Lemma 8 for every n. From Proposition 1 we obtain that x(t) → x∗ as t → t2 , and therefore p2gn → p∗ as n → ∞. This completes the proof. We can prove the following lemmas in a similar manner to that used for proving Lemmas 8 and 9. Lemma 10. Assume that the set Gi consists of a finite number of elements gk , g1 > g2 > ... > gN . Then 2
tg1 u(x(t))dt = t1g
N
k π ∪ω k k
u(x(t))dt +
u(x(t))dt. F
5
Asymptotical stability of optimal paths in nonconvex problems
119
Here πk and ωk are sets of 1st and 2nd type in the interval [p2k , p1k ] and the set F is either a set of 1st type in the interval [p2g1 , p1gN ], if p1gN ≥ p2g1 , or is a set of 2nd type in the interval [p1gN , p2g1 ], if p1gN < p2g1 . Lemma 11. Assume that gn ∈ Gi , n = 1, 2, ..., g1 > g2 > ..., and t1 = limn→∞ t1gn . Then 2
tg1
u(x(t))dt =
u(x(t))dt +
n π ∪ω n n
t1
u(x(t))dt. F
Here πn and ωn are sets of 1st and 2nd type in the interval [p2n , p1n ] and the set F is either a set of 1st type in the interval [p2g1 , p∗ ], if p∗ ≥ p2g1 , or is a set of 2nd type in the interval [p∗ , p2g1 ], if p∗ < p2g1 . In the next lemma we combine the results obtained by Lemmas 9 and 11. Lemma 12. Assume that the set Gi consists of elements gn , n = ±1, ±2, . . ., where · · · < g−2 < g−1 < g1 < g2 < · · · , and where t1 = limn→−∞ t1gn and t2 = limn→∞ t2gn . Then
t
2
u(x(t))dt =
u(x(t))dt.
n π ∪ω n n
t1
Here πn and ωn are sets of 1st and 2nd type in the interval [p2n , p1n ]. Proof. We apply Lemmas 9 and 11 and obtain
t
2
u(x(t))dt =
n
t1g1 t2g
−1 u(x(t))dt =
t1
u(x(t))dt +
∪ω πn n
n
∪ω πn n
u(x(t))dt,
(5.27)
F
u(x(t))dt +
u(x(t))dt.
(5.28)
F
We define π0 = F ∪ F and ω0 = [t2g−1 , t1g1 ]. Clearly they are sets of 1st and 2nd type in the interval [p2g−1 , p1g1 ] (note that p2g−1 < p1g1 by Definition 5). Therefore the lemma is proved if we sum (5.27) and (5.28).
5.6.2 Now we use Lemma 4. We take any interval [t1k , t2k ] and let
120
M.A. Mamedov
[t1k , t2k ] = [t1k , t1Gk ] ∪ [t1Gk , t2Gk ] ∪ [t2Gk , t2k ]. We show that 2
tk u(x(t))dt =
n
t1k
u(x(t))dt +
k ∪ω k πn n
u(x(t))dt,
(5.29)
Ek
where πnk and ωnk are sets of 1st and 2nd type in the interval [p2nk , p1nk ] and x(t) ∈ int D, ∀t ∈ E k . If the conditions of Lemma 12 hold then (5.29) is true if we take E k = [t1k , t1Gk ] ∪ [t2Gk , t2k ]. Otherwise we apply Lemmas 8–11 and obtain 2
tGk u(x(t))dt =
n
t1G k
u(x(t))dt +
k ∪ω k πn n
u(x(t))dt.
Fk
If F k is a set of 2nd type then (5.29) is true if we take E k = F k . Assume that F k is a set of 1st type on some interval [p2 , p1 ]. In this case we set π0k = F k and ω0k = [t1k , t1Gk ] ∪ [t2Gk , t2k ]. We have x(t1k ) = x(t2k ) = x∗ (see Lemma 4) and therefore π0k and ω0k are sets of 1st and 2nd type in the interval [p2 , p1 ]. Thus (5.29) is true. Now we apply Lemmas 8–12 to the intervals [0, t1 ] and [t2 , T ]. We have
t
1
u(x(t))dt =
n
0
T u(x(t))dt = t2
0 ∪ω 0 πn n
n
u(x(t))dt + F0
T ∪ω T πn n
u(x(t))dt +
u(x(t))dt + FT
u(x(t))dt,
(5.30)
u(x(t))dt.
(5.31)
E0
u(x(t))dt + ET
Here • F 0 and F T are sets of 1st type (they may be empty); • [0, t1G0 ] ∪ [t2G0 , t1 ] ⊂ E 0 and [t2 , t1GT ] ∪ [t2GT , T ] ⊂ E T ; • x(t) ∈ / int D for all t ∈ E 0 ∪ E T . Thus, applying Lemma 4 and taking into account (5.29)–(5.31), we can prove the following lemma. Lemma 13. The interval [0, T ] can be divided into subintervals such that
5
Asymptotical stability of optimal paths in nonconvex problems
121
[0, T ] = ∪n (πn ∪ ωn ) ∪ F1 ∪ F2 ∪ E,
T u(x(t)) dt = 0
(5.32)
u(x(t)) dt +
n π ∪ω n n
u(x(t)) dt +
F1 ∪F2
u(x(t)) dt. E
(5.33) Here 1. The sets πn and ωn are sets of 1st and 2nd type, respectively, in the intervals [p2n , p1n ], n = 1, 2, .... 2. The sets F1 and F2 are sets of 1st type in the intervals [p21 , p11 ] and [p22 , p12 ], respectively, and x(t) ∈ int D, f or all t ∈ F1 ∪ F2 , p1i
−
p2i
≤ C < +∞, i = 1, 2.
(5.34) (5.35)
3. Also x(t) ∈ / int D, f or all t ∈ E.
(5.36)
4. For every δ > 0 there is a number C(δ) such that meas [(F1 ∪ F2 ) ∩ Zδ ] ≤ C(δ),
(5.37)
where the number C(δ) < ∞ does not depend on the trajectory x(t), on T or on the intervals in (5.32). Proof: We define F1 = {t ∈ F 0 : x(t) ∈ int D}, F2 = {t ∈ F T : x(t) ∈ int D} and E = ∪k E k ∪ E 0 ∪ E T . Then we replace π10 to π10 ∪ (F 0 \ F1 ) and π1T to π1T ∪ (F T \ F2 ) in (5.30) and (5.31) (note that after these replacements the conditions of Definition 6 still hold). We obtain (5.33) summing (5.29)– (5.31). It is not difficult to see that all assertions of the lemma hold. Note that (5.35) follows from the fact that the trajectory x(t) is uniformly bounded (see (5.3)). The inequality (5.37) follows from Lemma 17.4.1, taking into account Definition 6, and thus the lemma is proved. Lemma 14. There is a number L < +∞ such that
[u(x(t)) − u∗ ] dt < L,
(5.38)
F1 ∪F2
where L does not depend on the trajectory x(t), on T or on the intervals in (5.32). Proof: From Condition H3 it follows that there exist a number ε > 0 and a ∼ ∼ trajectory x (·) to the system (6.1), defined on [ 0, Tε ], such that p x (0) = ∼ p∗ − ε, p x (Tε ) = p∗ + ε and
122
M.A. Mamedov . ∼
p x (t) > 0 for almost all t ∈ [0, Tε ].
Define
(5.39)
∼
u(x (t))dt.
Rε = [0,Tε ]
Consider the set F1 and corresponding interval [p21 , p11 ]. Define a set F1ε = {t ∈ F 1 : |px(t) − p∗ | < ε}. We consider the most common case, when [p∗ − ε, p∗ + ε] ⊂ [p21 , p11 ]. In this case the sets F1ε and [0, Tε ] are sets of 1st and 2nd type in the interval ∼ [p∗ − ε, p∗ + ε] for the trajectories x(·) and x (·), respectively. We have
u(x(t))dt + Rε = u(x(t)) dt + u(x(t)) dt + u(x(t))dt. F1ε
F1
[0,Tε ]
F1 \F1ε
(5.40) ∼
We use Lemma 7. Note that this lemma can be applied for the trajectory x (·) (which may not be continuously differentiable) due to the inequality (5.39). Taking into account δ(x) ≥ 0 and u(x(t)) ≤ u∗ , for t ∈ Q, from Lemma 7 we obtain
u(x(t)) dt + u(x(t)) dt ≤ u∗ (meas F1ε + Tε ). (5.41) F1ε
[0,Tε ]
From (5.37) it follows that meas (F1 \ F1ε ) = meas (F1 ∩ Zε ) ≤ C(ε). Thus
u(x(t)) dt ≤ Cε ,
(5.42)
F1 \F1ε
where the number Cε < +∞ does not depend on T or on the trajectory x(t). Denote C = Tε u∗ + Cε − Rε . Then from (5.40)–(5.42) we obtain
u(x(t)) dt ≤ u∗ meas F1ε + C ≤ u∗ meas F1 + C F1
and therefore
F1
[u(x(t)) − u∗ ] dt ≤ C .
5
Asymptotical stability of optimal paths in nonconvex problems
123
By analogy we can prove that
[u(x(t)) − u∗ ] dt ≤ C . F2
Thus the lemma is proved if we take L = C + C .
5.7 The proof of Theorem 13.6 From Condition M it follows that for every T > 0 there exists a trajectory xT (·) ∈ XT , for which
u(xT (t)) dt ≥ u∗ T − b.
(5.43)
[0,T ]
5.7.1 First we consider the case when x(t) is a continuously differentiable function. In this case we can apply Lemma 7. From Lemmas 4 and 14 we have
u(x(t)) dt ≤ u(x(t)) dt+ u(x(t)) dt+L+u∗ +meas (F1 ∪F2 ). n π ∪ω n n
[0,T ]
E
Then applying Lemma 7 we obtain
u(x(t)) dt [0,T ]
≤
⎛ ⎝u∗ meas (πn ∪ ωn ) −
n
+ E
= u∗
−
u(x(t)) dt −
Qn
⎞ δ 2 (x(t)) dt⎠
En
u(x(t)) dt + L + u∗ + meas (F1 ∪ F2 )
meas (πn ∪ ωn ) + meas (F1 ∪ F2 ) + meas E
n
[u∗ − u(x(t))] dt −
Q ∗
= u meas [0, T ] −
δ 2 (x(t)) dt + L A
∗
[u − u(x(t))] dt − Q
δ 2 (x(t)) dt + L. A
124
M.A. Mamedov
Here Q = (∪n Qn ) ∪ E and A = ∪n En . Taking (5.43) into account we have
u(x(t)) dt − u(xT (t)) dt ≤ − [u∗ − u(x(t))] dt [0,T ]
Q
[0,T ]
−
δ 2 (x(t)) dt + L + b, A
that is,
JT (x(·)) − JT (xT (·)) ≤ −
[u∗ − u(x(t))] dt −
Q
δ 2 (x(t)) dt + L + b. A
(5.44) Here Q = (∪n Qn ) ∪ E and A = ∪n En
(5.45)
and the following conditions hold: a) Q ∪ A = { t ∈ [0, T ] : x(t) ∈ / int D};
(5.46)
[0, T ] = ∪n (πn ∪ ωn ) ∪ (F1 ∪ F2 ) ∪ E;
(5.47)
b)
c) for every δ > 0 there exist K(δ) < +∞ and C(δ) < +∞ such that meas [(πn ∪ ωn ) ∩ Zδ ] ≤ K(δ) meas [(Qn ∪ En ) ∩ Zδ ] and meas [(F1 ∪ F2 ) ∩ Zδ ] ≤ C(δ);
(5.48) (5.49)
(recalling that Zδ = {t ∈ [0, T ] : |px(t) − p∗ | ≥ δ}) d) for every ε > 0 there exists δε > 0 such that δ 2 (x) ≥ δε for all x, ||x − x∗ || ≥ ε.
(5.50)
The first assertion of the theorem follows from (5.44), (5.46) and (5.50) for the case under consideration (that is, x(t) continuously differentiable). We now prove the second assertion. Let ε > 0 and δ > 0 be given numbers and x(·) a continuously differentiable ξ-optimal trajectory. We denote Xε = {t ∈ [0, T ] : ||x(t) − x∗ || ≥ ε}.
5
Asymptotical stability of optimal paths in nonconvex problems
125
∼
First we show that there exists a number K ε,ξ < +∞ which does not depend on T > 0 and meas [(Q ∪ A) ∩ Xε ] ≤
∼
K ε,ξ .
(5.51)
Assume that (5.51) is not true. In this case there exist sequences Tk → ∞, k → ∞ and sequences of trajectories {xk (·)} (every xk (·) is a ξ-optimal Kε,ξ trajectory in the interval [0, Tk ]) and {xTk (·)} (satisfying (5.43) for every T = Tk ) such that k meas [(Qk ∪ Ak ) ∩ Xεk ] ≥ Kε,ξ
as
k → ∞.
(5.52)
From Lemma 17.3.1 and (5.50) we have u∗ − u(xk (t)) ≥ νε δ 2 (xk (t)) ≥ δε2
if if
t ∈ Qk ∪ Xεk and t ∈ Ak ∩ Xεk .
Denote ν = min {νε , δε2 } > 0. From (5.44) it follows that JTk (xk (·)) − JTk (xTk (·)) ≤ L + b − ν meas [(Qk ∪ Ak ) ∩ Xεk ]. Therefore, for sufficiently large numbers k, we have JTk (xk (·)) ≤ JTk (xTk (·)) − 2 ξ ≤ JT∗k − 2 ξ, which means that xk (t) is not a ξ-optimal trajectory. This is a contradiction. Thus (5.51) is true. 1 < +∞ such that Now we show that for every δ > 0 there is a number Kδ,ξ 1 meas Zδ ≤ Kδ,ξ .
(5.53)
From (5.47)–(5.49) we have meas [(πn ∪ ωn ) ∩ Zδ ] + meas [(F1 ∪ F2 ) ∩ Zδ ] meas Zδ = n
+ meas (E ∩ Zδ ) K(δ) meas [(Qn ∪ En ) ∩ Zδ ] + C(δ) + meas (E ∩ Zδ ) ≤ n ∼
≤K (δ) meas [([∪n (Qn ∪ En )] ∩ Zδ ) ∪ (E ∩ Zδ )] + C(δ) ∼
=K (δ) meas [(Q ∪ A) ∩ Zδ ] + C(δ), ∼
where K (δ) = max{1, K(δ)}. Since Zδ ⊂ Xδ , taking (5.51) into account we obtain (5.53), where
126
M.A. Mamedov ∼
∼
1 Kδ,ξ =K (δ) K δ,ξ +C(δ). 0 0 = {t ∈ [0, T ] : ||x(t) − x∗ || > ε/2.}. Clearly Xε/2 is an We denote Xε/2 open set and therefore can be presented as a union of an at most countable ∼ 0 = ∪k τ k . Out of these intervals we number of open intervals, say Xε/2 choose further intervals, which have a nonempty intersection with Xε , say these are τk , k = 1, 2, .... Then we have 0 . Xε ⊂ ∪k τk ⊂ Xε/2
(5.54)
Since a derivative of the function x(t) is bounded, it is not difficult to see that there exists a number σε > 0 such that meas τk ≥ σε
for all k.
(5.55)
But the interval [0, T ] is bounded and therefore the number of intervals τk is finite too. Let k = 1, 2, 3, ..., NT (ε). We divide every interval τk into two parts: / int D}. τk1 = {t ∈ τk : x(t) ∈ int D} and τk2 = {t ∈ τk : x(t) ∈ From (5.46) and (5.54) we obtain 0 ∪k τk2 ⊂ (Q ∪ A) ∩ Xε/2
and therefore from (5.51) it follows that ∼
meas (∪k τk2 ) ≤ K ε/2,ξ .
(5.56)
Now we apply Lemma 17.4.1. We have ·
p x (t) ≤ − ηε/2 ,
t ∈ ∪k τk1 .
(5.57)
Define p1k = supt∈τk px(t) and p2k = inf t∈τk px(t). It is clear that ∼
p1k − p2k ≤ C ,
k = 1, 2, 3, ..., NT (ε),
(5.58)
and ·
|p x (t)| ≤ K,
for all t.
(5.59)
∼
Here the numbers C and K do not depend on T > 0, x(·), ε or ξ. We divide the interval τk into three parts: ·
τk− = {t ∈ τk : p x (t) < 0}, ·
τk+ = {t ∈ τk : p x (t) > 0}.
·
τk0 = {t ∈ τk : p x (t) = 0} and
5
Asymptotical stability of optimal paths in nonconvex problems
Then we have p1k − p2k
* * * * * * * *
* * * * · · · * * ≥ ** p x (t)dt ** = * p x (t)dt + p x (t)dt * . * * * * * * τ− τk τ+ k
We define
127
α=−
·
p x (t)dt and β =
τk−
k
·
p x (t)dt.
Clearly α > 0, β > 0
τk+
and
) p1k − p2k ≥
−α + β, α − β,
if α < β, if α ≥ β.
(5.60)
From (5.59) we obtain 0 < β ≤ K meas τk+ .
(5.61)
On the other hand, τk1 ⊂ τk− and therefore from (5.57) we have α ≥ ηε/2 meas τk− ≥ ηε/2 meas τk1 .
(5.62)
Consider two cases. a) α ≥ β. Then from (5.60)–(5.62) we obtain ∼
+ 1 2 1 C ≥ pk − pk ≥ α − β ≥ ηε/2 meas τk − K meas τk .
Since τk+ ⊂ τk2 , then from (5.56) it follows that Therefore from (5.63) we have , meas τk1 ≤ Cε,ξ
meas τk+ ≤ ∼
where Cε,ξ = (C + K· K ε/2,ξ )/ηε/2 .
(5.63) ∼
K ε/2,ξ .
(5.64)
b) α < β. Then from (5.61) and (5.62) we obtain ∼
ηε/2 meas τk1 < K meas τk+ ≤ K· K ε/2,ξ or , meas τk1 < Cε,ξ
∼
where Cε,ξ = K· K ε/2,ξ /ηε/2 .
(5.65)
Thus from (5.64) and (5.65) we obtain , Cε,ξ }, meas τk1 ≤ Cε,ξ = max{Cε,ξ
k = 1, 2, ..., NT (ε),
and then meas (∪k τk1 ) ≤ NT (ε) Cε,ξ .
(5.66)
128
M.A. Mamedov
Now we show that for every ε > 0 and ξ > 0 there exists a number < +∞ such that Kε,ξ . meas (∪k τk1 ) ≤ Kε,ξ
(5.67)
Assume that (5.67) is not true. Then from (5.66) it follows that NT (ε) → ∞ as T → ∞. Consider the intervals τk for which the following conditions hold: meas τk1 ≥
1 σε and meas τk2 ≤ λ meas τk1 , 2
(5.68)
where λ is any fixed number. Since NT (ε) → ∞, then from (5.55) and (5.56) it follows that the number of intervals τk satisfying (5.68) increases infinitely as T → ∞. On the other hand, the number of intervals τk , for which the conditions α < β, meas τk2 > λ meas τk1 and λ = ηε/2 /K hold, is finite. Therefore the number of of intervals τk for which the conditions α ≤ β and (5.68) hold infinitely increases as T → ∞. We denote the number of such intervals by NT and for the sake of definiteness assume that these are intervals τk , k = 1, 2, ..., NT . We set λ = ηε/2 /2K for every τk . Then from (5.63) and (5.68) we have p1k − p2k ≥ ηε/2 meas τk1 − K·
ηε/2 1 meas τk1 = ηε/2 meas τk1 . 2K 2
Taking (5.55) into account we obtain p1k − p2k ≥ eε ,
k = 1, 2, ..., NT ,
(5.69)
where eε =
1 ηε/2 σε > 0 2
1 8 eε . From (5.69) 1 2 [sk , sk ] ⊂ τk such that
Let δ = d
dk =
and NT → ∞ as T → ∞.
it follows that for every τk there exists an interval
|p x(t) − p∗ | ≥ δ, t ∈ dk ,
p x(s1k ) = sup p x(t), t∈dk
p x(s2k ) = inf p x(t) t∈dk
and
p x(s1k ) − p x(s2k ) = δ.
From (5.59) we have * * * *
* * · · · * * δ = * p x (t) dt* ≤ |p x (t)| dt ≤ |p x (t)| dt ≤ K·meas dk . * * * [s1 ,s2 ] * [s1 ,s2 ] dk k
k
k
k
5
Asymptotical stability of optimal paths in nonconvex problems
129
Then meas dk ≥ δ/K > 0. Clearly dk ⊂ Zδ and therefore
T meas Zδ ≥ meas ∪N k=1 dk =
NT
meas dk ≥ NT
k=1
δ . K
This means that meas Zδ → ∞ as T → ∞, which contradicts (5.53). Thus (5.67) is true. Then taking (5.56) into account we obtain meas ∪k τk =
∼
(meas τk1 + meas τk2 ) ≤ K ε/2,ξ +Kε,ξ .
k
Therefore from (5.54) it follows that meas Xε = meas ∪k τk ≤ Kε,ξ , ∼
. where Kε,ξ =K ε/2,ξ +Kε,ξ Thus we have proved that the second assertion of the theorem is true for the case when x(t) is a continuously differentiable function.
5.7.2 We now take any trajectory x(·) to the system (6.1). It is known (see, for example, [3]) that for a given number δ > 0 (we take δ < ε/2) there exists a ∼ continuously differentiable trajectory x (·) to the system (6.1) such that ∼
|| x(t)− x (t)|| ≤ δ for all t ∈ [0, T ]. Since the function u is continuous then there exists η(δ) > 0 such that ∼
u(x (t)) ≥ u(x(t)) − η(δ) for all t ∈ [0, T ]. Therefore
∼
u(x (t)) dt ≥ [0,T ]
u(x(t)) dt − T η(δ). [0,T ]
Let ξ > 0 be a given number. For every T > 0 we choose a number δ such that T η(δ) ≤ ξ. Then
∼ ∼ u(x(t)) dt ≤ u(x (t)) dt + T η(δ) ≤ u(x (t)) dt + ξ, (5.70) [0,T ]
[0,T ]
[0,T ]
130
M.A. Mamedov
that is,
∗
[ u(x(t)) − u ] dt ≤ [0,T ]
∼
[ u(x (t)) − u∗ ] dt + ξ.
[0,T ] ∼
Since the function x (·) is continuously differentiable then the second integral in this inequality is bounded (see the first part of the proof), and therefore the first assertion of the theorem is proved. Now we prove the second assertion of Theorem 13.6. We will use (5.70). Take a number ε > 0 and assume that x(·) is a ξ-optimal trajectory, that is, JT (x(·)) ≥ JT∗ − ξ. From (5.70) we have ∼
JT (x (·)) ≥ JT (x(·)) − ξ ≥ JT∗ − 2ξ. ∼
Thus x (·) is a continuously differentiable 2ξ-optimal trajectory. That is why (see the first part of the proof) for the numbers ε/2 > 0 and 2ξ > 0 there exists Kε,ξ < +∞ such that ∼
meas { t ∈ [0, T ] : || x (t) − x∗ || ≥ ε/2} ≤ Kε,ξ . If || x(t ) − x∗ || ≥ ε for any t then ∼
∼
|| x (t ) − x∗ || ≥ || x(t ) − x∗ || − || x(t )− x (t ) || ≥ ε − δ ≥
ε . 2
Therefore ∼
{ t ∈ [0, T ] : || x(t) − x∗ || ≥ ε} ⊂ { t ∈ [0, T ] : || x (t) − x∗ || ≥ ε/2}, which implies that the proof of the second assertion of the theorem is completed, that is, meas { t ∈ [0, T ] : || x(t) − x∗ || ≥ ε} ≤ Kε,ξ . Now we prove the third assertion of the theorem. Let x(·) be an optimal trajectory and x(t1 ) = x(t2 ) = x∗ . Consider a trajectory x∗ (·) defined by the formula ) x(t) if t ∈ [0, t1 ] ∪ [t2 , T ], x∗ (t) = t ∈ [t1 , t2 ]. x∗ if
5
Asymptotical stability of optimal paths in nonconvex problems
131
Assume that the third assertion of the theorem is not true, that is, there is a point t ∈ (t1 , t2 ) such that ||x(t ) − x∗ || = c > 0. Consider the function x(·). In [3] it is proved that there is a sequence of continuously differentiable trajectories xn (·), t ∈ [t1 , T ], which is uniformly convergent to x(·) on [t1 , T ] and for which xn (t1 ) = x(t1 ) = x∗ . That is, for every δ > 0 there exists a number Nδ such that max || xn (t) − x(t) || ≤ δ for all n ≥ Nδ .
t∈[t1 ,T ]
On the other hand, for every δ > 0 there exists a number η(δ) > 0 such that η(δ) → 0 as δ → 0 and | u(x(t)) − u(xn (t)) | ≤ η(δ) for all t ∈ [t1 , T ].
(5.71)
Then we have
u(x(t)) dt ≤
[t1 ,T ]
u(xn (t)) dt + T η(δ).
(5.72)
[t1 ,T ]
Take a sequence of points tn ∈ (t , t2 ) such that tn → t2 as n → ∞. Clearly in this case xn (tn ) → x∗ . We apply Lemma 13 for the interval [t1 , tn ] and obtain (see also (5.31))
u(xn (t)) dt [t1
,tn ]
=
u(xn (t)) dt +
k π n ∪ω n k k
Fn
u(xn (t)) dt +
u(xn (t)) dt.
(5.73)
En
Here x(t) ∈ int D ∀t ∈ F n and F n is a set of 1st type on the interval [pxn (tn ), p∗ ] if pxn (tn ) < p∗ . Since xn (tn ) → x∗ , pxn (tn ) → p∗ and thus for every t ∈ F n we have u(xn (t)) → u∗ as n → ∞. Therefore
αn =
[u(xn (t)) − u∗ ] dt → 0 as n → ∞.
Fn
We also note that from xn (t) ∈ / int D, t ∈ E n , it follows that
En
u(xn (t)) dt ≤ u∗ meas E n .
132
M.A. Mamedov
Now we use Lemma 7 and obtain u(xn (t)) dt k π n ∪ω n k k
=u
∗
meas [∪k (πkn
∪
ωkn )]
−
∗
[u − u(xn (t))] dt −
∪k Qn k
δ 2 (xn (t)) dt.
∪k Ekn ∼
We take a number δ < c/2. Then there exists a number β > 0 such that ∼
meas [∪k (Qnk ∪ Ekn )] ≥ β . Then there exists a number β > 0 for which
u(xn (t)) dt ≤ u∗ meas [∪k (πkn ∪ ωkn )] − β.
k π n ∪ω n k k
Therefore from (5.73) we have
u(xn (t)) dt ≤ u∗ {meas [∪k (πkn ∪ ωkn )] + meas F n + meas E n }
[t1 ,tn ]
+ αn − β or
u(xn (t)) dt ≤ u∗ (tn − t1 ) + αn − β.
(5.74)
[t1 ,tn ]
From (5.71) we obtain
u(xn (t)) dt [t2 ,T ]
≤
u(x(t)) dt + T η(δ) =
[t2 ,T ]
Thus from (5.72)–(5.75) we have
[t2 ,T ]
u(x∗ (t)) dt + T η(δ).
(5.75)
5
Asymptotical stability of optimal paths in nonconvex problems
133
u(x(t)) dt [t1 ,T ]
≤
u(xn (t)) dt + T η(δ) [t1 ,T ]
=
u(xn (t)) dt +
[t1 ,tn ]
u(xn (t)) dt +
[tn ,t2 ]
≤ u∗ (tn − t1 ) + u∗ (t2 − tn ) +
u(xn (t)) dt + T η(δ)
[t2 ,T ]
u(x∗ (t)) dt
[t2 ,T ]
+ αn − β + λn + 2T η(δ)
= u(x∗ (t)) dt + αn − β + λn + 2T η(δ). [t1 ,T ]
Here λn =
[ u(xn (t)) − u∗ ] dt → 0 as n → ∞,
[tn ,t2 ]
because tn → t2 . We choose the numbers δ > 0 and n such that the following inequality holds: αn + λn + 2T η(δ) < β. In this case we have
u(x∗ (t)) dt
u(x(t)) dt < [t1 ,T ]
and therefore
[t1 ,T ]
u(x(t)) dt <
[0,T ]
u(x∗ (t)) dt,
[0,T ]
which means that x(t) is not optimal. This is a contradiction. Thus the theorem is proved.
References 1. D. Cass and K. Shell, The structure and stability of competitive dynamical systems, J. Econom. Theory, 12 (1976), 31–70. 2. A. N. Kolmogorov and S. V. Fomin, Introductory Real Analysis (Moscow, Nauka, 1975).
134
M.A. Mamedov
3. A. F. Filippov, Differential Equations with Discontinuous Right–Hand Sides, Mathematics and its Applications (Soviet series) (Kluwer Academic Publishers, Dordrecht, 1988). 4. J. A. Fridy, Statistical limit points, Proc. Amer. Math. Soc., 118 (1993), 1187–1192. 5. A. Leizarowitz, Optimal trajectories on infinite horizon deterministic control systems, Appl. Math. Optim., 19 (1989), 11–32. 6. A. Leizarowitz (1985), Infinite horizon autonomous systems with unbounded cost, Appl. Math. Optim., 13 (1985), 19–43. 7. V. L. Makarov and A. M. Rubinov, Mathematical Theory of Economic Dynamics and Equilibria (Nauka, Moscow, 1973). English trans. Springer-Verlag, New York, 1977. 8. M. A. Mamedov, Turnpike theorems in continuous systems with integral functionals, Russian Acad. Sci. Dokl. Math. 45, No. 2 (1993), 432–435. 9. M. A. Mamedov, Turnpike theorems for integral functionals, Russian Acad. Sci. Dokl. Math. 46, No. 1 (1993), 174–177. 10. M. A. Mamedov and S. Pehlivan, Statistical cluster points and turnpike theorem in nonconvex problems, J. Math. Anal. Appl., 256 (2001), 686–693. 11. L. W. McKenzie, Turnpike theory, Econometrica, 44 (1976), 841–866. 12. S. Pehlivan and M. A. Mamedov, Statistical cluster points and turnpike, Optimization, 48 (2000), 93–106. 13. R. Radner, Paths of economic growth that are optimal with regard only to final states; a turnpike theorem, Rev. Econom. Stud., 28 (1961), 98–104. 14. R. T. Rockafellar, Saddle points of Hamiltonian systems in convex problems of Lagrange, J. Optimization Theory Appl., 12 (1973), 367–390. 15. R. T. Rockafellar, Saddle points of Hamiltonian systems in convex problems having a nonzero discount rate, J. Econom. Theory, 12 (1976), 71–113. 16. P. A. Samuelson, A catenary turnpike theorem involving consumption and the gold rule, Amer. Econom. Rev., 55 (1965), 486–496. 17. J. A. Scheinkman, On optimal steady states of n-sector growth models when utility is discounted, J. Econom. Theory, 12 (1976), 11–30. 18. J. A. Scheinkman, Stability of regular equilibra and the correspondence principle for symmetric variational problems, Internat. Econ. Rev. 20 (1979), 279–315. 19. A. Zaslavski, Existence and structure of optimal solutions of variational problems, Contemp. Math., 204 (1997), 247–278. 20. A. Zaslavski, Existence and uniform boundedness of optimal solutions of variational problems, Abstr. Appl. Anal., 3 (1998), 265–292. 21. A. Zaslavski, Turnpike theorem for nonautonomous infinite dimensional discrete–time control systems, Optimization, 48 (2000), 69–92.
Chapter 6
Pontryagin principle with a PDE: a unified approach B. D. Craven
Abstract A Pontryagin principle is obtained for a class of optimal control problems with dynamics described by a partial differential equation. The method, using Karush–Kuhn–Tucker necessary conditions for a mathematical program, is almost identical to that for ordinary differential equations. Key words: Optimal control, Pontryagin principle, partial differential equation, Karush–Kuhn–Tucker conditions
6.1 Introduction Pontryagin’s principle has been proved in at least four ways, for an optimal control problem in continuous time with dynamics described by an ordinary differential equation (ODE). One approach ([5], [6]) regards the control problem as a mathematical program, and uses the Karush–Kuhn–Tucker (KKT) necessary conditions as the starting point (though with some different hypotheses) for deriving the Pontryagin theory. There are various results for optimal control when the dynamics are described by a partial differential equation (PDE), often derived (as, for example, by Lions and Bensoussan) using variational inequalities, which are generally equivalent to mathematical programs in infinite dimensions. The results in [1]–[5], and others by the same authors, obtain some versions of Pontryagin’s principle by quite different methods to those used for ODEs. However, the Pontryagin theory involving a PDE can also be derived from the mathematical programming approach, using the KKT conditions, and replacing the time variable t by a space variable z, say in R2 or R3 , or by (t, z) combined. Whatever approach B. D. Craven Department of Mathematics, University of Melbourne, Victoria 3010, AUSTRALIA e-mail: [email protected] C. Pearce, E. Hunt (eds.), Structure and Applications, Springer Optimization and Its Applications 32, DOI 10.1007/978-0-387-98096-6 6, c Springer Science+Business Media, LLC 2009
135
136
B.D. Craven
is followed requires a good deal of detailed calculation, concerned with choice of function spaces (suitable Sobolev spaces), and proofs of differentiability properties. These details are omitted here (they are adequately treated for example in [1], [3]), since the aim here is to show that a Pontryagin principle readily follows. The results depend indeed on certain differentiability properties, stated in what follows, but only indirectly on how these properties are achieved.
6.2 Pontryagin for an ODE Consider first an optimal control problem with an ODE:
MIN J(u) : = F (x, u) :=
T
f (x(t), u(t), t)dt subject to
(6.1)
0
˙ = m(x(t), u(t), t) u(t) ∈ Γ (t) (0 ≤ t ≤ T ). x(0) = x0 , x(t) Here x(.) is the state function, u(.) is the control function, the time interval [0, T ] is fixed, f and m are differentiable functions. Other details such as variable horizon T, an endpoint constraint on x(T ), and state constraints, can readily be added to the problem. They are omitted here, since the purpose is to show the method. The steps are as follows. (a) The problem (6.1) is expressed as a mathematical program: MINx∈X,u∈U J(u) := F (x, u) subject to Dx = M (x, u), u ∈ Γ, over suitable function spaces X and U ; X is chosen so that the differential operator D := (d/dt) is a continuous linear mapping (see Note 1 in the Appendix). (b) Assume temporarily that F and M are differentiable with respect to (x, u). Then necessary KKT conditions for a minimum at (x, u) = (¯ x, u ¯) are ˆ x, u ¯) + λ(−Dx + Mx (¯ x, u ¯)) = 0, Fx (¯ ˆ (Fu (¯ x, u ¯) + λMu ( x ¯, u ¯))(Γ − u ¯) ≥ 0, ˆ Represent λ ˆ by a function λ(.), ¯ with a Lagrange multiplier λ. where
T
ˆ w≥ ∀w ∈ C[0, T ] < λ,
¯ λ(t), w(t)dt.
0
Define the Hamiltonian h(x(t), u(t), t, λ(t)) := f (x(t), u(t), t) + λ(t)m(x(t), u(t), t)
(6.2) (6.3)
6
Pontryagin principle with a PDE: a unified approach
137
and
ˆ := F (x, u) + λM ˆ (x, u) = H(x, u, λ)
T
¯ h(x(t), u(t), t, λ(t))dt.
0
In what follows, differentiability will be assumed only with respect to x, not ˆ remains, satisfying (6.2), u, so that (6.3) is not available. The multiplier λ x, u ¯) is assumed surjective. provided that the operator −D + Mx (¯ ˆ term in (6.2) by parts leads to (c) Integrating the −λD ¯ = (F + λ ˆM ¯ )x (¯ −Dλ x, u ¯), if the integrated part vanishes. Choosing a boundary condition to do this, the adjoint differential equation is obtained: ¯ ) = 0. ¯˙ x(t), u ¯(t), t, λ(t)), λ(T − λ(t) = hx (¯
(6.4)
(d) Assume that Dx = M (x, u) defines x as a Lipschitz function of u, and that (see Note 2 in the Appendix) x, u ¯)(x − x ¯) + O( x − x ¯ + u − u ¯ ), F (x, u) − F (¯ x, u) = Fx (¯
(6.5)
with a similar requirement for M . Then minimality of (¯ x, u ¯), namely that F (x, u) − F (¯ x, u ¯) ≥ 0, with (6.2), leads (see [7], Theorem 7.2.3) to ˆ ˆ = F (x, u)− F( x H(¯ x, u, λ)−H(¯ x, u ¯, λ) ¯, u ¯)+O( u− u ¯ ) ≥ O( u− u ¯ ), (6.6) ˆ over Γ (.) at u ¯. (Note that there describing a quasimin (see [6]) of H(¯ x, ., λ) is no requirement of convexity on Γ (.).) (e) Assuming that u ¯ is a minimum in terms of the L1 norm, suppose if possible that h(¯ x(t), u(t), t, λ(t)) < h(¯ x(t), u ¯(t), t, λ(t)) for t in a set of positive measure. Then (see Note 3 in the Appendix) a set of control functions {uβ (.) : β ≥ 0} ⊂ Γ is constructed (see [7], Theorem 7.2.6), for which ˆ − H(¯ ˆ ≤ c u − u H(¯ x, u, λ) x, u ¯, λ) ¯ for some constant c > 0, thus contradicting (6.6). (A required chattering property holds automatically for the considered control constraint.) This has proved Pontryagin’s principle, in the following form. Theorem 1. Let the control problem (6.1) reach a local minimum at (x, u) = (¯ x, u ¯) with respect to the L1 -norm for the control u. Assume that the differential equation Dx−M (x, u) determines x as a Lipschitz function of u, that the x, u ¯) differentiability property (6.5) (with respect to x) holds, and that −DM (¯ is surjective, Then necessary conditions for the minimum are that the costate
138
B.D. Craven
¯ λ(.) satisfies the adjoint equation (6.4), and that h(¯ x(t), ., t, λ(t)) is minimized over Γ (t) at u ¯(t), for almost all t.
6.3 Pontryagin for an elliptic PDE Denote by Ω a closed bounded region in R3 (or R2 ), with boundary ∂Ω, and disjoint sets Ai (i = 1, 2, 3, 4) whose union is ∂Ω. The optimal problem considered is:
f (x(z), u(z), z)dz MIN x(.), u(.)J(u) := Ω
subject to (∀z (∀z (∀z (∀z
∈ Ω) Dx(z) = m(x(z), u(z), z), x(z) = x0 (z), ∈ A1 ) ∈ A2 ) (∇x(z)).n(z) = g0 (z), ∈ Ω) u(z) ∈ Γ (z).
(6.7) (6.8) (6.9)
Here D is an elliptic linear partial differential operator, such as the Laplacian ∇2 , and n ≡ n(z) denotes the outward-pointing unit normal vector to ∂Ω at z ∈ ∂Ω. The constraint on the control u(z) is specified in terms of a given set-valued function Γ (z). The precise way in which x(.) satisfies the PDE (6.8) need not be specified here; instead, some specific properties of the solution will be required. The function spaces must be chosen so that D is a continuous linear mapping. This holds, in particular, for D = ∇2 , with Sobolev spaces, if x ∈ W02 (Ω) and u ∈ W01 (Ω). It is further required that (6.7) determines x(.) as a Lipschitz function of u(.). The boundary ∂Ω of the region need only be smooth enough that Green’s theorem can be applied to it. The Hamiltonian is h(x(z), u(z), z, λ(z)) := f (x(z), u(z), z) + λ(z)m(x(z), u(z), z).
(6.10)
The steps of Section 6.2 are now applied, but replacing t ∈ [0, T ] by z ∈ Ω. It is observed that steps (a), (b), (d) and (e) remain valid – they do not depend on t ∈ R. Step (c) requires a replacement for integration by parts. If D = ∇2 , it is appropriate to use Green’s theorem in the form
2 2 [λ∇ x − x∇ λ]dv = [λ(∂x/∂n) − x(∂λ/∂n)]ds, Ω
∂Ω
in which dv and ds denote elements of volume and surface. The right side of (6.10) beomes the integrated part; the origin can be shifted, in the spaces of
6
Pontryagin principle with a PDE: a unified approach
139
functions, to move x(z) = x0 (z) to x(z) = 0, and a similar replacement for the normal component (∇x(z)).n(z); so the contributions to the integrated part from A1 and A2 vanish already. The remaining contributions vanish if boundary conditions are imposed: λ(z) = 0 on A3 ; ∂λ/∂n = 0 on A4 (thus ∇λ(z).n(z) = 0 on A4 ).
(6.11)
Then (6.2) leads to the adjoint PDE D∗ λ(z) = ∂h(x(z), u(z), z; λ(z))/∂x(z), with boundary conditions (6.11), where D∗ denotes the adjoint linear operator to D. Here, with D = ∇2 , (6.10) shows that D∗ = ∇2 also. Then (e), with z ∈ Ω replacing t ∈ [0, T ], gives Pontryagin’s principle in the form: ¯ h(¯ x(z), ., t, λ(z)) is minimized over Γ (z) at u ¯(z), possibly except for a set of z of zero measure. If f and m happen to be linear in u, and if Γ (z) is a polyhedron with vertices pi (or an interval if u(z) ∈ R), then Pontryagin’s principle may lead to bang-bang control, namely u(z) = pi when z ∈ Ei , for some disjoint sets Ei ⊂ Ω.
6.4 Pontryagin for a parabolic PDE Now consider a control problem with dynamics described by the PDE ∂x(z, t)/∂t = c2 ∇2z x(z, t) + m(x(z, t), u(z, t), z, t), where ∇2z acts on the variable z. Here t (for the ODE) has been replaced by 3 (t, z) ∈ [0, T ] × Ω, for a closed bounded region Ω ⊂ R , and where m(.) is a forcing function. Define the linear differential operator D := (∂/∂t) − c2 ∇2z . The function spaces must be chosen so that D is a continuous linear mapping. Define Ai ⊂ ∂Ω as in Section 6.3. The optimal control problem now becomes (with a certain choice of boundary conditions)
T
M INx(.),u(.) J(u) :=
f (x(z, t), u(z, t), z, t)dtdz 0
Ω
subject to (∀z ∈ Ω) (∀z ∈ A1 )(∀t ∈ [0, T ]) (∀z ∈ A2 ) (∀z ∈ ∂Ω) (∀z ∈ Ω)
Dx(z, t) = m(x(z, t), u(z, t), t, z), x(z, t) = x0 (z, t), ∇x(z, t)).n(z) = g0 (z, t), x(z, 0) = b0 (z), u(z, t) ∈ Γ (z, t).
140
B.D. Craven
Then steps (a), (b), (d) and (e) proceed as in Section 6.3, for an elliptic PDE. The Hamiltonian is h(x(z, t), u(z, t), z, t, λ(z, t)) := f (x(z, t), u(z, t), z, t) + λ(z, t)m(x(z, t), u(z, t), z, t). Step (c) (integration by parts) is replaced by the following (where I := [0, T ] and θ := (∂/∂t)) :
ˆ λ(z, t)Dx(z, t)dzdt −λDx = −
I ∂ =− λ(z, t)[θ − ∇2z ]x(z, t)dzdt
I Ω = dz (θλ(z, t))x(z, t)dt Ω
I + (∇2z λ(z, t))x(z, t)dz, I
Ω
applying integration by parts to θ and Green’s theorem to ∇2z , provided that the “integrated parts” vanish. Since x(z, t) is given for t = 0, and for z ∈ A1 ∪ A2 ⊂ ∂Ω, it suffices if (∀z)λ(z, T ) = 0, so that
Ω
[λ(z, t)x(x, t)]T0 = 0, and if
(∀t ∈ [0, T ]) λ(z, t) = 0 on A3 ; ∇x(z, t.n(z, t) = 0 on A4 . With these boundary conditions, the adjoint PDE becomes −(∂/∂t)λ(z, t) = c2 ∇2z λ(z, t). Then (e), with (z, t) ∈ Ω × I replacing t ∈ [0, T ], gives Pontryagin’s principle in the form: ¯ t)) is minimized over Γ (z) at u h(¯ x(z, t), ., z, t, λ(z, ¯(z, t), possibly except for a set of (z, t) of zero measure. Concerning bang-bang control, a similar remark to that in Section 6.3 applies here also.
6.5 Appendix Note 1. The linear mapping D is continuous if x(.) is given a graph norm : x := x ∗ + Dx ∗ ,
6
Pontryagin principle with a PDE: a unified approach
141
where x ∗ denotes a given norm, such as x ∞ or x 2 . Note 2. It follows from Gronwall’s inequality that the mapping from u (with L1 norm) to x (with L∞ or L2 norm) is Lipschitz if m(.) satisfies a Lipschitz condition. The differentiability property (6.5) replaces the usual x, u ¯) by Fx (¯ x, u). This holds (using the first mean value theorem) if f Fx (¯ and m have bounded second derivatives. Similar results are conjectured for the case of partial differential equations. Note 3. The construction depends on the (local) minimum being reached when u has the L1 -norm, and on the constraint (∀z)u(z) ∈ Γ (z) having the chattering property, that if u and v are feasible controls, then w is a feasible control, defining w(z) = u(z) for z ∈ Ω1 ⊂ Ω and w(z) = v(z) for z ∈ Ω\Ω1 . For Section 6.4, substitute (z, t) for z here. Acknowledgments The author thanks two referees for pointing out ambiguities and omissions.
References 1. E. Casas, Boundary control problems for quasi–linear elliptic equations: A Pontryagin’s principle, Appl. Math. Optim. 33 (1996), 265–291. 2. E. Casas, Pontryagin’s principle for state–constrained boundary control problems of semilinear parabolic equations, SIAM J. Control Optim. 35 (1997), 1297–1327. 3. E. Casas, F. Tr¨ oltsch and A. Unger, Second order sufficient optimality condition for a nonlinear elliptic boundary control problem, Z. Anal. Anwendungen 15 (1996), 687–707. 4. E. Casas and F. Tr¨ oltsch, Second order necessary optimality conditions for some stateconstrained control problems of semilinear elliptic equations, SIAM J. Control Optim. 38 (2000), 1369–1391. 5. E. Casas, J.-P. Raymond and H. Zidani, Pontryagin’s principle for local solutions of control problems with mixed control–state constraints, SIAM J. Control Optim. 39 (1998), 1182–1203. 6. B. D. Craven, Mathematical Programming and Control Theory (Chapman & Hall, London, 1978). 7. B. D. Craven, Control and Optimization (Chapman & Hall, London, 1995).
Chapter 7
A turnpike property for discrete-time control systems in metric spaces Alexander J. Zaslavski
Abstract In this work we study the structure of “approximate” solutions for a nonautonomous infinite dimensional discrete-time control system determined by a sequence of continuous functions vi : X × X → R1 , i = 0, ±1, ±2, . . . where X is a metric space. Key words: Discrete-time control system, metric space, turnpike property
7.1 Introduction Let X be a metric space and let ρ(·, ·) be the metric on X. For the set X × X we define a metric ρ1 (·, ·) by ρ1 ((x1 , x2 ), (y1 , y2 )) = ρ(x1 , y1 ) + ρ(x2 , y2 ),
x1 , x2 , y1 , y2 ∈ X.
Let Z be the set of all integers. Denote by M the set of all sequences 1 of functions v = {vi }∞ i=−∞ where vi : X → R is bounded from below for ∞ each i ∈ Z. Such a sequence of functions {vi }i=−∞ ∈ M will occasionally be denoted by a boldface v (similarly {ui }∞ i=−∞ will be denoted by u, etc.) The set M is equipped with the metric d defined by ˜ u) = sup{|vi (x, y) − ui (x, y)| : (x, y) ∈ X × X, i ∈ Z}, d(v,
(1.1)
˜ u)(1 + d(v, ˜ u))−1 , u, v ∈ M. d(v, u) = d(v, In this paper we investigate the structure of “approximate” solutions of the optimization problem
Alexander J. Zaslavski Department of Mathematics, The Technion–Israel Institute of Technology, Haifa, Israel C. Pearce, E. Hunt (eds.), Structure and Applications, Springer Optimization and Its Applications 32, DOI 10.1007/978-0-387-98096-6 7, c Springer Science+Business Media, LLC 2009
143
144
A.J. Zaslavski k 2 −1
vi (xi , xi+1 ) → min,
2 {xi }ki=k ⊂ X, 1
xk1 = y, xk2 = z
(P )
i=k1
where v = {vi }∞ i=−∞ ∈ M, y, z ∈ X and k2 > k1 are integers. The interest in these discrete-time optimal problems stems from the study of various optimization problems which can be reduced to this framework, for example, continuous-time control systems which are represented by ordinary differential equations whose cost integrand contains a discounting factor (see [Leizarowitz (1985)]), the infinite-horizon control problem of minimizT ing 0 L(z, z )dt as T → ∞ (see [Leizarowitz (1989), Zaslavski (1996)]) and the analysis of a long slender bar of a polymeric material under tension in [Leizarowitz and Mizel (1989), Marcus and Zaslavski (1999)]. Similar optimization problems are also considered in mathematical economics (see [Dzalilov et al. (2001), Dzalilov et al. (1998), Makarov, Levin and Rubinov (1995), Makarov and Rubinov (1973), Mamedov and Pehlivan (2000), Mamedov and Pehlivan (2001), McKenzie (1976), Radner (1961), Rubinov (1980), Rubinov (1984)]). Note that the problem (P) was studied in [Zaslavski (1995)] when X was a compact metric space and vi = v0 for all integers i. For each v ∈ M, each m1 , m2 ∈ Z such that m2 > m1 and each z1 , z2 ∈ X set σ(v, m1 , m2 , z1 , z2 ) = m −1 2 2 vi (xi , xi+1 ) : {xi }m inf i=m1 ⊂ X, xm1 = z1 , xm2 = z2 .
(1.2)
i=m1
If the space of states X is compact and vi is continuous for all integers i, then the problem (P) has a solution for each y, z ∈ X and each pair of integers k2 > k1 . For the noncompact space X the existence of solutions of the problem (P) is not guaranteed and in this situation we consider δapproximate solutions. Let v ∈ M, y, z ∈ X, k2 > k1 be integers and let δ be a positive number. 2 ⊂ X satisfying xk1 = y, xk2 = z is a We say that a sequence {xi }ki=k 1 δ-approximate solution of the problem (P) if k 2 −1
vi (xi , xi+1 ) ≤ σ(v, k1 , k2 , y, z) + δ.
i=k1
In this chapter we study the structure of δ-approximate solutions of the problem (P). xi }∞ Definition: Let v = {vi }∞ i=−∞ ∈ M and {¯ i=−∞ ⊂ X. We say that v ∞ has the turnpike property (TP) and {¯ xi }i=−∞ is the turnpike for v if for each > 0 there exist δ > 0 and a natural number N such that for each pair of 2 integers m1 , m2 satisfying m2 ≥ m1 + 2N and each sequence {xi }m i=m1 ⊂ X
7
A turnpike property
satisfying
m 2 −1
145
vi (xi , xi+1 ) ≤ σ(v, m1 , m2 , xm1 , xm2 ) + δ
i=m1
there exist τ1 ∈ {m1 , . . . , m1 + N } and τ2 ∈ {m2 − N, . . . , m2 } such that ¯i ) ≤ , i = τ1 , . . . , τ2 . ρ(xi , x ¯m1 ) ≤ δ, then τ1 = m1 , and if ρ(xm2 , x ¯m2 ) ≤ δ, then Moreover, if ρ(xm1 , x τ 2 = m2 . This property was studied in [Zaslavski (2000)] for sequences of functions v which satisfy certain uniform boundedness and uniform continuity assumptions. We showed that a generic v has the turnpike property. The turnpike property is very important for applications. Suppose that a sequence of cost functions v ∈ M has the turnpike property and we know a finite number of approximate solutions of the problem (P). Then we know the turnpike {¯ xi }∞ i=−∞ , or at least its approximation, and the constant N which is an estimate for the time period required to reach the turnpike. This information can be useful if we need to find an “approximate” solution of the problem (P) with a new time interval [k1 , k2 ] and the new values y, z ∈ X at the end points k1 and k2 . Namely instead of solving this new problem on the “large” interval [k1 , k2 ] we can find an “approximate” solution of the ¯k1 +N problem (P) on the “small” interval [k1 , k1 + N ] with the values y, x at the end points and an approximate solution of the problem (P) on the ¯k2 −N , z at the end points. “small” interval [k2 − N, k2 ] with the values x Then the concatenation of the first solution, the sequence {¯ xi }kk21 −N +N and the second solution is an approximate solution of the problem (P) on the interval [k1 , k2 ] with the values y, z at the end points. Sometimes as an “approximate” 2 satisfying solution of the problem (P) we can choose any sequence {xi }ki=k 1 ¯i for all i = k1 + N, . . . , k2 − N. xk1 = y, xk2 = z and xi = x This sequence is a δ-approximate solution where the constant δ does not depend on k1 , k2 and y, z. The constant δ is not necessarily a “small” number but it may be sufficient for practical needs especially if the length of the interval [k1 , k2 ] is large. The turnpike property is well known in mathematical economics. The term was first coined by Samuelson in 1948 (see [Samuelson (1965)]) where he showed that an efficient expanding economy would spend most of the time in the vicinity of a balanced equilibrium path (also called a von Neumann path). This property was further investigated in [Dzalilov et al. (2001), Dzalilov et al. (1998), Makarov, Levin and Rubinov (1995), Makarov and Rubinov (1973), Mamedov and Pehlivan (2000), Mamedov and Pehlivan (2001), McKenzie (1976), Radner (1961), Rubinov (1980), Rubinov (1984)] for optimal trajectories of models of economic dynamics.
146
A.J. Zaslavski
The chapter is organized as follows. In Section 2 we study the stability of the turnpike phenomenon. In Section 3 we show that if {¯ xi }∞ i=−∞ is the ∞ turnpike for v = {vi }i=−∞ ∈ M and vi is continuous for each integer i, 2 xi }ki=k is a solution then for each pair of integers k2 > k1 the sequence {¯ 1 ¯k2 . In Section 4 we show that of the problem (P) with y = x ¯k1 and z = x under certain assumptions the turnpike property is equivalent to its weakened version.
7.2 Stability of the turnpike phenomenon In this section we prove the following result. Theorem 1. Assume that v = {vi }∞ i=−∞ ∈ M has the turnpike property and {¯ xi }∞ i=−∞ ⊂ X is the turnpike for v. Then the following property holds: For each > 0 there exist δ > 0, a natural number N and a neighborhood U of v in M such that for each u ∈ U, each pair of integers m1 , m2 satisfying 2 m2 ≥ m1 + 2N and each sequence {xi }m i=m1 ⊂ X satisfying m 2 −1
ui (xi , xi+1 ) ≤ σ(u, m1 , m2 , xm1 , xm2 ) + δ
(2.1)
i=m1
there exist τ1 ∈ {m1 , . . . , m1 + N } and τ2 ∈ {m2 − N, . . . , m2 } such that ¯i ) ≤ , i = τ1 , . . . , τ2 . ρ(xi , x
(2.2)
¯m1 ) ≤ δ, then τ1 = m1 , and if ρ(xm2 , x ¯m2 ) ≤ δ, then Moreover, if ρ(xm1 , x τ2 = m 2 . Proof. Let > 0. It follows from the property (TP) that there exist δ0 ∈ (0, /4)
(2.3)
and a natural number N0 such that the following property holds: (P1) for each pair of integers m1 , m2 ≥ m1 + 2N0 and each sequence 2 {xi }m i=m1 ⊂ X satisfying m 2 −1
vi (xi , xi+1 ) ≤ σ(v, m1 , m2 , xm1 , xm2 ) + 4δ0
(2.4)
i=m1
there exist τ1 ∈ {m1 , . . . , m1 + N0 }, τ2 ∈ {m2 − N0 , m2 } such that (2.2) ¯m1 ) ≤ 4δ0 , then τ1 = m1 and if ρ(xm2 x ¯m2 ) ≤ 4δ0 holds. Moreover, if ρ(xm1 , x then τ2 = m2 . It follows from the property (TP) that there exist δ ∈ (0, δ0 /4)
(2.5)
7
A turnpike property
147
and a natural number N1 such that the follwing property holds: (P2) For each pair of integers m1 , m2 ≥ m1 + 2N1 and each sequence 2 {xi }m i=m1 ⊂ X which satisfies m 2 −1
vi (xi , xi+1 ) ≤ σ(v, m1 , m2 , xm1 , xm2 ) + 4δ
i=m1
there exist τ1 ∈ {m1 , . . . , m1 + N1 }, τ2 ∈ {m2 − N1 , . . . , m2 } such that ¯i ) ≤ δ0 , i ∈ {τ1 , . . . , τ2 }. ρ(xi , x
(2.6)
N = 4(N1 + N0 )
(2.7)
Set and U = {u ∈ M : |ui (x, y) − vi (x, y)| ≤ (8N )−1 δ, (x, y) ∈ X × X}.
(2.8)
Assume that u ∈ U, m1 , m2 ∈ Z, m2 ≥ m1 + 2N , 2 {xi }m i=m1
⊂ X and
m 2 −1
ui (xi , xi+1 ) ≤ σ(u, m1 , m2 , xm1 , xm2 ) + δ.
(2.9)
i=m1
Let k ∈ {m1 , . . . , m2 }, m2 − k ≥ 2N.
(2.10)
(2.9) implies that k+2N −1
ui (xi , xi+1 ) ≤ σ(u, k, k + 2N, xk , xk+2N ) + δ.
(2.11)
i=k
By (2.11), (2.7) and (2.8), *k+2N −1 * k+2N * * −1 * * ui (xi , xi+1 ) − vi (xi , xi+1 )* ≤ δ/4, * * * i=k
i=k
|σ(u, k, k + 2N, xk , xk+2N ) − σ(v, k, k + 2N, xk , xk+2N )| < δ/4 and k+2N −1 i=k
vi (xi , xi+1 ) ≤
k+2N −1
ui (xi , xi+1 ) + δ/4
i=k
≤ δ/4 + σ(u, k, k + 2N, xk , xk+2N ) + δ ≤ σ(v, k, k + 2N, xk , xk+2N ) + δ + δ/4 + δ/4.
(2.12)
148
A.J. Zaslavski
We have that (2.12) holds for any k satisfying (2.10). This fact implies that k 2 −1 vi (xi , xi+1 ) ≤ σ(v, k1 , k2 , xk1 , xk2 ) + 2−1 · 3δ (2.13) i=k1
for each pair of integers k1 , k2 ∈ {m1 , . . . , m2 } such that k1 < k2 ≤ k1 + 2N. It follows from (2.13), (2.7) and the property (P2) that for any integer k ∈ {m1 , . . . , m2 } satisfying m2 − k ≥ 2N0 + 2N1 there exists an integer q such that q − k ∈ [2N0 , 2N0 + 2N1 ] and ¯ q ) ≤ δ0 . ρ(xq , x This fact implies that there exists a finite strictly increasing sequence of integers {τj }sj=0 such that ¯τj ) ≤ δ0 , j = 0, . . . , s, ρ(xτj , x ¯m1 ) ≤ δ0 , then τ0 = m1 , m1 ≤ τ0 ≤ 2N0 + 2N1 + m1 , if ρ(xm1 , x
(2.14) (2.15)
τj+1 − τj ∈ [2N0 , 2N0 + 2N1 ], j = 0, . . . , s − 1
(2.16)
m2 − 2N0 − 2N1 < τs ≤ m2 .
(2.17)
and It follows from (2.13), (2.7), (2.5), (2.14) and (2.16) that for j = 0, . . . , s − 1 ¯i ) ≤ , i ∈ {τj , . . . , τj+1 }. ρ(xi , x ¯i ) ≤ , i ∈ {τ0 , . . . , τs } with τ0 ≤ m1 + N , τs ≥ m2 − N . By Thus ρ(xi , x ¯m1 ) ≤ δ0 , then τ0 = m1 . (2.15) if ρ(xm1 , x Assume that ¯ m2 ) ≤ δ0 . (2.18) ρ(xm2 , x To complete the proof of the theorem it is sufficient to show that ρ(xi , x ¯i ) ≤ , i ∈ {τs , . . . , m2 }. By (2.17) and (2.16) m2 − τs−1 = m2 − τs + τs − τs−1 ∈ [2N0 , 4N0 + 4N1 ].
(2.19)
By (2.13), (2.19) and (2.7), m 2 −1 i=τs−1
vi (xi , xi+1 ) ≤ σ(v, τs−1 , m2 , xτs−1 , xm2 ) + 2−1 · 3δ.
(2.20)
7
A turnpike property
149
It follows from (2.19), (2.20), (2.18), (2.14) and the property (P1) that ¯i ) ≤ , i = τs−1 , . . . , m2 . ρ(xi , x Theorem 2.1 is proved.
7.3 A turnpike is a solution of the problem (P) ∞ In this section we show that if {¯ xi }∞ i=−∞ is the turnpike for v = {vi }i=−∞ ∈ M and vi is continuous for each integer i, then for each pair of integers 2 xi }ki=k is a solution of the problem (P) with y = x ¯ k1 k2 > k1 the sequence {¯ 1 and z = x ¯k2 . We prove the following result.
Theorem 2. Let v = {vi }∞ xi }∞ −∞ ⊂ X. Assume that vi i=−∞ ∈ M, and {¯ is continuous for all i ∈ Z, v has the turnpike property and {¯ xi }∞ −∞ is the turnpike for v. Then for each pair of integers m1 , m2 > m1 , m 2 −1
vi (¯ xi , x ¯i+1 ) = σ(v, m1 , m2 , x ¯ m1 , x ¯m2 ).
i=m1
¯2 > m ¯ 1, Proof. Assume the contrary. Then there exist a pair of integers m ¯ 1, m ¯2 a sequence {xi }m i=m ¯ 1 and a number Δ > 0 such that ¯m ¯m xm ¯1 = x ¯ 1 , xm ¯2 = x ¯ 2, m ¯ 2 −1
vi (xi , xi+1 ) <
i=m ¯1
m ¯ 2 −1
vi (¯ xi , x ¯i+1 ) − Δ.
(3.1)
i=m ¯1
There exists ∈ (0, Δ/4) such that the following property holds: ¯ 2 + 1}, z1 , z2 ∈ X and (P3) if i ∈ {m ¯ 1 − 1, . . . , m ¯i ), ρ(z2 , x ¯i+1 ) ≤ , ρ(z1 , x then
xi , x ¯i+1 ) − vi (z1 , z2 )| ≤ Δ[64(m ¯2 −m ¯ 1 + 4)]−1 . |vi (¯
(3.2)
By the property (TP) there exist δ ∈ (0, /4) and a natural number N such 2 ⊂ that for each pair of integers n1 , n2 ≥ n1 + 2N and each sequence {yi }ni=n 1 X satisfying n 2 −1 vi (yi , yi+1 ) ≤ σ(v, n1 , n2 , yn1 , yn2 ) + δ (3.3) i=n1
the following inequality holds: ρ(yi , x ¯i ) ≤ , i = n1 + N, . . . , n2 − N.
(3.4)
150
A.J. Zaslavski
m ¯ 2 +4N There exists {¯ yi }i= m ¯ 1 −4N ⊂ X such that
¯m ¯m ¯m y¯m ¯ 1 −4N = x ¯ 1 −4N , y ¯ 2 +4N = x ¯ 2 +4N ,
(3.5)
and m ¯ 2 +4N
vi (¯ yi , y¯i+1 ) ≤ σ(v, m ¯ 1 − 4N, m ¯ 2 + 4N, y¯m ¯m ¯ 1 −4N , y ¯ 2 +4N ) + δ/8.
i=m ¯ 1 −4N
(3.6) By (3.5), (3.6) and the definition of δ, N (see (3.3) and (3.4)) ¯i ) ≤ , i = m ¯ 1 − 3N, . . . , m ¯ 2 + 3N. ρ(¯ yi , x
(3.7)
m ¯ 2 +4N Define {yi }i= m ¯ 1 −4N ⊂ X by
¯ 1 − 4N, . . . , m ¯ 1 − 1} ∪ {m ¯ 2 + 1, . . . , m ¯ 2 + 4N }, yi = y¯i , i ∈ {m
(3.8)
¯ 1, . . . , m ¯ 2 }. yi = xi , i ∈ {m We will estimate m ¯ 2 +4N −1
vi (¯ yi , y¯i+1 ) −
i=m ¯ 1 −4N
m ¯ 2 +4N −1
vi (yi , yi+1 ).
i=m ¯ 1 −4N
By (3.8) and (3.1), m ¯ 2 +4N −1
vi (¯ yi , y¯i+1 ) −
i=m ¯ 1 −4N
vi (yi , yi+1 )
(3.9)
i=m ¯ 1 −4N m ¯2
=
m ¯ 2 +4N −1
[vi (¯ yi , y¯i+1 ) − vi (yi , yi+1 )]
i=m ¯ 1 −1
ym ¯m ym ym ¯m = vm ¯ 1 −1 (¯ ¯ 1 −1 , y ¯ 1 ) − vm ¯ 1 −1 (¯ ¯ 1 −1 , ym ¯ 1 ) + vm ¯ 2 (¯ ¯ 2, y ¯ 2 +1 ) − vm ¯ 2 (ym ¯ 2 , ym ¯ 2 +1 ) +
m ¯ 2 −1
[vi (¯ yi , y¯i+1 ) − vi (yi , yi+1 )]
i=m ¯1
ym ¯m ym ¯m ym ¯m = vm ¯ 1 −1 (¯ ¯ 1 −1 , y ¯ 1 ) − vm ¯ 1 −1 (¯ ¯ 1 −1 , x ¯ 1 ) + vm ¯ 2 (¯ ¯ 2, y ¯ 2 +1 ) xm ¯m − vm ¯ 2 (¯ ¯ 2, y ¯ 2 +1 ) +
m ¯ 2 −1
[vi (¯ yi , y¯i+1 ) − vi (¯ xi , x ¯i+1 )]
i=m ¯1
+
m ¯ 2 −1
[vi (¯ xi , x ¯i+1 ) − vi (xi , xi+1 )].
i=m ¯1
By (3.7) and the property (P3),
7
A turnpike property
151
|vm ym ¯m ym ¯m ¯2 −m ¯ 1 + 4)]−1 , (3.10) ¯ 1 −1 (¯ ¯ 1 −1 , y ¯ 1 ) − vm ¯ 1 −1 (¯ ¯ 1 −1 , x ¯ 1 )| ≤ 2Δ[64(m
|vm ym ¯m xm ¯m ¯2 −m ¯ 1 + 4)]−1 , ¯ 2 (¯ ¯ 2, y ¯ 2 +1 ) − vm ¯ 2 (¯ ¯ 2, y ¯ 2 +1 )| ≤ 2Δ[64(m
(3.11)
yi , y¯i+1 ) − vi (¯ xi , x ¯i+1 )| ≤ Δ[64(m ¯2 −m ¯ 1 + 4)]−1 , i = m ¯ 1, . . . , m ¯ 2 − 1. |vi (¯ (3.12) It follows from (3.9), (3.12), (3.11) and (3.1) that m ¯ 2 +4N −1
[vi (¯ yi , y¯i+1 ) − vi (yi , yi+1 )]
i=m ¯ 1 −4N
¯ 1 + 4)]−1 (m ¯2 −m ¯ 1 + 4) ≥ −Δ[64(m ¯2 −m +
m ¯ 2 −1
[vi (¯ xi , x ¯i+1 ) − vi (xi , xi+1 )]
i=m ¯1
≥ Δ − Δ/64 > 2δ. Combined with (3.8) this fact contradicts (3.6). The contradiction we have reached proves the theorem.
7.4 A turnpike result In this section we show that under certain assumptions the turnpike property is equivalent to its weakened version. Theorem 3. Let v = {vi }∞ xi }∞ −∞ ∈ M, {¯ i=−∞ ⊂ X and m 2 −1
vi (¯ xi , x ¯i+1 ) = σ(v, m1 , m2 , x ¯ m1 , x ¯ m2 )
(4.1)
i=m1
for each pair of integers m1 , m2 > m1 . Assume that the following two properties hold: (i) For any > 0 there exists δ > 0 such that for each i ∈ Z, each x1 , x2 , y1 , y2 ∈ X satisfying ρ(xj , yj ) ≤ δ, j = 1, 2, |vi (x1 , x2 ) − vi (y1 , y2 )| ≤ ; (ii) for each > 0 there exist δ > 0 and a natural number N such that for 2 each pair of integers m1 , m2 ≥ m1 + 2N and each sequence {xi }m i=m1 ⊂ X satisfying
152
A.J. Zaslavski m 2 −1
vi (xi , xi+1 ) ≤ σ(v, m1 , m2 , x ¯ m1 , x ¯ m2 ) + δ
(4.2)
i=m1
the inequality ρ(xi , x ¯i ) ≤ , i = m1 + N, . . . , m2 − N holds. Then v has the property (TP) and {¯ xi }∞ i=−∞ is the turnpike for v. Proof. We show that the following property holds: (C) Let > 0. Then there exists δ > 0 such that for each integer m, each natural number k and each sequence {xi }k+m i=k ⊂ X satisfying ¯i ) ≤ δ, i = m, m + k ρ(xi , x m+k−1
(4.3)
vi (xi , xi+1 ) ≤ σ(v, m, m + k, xm , xm+k ) + δ
i=m
the inequality ¯i ) ≤ , i = m, . . . , m + k ρ(xi , x
(4.4)
holds. Let > 0. There exists 0 ∈ (0, /2) and a natural number N such that for 2 each pair of integers m1 , m2 ≥ m1 + 2N and each sequence {xi }m i=m1 ⊂ X satisfying m 2 −1 vi (xi , xi+1 ) ≤ σ(v, m1 , m2 , xm1 , xm2 ) + 20 i=m1
the inequality ¯i ) ≤ , i = m1 + N, . . . , m2 − N ρ(xi , x
(4.5)
holds. By the property (i) there is δ ∈ (0, 0 /8) such that for each integer i and each z1 , z2 , y1 , y2 ∈ X satisfying ρ(zj , yj ) ≤ δ, j = 1, 2 the following inequality holds: |vi (z1 , z2 ) − vi (y1 , y2 )| ≤ 0 /16.
(4.6)
Assume that m is an integer, k is a natural number and a sequence {xi }m+k i=m ⊂ X satisfies (4.3). We will show that (4.4) is true. Clearly we may assume without loss of generality that k > 1. Define a sequene {zi }m+k i=m ⊂ X by ¯i , i ∈ {m, . . . , m + k} \ {m, m + k}. zi = xi , i = m, m + k, zi = x (4.3) and (4.7) imply that
(4.7)
7
A turnpike property m+k−1
153
vi (xi , xi+1 )
i=m
≤
m+k−1
vi (zi , zi+1 ) + δ
i=m
≤δ+
m+k−1
vi (¯ xi , x ¯i+1 ) + |vm (¯ xm , x ¯m+1 ) − vm (xm , x ¯m+1 )|
i=m
+ |vm+k−1 (¯ xm+k−1 , x ¯m+k ) − vm+k−1 (¯ xm+k−1 , xm+k )|. Combined with (4.3) and the definition of δ (see (4.6)) this inequality implies that m+k−1
vi (xi , xi+1 ) ≤ δ + 0 /8 +
i=m
m+k−1
vi (¯ xi , x ¯i+1 ).
(4.8)
i=m
+k Define {yi }m+2N i=m−2N ⊂ X by
¯i , i ∈ {m − 2N, . . . , m − 1} ∪ {m + k + 1, . . . , m + k + 2N }, yi = x
(4.9)
yi = xi , i ∈ {m, . . . , m + k}. It follows from (4.9), (4.3) and the definition of δ (see (4.6)) that xm−1 , x ¯m ) − vm−1 (ym−1 , ym )| ≤ 0 /16 |vm−1 (¯ and xm+k , x ¯m+k+1 ) − vm+k (ym+k , ym+k+1 )| ≤ 0 /16. |vm+k (¯ Combined with (4.9) and (4.8) these inequalities imply that m+k
vi (yi , yi+1 ) ≤ vm−1 (¯ xm−1 , x ¯m ) +
i=m−1
+
0 + vm+k (¯ xm+k , x ¯m+k+1 ) 16
m+k−1 0 + vi (xi , xi+1 ) 16 i=m
xm−1 , x ¯m ) + vm+k (¯ xm+k , x ¯m+k+1 ) + δ ≤ 0 /8 + vm−1 (¯ + 0 /8 +
m+k−1
vi (¯ xi , x ¯i+1 )
i=m
< 0 /2 +
m+k i=m−1
vi (¯ xi , x ¯i+1 ).
(4.10)
154
A.J. Zaslavski
By (4.9), (4.10) and (4.1) m+k+2N −1
vi (yi , yi+1 )
i=m−2N m−2
=
vi (yi , yi+1 ) +
≤
vi (yi , yi+1 ) +
i=m−1
i=m−2N m−2
m+k
vi (¯ xi , x ¯i+1 ) + 0 /2 +
vi (yi , yi+1 )
i=m+k+1 m+k
vi (¯ xi , x ¯i+1 )
i=m−1
i=m−2N
+
m+k+2N −1
m+k+2N −1
vi (¯ xi , x ¯i+1 )
i=m+k+1
= 0 /2 +
m+k+2N −1
vi (¯ xi , x ¯i+1 )
i=m−2N
= 0 /2 + σ(v, m − 2N, m + k + 2N, x ¯m−2N , x ¯m+k+2N ) = 0 /2 + σ(v, m − 2N, m + k + 2N, ym−2N , ym+k+2N ). Thus m+k+2N −1
vi (yi , yi+1 ) ≤ 0 /2 + σ(v, m − 2N, m + k + 2N, ym−2N , ym+k+2N ).
i=m−2N
By this inequality and the definition of 0 (see (4.5)) ¯i ) ≤ , i = m − N, . . . , m + k + N. ρ(yi , x Together with (4.9) this implies that ¯i ) ≤ , i = m, . . . , m + k. ρ(xi , x Thus we have shown that the property (C) holds. Now we are ready to complete the proof. Let > 0. By the property (C) there exists δ0 ∈ (0, ) such that for each 2 pair of integers m1 , m2 > m1 and each sequence {xi }m i=m1 ⊂ X satisfying ¯i ) ≤ δ0 , i = m1 , m2 ρ(xi , x m 2 −1 i=m1
vi (xi , xi+1 ) ≤ σ(v, m1 , m2 , xm1 , xm2 ) + δ0
(4.11)
7
A turnpike property
155
the following inequality holds: ¯i ) ≤ , i = m1 , . . . , m2 ρ(xi , x
(4.12)
There exist a number δ ∈ (0, δ0 ) and a natural number N such that for each 2 pair of integers m1 , m2 ≥ m1 + 2N and each sequence {xi }m i=m1 ⊂ X which satisfies m 2 −1 vi (xi , xi+1 ) ≤ σ(v, m1 , m2 , xm1 , xm2 ) + δ (4.13) i=m1
the following inequality holds: ρ(xi , x ¯i ) ≤ δ0 , i = m1 + N, . . . , m2 − N.
(4.14)
2 Let m1 , m2 ≥ m1 + 2N be a pair of integers and {xi }m i=m1 ⊂ X satisfy ¯m1 ) ≤ δ. Then by (4.14) (4.13). Then (4.14) is valid. Assume that ρ(xm1 , x and (4.13), ¯m1 +N ) ≤ δ0 ρ(xm1 +N , x
and
m1 +N −1
vi (xi , xi+1 ) ≤ σ(v, m1 , m1 + N, xm1 , xm1 +N ) + δ.
i=m1
It follows from these relations and the definition of δ0 (see (4.11) and (4.12)) that ¯i ) ≤ , i = m1 , . . . , m1 + N. ρ(xi , x Analogously we can show that is ρ(xm2 , x ¯m2 ) ≤ δ, then ¯i ) ≤ , i = m2 − N, . . . , m2 . ρ(xi , x This completes the proof of the theorem.
References [Dzalilov et al. (2001)] Dzalilov, Z., Ivanov, A.F. and Rubinov, A.M. (2001) Difference inclusions with delay of economic growth, Dynam. Systems Appl., Vol. 10, pp. 283–293. [Dzalilov et al. (1998)] Dzalilov, Z., Rubinov, A.M. and Kloeden, P.E. (1998) Lyapunov sequences and a turnpike theorem without convexity, Set-Valued Analysis, Vol. 6, pp. 277–302. [Leizarowitz (1985)] Leizarowitz, A. (1985) Infinite horizon autonomous systems with unbounded cost, Appl. Math. and Opt., Vol. 13, pp. 19–43. [Leizarowitz (1989)] Leizarowitz, A. (1989) Optimal trajectories on infinite horizon deterministic control systems, Appl. Math. and Opt., Vol. 19, pp. 11–32. [Leizarowitz and Mizel (1989)] Leizarowitz, A. and Mizel, V.J. (1989) One dimensional infinite horizon variational problems arising in continuum mechanics, Arch. Rational Mech. Anal., Vol. 106, pp. 161–194.
156
A.J. Zaslavski
[Makarov, Levin and Rubinov (1995)] Makarov, V.L, Levin, M.J. and Rubinov, A.M. (1995) Mathematical economic theory: pure and mixed types of economic mechanisms, North-Holland, Amsterdam. [Makarov and Rubinov (1973)] Makarov, V.L. and Rubinov, A.M. (1973) Mathematical theory of economic dynamics and equilibria, Nauka, Moscow, English trans. (1977): Springer-Verlag, New York. [Mamedov and Pehlivan (2000)] Mamedov, M.A. and Pehlivan, S. (2000) Statistical convergence of optimal paths, Math. Japon., Vol. 52, pp. 51–55. [Mamedov and Pehlivan (2001)] Mamedov, M.A. and Pehlivan, S. (2001) Statistical cluster points and turnpike theorem in nonconvex problems, J. Math. Anal. Appl., Vol. 256, pp. 686–693. [Marcus and Zaslavski (1999)] Marcus, M. and Zaslavski, A.J. (1999) The structure of extremals of a class of second order variational problems, Ann. Inst. H. Poincare, Anal. non lineare, Vol. 16, pp. 593–629. [McKenzie (1976)] McKenzie, L.W. (1976) Turnpike theory, Econometrica, Vol. 44, pp. 841–866. [Radner (1961)] Radner, R. (1961) Path of economic growth that are optimal with regard only to final states; a turnpike theorem, Rev. Econom. Stud., Vol. 28, pp. 98–104. [Rubinov (1980)] Rubinov, A.M. (1980) Superlinear multivalued mappings and their applications to problems of mathematical economics, Nauka, Leningrad. [Rubinov (1984)] Rubinov, A.M. (1984) Economic dynamics, J. Soviet Math., Vol. 26, pp. 1975–2012. [Samuelson (1965)] Samuelson, P.A. (1965) A catenary turnpike theorem involving consumption and the golden rule, American Economic Review, Vol. 55, pp. 486–496. [Zaslavski (1995)] Zaslavski, A.J. (1995) Optimal programs on infinite horizon, 1 and 2, SIAM Journal on Control and Optimization, Vol. 33, pp. 1643–1686. [Zaslavski (1996)] Zaslavski, A.J. (1996) Dynamic properties of optimal solutions of variational problems, Nonlinear Analysis: Theory, Methods and Applications, Vol. 27, pp. 895–932. [Zaslavski (2000)] Zaslavski, A.J. (2000) Turnpike theorem for nonautonomous infinite dimensional discrete-time control systems, Optimization, Vol. 48, pp. 69–92.
Chapter 8
Mond–Weir Duality B. Mond
Abstract Consider the nonlinear programming problem to minimize f (x) subject to g(x) ≤ 0. The initial dual to this problem given by Wolfe required that all the functions be convex. Since that time there have been many extensions that allowed the weakening of the convexity conditions. These generalizations include pseudo- and quasi-convexity, invexity, and second order convexity. Another approach is that of Mond and Weir who modified the dual problem so as to weaken the convexity requirements. Here we summarize and compare some of these different approaches. It will also be pointed out how the two different dual problems (those of Wolfe and Mond–Weir) can be combined. Some applications, particularly to fractional programming, will be discussed. Key words: Mond–Weir dual, linear programming, Wolfe dual
8.1 Preliminaries One of the most interesting and useful aspects of linear programming is duality theory. Thus to the problem minimize ct x subject to Ax ≥ b, x ≥ 0 (when A is an m × n matrix) there corresponds the dual problem maximize bt y subject to At y ≤ c, y ≥ 0. Duality theory says that for any feasible x and y, ct x ≥ bt y; and, if x0 is optimal for the primal problem, there exists an optimal y0 of the dual and ct x0 = bt y0 .
B. Mond Department of Mathematics and Statistical Sciences, La Trobe University, Victoria 3086, AUSTRALIA; Department of Mathematics and Statistics, University of Melbourne Victoria 3052, AUSTRALIA e-mail: [email protected] and [email protected] C. Pearce, E. Hunt (eds.), Structure and Applications, Springer Optimization and Its Applications 32, DOI 10.1007/978-0-387-98096-6 8, c Springer Science+Business Media, LLC 2009
157
158
B. Mond
The first extension of this duality theory was to quadratic programming. In [4], Dorn considered the quadratic program with linear constraints minimize 1 t t 2 x Cx + p x subject to Ax ≥ b, x ≥ 0, where C is a positive semi-definite matrix. Dorn [4] showed that this problem was dual to the quadratic problem maximize − 21 ut Cu + bt y subject to At y ≤ Cu + p, y ≥ 0. Weak duality holds and if x0 is optimal for the primal, there exists y0 such that (u = x0 , y0 ) is optimal for the dual with equality of objective functions. The requirement that C be positive semi-definite ensures that the objective function is convex.
8.2 Convexity and Wolfe duality Let f be a function from Rn into R and 0 ≤ λ ≤ 1. f is said to be convex if for all x, y ∈ Rn , f (λy + (1 − λ)x) ≤ λf (y) + (1 − λ)f (x). If f is differentiable, this is equivalent to f (y) − f (x) ≥ (y − x)t ∇f (x).
(8.1)
Although there are other characterizations of convex functions, we shall assume that all functions have a continuous derivative and shall use the description (21.1) for convex functions. The extension of duality theory to convex non-linear programming problems (with convex constraints) was first given by Wolfe [17]. He considered the problem (P) minimize f (x) subject to g(x) ≤ 0, and proposed the dual (WD) maximize f (u) + y t g(u) subject to ∇f (u) + ∇y t g(u) = 0, y ≥ 0. Assuming f and g are convex, weak duality holds since, for feasible x and (u, y) f (x) − f (u) − y t g(u) ≥ (x − u)t ∇f (u) − y t g(u) = −(x − u)t ∇y t g(u) − y t g(u) ≥ −y t g(x) ≥ 0.
He also showed that if x0 is optimal for (P) and a constraint qualification is satisfied, then there exists y0 such that (u = x0 , y0 ) is optimal for (WD) and for these optimal vectors the objective functions are equal.
8
Mond–Weir Duality
159
8.3 Fractional programming and some extensions of convexity Simultaneous to the development of convex programming, there was consideration of the fractional programming problem (x) (FP) minimize fg(x) , (g(x) > 0) subject to h(x) ≤ 0, and in particular, the linear fractional programming problem t t (LFP) minimize dc tx+α x+β (d x + β) > 0 subject to Ax ≥ b, x ≥ 0. It was noted (see, e.g., Martos [9]) that many of the features of linear programming, such as duality and the simplex method, are easily adapted to linear fractional programming, although the objective function is not convex. This led to consideration of weaker than convexity conditions. Mangasarian [6] defined pseudo-convex functions that satisfy (y − x)t ∇f (x) ≥ 0 =⇒ f (y) − f (x) ≥ 0. Also, useful are quasi-convex functions that satisfy f (y) − f (x) ≤ 0 =⇒ (y − x)t ∇f (x) ≤ 0. It can be shown [6] that if f is convex, ≥ 0 and g concave, > 0 (or f convex, g linear > 0) then f /g is pseudo-convex. It follows from this that the objective function in the (LFP) is pseudo-convex. Mangasarian [7] points out that whereas some results (such as sufficiency and converse duality) hold if, in (P), f is only pseudo-convex and g quasiconvex, Wolfe duality does not hold for such functions. One example is the following: minimize x3 + x subject to x ≥ 1, which has the optimal value 2 at x = 1. The Wolfe dual maximize u3 + u + y(1 − u) 3u2 + 1 − y = 0, y ≥ 0 can be shown to → +∞ as u → −∞. One of the reasons that in Wolfe duality, convexity cannot be weakened to pseudo-convexity is because, unlike for convex functions, the sum of two pseudo-convex functions need not be pseudo-convex. It is easy to see, however, that duality between (P) and the Wolfe dual (WD), does hold if the Lagrangian f + y t g (y ≥ 0) is pseudo-convex. We show that weak duality holds. Since ∇[f (u) + y t g(u)] = 0, we have (x − u)t ∇[f (u) + y t g(u)] ≥ 0 =⇒ f (x) + y t g(x) ≥ f (u) + y t g(u) Now since y t g(x) ≤ 0, f (x) ≥ f (u) + y t g(u).
160
B. Mond
8.4 Mond–Weir dual In order to weaken the convexity requirements, Mond and Weir [12] proposed a different dual to (P). (MWD) maximize f (u) subject to ∇f (u) + ∇y t g(u) = 0, y t g(u) ≥ 0, y ≥ 0. Theorem 1 (Weak duality). If f is pseudo-convex and y t g is quasi-convex, then f (x) ≥ f (u). Proof. y t g(x) − y t g(u) ≤ 0 =⇒ (x − u)t ∇g(u) ≤ 0 ∴ −(x − u)t ∇f (u) ≤ 0 or (x − u)t ∇f (u) ≥ 0 =⇒ f (x) ≥ f (u).
It is easy to see that if also x0 is optimal for (P) and a constraint qualification is satisfied, then there exists a y0 such that (u = x0 , y0 ) is optimal for (MWD) with equality of objective functions. Consider again the problem minimize x3 + x subject to x ≥ 1, to which Wolfe duality does not apply. The corresponding Mond–Weir dual maximize u3 + u subject to 3u2 + 1 − y = 0, y(1 − u) ≥ 0, y ≥ 0, has an optimal at u = 1, y = 4 with optimum value equal to 2. Although many variants of (MWD) are possible (see [12]), we give a dual that can be regarded as a combination of (WD) and (MWD). Let M = {1, 2, . . . , m} and I ⊆ M . yi gi (u) max f (u) + i∈I
∇f (u) + ∇y t g(u) = 0, y ≥ 0 yi gi (u) ≥ 0. i∈M/I
Weak duality holds if f +
i∈I
convex.
yi gi is pseudo-convex and
yi gi is quasi-
i∈M/I
8.5 Applications Since Mond–Weir duality holds when the objective function is pseudo-convex but not convex, it is natural to apply the duality results to the fractional programming problem (FP). Thus if f ≥ 0, convex and g > 0, concave, y t h quasi-convex then weak duality holds between (FP) and the following problem:
8
Mond–Weir Duality
161
max
f (u)/g(u)
subject to
∇[f (u)/g(u) + y t h(u)] = 0 y t h(u) ≥ 0,
y ≥ 0.
Other fractional programming duals can be found in Bector [1] and Schaible [15, 16]. Instead of the problem (FP), Bector [1] considered the equivalent problem minimize
f (x)/g(x)
subject to
h(x) ≤ 0. g(x)
t
h(x) Here the Lagrangian is f (x)+y and is psuedo-convex if f and h are g(x) t convex, g is concave > 0, f + y h ≥ 0 (unless g is linear). Thus his dual to (FP) is
maximize subject to
f (u) + y t h(u) g(u) , + f (u) + y t h(u) =0 ∇ g(u) f (u) + y t h(u) ≥ 0,
y ≥ 0.
Duality holds if f is convex, g is concave > 0, and h is convex. A dual that combines the fractional dual of Mond–Weir and that of Bector can be found in [13]. Schaible [15, 16] gave the following dual to (FP): maximize λ ∇f (u) − λ∇g(u) + ∇y t h(u) = 0 f (u) − λg(u) + y t h(u) ≥ 0 y ≥ 0,
λ ≥ 0.
Duality holds if f is convex, ≥ 0, g concave, > 0 and h is convex. A Mond–Weir version of the Schaible dual is the following: maximize λ ∇f (u) − λ∇g(u) + ∇y t h(u) = 0 f (u) − λg(u) ≥ 0 y t h(u) ≥ 0,
λ ≥ 0,
y ≥ 0.
Here duality holds if f is convex and nonnegative, g concave and strictly positive, and y t h is quasiconvex.
162
B. Mond
A dual that is a combination of the last two is the following: maximize λ ∇f (u) − λ∇g(u) + ∇y t h(u) = 0 f (u) − λg(u) + yi hi (u) ≥ 0
i∈I
yi hi (u) ≥ 0, λ ≥ 0, y ≥ 0
i∈M/I
Here
yi hi (u) need only be quasi-convex for duality to hold.
i∈M/I
A fractional programming problem where Bector and Schaible duality do not hold but the Mond–Weir fractional programming duals are applicable is the following: 1 subject to x3 ≥ 1. minimize − x>0 x Here neither the Bector nor the Schaible dual is applicable. The Mond– Weir Bector type dual is 1 maximize − u>0 u 1 y = 4 , y(1 − u3 ) ≥ 0, y ≥ 0. 3u The maximum value −1 is attained at u = 1, y = 13 . The Mond–Weir Schaible type dual is maximize λ subject to − λ − 3yu2 = 0 −1 − λu ≥ 0 y(1 − u3 ) ≥ 0, y ≥ 0 The maximum is attained at u = 1, y = 1/3, λ = −1.
8.6 Second order duality Mangasarian [8] proposed the following second order dual to (P). (MD)
maximize
4 1 3 f (u) + y t g(u) − pt ∇2 f (u) + ∇2 y t g(u) p 2 ∇y t g(u) + ∇2 y t g(u)p + ∇f (u) + ∇2 f (u)p = 0 y≥0
8
Mond–Weir Duality
163
In [11] Mond established weak duality between (P) and (MD) under the following conditions: If for all x, u, p 1 f (x) − f (u) ≥ (x − u)t ∇f (u) + (x − u)t ∇2 f (u)p − pt ∇2 f (u)p 2 (subsequently called second order convex by Mahajan [5]) and 1 gi (x) − gi (u) ≥ (x − u)t ∇gi (u) + (x − u)t ∇2 gi (u)p − pt ∇2 gi (u)p 2 i = 1, . . . , m The second order convexity requirements can be weakened by suitably modifying the dual. The corresponding dual is (MWSD) max f (u) − 12 pt ∇2 f (u)p ∇y t g(u) + ∇2 y t g(u)p + ∇f (u) + ∇2 f (u)p = 0 1 y t g(u) − pt [∇2 y t g(u)]p ≥ 0, y ≥ 0. 2 Weak duality holds between (P) and (MWSD) if f satisfies 1 (x − u)t ∇f (u) + (x − u)t ∇2 f (u)p ≥ 0 =⇒ f (x) ≥ f (u) − pt ∇2 f (u)p 2 (called second order pseudo-convex) and y t g satisfies 1 y t g(x) − y t g(u) + pt ∇2 y t g(u)p ≤ 0 2 =⇒ (x − u)t ∇y t g(u) + (x − u)t ∇2 [y t g(u)]p ≤ 0 (called second order quasi-convex). Other second order duals can be found in [13].
8.7 Symmetric duality In [3], Dantzig, Eisenberg and Cottle formulated the following pair of symmetric dual problems: (SP)
minimize K(x, y) − y t ∇2 K(x, y) −∇2 K(x, y) ≥ 0 x≥0
(SD)
maximize K(u, v) − ut ∇1 K(u, v) −∇1 K(u, v) ≤ 0 v ≥ 0.
164
B. Mond
Weak duality holds if K is convex in x for fixed y and concave in y for fixed x. In [12] Mond and Weir considered the possibility of weakening the convexity and concavity requirements by modifying the symmetric dual problems. They proposed the following: (MWSP)
minimize K(x, y) −∇2 K(x, y) ≥ 0
−y t ∇2 K(x, y) ≤ 0 x≥0 (MWSD)
maximize K(u, v) −∇1 K(u, v) ≤ 0
−ut ∇1 K(u, v) ≥ 0 v ≥ 0. Weak duality holds if K is pseudo-convex in x for fixed y and pseudo-concave in y for fixed x. Proof. (x − u)t ∇1 K(u, v) ≥ 0 =⇒ K(x, v) ≥ K(u, v) (v − y)t ∇2 K(x, y) ≤ 0 =⇒ K(x, v) ≤ K(x, y) ∴ K(x, y) ≥ K(u, v). Once symmetric duality was shown to hold with only pseudo-convex and pseudo-concave requirements, it was tempting to try to establish a pair of symmetric dual fractional problems. Such a pair is given in [2]. minimize [φ(x, y)/ψ(x, y)] ψ(x, y)∇2 φ(x, y) − φ(x, y)∇2 ψ(x, y) ≤ 0 y t [ψ(x, y)∇2 φ(x, y) − φ(x, y)∇2 ψ(x, y)] ≥ 0 x≥0 maximize [φ(u, v)/ψ(u, v)] ψ(u, v)∇1 φ(u, v) − φ(u, v)∇1 ψ(u, v) ≥ 0 ut [ψ(u, v)∇1 φ(u, v) − φ(u, v)∇1 ψ(u, v)] ≤ 0 v≥0 Assuming that φ(·, y) and ψ(x, ·) are convex while φ(x, ·) and ψ(·, y) are concave, then the objective function is pseudo-convex in x for fixed y and pseudo-concave in y for fixed x. In this case weak duality holds, i.e., for feasible (x, y) and (u, v) φ(x, y)/ψ(x, y) ≥ φ(u, v)/ψ(u, v). Finally we point out that Mond–Weir duality has been found to be useful and applicable in a great many different contexts. A recent check of Math
8
Mond–Weir Duality
165
Reviews showed 112 papers where the term Mond–Weir is used either in the title or in the abstract. Seventy-eight of these papers are listed in [10].
References 1. C.R. Bector, Duality in Nonlinear Programming, Z. Oper. Res., 59 (1973), 183–193. 2. S. Chandra, B. D. Craven and B. Mond, Symmetric Dual Fractional Programming, Z. Oper. Res., 29 (1985), 59–64. 3. G.G. Dantzig, E. Eisenberg and R.W. Cottle, Symmetric Dual Nonlinear Programs, Pacific J. Math., 15 (1965), 809–812. 4. W.S. Dorn, Duality in Quadratic Programming, Quart. Appl. Math., 18 (1960), 155– 162. 5. D.G. Mahajan, Contributions to Optimality Conditions and Duality Theory in Nonlinear Programming, PhD Thesis, IIT Bombay, India, 1977. 6. O.L. Mangasarian, Pseudo-convex Functions, SIAM J. Control, 3 (1965), 281–290. 7. O.L. Mangasarian, Nonlinear Programming, McGraw-Hill, New York, 1969. 8. O.L. Mangasarian, Second and Higher-order Duality in Nonlinear Programming, J. Math. Anal. Appl., 51 (1975), 607–620. 9. B. Martos, Nonlinear Programming; Theory and Methods, North Holland Pub. Co., Amsterdam, 1975. 10. B. Mond, What is Mond-Weir Duality?, in, Recent Developments in Operational Research, Manja Lata Agarwal and Kanwar Sen, Editors, Narosa Publising House, New Delhi, India, 2001, 297–303. 11. B. Mond, Second Order Duality for Non-linear Programs, Opsearch, 11 (1974), 90–99. 12. B. Mond and T. Weir, Generalized concavity and duality, in Generalized Concavity in Optimization and Economics, S. Schaible and W.T. Ziemba, Editors, Academic Press, New York, 1981, 263–279. 13. B.Mond and T. Weir, Duality for Fractional Programming with Generalized Convexity Conditions, J. Inf. Opt. Sci., 3 (1982), 105–124. 14. B. Mond and T. Weir, Generalized Convexity and Higher Order Duality, J. Math. Sci., 16–18 (1981–83), 74–94. 15. S. Schaible, Duality in Fractional Programming: A Unified Approach, Oper. Res., 24 (1976), 452–461. 16. S. Schaible, Fractional Programming I, Duality, Man. Sci., 22 (1976), 858–867. 17. P. Wolfe, A Duality Theorem for Nonlinear Programming, Quart. Appl. Math., 19 (1961), 239–244.
Chapter 9
Computing the fundamental matrix of an M /G/1–type Markov chain Emma Hunt
Abstract A treatment is given of a probabilistic approach, Algorithm H, to the determination of the fundamental matrix of a block-structured M/G/1– type Markov chain. Comparison is made with the cyclic reduction algorithm. Key words: Block Markov chain, fundamental matrix, Algorithm H, convergence rates, LR Algorithm, CR Algorithm
9.1 Introduction By a partitioned or block–M/G/1 Markov chain we mean a Markov chain with transition matrix of block-partitioned form ⎤ ⎡ B1 B2 B3 B4 . . . ⎢ A0 A1 A2 A3 . . . ⎥ ⎥ ⎢ P = ⎢ 0 A0 A1 A2 . . . ⎥ , ⎦ ⎣ .. .. .. .. . . . . . . . where each block is k × k, say. We restrict attention to the case where the chain is irreducible but do not suppose positive recurrence. If the states are partitioned conformably with the blocks, then the states corresponding to block (≥ 0) are said to make up level and to constitute the phases of level . The j-th phase of level will be denoted (, j). In [18] Neuts noted a variety of special cases of the block–M/G/1 Markov chain which occur as models in various applications in the literature, such as Emma Hunt School of Mathematical Sciences & School of Economics, The University of Adelaide, Adelaide SA 5005, AUSTRALIA e-mail: [email protected] C. Pearce, E. Hunt (eds.), Structure and Applications, Springer Optimization and Its Applications 32, DOI 10.1007/978-0-387-98096-6 9, c Springer Science+Business Media, LLC 2009
167
168
E. Hunt
Bailey’s bulk queue (pp. 66–69) and the Odoom–Lloyd–Ali Khan–Gani dam (pp. 69–71 and 348–353). For applications, the most basic problem concerning the block–M/G/1 Markov chain is finding the invariant probability measure in the positive recurrent case. We express this measure as π = (π0 , π1 , . . .), the components πi being k-dimensional vectors so that π is partitioned conformably with the structure of P . An efficient and stable method of determining π has been devised by Ramaswami [20] based on a matrix version of Burke’s formula. The key ingredient here is the fundamental matrix, G. This arises as follows. Denote by Gr, (1 ≤ r, ≤ k) the probability, given the chain begins in state (i + 1, r), that it subsequently reaches level i ≥ 0 and that it first does so by entering the state (i, ). By the homogeneity of the transition probabilities in levels one and above, plus the fact that trajectories are skipfree downwards in levels, the probability Gr, is well defined and independent of i. The fundamental matrix G is defined by G = (Gr, ). A central property of G, of which we shall make repeated use, is that it is the smallest nonnegative solution to the matrix equation G = A(G) :=
∞
Ai Gi ,
(9.1)
i=0
where we define G0 = I [9, Sections 2.2, 2.3]. Now define................................. A∗i =
∞
Aj Gj−i ,
j=i
Bi∗ =
∞
Bj+1 Gj−i ,
i ≥ 1.
(9.2)
j=i
The Ramaswami generalization of the Burke formula to block–M/G/1 Markov chains is as follows (Neuts [18, Theorem 3.2.5]). Theorem A. For a positive recurrent block–M/G/1 chain P , the matrix I − A∗1 is invertible and the invariant measure of P satisfies ⎤ ⎡ i−1 ∗ ⎦ (I − A∗1 )−1 (i ≥ 1). πj Bi+1−j πi = ⎣π0 Bi∗ + j=1
The determination of π0 is discussed in Neuts [18, Section 3.3]. The theorem may then be used to derive the vectors πi once G is known. Thus the availability of efficient numerical methods for computing the matrix G is crucial for the calculation of the invariant measure. Different algorithms for computing the minimal nonnegative solution of (9.1) have been proposed and analyzed by several authors. Many of them arise from functional iteration techniques based on manipulations of (9.1). For instance, in Ramaswami [19] the iteration
9
Computing the fundamental matrix of an M/G/1–type Markov chain ∞
Xj+1 =
Ai Xji ,
X0 = 0,
169
(9.3)
i=0
was considered. Similar techniques, based on the recurrences −1
Xj+1 = (I − A1 )
A0 +
∞
(I − A1 )
−1
Ai Xji
(9.4)
i=2
or
Xj+1 =
I−
∞
−1 Ai Xji−1
A0
(9.5)
i=1
were introduced in Neuts [18], Latouche [12] in order to speed up the convergence. However, the convergence of these numerical schemes still remains linear. In Latouche [13] a Newton iteration was introduced in order to arrive at a quadratic convergence, with an increase in the computational cost. In Latouche and Stewart [15] the approximation of G was reduced to solving nested finite systems of linear equations associated with the matrix P by means of a doubling technique. In this way the solution for the matrix P is approximated with the solution of the problem obtained by cutting the infinite block matrix P to a suitable finite block size n. In this chapter we present a probabilistic algorithm, Algorithm H, for the determination of the fundamental matrix G in a structured M/G/1 Markov chain. An account of the basic idea is given by the author in [11]. Algorithm H is developed in the following three sections. In Section 5 we then consider an alternative approach to the calculation of G. This turns out to be rather more complicated than Algorithm H in general. We do not directly employ this algorithm, Algorithm H, and we do not detail all its steps. However Algorithm H serves several purposes. First, we show that it and Algorithm H possess an interlacing property. This enables us to use it to obtain (in Section 7) information about the convergence rate of Algorithm H. Algorithm H reduces to Algorithm LR, the logarithmic-reduction algorithm of Latouche and Ramaswami [14], in the quasi birth–death (QBD) case. Thus for a QBD the interlacing property holds for Algorithms H and LR. This we consider in Section 8. In Section 9 we address the relation between Algorithm H and Bini and Meini’s cyclic reduction algorithm, Algorithm CR. Algorithm CR was developed and refined in a chain of articles that provided a considerable improvement over earlier work. See in particular [3]–[10]. We show that Algorithm H becomes Algorithm CR under the conditions for which the latter has been established. Algorithm H is seen to hold under more general conditions than Algorithm CR. It follows from our discussion that, despite a statement to the contrary in [8], Algorithm CR is different from Algorithm LR in the QBD case.
170
E. Hunt
9.2 Algorithm H: Preliminaries It proves convenient to label the levels of the chain A as −1, 0, 1, 2, . . ., so that A is homogeneous in the one-step transition probabilities out of all nonnegative levels. Thus the matrix G gives the probabilities relating to first transitions into level −1, given the process starts in level 0. Since we are concerned with the process only up to the first transition into level −1, we may without loss of generality change the transition probabilities out of level −1 and make each phase of level −1 absorbing. That is, we replace our chain A with a chain A with levels −1, 0, 1, 2, . . . and structured one-step transition matrix ⎤ ⎡ I 0 0 ... ⎢ A0 A1 A2 . . . ⎥ ⎥ ⎢ P = ⎢ 0 A0 A1 . . . ⎥ . ⎦ ⎣ .. .. .. . . . . . . Most of our analysis will be in terms of the (substochastic) subchain A0 with levels 0, 1, 2, . . . and structured one-step transition matrix ⎡ ⎤ A1 A2 A3 . . . ⎢ A0 A1 A2 . . . ⎥ ⎢ ⎥ P (0) = ⎢ 0 A0 A1 . . . ⎥ . ⎣ ⎦ .. .. .. . . . . . . The assumption that A is irreducible entails that every state in a nonnegative-labeled level of A has access to level −1. Hence all the states of A0 are transient or ephemeral. For t ≥ 0, let Xt denote the state of A0 at time t and let the random variable Yt represent the level of A0 at time t. For r, s ∈ K := {1, 2, . . . , k} we define * / * {Xt = (0, s), Yu > 0 (0 < u < t)} *X0 = (0, r) . Ur,s := P t>0
Thus Ur,s is the probability that, starting in (0, r), the process A0 revisits level 0 at some subsequent time and does so with first entry into state (0, s). The matrix U := (Ur,s ) may be regarded as the one-step transition matrix of a Markov chain U on the finite state space K. The chain U is a censoring of A0 in which the latter is observed only on visits to level zero. No state of U is recurrent, for if r ∈ K were recurrent then the state (0, r) in A0 would be recurrent, which is a contradiction. Since no state of U is recurrent, I − U is invertible and ∞ U i = (I − U )−1 . i=0
9
Computing the fundamental matrix of an M/G/1–type Markov chain
171
The matrix U is also strictly substochastic, that is, at least one row sum is strictly less than unity. any path whose probability contributes to Gr,s begins in (0, r), makes In A, some number n ≥ 0 of revisits to level 0 with −1 as a taboo level, and then takes a final step to (−1, s). Suppose the final step to (−1, s) is taken from (0, m). Allowing for all possible choices of m, we derive ∞ i U (A0 )m,s , Gr,s = i=0
m∈K
so that G=
∞
r,m
U
i
A0 = (I − U )−1 A0 .
i=0
Our strategy for finding G is to proceed via the determination of U . For ≥ 0, we write U () for the matrix whose entries are given by * / * {Xt = (0, s), 0 < Yu < (0 < u < t)} *X0 = (0, r) (U ())r,s := P t>0
for r, s ∈ K. Thus U () corresponds to U when the trajectories in A0 are further restricted not to reach level or higher before a first return to level 0. We may argue as above that I − U () is invertible and that I − U () =
∞
(U ())i .
i=0
Further, since U is finite, U () ↑ U as → ∞ and
[I − U ()]−1 ↑ [I − U ]−1 as → ∞.
The probabilistic construction we are about to detail involves the exact algorithmic determination (to machine precision) of U () for of the form 2N with N a nonnegative integer. This leads to an approximation 3 4−1 TN := I − U (2N ) A0 for G. We have TN ↑ G as N → ∞. The matrix TN may be interpreted as the contribution to G from those trajectories from level 0 to level −1 in A that are restricted to pass through only levels below 2N .
172
E. Hunt
9.3 Probabilistic construction We construct a sequence (Aj )j≥0 of censored processes, each of which has the nonnegative integers as its levels. For j ≥ 1, the levels 0, 1, 2, . . . of Aj are respectively the levels 0, 2, 4, . . . of Aj−1 , that is, Aj is Aj−1 censored to be observed in even-labeled levels only. Thus Aj is a process that has been censored j times. By the homogeneity of one-step transitions out of level 1 and higher levels, Aj has a structured one-step transition matrix ⎤ ⎡ (j) (j) (j) B1 B2 B3 . . . ⎥ ⎢ (j) (j) (j) ⎢ A0 A1 A2 . . . ⎥ (j) ⎥, ⎢ P =⎢ (j) (j) ⎥ ⎣ 0 A0 A1 . . . ⎦ .. .. .. . . . . . . (0)
that is, each chain Aj is of structured M/G/1 type. We have Bi = Ai for (0) i ≥ 1 and Ai = Ai for i ≥ 0. In the previous section we saw that A0 contains no recurrent states, so the same must be true also for the censorings A1 , A2 , ... . The substochastic (j) (j) matrices B1 , A1 , formed by censoring Aj to be observed only in levels 0 (j) and 1 respectively, thus also contain no recurrent states. Hence I − B1 and (j) I − A1 are both invertible. We now consider the question of deriving the block entries in P (j+1) from (j) (j) those in P (j) . First we extend our earlier notation and write Xt , Yt respectively for the state and level of Aj at time t ∈ {0, 1, . . .}. For h a nonnegative (j) integer, define the event Ωs,t,h by 6 7 (j) (j) (j) Ωs,t,h = Xt = (h, s), Yu(j) − Y0 is even (0 < u < t) (j+1)
and for n ≥ 0, define the k × k matrix Ln by * % & * / (j) * (j) (j+1) Ln :=P Ωs,t,2+2n * X0 = (2 + 1, r) * r,s t>0
for r, s ∈ K. By the homogeneity in positive-labeled levels of the one-step transition probabilities in Aj , the left-hand side is well defined, that is, the right-hand side is independent of the value of ≥ 0. (j+1) )r,s as the probability, conditional on initial state We may interpret (Ln (2 + 1, r), that the first transition to an even-labeled level is to state (2 + 2n, s). We may express the transitions in Aj+1 in terms of those of Aj and the ma(j+1) trices Ln by an enumeration of possibilities. Suppose i > 0. A single-step transition from state (i, r) to state (i − 1 + n, s) (n ≥ 0) in Aj+1 corresponds
9
Computing the fundamental matrix of an M/G/1–type Markov chain
173
to a transition (possibly multistep) from state (2i, r) to state (2(i − 1 + n), s) in Aj that does not involve passage through any other state with even-labeled (j) level. When n > 0, this may occur in a single step with probability (A2n−1 )r,s . For a multistep transition, we may obtain the probability by conditioning on the first state lying in an odd-labeled level. This gives n
(j)
= A2n−1 + A(j+1) n
(j)
(j+1)
A2m Ln−m
(n ≥ 1).
(9.6)
m=0
For n = 0, there is no single-step transition in Aj producing a drop of two levels, so the leading term disappears to give (j+1)
A0
(j)
(j+1)
= A0 L0
.
(9.7)
A similar argument gives (j)
Bn(j+1) = B2n−1 +
n
(j)
(j+1)
B2m Ln−m
(n ≥ 1)
(9.8)
m=1
for transitions from level 0. (j+1) proceeds in two stages. For n ≥ The determination of the matrices Ln (j+1) by 0, define the k × k matrix Kn * % & * / (j) * (j) (j+1) Kn := P Ωs,t,2+1 * X0 = (2 + 1, r) * r,s t>0
for r, s ∈ K. Again the left-hand side is well defined. We may interpret this as follows. Suppose Aj is initially in state (2 + 1, r). The (r, s) entry in (j+1) Kn is the probability that, at some subsequent time point, Aj is in state (2 + 2n + 1, s) without in the meantime having been in any even–labelled level. (j+1) consists of a sequence of steps each Each path in Aj contributing to Ln of which involves even-sized changes of level, followed by a final step with an odd-sized change of level. Conditioning on the final step yields = L(j+1) n
n
(j)
(j+1) Km A2(n−m) (n ≥ 0).
(9.9)
m=0
To complete the specification of P (j+1) in terms of P (j) , we need to deter(j+1) . We have by definition that mine the matrices Kn
(j+1) K0
%
r,s
=
/ t>0
* *
* (j) (j) Ωs,t,2+1 * X0 *
& = (2 + 1, r) .
174
E. Hunt (j+1)
Since Aj is skip-free to the left, trajectories contributing to K0 change level and so (j+1)
K0
=
∞
(j)
i
A1
−1 (j) = I − A1 .
cannot
(9.10)
i=0 (j+1)
involves at least one step in Aj with an increase in level. For n > 0, Kn Conditioning on the last such step yields the recursive relation Kn(j+1) =
n−1
(j)
(j+1)
(j+1) Km A2(n−m)+1 K0
(n ≥ 1).
(9.11)
m=0
We may also develop a recursion by conditioning on the first such jump between levels. This gives the alternative recursive relation Kn(j+1) :=
n−1
(j+1)
K0
(j)
(j+1) A2(n−m)+1 Km
(n ≥ 1).
(9.12)
m=0
Since level 1 in AN corresponds to level 2N in A0 , paths in AN from (0, r) to (0, s) that stay within level 0 correspond to paths from (0, r) to (0, s) in A0 that do not reach level 2N or higher. Hence
(N ) B1 = U (2N ) r,s r,s
for r, s ∈ K, or
(N )
B1
= U (2N ).
Thus the recursive relations connecting the block entries in P (j+1) to those in P (j) for j = 0, 1, . . . , N − 1 provide the means to determine U (2N ) exactly and so approximate G.
9.4 Algorithm H In the last section we considered the sequence of censored processes (Aj )j≥0 , (N )
each with the nonnegative integers as its levels. The determination of B1 requires only a finite number of the matrix entries in each P (j) to be determined. For the purpose of calculating TN , the relevant parts of the construction of the previous section may be summarized as follows. The algorithm requires initial input of A0 , A1 , . . . , A2N −1 . First we specify Bn(0) = An A(0) n = An
(n = 1, . . . , 2N ), (n = 0, 1, . . . , 2N − 1).
9
Computing the fundamental matrix of an M/G/1–type Markov chain
175
We then determine (j)
(j)
(j)
(j)
B1 , B2 , . . . , B2N −j , A0 , A1 , . . . , A2N −j −1 recursively for j = 1, 2, . . . , N as follows. To obtain the block matrices in Aj+1 from those in Aj , first find the auxiliary quantities (−1 ' (j+1) (j) K0 = I − A1 , Kn(j+1) =
n−1
(j)
(j+1)
(j+1) Km A2(n−m)+1 K0
,
m=0
for n = 1, 2, . . . , 2N −j−1 − 1, and = L(j+1) n
n
(j)
(j+1) Km A2(n−m) ,
m=0
for n = 0, 1, . . . , 2N −j−1 − 1. Calculate (j+1) (j) (j+1) = A0 L0 A0 and (j)
n
(j)
m=1 n
Bn(j+1) = B2n−1 + A(j+1) = A2n−1 + n
(j)
(j+1)
(j)
(j+1)
B2m Ln−m , A2m Ln−m ,
m=0
for n = 1, 2, . . . , 2N −j−1 − 1. (N ) The above suffices for the evaluation of B1 . We then compute (−1 ' (N ) A0 , TN = I − B1 which is an approximation to G incorporating all contributing paths in A that do not involve level 2N (or higher). The algorithm may be specified as a short MATLAB program.
9.5 Algorithm H: Preliminaries We now consider an Algorithm H oriented toward a different way of calculating G. This involves two sequences (Mj )j≥1 , (Nj )j≥1 of censored processes, all with levels −1, 0, 1, 2, . . .. The process Mj has block-structured one-step transition matrix
176
E. Hunt
⎡
P
(j)
⎤
I 0 0 0 ... ⎢ (j) (j) (j) (j) ⎥ ⎢ A0 A1 A2 A3 . . . ⎥ ⎢ ⎥. (j) (j) (j) =⎢ ⎥ ⎣ 0 A0 A1 A2 . . . ⎦ .. .. .. .. . . . . . . .
The process Nj has block-structured one-step transition matrix ⎡ ⎤ I 0 0 0 ... (j) (j) ⎢ (j) ⎥ 0 B2 B3 . . . ⎥ ⎢ B0 (j) ⎥. (j) (j) Q =⎢ ⎢ 0 B0 0 B2 . . . ⎥ ⎣ ⎦ .. .. .. .. . . . . . . . that is, These are set up recursively, beginning with M1 = A, (1)
Ai
:= Ai
(i ≥ 0).
(9.13)
We construct Nj by censoring Mj , observing it only when it is either in level −1 or a change of level occurs. Thus for i = 1, (j)
Bi
( ' (j) −1 (j) = I − A1 Ai .
(9.14)
We form Mj+1 by censoring Nj , observing only the odd-labeled levels −1, 1, 3, 5, ... , and then relabeling these as −1, 0, 1, 2, ... . Thus level ≥ −1 of Mj+1 and Nj+1 corresponds to level 2( + 1) − 1 of Mj and Nj . It follows that level (≥ −1) of Mj and Nj corresponds to level ( + 1)2j−1 − 1 of M1 and N1 . (j+1) (j) (j) We derive the blocks of P from those of Q as follows. Let X t , (j)
Y t denote respectively the state and level of Nj at time t. Following the procedure involved in Algorithm H, define for h a nonnegative integer 6 (j) 7 (j) (j) (j) Ω s,t,h = X t = (k, s), Y u − Y 0 is even (0 < u < t) . (j+1)
The matrices Ln
(j+1) Ln
are then defined for n ≥ 0 by %
r,s
:= P
/ t>0
* *
(j) * (j) Ω s,t,2+2n−1 * X 0
*
& = (2, r)
for r, s ∈ K. As before the right-hand side is independent of > 0. The matrix (j+1) (j+1) Ln plays a similar role to that of Ln for Algorithm H, only here the trajectories utilize only even-labeled levels of Nj except for a final step to an odd-labeled level.
9
Computing the fundamental matrix of an M/G/1–type Markov chain
177
Arguing as before, we derive (j+1)
An
n
(j)
= B 2n−1 +
(j)
(j+1)
B 2m Ln−m
(n > 1)
(9.15)
m=0
with (j+1)
An
=
n
(j)
(j+1)
B 2m Ln−m
(n = 0, 1).
(9.16)
m=0
The derivation is identical to that leading to (9.7) and (9.6). For n = 1 there is no term corresponding to the first term on the right in (9.6) since the present (j)
censoring requires B 1 := 0. The present censoring out of even-labeled, as opposed to odd-labeled, levels means that no analogue to (9.8) is needed. (j+1)
As before we now determine the matrices Ln
(j+1) Kn .
We define
(j+1) Kn
%
:= P
r,s
/
* *
(j) * (j) Ω s,t,2+2n * X 0
*
t>0
in terms of matrices &
= (2, r)
for r, s ∈ K and n ≥ 1. As before the right-hand side is independent of ≥ 0. (j) By analogy with (9.10), we have since B 1 = 0 that ( ' (j) −1 (j+1) = I − B1 = I. K0 By analogy with (9.9), we have (j+1)
Ln
=
n
(j+1)
(j)
Km
B 2(n−m)
m=0 (j)
= B 2n +
n
(j+1)
Km
(j)
B 2(n−m)
(n ≥ 0).
(9.17)
m=1
For n = 0 we adopt the convention of an empty sum being interpreted as zero. Finally we have corresponding to (9.11) that (j+1)
Kn
=
n−1
(j+1)
Km
(j)
B 2(n−m)+1
m=0 (j)
= B 2n+1 +
n−1
(j+1)
Km
(j)
B 2(n−m)+1
(n ≥ 1),
(9.18)
m=1 (j+1)
, where again the empty sum for n = 1 is interpreted as zero. As with Ln the leading term on the right-hand side corresponds to a single-step transition in Nj while the sum incorporates paths involving more than one step in Nj .
178
E. Hunt (N )
As with the computations involved in Algorithm H, B 0 can be calculated in a finite number of steps. We may identify the relevant steps as follows. (1) We require initial input of A0 , A1 , ... , A2N −1 . First we specify An = An N for n = 0, 1, . . . , 2 − 1. We calculate (j)
(j)
(j)
A0 , A1 , . . . , A2N −j −1 , (j)
(j)
(j)
(j)
B 0 , B 2 , B 3 , . . . , B 2N −j −1 recursively for j = 2, . . . , N as follows. We have ( ' (j) (j) −1 (j) B n = I − A1 An n = 0, 2, . . . , 2N −j − 1.
(9.19)
To obtain the matrices (j+1)
A0
(j+1)
, A1
, . . . , A2N −j−1 −1 ,
first evaluate the auxiliary quantities (j+1)
Kn
(j)
= B 2n+1 +
n−1
(j+1)
Km
(j)
(j+1)
A2(n−m)+1 K0
(9.20)
m=1
for n = 1, 2, . . . , 2N −j−1 − 1 and n (j+1) (j) (j+1) (j) Ln = B 2n + K m B 2(n−m)
(9.21)
m=1
n = 0, 1, . . . , 2N −j−1 − 1 and then calculate (j+1)
An
=
n
(j)
(j+1)
B 2m Ln−m
(n = 0, 1),
(9.22)
m=0 (j+1)
An
(j)
= B 2n−1 +
n
(j)
(j+1)
B 2m Ln−m
(n = 2, . . . , 2N −j−1 − 1).
(9.23)
m=0
We shall make use of this construction in subsequent sections.
9.6 H, G and convergence rates For j ≥ 1, define M
(j)
M
by * ⎤ * * (j) := P ⎣ Φs,t ** X 0 = (0, r)⎦ * t≥0 ⎡
(j) r,s
/
(9.24)
9
Computing the fundamental matrix of an M/G/1–type Markov chain
179
for r, s ∈ K, where 8 9 (j) Φs,t = X t = (2j−1 , s), 0 ≤ Y u < 2j − 1, Y u = 2j−1 − 1 (0 < u < t) . We note that this gives M
(1)
= I.
(9.25)
Also for j ≥ 1 we put % & /8 9 ** j−1 j (Vj )r,s = P X t = (−1, s), 2 − 1 ≤ Yt < 2 − 1 *X 0 = (0, r) t>0
for r, s ∈ K. Here Y = max0≤u
For j = 1, we have also that % & /8 9 ** (V1 )r,s := P X t = (−1, s), Y u = 0 (0 ≤ u < t) * X 0 = (0, r)
t>0
= B0
r,s
,
so that
(1)
V1 = B 0 . Proposition 9.6.1 For j ≥ 1, the matrices Vj , M Vj = M
(j)
(j)
B0 .
(9.27) (j)
are related by (9.28)
Proof. By (9.25) and (9.27), the result is immediate for j = 1, so suppose j > 1. Since A is skip-free from above in levels, every trajectory contributing to Vj must at some time pass through level 2j−1 − 1. Conditioning on the first entry into level 2j−1 − 1, we have * % & (j) / (j) ** j−1 (Vj )r,s = P Ψs,t * X 0 = (2 − 1, m) M * r,m t>0 m∈K * ' (j) ( (j) * (j) P X 1 = (−1, s)*X 0 = (0, m) M = m∈K
(j) (j) = M B0
r,m
r,s
,
180
where
E. Hunt
8 9 (j) Ψs,t = X t = (−1, s), 0 ≤ Y u < 2j − 1 (0 < u < t) ,
giving the required result.
Once a convenient recursion is set up for the determination of the matrix (j) M , this may be used to set up Algorithm H for approximating G by use of (9.28). Iteration N of Algorithm H gives the estimate T N :=
N +1
Vj
(9.29)
j=1
for G. The contribution T N is the contribution to G (describing transitions from level 0 to level −1) by paths which which reach a level of at most 2N +1 −2. The estimate TN from the first N iterations of Algorithm H is the contribution from paths which reach a level of 2N +1 − 1 at most. Hence we have the interlacing property T 1 ≤ T1 ≤ T 2 ≤ T2 ≤ T 3 ≤ . . . connecting the successive approximations of G in Algorithms H and H. We have T N ↑ G and TN ↑ G as N → ∞. The interlacing property yields (G − T 1 )e ≥ (G − T1 )e ≥ (G − T 2 )e ≥ (G − T2 )e ≥ . . . or
e − T 1 e ≥ e − T1 e ≥ e − T 2 e ≥ e − T2 e ≥ . . .
in the case of stochastic G. The interlacing property need not carry over to other error measures such as goodness of fit to the equation G = A(G). This will be further discussed in the subsequent partner chapter in this volume. Theorem 9.6.1 If A is transient and irreducible, then T N converges to G quadratically as N → ∞. Proof. If A is transient and irreducible, then the maximal eigenvalue of G is numerically less than unity. We may choose a matrix norm such that 0 < G < ξ < 1. We have * % & (j) / (j) ** B0 =P Ψs,t * X 0 = (2j−1 − 1, r) * r,s t>0 * & % /8 9** j−1 − 1, r) X t = (−1, s) * X 0 = (2 ≤P * t>0 j−1 = G(2 ) . r,s
9
Computing the fundamental matrix of an M/G/1–type Markov chain
181
Choose K ≥ 1 to be an upper bound for the norm of all substochastic k × k matrices. (For some norms K equals 1 will suffice.) Then by (9.28) (j)
Vj ≤ K B0 ≤ K G(2 Hence
j−1
)
< Kξ (2
j−1
)
.
2 2 2 2 2 2 2 2 N ∞ 2 2 2 ∞ 2 j−1 2G − 2 2 2≤ V V Kξ (2 ) ≤ j j 2 2 2 2 2 2 2j+N +1 2 j=N +1 j=1 = Kξ (2
N
)
∞
ξ (2
)
=0
< Kξ
(2N )
/ 1 − ξ2 ,
whence the stated result. Corollary 9.6.1 By the convergence result for T N and the interlacing property, Algorithm H also converges to G quadratically when A is transient and irreducible. (j)
Remark 1. Similarly to the argument for B 0 , we have that * % & (j) / (j) ** j−1 B2 =P Λs,t * X 0 = (2 − 1, r) , * r,s
(9.30)
t>0
where 8 9 (j) Λs,t = X t = (2j − 1, s), 0 ≤ Y u < 3 · 2j−1 − 1 (0 ≤ u < t) . We shall make use of (9.30) in the next section. By the interlacing property, an implementation of Algorithm H would, for the same number of iterations, be no more accurate than Algorithm H. (j) appears to be in general The computation of the auxiliary matrices M quite complicated, so Algorithm H offers no special advantages. However it does have theoretical interest, as with the interlacing property shown above and the consequent information about the convergence rate of Algorithm H. We shall see in the next section that in the special case of a QBD, Algorithm H reduces to the logarithmic-reduction algorithm of Latouche and Ramaswami [14].
9.7 A special case: The QBD There is considerable simplification to Algorithm H in the special case of a QBD, arising from the fact that this is skip-free in levels both from above and from below. We now investigate this situation.
182
E. Hunt (j)
(j+1)
Because An = 0 for n > 2, (9.20) gives K n (j+1)
already seen that K 0 (j+1)
L0
= 0 for n > 0. We have
= I. Relation (9.21) now provides
(j)
(j+1)
= B0 ,
L1
(j)
(j+1)
= B 2 and Ln
= 0 for n > 1.
Equations (9.22) and (9.23) consequently yield (j+1)
An
(j+1)
A1
(j) 2 = Bn for n = 0, 2, (j)
(j)
(j)
(9.31)
(j)
= B0 B2 + B2 B0 .
(9.32)
The relations (9.19) give ( ' (j) (j) −1 (j) B n = I − A1 An for n = 0, 2.
(9.33)
Equations (9.31)–(9.33) are simply the familiar defining relations for Algorithm LR. We now turn our attention to the matrix M For a QBD with j > 1, * % & (j) / (j) ** M =P χs,t * X 0 = (0, r) * r,s
(j)
.
t>0
where 8 9 (j) χs,t = X t = (2j−1 − 1, s), 0 ≤ Y u < 2j−1 − 1, (0 < u < t) and so M
(j+1)
%
r,s
=P
/ t>0
* & * j+1 * Ψs,t * X 0 = (0, r) . *
Thus in particular % & (2) /8 9 ** M =P X t = (1, s), Y u = 0, (0 < u < t) *X 0 = (0, r) r,s
t>0
* 4 = P X 1 = (1, s)* X 0 = (0, r) (1) , = B2 3
r,s
so that M
(2)
(1)
= B2 .
(9.34)
For j ≥ 2, we derive by conditioning on the first passage of N1 to level 2j−1 − 1 that
9
Computing the fundamental matrix of an M/G/1–type Markov chain
M
183
(j+1) r,s
%
=
P
/
m∈K
t>0
(j)
=
M
m∈K
= M
(j)
* *
(j) * χm,t * X 0
*
r,m
%
/
= (0, r) × P
* *
(j) * Υs,t,v * X t
*
v>0
& j−1
= (2
− 1, m)
(j) B2 m,s
(j)
B2
&
r,s
,
where 8 9 (j) Υs,t,v = X t+v = (2j − 1, s), 0 ≤ Y u < 2j − 1 (t < u < t + v) .
Thus M
(j+1)
=M
(j)
(j)
B2
and so for j > 2 M
(j+1)
=M
(2)
(2)
(j)
B2 . . . B2
(1)
(2)
(j)
(j−1)
for j ≥ 2.
= B2 B2 . . . B2 . Taking this result with (9.34) yields M
(j)
(1)
= B2 . . . B2
Finally, Proposition 9.6.1 provides (1) B0 for j = 1 Vj = (1) (j−1) (j) B2 . . . B2 B 0 for j > 1. With this evaluation for Vj , (9.29) is the formula employed in Algorithm LR for calculating approximations to G. Thus Algorithm H reduces to Algorithm LR in the case of a QBD. In the case of a QBD, some simplifications occur also in Algorithm H. (j) (j) Since An = 0 for n > 2, we have Kn = 0 for n > 0 and so (j+1)
= K0
(j+1)
= K0
L0 L1
(j+1)
A0 ,
(j)
(j+1)
A2 ,
(j)
L(j+1) = 0 for n > 1. n (j)
Also Bn = 0 for n > 2.
184
E. Hunt
The relations linking Aj+1 and Aj are thus (j+1)
Ai (j+1)
A1
(j)
(j+1)
(j)
(j+1)
= Ai K0
(j)
= A1 + A0 K0 (j+1)
B1
(j)
(j)
Ai
(i = 0, 2),
(j)
(j)
(j)
(j+1)
= B 1 + B2 K 0
(j+1)
B2
(j+1)
A2 + A2 K0
(j)
(j+1)
= B2 K0
(j)
A0 ,
(j)
A0 ,
(j)
A2 .
The initialization is (0)
Ai
= Ai
(0)
(i = 0, 1, 2),
Bi
= Ai
(i = 1, 2).
As a result, Algorithm H can, in the QBD case, be run in a very similar way to Algorithm LR. The censorings and algebraic detail are, however, quite different. We programmed the LR Algorithm and ran it and Algorithm H on an example given in [7] and subsequently in [1] and [2]. Example 5. Latouche and Ramaswami’s pure-birth/pure-death process. This example is a QBD with 1−p 0 0 p 0 0 , A1 = , A2 = . A0 = 0 0 2p 0 0 1 − 2p We chose p equal to 0.1. In presenting results we employ GI as a generic notation for the approximation to G after I iterations with the algorithms involved, viz., T I and TI in the present case. The results in Table 9.1 have errors 0.5672 > 0.4800 > 0.3025 > 0.2619 > · · · > 4.9960e − 14 > 4.3854e − 14, illustrating well the interlacing property. Table 9.1 The interlacing property LR Iteration I 1 2 3 4 5 6 7
e − GI e∞ 0.5672 0.3025 0.1027 0.0145 3.3283e-04 1.7715e-07 4.9960e-14
H CPU Time (s) 0.001 0.002 0.003 0.007 0.009 0.011 0.012
e − GI e∞ 0.4800 0.2619 0.0905 0.0130 2.9585e-04 1.5747e-07 4.3854e-14
CPU Time (s) 0.000 0.000 0.004 0.007 0.009 0.010 0.010
9
Computing the fundamental matrix of an M/G/1–type Markov chain
185
9.8 Algorithms CR and H We now consider the relation between Algorithm H and Bini and Meini’s Cyclic Reduction Algorithm CR. The latter is carried out in terms of formal power series, so to make a connection we need to express Algorithm H in these terms, too. For j ≥ 0, we define ∞
ψ (j) (z) := φ(j) (z) :=
n A(j) n z ,
n=0 ∞
(j)
Bn+1 z n .
n=0
We remark that since Aj is substochastic, these series are absolutely convergent for |z| ≤ 1. We encapsulate the odd- and even-labeled coefficients in the further generating functions ψe(j) (z) := φ(j) e (z) :=
∞ n=0 ∞
(j)
A2n z n ,
ψo(j) (z) :=
(j)
B2(n+1) z n ,
φ(j) o (z) :=
n=0
∞ n=0 ∞
(j)
A2n+1 z n , (j)
B2n+1 z n .
n=0
Again, these power series are all absolutely convergent for |z| ≤ 1. We introduce ∞ L(j+1) zn, L(j+1) (z) := n n=0
K (j+1) (z) :=
∞
Kn(j+1) z n .
n=0
Multiplication of (9.6) by z , summing over n ≥ 1 and adding (9.7) provides n
ψ (j+1) (z) =
∞
(j) A2n−1 z n
+
n=1
=
zψo(j) (z)
∞
(j) A2m z m
m=0
+
∞
L(j+1) zn n
n=0
ψe(j) (z)L(j+1) (z).
(9.35)
Similarly (9.8) gives φ(j+1) (z) =
∞ n=1
(j) B2n−1 z n−1
+
∞
(j) B2m z m
m=0
(j) (j+1) = φ(j) (z). o (z) + zφe (z)L
∞
L(j+1) zn n
n=0
(9.36)
186
E. Hunt
Forming generating functions from (9.9) in the same way leads to L(j+1) (z) = K (j+1) (z)ψe(j) (z),
(9.37)
while from (9.10) and (9.11) we derive (j+1)
K (j+1) (z) = K0
(j+1)
= K0
+ +
(j+1)
+
(j+1)
+
= K0 = K0
∞
n−1
zn
n=1 ∞
(j)
(j+1)
(j+1) Km A2(n−m)+1 K0
m=0 (j+1) Km
∞
(j)
(j+1)
z n A2(n−m)+1 K0
m=0 ∞
n=m+1 ∞ (j) (j+1) (j+1) m Km z z A2+1 K0 m=0 =1 ∞ (j) (j+1) K (j+1) (z) z A2+1 K0 =1
( ' (j) (j+1) + K (j+1) (z) ψo(j) (z) − A1 K0 ' ( (j+1) (j+1) = K0 + K (j+1) (z) ψo(j) (z) − I K0 ( ' (j) (j+1) + K (j+1) (z) I − A1 K0 . (j+1)
= K0
By (9.10) the last term on the right simplifies to K (j+1) (z). Hence we have (j+1)
K0
' ( (j+1) = K (j+1) (z) I − ψo(j) (z) K0 . (j)
(j+1) −1
Postmultiplication by I − A1 = [K0
yields ' ( I = K (j+1) (z) I − ψo(j) (z) ,
so that
]
' (−1 . K (j+1) (z) = I − ψo(j) (z)
Hence we have from (9.37) that ' (−1 L(j+1) (z) = I − ψo(j) (z) ψe(j) (z). We now substitute for L(j+1) (z) in (9.35) and (9.36) to obtain ' (−1 (j) (j) ψe(j) (z) φ(j+1) (z) = φ(j) o (z) + zφe (z) I − ψo (z)
(9.38)
9
Computing the fundamental matrix of an M/G/1–type Markov chain
and
' (−1 (j) ψe(j) (z). ψ (j+1) (z) = zψo(j) (z) + φ(j) e (z) I − ψo (z)
We have also that ψ (0) (z) =
∞
187
(9.39)
An z n
(9.40)
An+1 z n .
(9.41)
n=0
and φ(0) (z) =
∞ n=0
The recursive relations (9.38)–(9.41) are precisely the generating functions used in CR (see [5]). Thus Algorithm H is equivalent to the cyclic reduction procedure whenever the latter is applicable. The formulation of Algorithm CR of Bini and Meini that we derived above is the simpler of two versions given in [5]. Bini and Meini have developed the theme of [5] in this and a number of associated works (see, for example, [3]–[10]). The proofs in [5] are more complicated than those we use to establish Algorithm H. Furthermore, their treatment employs a number of assumptions that we have not found necessary. The most notable of these are restrictions as to the M/G/1–type Markov chains to which the results apply. Like us, they assume A is irreducible. However they require also that A be positive recurrent and that the matrix G be irreducible and aperiodic. Further conditions imposed later in the proofs are less straightforward. Several alternative possibilities are proposed which can be used to lead to desired results. These are: ∞ (j) −1 is bounded above; (a) that the matrix I − i=1 Ai
(b) that the limit P = limj→∞ P (j) exists and the matrix P is is the onestep transition matrix of a positive recurrent Markov chain; (j) (c) that the matrix A1 is irreducible for some j; ∞ (j) (d) that the matrices i=1 Ai are irreducible for every j and do not converge to a reducible matrix.
References 1. N. Akar, N. C. Oˇ guz & K. Sohraby, “TELPACK: An advanced TELetraffic analysis PACKage,” IEEE Infocom ’97. http://www.cstp.umkc.edu/personal/akar/home.html 2. N. Akar, N. C. Oˇ guz & K. Sohraby, An overview of TELPACK IEEE Commun. Mag. 36 (8) (1998), 84–87. 3. D. Bini and B. Meini, On cyclic reduction applied to a class of Toeplitz–like matrices arising in queueing problems, in Proc. 2nd Intern. Workshop on Numerical Solution of Markov Chains, Raleigh, North Carolina (1995), 21–38.
188
E. Hunt
4. D. Bini and B. Meini, Exploiting the Toeplitz structure in certain queueing problems, Calcolo 33 (1996), 289–305. 5. D. Bini and B. Meini, On the solution of a non–linear matrix equation arising in queueing problems, SIAM J. Matrix Anal. Applic. 17 (1996), 906–926. 6. D. Bini and B. Meini, On cyclic reduction applied to a class of Toeplitz–like matrices arising in queueing problems, in Computations with Markov Chains, Ed. W. J. Stewart, Kluwer, Dordrecht (1996) 21–38. 7. D. Bini and B. Meini, Improved cyclic reduction for solving queueing problems, Numerical Algorithms 15 (1997), 57–74. 8. D. A. Bini and B. Meini, Using displacement structure for solving non–skip–free M/G/1 type Markov chains, in Advances in Matrix Analytic Methods for Stochastic Models, Eds A. S. Alfa and S. R. Chakravarthy, Notable Publications, Neshanic Station, NJ (1998), 17–37. 9. D. Bini and B. Meini, Solving certain queueing problems modelling by Toeplitz matrices, Calcolo 30 (1999), 395–420. 10. D. Bini and B. Meini, Fast algorithms for structured problems with applications to Markov chains and queueing models, Fast Reliable Methods for Matrices with Structure, Eds T. Kailath and A. Sayed, SIAM, Philadelphia (1999), 211–243. 11. E. Hunt, A probabilistic algorithm for determining the fundamental matrix of a block M/G/1 Markov chain, Math. & Comput. Modelling 38 (2003), 1203–1209. 12. G. Latouche, Algorithms for evaluating the matrix G in Markov chains of P H/G/1 type, Bellcore Tech. Report (1992) 13. G. Latouche, Newton’s iteration for non–linear equations in Markov chains, IMA J. Numer. Anal. 14 (1994), 583–598. 14. G. Latouche and V. Ramaswami, A logarithmic reduction algorithm for Quasi–Birth– Death processes, J. Appl. Prob. 30 (1993), 650–674. 15. G. Latouche and G. W. Stewart, Numerical methods for M/G/1 type queues, in Proc. Second Int. Workshop on Num. Solution of Markov Chains, Raleigh NC (1995), 571– 581. 16. B. Meini, Solving M/G/1 type Markov chains: recent advances and applications, Comm. Statist.– Stoch. Models 14 (1998), 479–496. 17. B. Meini, Solving QBD problems: the cyclic reduction algorithm versus the invariant subspace method, Adv. Performance Anal. 1 (1998), 215–225. 18. M. F. Neuts, Structured Stochastic Matrices of M/G/1 Type and Their Applications, Marcel Dekker, New York (1989). 19. V. Ramaswami, Nonlinear matrix equations in applied probability – solution techniques and open problems, SIAM Review 30 (1988), 256–263. 20. V. Ramaswami, A stable recursion for the steady state vector in Markov chains of M/G/1 type, Stoch. Models 4 (1988), 183–188.
Chapter 10
A comparison of probabilistic and invariant subspace methods for the block M /G/1 Markov chain Emma Hunt
Abstract A suite of numerical experiments is used to compare Algorithm H and other probability-based algorithms with invariant subspace methods for determining the fundamental matrix of an M/G/1–type Markov chain. Key words: Block M/G/1 Markov chain, fundamental matrix, invariant subspace methods, probabilistic algorithms, Algorithm H
10.1 Introduction In a preceding chapter in this volume, we discussed the structure of a new probabilistic Algorithm H for the determination of the fundamental matrix G of a block M/G/1 Markov chain. We assume familiarity with the ideas and notation of that chapter. In the current chapter we take a numerical standpoint and compare Algorithm H with other, earlier, probability-based algorithms and with an invariant subspace approach. The last-mentioned was proposed recently by Akar and Sohraby [4] for determining the fundamental matrix G of an M/G/1–type Markov chain or the rate matrix R of a GI/M/1–type Markov chain. Their approach applies only for special subclasses of chains. For the M/G/1 case this is where ∞ Ai z i A(z) = i=0
is irreducible for 0 < z ≤ 1 and is a rational function of z. The analysis can then be conducted in terms of solving a matrix polynomial equation Emma Hunt School of Mathematical Sciences & School of Economics, The University of Adelaide, Adelaide SA 5005, AUSTRALIA e-mail: [email protected] C. Pearce, E. Hunt (eds.), Structure and Applications, Springer Optimization and Its Applications 32, DOI 10.1007/978-0-387-98096-6 10, c Springer Science+Business Media, LLC 2009
189
190
E. Hunt
of finite degree. Following its originators, we shall refer to this technique as TELPACK. It is important to note that TELPACK applies only in the positive recurrent case. It is natural to have high hopes for such an approach, since it exploits special structure and circumvents the necessity for truncations being made to the sequence (Ak )k≥0 . The solution to the polynomial problem is effected via a so-called invariant subspace approach. The invariant subspace approach is one that has been extensively used over the past 20 years for attacking an important problem in control theory, that of solving the algebraic Riccati equation. This has been the object of intense study and many solution variants and refinements exist. A further treatment relating to the M/G/1–type chain has been given by Gail, Hantler and Taylor [6]. Akar, Oˇ guz and Sohraby [3] also treat the finite quasi-birth-and-death process by employing either Schur decomposition or matrix-sign-function iteration to find bases for left- and right-invariant subspaces. In connection with a demonstration of the strength of the invariant subspace method, Akar, Oˇ guz and Sohraby [1], [2] made available a suite of examples of structured M/G/1 and GI/M/1 Markov chains which may be regarded as standard benchmarks. These formed part of a downloadable package, including C code implementations of the invariant subspace approach, which was until very recently available from Khosrow Sohraby’s home page at http://www.cstp.umkc.edu/org/tn/telpack/home.html. This site no longer exists. Section 2 addresses some issues for error measures used for stopping rules for iterative algorithms, in preparation for numerical experiments. In Section 3 we perform numerical experiments, drawing on a suite of TELPACK M/G/1 examples and a benchmark problem of Daigle and Lucantoni. Our experiments illustrate a variety of points and provide some surprises. We could find no examples in the literature for which A(z) is not rational, so have supplied an original example.
10.2 Error measures In [8], Meini noted that, in the absence of an analysis of numerical stability, the common error measure e − GI e ∞
(10.1)
applied when G is stochastic may not be appropriate for TELPACK. She proposed instead the measure GI − A(GI ) ∞ , which is also appropriate in the case of substochastic G,
(10.2)
10
A comparison of probabilistic and invariant subspace methods
191
We now note that a closer approximation to G can on occasion give rise to a worse error as measured by (10.2). That is, it can happen that there are substochastic matrices G0 , G1 simultaneously satisfying G − G1 ∞ < G − G0 ∞ ,
(10.3)
and in fact 0 ≤ G0 ≤ G1 ≤ G, but with G1 − A(G1 ) ∞ > G0 − A(G0 ) ∞ .
(10.4)
We shall make use of the QBD given by A0 =
1−p 0 , 0 0
A1 =
0 p , rp 0
A2 =
0 0 0 1 − rp
(10.5)
with r ≥ 1 and 0 < p < 1/r.
(10.6)
This is an extension of the pure-birth/pure-death process of Latouche and Ramaswami [7]. With these parameter choices, the QBD is irreducible. It is null recurrent for r = 1 and positive recurrent for r > 1, with fundamental matrix 10 G= . 10 Also for any matrix
x0 GI = y0 we have
with 0 ≤ x, y ≤ 1,
1 − p + py 0 . A(GI ) = rpx + (1 − rp)xy 0
Take r = 1 and p = 1/2 and put 0.5 0 , G0 = 0.5 0
0.6 0 G1 = . 0.9 0
We have G − G1 ∞ = 0.4 < 0.5 = G − G0 ∞ and so (10.3) holds. Also
0.75 0 A(G0 ) = , 0.375 0
so that
G0 − A(G0 ) ∞ = 0.25,
(10.7)
192
and
E. Hunt
0.95 0 A(G1 ) = , 0.57 0
so that
G1 − A(G1 ) ∞ = 0.35.
We thus have (10.4) as desired. Further G − GI ∞ = e − GI e ∞ for GI of the form (10.7), so that we have also an example in which (10.4) and (10.8) e − G1 e ∞ < e − G0 e ∞ hold simultaneously. The inequalities (10.3) and (10.4) or (10.4) and (10.8) also occur simultaneously for the same choices of G0 and G1 when we take the QBD given by (10.5) with r = 2 and p = 0.4. In the above examples, the two nonzero entries in Gi − A(Gi ) (i = 0, 1), that is, those in the leading column, have opposite sign. This is a particular instance of a general phenomenon with QBDs given by (10.5) and (10.6) and GI of the form (10.7). The general result referred to is as follows. Suppose GI is of the form (10.7) with y < 1, x ≤ 1 and x < (1 − p)/(1 − rp). We have (1 − p)(1 − y) > x(1 − rp)(1 − y) or x − [1 − p + py] < [rpx + (1 − rp)xy] − y, so that Θ1 < −Θ2 , where Θi := [GI − A(GI )]i,1
(i = 1, 2).
If Θ2 ≥ 0, then Θ1 < 0. Conversely, if Θ1 ≥ 0, then Θ2 < 0. In particular, if Θ1 and Θ2 are both nonzero, then they are of opposite sign. This sort of behavior does not appear to have been reported previously. It is also worthy of note that, because of its use of rational functions, no truncations need be involved in the computation of the error measure on the left in (10.2) with the use of TELPACK.
10.3 Numerical experiments We now consider some numerical experiments testing Algorithm H. As previously noted, all outputs designated as TELPACK have been obtained running
10
A comparison of probabilistic and invariant subspace methods
193
the C program downloaded from Khosrow Sohraby’s website. All other code has been implemented by us in MATLAB. The following experiments illustrate a variety of issues.
10.3.1 Experiment G1 Our first experiment is drawn from the suite of TELPACK M/G/1 examples. We use ⎡ ⎤ 1 11i+1
2 11i+1
7 11i+1
⎢ ⎥ ⎢ 18 10 i 1 10 i 1 10 i ⎥ ⎢ ⎥ Ai = ⎢ 21 21 21 21 21 21 ⎥ ⎣ ⎦ 9 4 i+1 9 4 i+1 12 4 i+1 40
7
40
−
z −1 11 )
7
40
7
for i ≥ 0. This gives ⎡
1 11 (1
⎢ ⎢ 18 10z −1 A(z) = ⎢ ⎢ 21 (1 − 21 ) ⎣ 4z −1 9 70 (1 − 7 ) ⎡ ⎢ ⎢ =⎢ ⎢ ⎣
(1 −
−
z −1 11 )
7 11 (1
−
z −1 11 )
1 21 (1
−
10z −1 1 21 ) 21 (1
−
10z −1 ⎥ 21 ) ⎥
9 70 (1
−
4z −1 7 )
−
4z −1 7 )
z −1 11 )
0
0
10z −1 21 )
(1 −
0
0
12 70 (1
(1 −
= I − z diag
+
1 10 4 , , 11 21 7
1 2 7 11 11 11
⎥⎢ ⎥ ⎢ 18 ⎥⎢ ⎥ ⎢ 21 ⎦⎣
0 4z −1 7 )
1 2 7 11 11 11
,−1 ⎢ ⎢ 18 ⎢ ⎢ 21 ⎣
⎥ ⎥ ⎦
⎤⎡
0
⎡
⎤
2 11 (1
1 1 21 21
1 1 21 21
⎤ ⎥ ⎥ ⎥ ⎥ ⎦
9 9 12 70 70 70
⎤ ⎥ ⎥ ⎥. ⎥ ⎦
9 9 12 70 70 70
This example provides the simplest form of rational A(z) for which every Ai has nonzero elements and can be expected to favor TELPACK. Using the stopping criterion (GI − GI−1 )e <
(10.9)
with = 10−8 , TELPACK converges in 7 iterations, Algorithm H in 5. In Table 10.1 we include also, for comparison, details of the performance of Algorithm H for 6 iterations.
194
E. Hunt
Table 10.1 Experiment G1 TELPACK
H
I
e − GI e∞
CPU Time (s)
I
e − GI e∞
CPU Time (s)
7
9.9920e-16
0.070
5 6
1.0617e-08 5.5511e-16
0.001 0.001
The use of the stopping criterion (10.9) is fixed in TELPACK. We have therefore employed it wherever TELPACK is one of the algorithms being compared.
10.3.2 Experiment G2 We now consider a further numerical experiment reported by Akar and Sohraby in [4]. The example we take up has 1 0 0.0002 0.9998 . A(z) = 1−a 0 1−az 0.9800 0.0200 The parameter a is varied to obtain various values of the traffic intensity ρ, with the latter defined by ρ = xA (1)e, where x is the invariant probability measure of the stochastic matrix A(1). See Neuts [9, Theorem 2.3.1 and Equation (3.1.1)]. In Tables 10.2 and 10.3 we compare the numbers of iterations required to calculate the fundamental matrix G to a precision of 10−12 or better (using (10.9) as a stopping criterion) with several algorithms occurring in the literature. The Neuts Algorithm is drawn from [9] and is based on the iterative relations G(0) = 0 with % & ∞ −1 i A0 + for j ≥ 0. (10.10) Ai (G(j)) G(j + 1) = (I − A1 ) i=2
The Ramaswami Algorithm [10] is based similarly on G(0) = 0 with G(j + 1) =
I−
∞
−1 i−1
Ai (G(j))
A0
for j ≥ 0.
(10.11)
i=1
As noted previously, TELPACK is designed for sequences (Ai ) for which the generating function A(z) is, for |z| ≤ 1, a rational function of z. In this event
10
A comparison of probabilistic and invariant subspace methods
195
Table 10.2 Experiment G2 ρ
Method
I
e − GI e∞
CPU Time (s)
0.20
Neuts Ramaswami Extended Neuts Extended Ramaswami TELPACK H
12 8 26 23 7 4
3.8658e-13 1.6036e-12 1.1143e-12 1.1696e-12 6.6613e-16 2.2204e-16
0.070 0.060 0.020 0.020 0.060 0.001
0.40
Neuts Ramaswami Extended Neuts Extended Ramaswami TELPACK H
21 15 50 41 6 5
1.1224e-12 3.7181e-13 3.0221e-12 1.9054e-12 2.2204e-16 1.1102e-16
0.200 0.180 0.040 0.030 0.060 0.001
0.60
Neuts Ramaswami Extended Neuts Extended Ramaswami TELPACK H
37 26 95 72 6 6
2.3257e-12 9.0550e-13 5.7625e-12 4.1225e-12 4.4409e-16 2.2204e-16
0.450 0.330 0.060 0.040 0.060 0.001
0.80
Neuts Ramaswami Extended Neuts Extended Ramaswami TELPACK H
81 55 220 157 6 7
5.4587e-12 4.2666e-12 1.4281e-11 1.0090e-11 4.4409e-16 4.4409e-16
1.230 0.870 0.150 0.090 0.060 0.001
the fundamental matrix satisfies a reduced matrix polynomial equation F (G) :=
f
Fi Gi = 0.
(10.12)
i=0
The Extended Neuts and Extended Ramaswami Algorithms are respectively extensions of the Neuts and Ramaswami Algorithms based on Equation (10.12). This enables the infinite sums to be replaced by finite ones. Assuming the invertibility of F1 , the recursions (10.10), (10.11) become respectively f −1 i F0 + Fi (G(j)) G(0) = 0, G(j + 1) = −F1 i=2
and G(0) = 0,
G(j + 1) =
−
f i=1
−1 i−1
Fi (G(j))
F0 .
196
E. Hunt
Table 10.3 Experiment G2 continued ρ
Method
I
e − GI e∞
CPU Time (s)
0.90
Neuts Ramaswami Extended Neuts Extended Ramaswami TELPACK H
163 111 451 314 6 8
1.0630e-11 7.2653e-12 3.2315e-11 2.1745e-11 6.6613e-16 1.1102e-16
2.710 1.900 0.310 0.180 0.060 0.001
0.95
Neuts Ramaswami Extended Neuts Extended Ramaswami TELPACK H
315 214 881 606 7 9
2.2072e-11 1.5689e-11 6.6349e-11 4.4260e-11 0 1.1102e-15
5.430 3.810 0.590 0.350 0.060 0.001
0.99
Neuts Ramaswami Extended Neuts Extended Ramaswami TELPACK H
1368 933 3836 2618 8 11
1.1456e-10 7.7970e-11 3.3809e-10 2.2548e-10 1.9984e-15 1.0880e-14
24.770 17.440 2.880 1.720 0.060 0.003
TELPACK is designed to exploit situations in which A(z) is a rational function of z, so it is hardly surprising that it achieves a prescribed accuracy in markedly fewer iterations than needed for the Neuts and Ramaswami Algorithms. What is remarkable is that these benefits do not occur for the Extended Neuts or Ramaswami Algorithms, in fact they require almost three times as many iterations for a prescribed accuracy, although overall the extended algorithms take substantially less CPU time than do their counterparts despite the extra iterations needed. In contrast to the Extended Neuts and Ramaswami Algorithms and TELPACK, Algorithm H holds for all Markov chains of block–M/G/1 type. In view of this Algorithm H compares surprisingly well. We note that it achieves accuracy comparable with that of TELPACK, with much smaller CPU times, for all levels of traffic intensity.
10.3.3 The Daigle and Lucantoni teletraffic problem The most common choice of benchmark problem in the literature and the subject of our next three numerical experiments is a continuous-time teletraffic example of Daigle and Lucantoni [5]. This involves matrices expressed in terms of parameters K, ρd , a, r and M . The defining matrices Ai (i = 0, 1, 2)
10
A comparison of probabilistic and invariant subspace methods
197
are of size (K + 1) × (K + 1). The matrices A0 and A2 are diagonal and prescribed by (A0 )j,j = 192[1 − j/(K + 1)]
(0 ≤ j ≤ K),
A2 = 192ρd I.
The matrix A1 is tridiagonal with (A1 )j,j+1 = ar
M −j M
(0 ≤ j ≤ K − 1),
(A1 )j,j−1 = jr
(1 ≤ j ≤ K).
A physical interpretation of the problem (given in [5]) is as follows. A communication line handles both circuit-switched telephone calls and packet-switched data. There are a finite number M of telephone subscribers, each of whom has exponentially distributed on-hook and off-hook times, the latter having parameter r and the former being dependent upon the offered load a which is given in Erlangs. In particular, the rate for the onhook distribution is given by the quantity a/(M r). Data packets arrive according to a Poisson process and their lengths are assumed to be approximated well by an exponentially distributed random variable having mean 8000. The communication line has a transmission capacity of 1.544 megabits per second of which 8000 bits per second are used for synchronization. Thus, at full line capacity, the line can transmit 192 packets per second. Each active telephone call consumes 64 kilobits per second. A maximum of min(M ,23) active telephone subscribers are allowed to have calls in progress at any given time. The transmission capacity not used in servicing telephone calls is used to transmit data packets. Thus, if there are i active callers, then the service rate for the packets is (1 − i/24) × 192. The offered load for the voice traffic is fixed at 18.244 Erlangs. Following the original numerical experiments in [5], the above example has been used as a testbench by a number of authors including Latouche and Ramaswami [7] and Akar, Oˇ guz and Sohraby [1], [2]. This example is a fairly demanding comparative test for an algorithm designed for general M/G/1–type Markov chains, since it features a QBD. The logarithmic-reduction method in [7] is expressly designed for such processes. The matrix-sign-function method of [1] and [2] is designed for the more general, but still rather limited case, in which A(z) is a rational function of z. As our algorithm is designed for discrete-time rather than continuous-time processes, we use the embedded jump chain of the latter, for which the entries in G have to be the same, for our analysis. In Experiments G3 and G4 we employ the criterion (10.9) with = 10−12 used in the above stochastic-case references, together with the parameter choice K = 23, the latter indicating that the matrices An in the example are of size of size 24 × 24.
198
E. Hunt
10.3.3.1 Experiment G3 In this experiment the call holding rate is set at r = 100−1 s−1 and the calling population size M is fixed at 512. In Tables 10.4 and 10.5 we compare the number of iterations involved in estimating G with different algorithms for a range of traffic parameter values from ρd = 0.01 to ρd = 0.29568. The latter value was noted by Daigle and Lucantoni [5] to correspond to an instability limit. The algorithms considered are the logarithmic-reduction algorithm of Latouche and Ramaswami (LR), TELPACK and Algorithm H. We do not give iteration counts for the original experiments of Daigle and Lucantoni. These counts are not detailed in [5] but are mentioned as running to tens of thousands.
Table 10.4 Iterations required with various traffic levels: Experiment G3 ρd
Method
I
GI − A(GI )∞
CPU Time (s)
0.010
TELPACK LR H
10 4 4
1.5613e-16 1.4398e-16 2.4460e-16
0.450 0.010 0.010
0.025
TELPACK LR H
10 5 5
1.8388e-16 1.6306e-16 2.1164e-16
0.480 0.020 0.020
0.050
TELPACK LR H
10 8 8
2.6368e-16 1.5959e-16 1.5959e-16
0.460 0.040 0.040
0.075
TELPACK LR H
10 10 10
2.2204e-16 2.2204e-16 2.6368e-16
0.450 0.040 0.040
0.100
TELPACK LR H
10 11 11
1.5266e-16 3.4781e-16 3.4478e-16
0.048 0.040 0.040
0.120
TELPACK LR H
10 12 12
2.0817e-16 1.5266e-16 2.3592e-16
0.060 0.060 0.060
0.140
TELPACK LR H
10 13 13
3.6082e-16 2.6368e-16 1.5266e-16
0.560 0.060 0.060
0.160
TELPACK LR H
9 14 14
2.2204e-16 1.6653e-16 1.9429e-16
0.420 0.060 0.060
10
A comparison of probabilistic and invariant subspace methods
199
Table 10.5 Iterations required with various traffic levels: Experiment G3 continued ρd
Method
I
GI − A(GI )∞
CPU Time (s)
0.180
TELPACK LR H
9 14 14
4.9960e-16 1.6653e-16 1.9429e-16
0.470 0.060 0.060
0.200
TELPACK LR H
9 15 15
1.8822e-16 1.1102e-16 1.9429e-16
0.420 0.070 0.070
0.220
TELPACK LR H
9 15 15
3.0531e-16 3.6082e-16 2.2204e-16
0.410 0.070 0.070
0.240
TELPACK LR H
10 16 16
3.0531e-16 1.3878e-16 1.1796e-16
0.450 0.080 0.070
0.260
TELPACK LR H
10 17 17
3.7383e-16 2.4980e-16 2.2204e-16
0.470 0.080 0.080
0.280
TELPACK LR H
12 18 18
9.5659e-16 1.9429e-16 1.1102e-16
0.530 0.080 0.080
0.290
TELPACK LR H
13 20 20
7.5033e-15 2.2204e-16 1.3878e-16
0.560 0.080 0.080
0.29568
TELPACK LR H
20 29 29
1.5737e-09 2.2204e-16 1.6653e-16
0.830 0.100 0.100
It should be noted that in the references cited there is some slight variation between authors as to the number of iterations required with a given method, with larger differences at the instability limit. Akar et al. attribute this to differences in the computing platforms used [4]. All computational results given here are those obtained by us, either using our own MATLAB code or by running TELPACK.
10.3.3.2 Experiment G4 Our fourth numerical experiment fixed the offered data traffic at 15%, the call holding rate at r = 300−1 s−1 and then considered system behavior as a function of the calling population size M (see Tables 10.6 and 10.7).
200
E. Hunt
Table 10.6 Experiment G4 M
Method
I
GI − A(GI )∞
CPU Time (s)
64
TELPACK LR H
9 16 16
3.7323e-16 2.7756e-16 1.3878e-16
0.440 0.030 0.030
128
TELPACK LR H
10 18 18
3.0531e-16 1.3878e-16 1.3878e-16
0.470 0.060 0.060
256
TELPACK LR H
11 19 19
4.2340e-16 2.2204e-16 1.3878e-16
0.500 0.050 0.060
512
TELPACK LR H
12 20 20
6.6337e-16 2.2204e-16 1.6653e-16
0.530 0.070 0.070
1024
TELPACK LR H
13 21 21
3.1832e-15 2.4980e-16 1.9429e-16
0.550 0.080 0.070
2048
TELPACK LR H
13 22 22
3.8142e-14 2.2204e-16 1.9429e-16
0.550 0.080 0.080
Table 10.7 Experiment G4 continued M
Method
I
GI − A(GI )∞
CPU Time (s)
4096
TELPACK LR H
14 23 23
6.3620e-14 1.9429e-16 2.7756e-16
0.530 0.080 0.080
8192
TELPACK LR H
15 24 24
1.5971e-13 2.4980e-16 3.0531e-16
0.610 0.090 0.090
16384
TELPACK LR H
16 25 25
4.2425e-12 2.2204e-16 2.2204e-16
0.650 0.090 0.080
32768
TELPACK LR H
17 27 27
2.5773e-11 1.9429e-16 2.2204e-16
0.690 0.100 0.100
65536
TELPACK LR H
25 32 32
6.5647e-08 1.9429e-16 2.2204e-16
0.960 0.130 0.110
10
A comparison of probabilistic and invariant subspace methods
201
10.3.3.3 Overview of Experiments G3 and G4 In the light of its design versatility, Algorithm H compares quite well with the above-mentioned more specialist algorithms. Its performance with respect to CPU time and accuracy is comparable with that of the logarithmic-reduction (LR) algorithm. Both the logarithmic-reduction algorithm and Algorithm H require considerably less CPU time than does TELPACK (the difference in times sometimes being as much as an order of magnitude) for superior accuracy. In Experiments 3 and 4 we employ the alternative error measure GI − A(GI ) ∞ < suggested by Meini (see, for example, [8]). In terms of this measure, the performance of TELPACK deteriorates steadily with an increase in the size of M , whereas Algorithms H and LR are unaffected. The last two TELPACK entries in Tables 10.5 and 10.7 are in small typeface to indicate that TELPACK was unable to produce a result in these cases and crashed, generating the error message ‘segmentation fault.’ Reducing to 10−8 produced a result in both instances.
10.3.3.4 Experiment G5 We ran Algorithm H on the Daigle and Lucantoni problem with the call holding rate fixed at r = 300−1 s−1 , the offered data traffic at 28% and the calling population size M at 65,536, varying the size of the matrices from 24 × 24 to 500 × 500. In all cases we used (10.9) as a stopping criterion with = 10−8 . We found that although the iteration counts decreased as the size of the matrices increased, CPU times increased substantially (see Table 10.8). This held for all matrix sizes except for 24×24 (the first entry in Table 10.8) where the computation required for the extra iterations outweighed the speed gain due to smaller matrix size.
10.3.4 Experiment G6 We now turn our attention to the case of a null recurrent process where the defining transition matrices for the system are given by A0 =
0.4 0 , 0 0.4
A1 =
0 0.1 0.2 0.2
and
A2 =
0.5 0 . 0 0.2
Results for this experiment are given in Table 10.9. The stopping criterion used was (10.9) with = 10−8 . We note that this case is not covered by
202
E. Hunt
Table 10.8 Experiment G5 K
Iterations I
H e − GI e∞
CPU Time (s)
23 24 25 26 27 28 29 39 49 59 69 79 89 99 149 299 499
29 19 18 17 17 17 16 15 14 14 13 13 13 13 12 12 12
9.4832e-09 5.1710e-11 8.0358e-11 2.6813e-08 9.2302e-11 4.4409e-16 2.5738e-08 2.3319e-11 1.18140e-09 2.2204e-15 3.6872e-08 4.5749e-10 4.5552e-12 5.3213e-13 3.0490e-09 9.7700e-15 5.8509e-14
0.110 0.080 0.080 0.090 0.100 0.100 0.110 0.200 0.260 0.600 1.130 2.250 4.170 7.670 76.400 853.990 3146.600
Table 10.9 Experiment G6 Method
Iterations I
e − GI e∞
CPU Time (s)
Neuts LR H
11307 24 24
2.1360e-04 3.9612e-08 3.7778e-08
10.950 0.010 0.010
the Akar and Sohraby methodology and therefore that TELPACK cannot be used for this experiment. The results for the H and LR Algorithms are several orders more accurate than that for the Neuts Algorithm with significantly lower CPU times.
10.3.5 Experiment G7 The numerical experiments above all involve matrix functions A(z) of rational form. We could find no examples in the literature for which A(z) is not rational. The following is an original example showing how Algorithm H (and the Neuts Algorithm) perform when A(z) is not rational. We note that these are the only two algorithms which can be applied here. Suppose p, q are positive numbers with sum unity. We define two k × k matrices Ω0 , Ω1 with ‘binomial’ forms. Let
10
A comparison of probabilistic and invariant subspace methods
⎡
0 0 ⎢ .. .. ⎢ . Ω0 = ⎢ . ⎣ 0 0 k−2 q pk−1 k−1 1 p ⎡
p p2 .. .
⎢ ⎢ ⎢ Ω1 = ⎢ ⎢ ⎣ pk−1 0
203
⎤
... 0 0 .. .. ⎥ .. . . . ⎥ ⎥, ⎦ ... 0 0
k−2 k−1 k−1 . . . k−2 pq q
q 0 2pq q2 .. .. k−1 . k−2 k−1 . k−3 2 q 2 p q 1 p 0 0
⎤ 0 0 0 0 ⎥ ⎥ .. .. ⎥ . . ⎥ ⎥ k−1 . k−2 k−1 ⎦ . . . k−2 pq q ... 0 0 ... ... .. .
We now define A0 := Ω0 e−r , rn rn−1 + Ω1 e−r An := Ω0 e−r n! (n − 1)!
(n ≥ 1),
for r a positive number. We remark that A(z) :=
∞
Am z m
m=0
= (Ω0 + zΩ1 )e−r(1−z)
(|z| ≤ 1),
so that A(z) is irreducible for 0 < z ≤ 1 and stochastic for z = 1. Let ω := (ω1 , ω2 , . . . , ωk ) denote the invariant probability measure of Ω := Ω0 + Ω1 = A(1). Then the condition
ωA (1)e ≤ 1
for G to be stochastic (see [9, Theorem 2.3.1]) becomes ω [Ω1 + r(ω0 + Ω1 )] e ≤ 1, that is,
4 3 ω (r + 1)e − (0, 0, . . . , ωk )T ≤ 1
or r + 1 − ωk ≤ 1. We deduce that G is stochastic if and only if r ≤ ωk .
204
E. Hunt
The parameter choice r = 1 thus provides the new and interesting situation of a transient chain. Results are given in Table 10.10 (with the size of the matrices set to 5 × 5). Since G is not stochastic we again revert to the use of A(GI ) − GI ∞ < as an error measure. Table 10.10 Experiment G7: a transient process p
Method
I
GI − A(GI )∞
CPU Time (s)
0.05
Neuts H
78 6
5.2153e-13 1.1102e-16
1.950 0.006
0.1
Neuts H
38 5
6.5445e-13 1.1102e-16
0.960 0.003
0.2
Neuts H
18 4
3.3762e-13 8.3267e-17
0.480 0.002
0.3
Neuts H
11 3
3.5207e-13 2.5153e-17
0.310 0.002
0.4
Neuts H
8 2
5.8682e-14 2.3043e-13
0.210 0.001
0.5
Neuts H
6 2
2.1154e-14 5.5511e-17
0.150 0.001
0.6
Neuts H
4 2
1.5774e-13 1.7347e-18
0.130 0.001
0.7
Neuts H
3 1
1.0413e-13 2.7311e-15
0.100 0.001
0.8
Neuts H
2 1
6.4682e-13 2.7756e-17
0.080 0.001
References 1. N. Akar, N. C. Oˇ guz & K. Sohraby, “TELPACK: An advanced TELetraffic analysis PACKage,” IEEE Infocom ’97. http://www.cstp.umkc.edu/personal/akar/home.html 2. N. Akar, N. C. Oˇ guz & K. Sohraby, An overview of TELPACK IEEE Commun. Mag. 36 (8) (1998), 84–87. 3. N. Akar, N. C. Oˇ guz and K. Sohraby, A novel (computational?) method for solving finite QBD processes, preprint. Comm. Statist. Stoch. Models 16 (2000), 273–311. 4. N. Akar and K. Sohraby, An invariant subspace approach in M/G/1 and G/M/1 type Markov chains, Commun. Statist. Stoch. Models 13 (1997), 381–416.
10
A comparison of probabilistic and invariant subspace methods
205
5. J. N. Daigle and D. M. Lucantoni, Queueing systems having phase–dependent arrival and service rates, Numerical Solution of Markov Chains, Marcel Dekker, New York (1991), 223–238. 6. H. R. Gail, S. L. Hantler and B. A. Taylor, M/G/1 type Markov chains with rational generating functions, in Advances in Matrix Analytic Methods for Stochastic Models, Eds A. S. Alfa and S. R. Chakravarthy, Notable Publications, Neshanic Station, NJ (1998), 1–16. 7. G. Latouche and V. Ramaswami, A logarithmic reduction algorithm for Quasi–Birth– Death processes, J. Appl. Prob. 30 (1993), 650–674. 8. B. Meini, Solving QBD problems: the cyclic reduction algorithm versus the invariant subspace method, Adv. Performance Anal. 1 (1998), 215–225. 9. M. F. Neuts, Structured Stochastic Matrices of M/G/1 Type and Their Applications, Marcel Dekker, New York (1989). 10. V. Ramaswami, Nonlinear matrix equations in applied probability – solution techniques and open problems, SIAM Review 30 (1988), 256–263.
Chapter 11
Interpolating maps, the modulus map and Hadamard’s inequality S. S. Dragomir, Emma Hunt and C. E. M. Pearce
Abstract Refinements are derived for both parts of Hadamard’s inequality for a convex function. The main results deal with the properties of various mappings involved in the refinements. Key words: Convexity, Hadamard inequality, interpolation, modulus map
11.1 Introduction A cornerstone of convex analysis and optimization is Hadamard’s inequality, which in its basic form states that for a convex function f on a proper finite interval [a, b] + f
a+b 2
,
1 ≤ b−a
b
f (x) dx ≤ a
f (a) + f (b) , 2
whereas the reverse inequalities hold if f is concave. For simplicity we take f as convex on [a, b] throughout our discussion. The three successive terms in S. S. Dragomir School of Computer Science and Mathematics, Victoria University, Melbourne VIC 8001, AUSTRALIA e-mail: [email protected] Emma Hunt School of Mathematical Sciences & School of Economics, The University of Adelaide, Adelaide SA 5005, AUSTRALIA e-mail: [email protected] C. E. M. Pearce School of Mathematical Sciences, The University of Adelaide, Adelaide SA 5005, AUSTRALIA e-mail: [email protected] C. Pearce, E. Hunt (eds.), Structure and Applications, Springer Optimization and Its Applications 32, DOI 10.1007/978-0-387-98096-6 11, c Springer Science+Business Media, LLC 2009
207
208
S.S. Dragomir et al.
Hadamard’s inequality are all means of f over the interval [a, b]. We denote them respectively by mf (a, b), Mf (a, b), Mf (a, b) or simply by m, M, M when f , a, b are understood. The bounds m and M for M are both tight. More generally, the integral mean M is defined by b 1 f (x)dx if a = b b−a a . Mf (a, b) = f (a) if a = b The Hadamard inequality can then be written as mf (a, b) ≤ Mf (a, b) ≤ Mf (a, b)
(11.1)
without the restriction a = b. There is a huge literature treating various refinements, generalizations and extensions of this result. For an account of these, see the monograph [4]. Work on interpolations frequently involves use of the auxiliary function φt (p, q) := pt + q(1 − t)
t ∈ [0, 1]
for particular choices of p, q. Thus a continuous interpolation of the first part of Hadamard’s inequality is available via the map Hf : [0, 1] → R given by 1 Hf (t) := b−a
b
f (y) dx, a
where for x ∈ [a, b] we set yt (x) := φt (x, (a + b)/2). Theorem A. We have that: (a) Hf is convex; (b) Hf is nondecreasing with Hf (0) = m, Hf (1) = M. The first part of Hadamard’s inequality is associated with Jensen’s inequality and has proved much more amenable to analysis than the second, though this is the subject of a number of studies, see for example Dragomir, Milo´sevi´c and S´ andor [3] and Dragomir and Pearce [5]. In the former study a map Gf : [0, 1] → R was introduced, defined by 1 [f (u1 ) + f (u2 )] , 2 where u1 (t) := yt (a) and u2 (t) := yt (b) for t ∈ [0, 1]. Gf (t) :=
Theorem B. The map Gf enjoys the following properties: (i) Gf is convex on [0, 1]; (ii) Gf is nondecreasing on [0, 1] with Gf (0) = m, Gf (1) = M ; (iii) we have the inequalities
11
Interpolating maps, the modulus map and Hadamard’s inequality
0 ≤ Hf (t) − m ≤ Gf (t) − Hf (t) and
+ Mf
3a + b a + 3b , 4 4
,
∀t ∈ [0, 1]
209
(11.2)
+ , + , 1 3a + b a + 3b f +f 2 4 4
1 ≤ Gf (t) dt
≤
0
m+M . ≤ 2
(11.3)
Inequality (11.2) was proved for differentiable convex functions. As this class of functions is dense in the class of all convex functions defined on the same interval with respect to the topology induced by uniform convergence, (11.2) holds also for an arbitrary convex map. Dragomir, Miloˇsevi´c and S´ andor introduced a further map Lf : [0, 1] → R given by
b 1 [f (u) + f (v)] dx, (11.4) Lf (t) := 2(b − a) a where we define u(x) := φt (a, x)
and
v(x) := φt (b, x)
for x ∈ [a, b] and t ∈ [0, 1]. The following was shown. Theorem C. We have that (1) Lf is convex on [0, 1]; (2) for all t ∈ [0, 1] Gf (t) ≤ Lf (t) ≤ (1 − t)M + tM ≤ M,
sup Lf (t) = Lf (1) = M ; t∈[0,1]
(3) for all t ∈ [0, 1] Hf (1 − t) ≤ Lf (t)
and
Hf (t) + Hf (1 − t) ≤ Lf (t). 2
(11.5)
In this chapter we take these ideas further and introduce results involving the modulus map. With the notation of (11.1) in mind, it is convenient to employ σ(x) := |x|,
i(x) := x.
210
S.S. Dragomir et al.
This gives in particular the identity Mf (a, b) = Mi (f (a), f (b)) =
f (a) 1 f (b)−f (a)
f (b) f (a)
if f (a) = f (b) xdx otherwise .
In Section 2 we derive a refinement of the basic Hadamard inequality and in Section 3 introduce further interpolations for the outer inequality Gf (t) − Hf (t) ≥ 0 in (11.2) and for Hf (t) − m ≥ 0, all involving the modulus map. For notational convenience we shall employ also in the sequel w1 (t) := φt (a, b), We note that this gives , + a + w1 = f (u1 ), f 2 so that Gf (t) =
w2 (t) := φt (b, a). + f
b + w2 2
, = f (u2 ),
+ , + , a + w1 b + w2 1 f +f . 2 2 2
(11.6)
In Section 4 we derive some new results involving the identric mean I(a, b) and in Section 5 introduce the univariate map Mf : [0, 1] → R given by Mf (t) :=
1 [f (w1 ) + f (w2 )] 2
(11.7)
and derive further results involving Lf . We remark that by convexity f (w1 ) ≤ tf (a) + (1 − t)f (b)
and f (w2 ) ≤ (1 − t)f (a) + tf (b),
so that Mf (t) ≤ M.
(11.8)
11.2 A refinement of the basic inequality We shall make repeated use of the following easy lemma. Lemma 2.1 If f , g are integrable on some domain I and f (x) ≥ |g(x)| then
I
on
I,
* * * * * f (x)dx ≥ * g(x)dx** . I
11
Interpolating maps, the modulus map and Hadamard’s inequality
f (x)dx ≥
Proof. We have I
I
211
* * * * |g(x)|dx ≥ ** g(x)dx** .
I
We now proceed to a refinement of Hadamard’s inequality for convex functions. For this result, we introduce the symmetrization f s of f on [a,b], defined by 1 f s (x) = [f (x) + f (a + b − x)]. 2 This has the properties mf s (a, b) = mf (a, b),
Mf s (a, b) = Mf (a, b),
Mf s (a, b) = Mf (a, b).
Theorem 2.2 Let I ⊂ R be an interval and f : I → R. Suppose a, b ∈ I with a < b. Then Mf (a, b) − Mf (a, b) ≥ |Mσ (f (a), f (b)) − Mσ◦f (a, b)|
(11.9)
Mf (a, b) − mf (a, b) ≥ |Mσ◦f s (a, b) − |mf (a, b)| | .
(11.10)
and
Proof. From the convexity of f , we have for t ∈ [0, 1] that 0 ≤ tf (a) + (1 − t)f (b) − f (ta + (1 − t)b). By virtue of the general inequality |c − d| ≥ | |c| − |d| | ,
(11.11)
we thus have tf (a) + (1 − t)f (b) − f (ta + (1 − t)b) ≥ | |tf (a) + (1 − t)f (b)| − |f (ta + (1 − t)b)| | . Lemma 2.1 provides
1
1
1 tdt + f (b) (1 − t)dt − f (ta + (1 − t)b)dt f (a) 0 0 0 * 1 *
1 * * ≥ ** |tf (a) + (1 − t)f (b)| dt − |f (ta + (1 − t)b)| dt** . 0
(11.12)
(11.13)
0
Inequality (11.9) now follows from evaluation of the integrals and a change of variables. Similarly we have for any α, β ∈ [a, b] that * * + + , ** ,** * * f (α) + f (β) * * ** α+β f (α) + f (β) * − *f α + β * * . −f ≥ ** ** * * ** 2 2 2 2
212
S.S. Dragomir et al.
Set α = w1 , β = w2 . Then α+β =a+b and by Lemma 2.1 we have 1
1 1 f (w1 ) dt + f (w2 ) dt − m 2 0 0 * * 1* * * f (w1 ) + f (w2 ) * * dt − |m| * * ≥* * * 2 0
(11.14)
* * * . *
By (11.14) and a change of variables, the previous result reduces to (21.2). When does the theorem provide an improvement on Hadamard’s inequality? That is, when does strict inequality obtain in (11.9) or (21.2)? To answer this, we consider the derivation of (11.11). Since c = (c − d) + d, we have |c| ≤ |c − d| + |d| (11.15) or |c| − |d| ≤ |c − d|.
(11.16)
|d| − |c| ≤ |d − c|.
(11.17)
By symmetry we have also
Combining the last two inequalities yields (11.11). To have strict inequality in (11.11) we need strict inequality in both (11.16) and (11.17), or equivalently in both (11.15) and |d| ≤ |d − c| + |c|. Thus strict inequality occurs in (11.16) if and only d, c − d are of opposite sign, and in (11.17) if and only c, d − c are of opposite sign. These conditions are satisfied if and only if c and d are of opposite sign. It follows that strict inequality obtains in (11.12) if and only if tf (a) + (1 − t)f (b) and f (ta + (1 − t)b) are of opposite sign, that is, if and only if tf (a) + (1 − t)f (b) > 0 > f (ta + (1 − t)b).
(11.18)
Since an integrable convex function is continuous, (11.13) and so (11.9) applies with strict inequality if and only if there exists t ∈ [0, 1] for which (11.18) holds. Similarly strict inequality applies in (21.2) if and only if there exist α, β ∈ [a, b] such that
11
Interpolating maps, the modulus map and Hadamard’s inequality
f (α) + f (β) >0>f 2
+
α+β 2
213
, .
Changes of variable yield the following condition. Corollary 2.3 A necessary and sufficient condition for strict inequality in (11.9) is that there exists x ∈ [a, b] such that (b − x)f (a) + (x − a)f (b) > 0 > f (x). A necessary and sufficient condition for strict inequality in (21.2) is that there exists x ∈ [a, b] such that f s (x) > 0 > m. Corollary 2.4 If in the context of Theorem 2.1 f s = f , then f (a) − M ≥ | |f (a)| − Mσ◦f (a, b)|
(11.19)
M − m ≥ | Mσ◦f (a, b) − |m| | .
(11.20)
and A necessary and sufficient condition for strict inequality in (21.4) is that there exists x ∈ [a, b] such that f (x) < 0 < f (a). A necessary and sufficient condition for strict inequality in (21.5) is that there exists x ∈ [a, b] such that f (x) > 0 > m. These ideas have natural application to means. For an example, denote by A(a, b), G(a, b) and I(a, b) respectively the arithmetic, geometric and identric means of two positive numbers a, b, given by A(a, b) = and
a+b , 2
I(a, b) =
1 e
a
bb aa
G(a, b) =
1/(b−a)
√ ab
if a = b . if a = b
These satisfy the geometric–identric–arithmetic (GIA) inequality G(a, b) ≤ I(a, b) ≤ A(a, b). This follows from G(a, b) ≤ L(a, b) ≤ A(a, b)
214
S.S. Dragomir et al.
(where L(a, b) refers to the logarithmic mean), which was first proved by Ostle and Terwilliger [6] and Carlson [1], [2], and L(a, b) ≤ I(a, b) ≤ A(a, b), which was established by Stolarsky [7], [8]. The first part of the GIA inequality can be improved as follows. Corollary 2.5 If a ∈ (0, 1] and b ∈ [1, ∞) with a = b, then * ' (** * (ln b)2 + (ln a)2
I(a, b) b a 2−a−b 1/(b−a) * * ≥ exp * − ln b a e * G(a, b) ln((b/a)2 ) ≥ 1. (11.21) Proof. For the convex function f (x) = − ln x (x > 0), the left-hand side of (11.9) is −
Since
b 1 ln a + ln b + ln x dx 2 b−a a 1 [b ln b − a ln a − (b − a)] − ln G(a, b) = b−a I(a, b) . = ln G(a, b)
ln b
|x|dx = ln a
and
b
(ln b)2 + (ln a)2 2
4 3 | ln x|dx = ln aa bb e2−a−b ,
a
we have likewise for the same choice of f that the right-hand side of (11.9) is * ' (** * (ln b)2 + (ln a)2
b a 2−a−b 1/(b−a) * * − ln b a e *, * ln((b/a)2 ) whence the desired result. We note for reference the incidental result M− ln (a, b) = − ln I(a, b)
(11.22)
derived in the proof. For the first inequality in (21.6) to be strict, by Corollary 2.3 there needs to exist x with 1 < x < b for which
11
Interpolating maps, the modulus map and Hadamard’s inequality
215
(b − x) ln a + (x − a) ln b < 0. Since the left-hand side is strictly increasing in x, this condition can be satisfied if and only if the left-hand side is strictly negative for x = 1, that is, we require (b − 1) ln a + (1 − a) ln b < 0. (11.23) Because b > 1, we have b − 1 > ln b and so (b − 1)/a − ln b > 0, since 0 < a ≤ 1. Thus the left-hand side of (11.23) is strictly increasing in a. It tends to −∞ as a → 0 and is zero for a = 1. Accordingly (11.23) holds whenever 0 < a < 1 < b. The second part of the GIA inequality may also be improved. Corollary 2.6. If 0 < a < b < ∞, then A(a, b) ≥ exp I(a, b)
%* * + ,* * 1 b* * * a + b ** * * * * *ln x(a + b − x)* dx − **ln * *b − a a 2
≥ 1.
*& * * * * (11.24)
Proof. For the convex function f (x) = − ln x (x > 0) we have that , + A(a, b) a+b − ln I(a, b) = ln M − m = ln 2 I(a, b) and that the right-hand side of (21.2) is * * + ,* * * 1 b* * a + b ** * * * * x(a + b − x) dx − ln ln * * * * * *b − a a 2
* * * *. *
The stated result follows from (21.2).
By Corollary 2.3, a necessary and sufficient condition for the first inequality in (21.7) to be strict is that there should exist x ∈ [a, b] such that ln[x(a + b − x)] < 0 < ln
a+b , 2
that is, x(a + b − x) < 1 < (a + b)/2. The leftmost term is minimized for x = a and x = b, so the condition reduces to ab < 1 < (a + b)/2 or 2 − b < a < 1/b. Since 2 − b < 1/b for b = 1, there are always values of a for which this condition is satisfied.
216
S.S. Dragomir et al.
Similar analyses may be made for the refinements of inequalities derived in the remainder of this chapter.
11.3 Inequalities for Gf and Hf Our first result in this section provides minorants for the difference between the two sides of the first inequality and the difference between the outermost quantities in (11.2). Theorem 3.1. Suppose I is an interval of real numbers with a, b ∈ I and a < b. Then if f : I → R is convex, we have for t ∈ [0, 1] that Gf (t) − Hf (t) ≥ |Mσ (f (u1 ), f (u2 )) − Hσ◦f (t)|
(11.25)
Hf (t) − m ≥ |Mσ◦f s (a, b) − |m| | .
(11.26)
and
Proof. We have Gf (t) =
1 [f (u1 ) + f (u2 )] = Mf (u1 , u2 ), 2
Hf (t) = Mf ◦yt (a, b) = Mf (u1 , u2 ), mf (u1 , u2 ) = mf (a, b) = m, so for t ∈ [0, 1] application of Theorem 2.2 to f on (u1 , u2 ) provides Gf (t) − Hf (t) ≥ |Mσ (f (u1 ), f (u2 )) − Mσ◦f (u1 , u2 )| , Hf (t) − m ≥ |Mσ◦f s (u1 , u2 ) − |m| | .
(11.27) (11.28)
Since u2 − u1 = t(b − a), we have for t ∈ (0, 1] that
u2 1 dy Mσ◦f (u1 , u2 ) = |f (y)| b − a u1 t
b 1 = |f (yt (x))| dx b−a a = Hσ◦f (t), so (11.27) yields (11.25) for t ∈ (0, 1]. As (11.25) also holds for t = 0, we have the first part of the theorem. Using u1 + u2 = a + b, we derive the second part similarly from Mσ◦f s (u1 , u2 ) = Mσ◦f s ◦yt (a, b). Our next result provides a minorant for the difference between the two sides of the second inequality in (11.3) and a corresponding result for the third inequality.
11
Interpolating maps, the modulus map and Hadamard’s inequality
Theorem 3.2. Suppose the conditions of Theorem 3.1 hold. Then * *
1 * * m+M − M ≥ **Mσ (m, M ) − |Gf (t)| dt** 2 0
217
(11.29)
and
+ , + , 1 3a + b a + 3b M− f +f 2 4 4 * 1 * + , + ,* * 1 ** a + 3b ** 3a + b * ≥ * +f |Gf (t) + Gf (1 − t)| dt − *f * 2 4 4 0
* * *. * (11.30)
Proof. First, we observe that 1
1
1 1 Gf (t)dt = f (u1 )dt + f (u2 )dt 2 0 0 0 % &
(a+b)/2
b 2 1 2 f (x)dx + f (x)dx = 2 b−a a b − a (a+b)/2 = M.
(11.31)
Application of Theorem 2.2 to Gf on [0,1] provides Gf (0) + Gf (1) − 2 and
1
* * Gf (t)dt ≥ *Mσ (Gf (0), Gf (1)) − Mσ◦Gf (0, 1)*
0
* * + ,* * * 1 ** * s M − Gf (1/2) ≥ *Mσ◦Gf (0, 1) − **Gf 2 *
* * *. *
By (11.31) and the relation Gf (0) = m, Gf (1) = M , we have the stated results.
11.4 More on the identric mean For a, b > 0, define γa,b : [0, 1] → R by γa,b (t) = G(u1 , u2 ),
(11.32)
where, as before, G(x, y) denotes the geometric mean of the positive numbers x, y. Theorem 4.1 The mapping γa,b possesses the following properties: (a) γa,b is concave on [0,1];
218
S.S. Dragomir et al.
(b) γa,b is monotone nonincreasing on [0, 1], with γa,b (1) = G(a, b) (c) for t ∈ [0, 1]
and
γa,b (0) = A(a, b);
γa,b (t) ≤ I(u1 , u2 ) ≤ A(a + b);
(d) we have + I
3a + b a + 3b , 4 4
,
+
3a + b a + 3b ≥G , 4 4 ≥ I(a, b) ≥ G(A(a, b), G(a, b)) ≥ G(a, b);
,
(e) for t ∈ [0, 1] we have 1≤
I(u1 , u2 ) A(a, b) ≤ . I(u1 , u2 ) γa,b (t)
(11.33)
Proof. We have readily that for t ∈ [0, 1] γa,b (t) = exp [−G− ln (t)] , H− ln (t) = − ln I(u1 , u2 ). Since the map x :→ exp(−x) is order reversing, (b)–(e) follow from Theorem B(ii),(iii). It remains only to establish (a). Since dui /dt = (−1)i (b − a)/2 for i = 1, 2 and u2 − u1 = t(b − a), we have from (11.32) that b − a u 1 − u2 t(b − a)2 dγa,b = · = − dt 4 (u1 u2 )1/2 4(u1 u2 )1/2 and so ( (b − a)2 (b − a)2 ' 1/2 −3/2 dγa,b 1/2 −3/2 =− u2 u1 < 0. − + u1 u2 1/2 dt 16 8(u1 u2 ) This establishes (a).
We now apply Theorems 3.1 and 3.2 to obtain further information about the identric mean. For t ∈ [0, 1], put ηa,b (t) := I(u1 , u2 ).
11
Interpolating maps, the modulus map and Hadamard’s inequality
219
Because G− ln (t) = − ln γa,b (t),
H− ln (t) = − ln ηa,b (t),
(11.25) provides ln ηa,b (t) − ln γa,b (t) = G− ln (t) − H− ln (t) ≥ La,b (t), where La,b (t) = Mσ (ln u1 , ln u2 ). This yields ηa,b (t) ≥ exp [La,b (t)] ≥ 1 for γa,b (t)
t ∈ [0, 1].
From (11.26) we derive + ln
a+b 2
,
* * + ,* * 1 b √ * a + b ** * − ln ηa,b (t) ≥ * |ln yu2 | dx − **ln * *b − a a 2
* * * * ≥ 0, *
which gives A(a, b) ≥ exp ηa,b (t) ≥1
%* * + ,* * 1 b √ * a + b ** * |ln yu2 | dx − **ln * * *b − a a 2 for
*& * * * *
t ∈ [0, 1].
Also application of (11.29) to the convex function − ln yields ln I(a, b) − ln[G(A(a, b), G(a, b))] ≥ Ka,b , where Ka,b
*
* = **Mσ (ln A(a, b), ln G(a, b)) −
0
Hence
1
* * |ln γa,b (t)| dt** .
I(a, b) ≥ exp [Ka,b ] ≥ 1. G(A(a, b), G(a, b))
Finally, applying (11.30) to the convex mapping − ln provides , + a + 3b 3a + b , − ln I(a, b) ln G 4 4 * * + ,+ ,* * 1 ** 1 a + 3b ** 3a + b * ≥ * |ln [γa,b (t)γa,b (1 − t)]| dt − *ln * 2 0 4 4 = Ma,b ,
* * * *
where Ma,b
* * := **
1 0
* * + ,* * * 3a + b a + 3b ** * * , |ln G(γa,b (t), γa,b (1 − t))* dt − *ln G *. 4 4
220
S.S. Dragomir et al.
Hence G
, 3a+b 4 ≥ exp [Ma,b ] ≥ 1. I(a, b)
a+3b 4
11.5 The mapping Lf We now consider further the properties of the univariate mapping Lf defined in the introduction. First we introduce the useful auxiliaries
b 1 f (u) dx, Af (t) := b−a a Bf (t) :=
1 b−a
b
f (v) dx a
for t ∈ [0, 1]. We have immediately from (11.4) that Lf (t) =
1 [Af (t) + Bf (t)] . 2
The following property is closely connected with the second part of the Hadamard inequality. Proposition 5.1. Suppose a < b and f : [a, b] → R is convex. Then 1 M + Mf (t) + Gf (t) Lf (t) ≤ 2 2 M + Mf (t) ≤ 2 ≤M (11.34) for all t ∈ [0, 1]. Proof. For t ∈ [0, 1] we have 1 Af (t) = w1 − a
w1
f (u)du a
and
1 Bf (t) = b − w2
b
f (v)dv. w2
Substituting Af (t), Bf (t) for the leftmost terms in the known inequalities + ,
w1 a + w1 f (a) + f (w1 ) 1 f (a) + f (w1 ) 1 +f ≤ , f (u)du ≤ w1 − a a 2 2 2 2 1 b − w2
+ , b + w2 f (b) + f (w2 ) 1 f (b) + f (w2 ) +f ≤ f (v)dv ≤ 2 2 2 2 w2
b
11
Interpolating maps, the modulus map and Hadamard’s inequality
221
respectively and adding, gives by (11.4) that , + , + 1 b + w2 1 a + w1 M + {f (w1 ) + f (w2 )} + f +f Lf (t) ≤ 4 2 2 2 ≤
1 1 M + {f (w1 ) + f (w2 )} . 2 2
The first two inequalities in (11.34) follow from (11.6) and (11.7) and the final inequality from (11.8). The first inequality in (11.5) is improved by the following proposition. We introduce the auxiliary variable z = z(t, x) := (1 − t)x + t
a+b 2
for
x ∈ [a, b]
t ∈ [0, 1].
and
Proposition 5.2. Under the assumptions of Proposition 5.1 we have Lf (t) − Hf (1 − t) * * *
w1 * * * * f (u) + f (u + t(b − a)) * 1 * du − Hσ◦f (1 − t)* * ≥ ** * * * (1 − t)(b − a) a 2 ≥0 (11.35) for all t ∈ [0, 1). Proof. Put z = (u + v)/2 = φt ((a + b)/2, x)
for x ∈ [a, b] and t ∈ [0, 1].
By (11.11) and the convexity of f , * * * f (u) + f (v) * f (u) + f (v) * − f (z) = * − f (z)** 2 2 * ** * * f (u) + f (v) * * − |f (z)| ≥ ** ** * 2 ≥0
* * * *
for all x ∈ [a, b] and t ∈ [0, 1]. By Lemma 2.1, integration with respect to x over [a, b] provides * * * 1 b ** f (u) + f (v) ** * * * dx − Hσ◦f (1 − t)** ≥ 0. * Lf (t) − Hf (1 − t) ≥ * * * *b − a a * 2
222
S.S. Dragomir et al.
Since * * * f (u) + f (v) * * dx * * * 2 a *
w1 * * f (u) + f (u + t(b − a)) * 1 * du, * = * (1 − t)(b − a) a * 2
1 b−a
b
inequality (11.35) is proved. Remark 5.3. We can apply Theorem 2.2 to Lf to provide results similar to Theorems 3.1 and 3.2. In fact we may readily verify that the components Af and Bf are themselves convex and so subject to Theorem 2.2. We now apply the above to obtain results for the identric mean. We may compute Af , Bf for f = − ln to derive
w1 1 A− ln (t) = (− ln u)du = − ln I(w1 , a), w1 − a a B− ln (t) = Thus
1 b − w2
b
(− ln u)du = − ln I(b, w2 ). w2
1 [A− ln (t) + B− ln (t)] = − ln ζa,b (t), 2 : [0, 1] → R is defined by
L− ln (t) = where the map ζa,b
ζa,b (t) = G(I(a, w1 ), I(w2 , b)). Theorem 5.4. We have the following. (a) for all t ∈ [0, 1] γa,b (t) ≥ ζa,b (t) ≥ [I(a, b)]1−t [G(a, b)]t ≥ G(a, b); (b) for all t ∈ [0, 1] ηa,b (1 − t) ≥ ζa,b (t)
and
G(ηa,b (t), ηa,b (1 − t)) ≥ ζa,b (t).
Proof. Since ζa,b (t) = exp [−L− ln (t)]
for all t ∈ [0, 1]
and the map x :→ exp(−x) is order reversing, (a) and (b) follow from Theorem C, parts 2 and 3.
Remark 5.5. Similar results may be obtained from Propositions 5.1 and 5.2.
11
Interpolating maps, the modulus map and Hadamard’s inequality
223
References 1. B. C. Carlson, Some inequalities for hypergeometric functions, Proc. Amer. Math. Soc. 17 (1966), 32–39. 2. B. C. Carlson, The logarithmic mean, Amer. Math. Monthly 79 (1972), 615–618. 3. S. S. Dragomir, D. S. Miloˇsevi´ c and J. S´ andor, On some refinements of Hadamard’s inequalities and applications, Univ. Belgrad Publ. Elek. Fak. Sci. Math. 4 (1993), 21–24. 4. S. S. Dragomir and C. E. M. Pearce, Hermite–Hadamard Inequalities, RGMIA Monographs, Victoria University, Melbourne (2000), online: http://rgmia.vu.edu.au/monographs. 5. S. S. Dragomir and E. Pearce, A refinement of the second part of Hadamard’s inequality, with applications, in Sixth Symposium on Mathematics & its Applications, Technical University of Timisoara (1996), 1–9. 6. B. Ostle and H. L. Terwilliger, A comparison of two means, Proc. Montana Acad. Sci. 17 (1957), 69–70. 7. K. B. Stolarsky, Generalizations of the logarithmic mean, Math. Mag. 48 (1975), 87–92. 8. K. B. Stolarsky, The power and generalized of logarithmic means, Amer. Math. Monthly 87 (1980), 545–548.
Chapter 12
Estimating the size of correcting codes using extremal graph problems Sergiy Butenko, Panos Pardalos, Ivan Sergienko, Vladimir Shylo and Petro Stetsyuk
Abstract Some of the fundamental problems in coding theory can be formulated as extremal graph problems. Finding estimates of the size of correcting codes is important from both theoretical and practical perspectives. We solve the problem of finding the largest correcting codes using previously developed algorithms for optimization problems in graphs. We report new exact solutions and estimates. Key words: Maximum independent set, graph coloring, error-correcting codes, coding theory, combinatorial optimization
12.1 Introduction Let a positive integer l be given. For a binary vector u ∈ B l denote by Fe (u) the set of all vectors (not necessarily of dimension l) which can be obtained from u as a consequence of a certain error e, such as deletion or transposition : of bits. A subset C ⊆ B l is said to be an e-correcting code if Fe (u) Fe (v) = ∅ for all u, v ∈ C, u = v. In this chapter we consider the following cases for the error e. • Single deletion (e = 1d): F1d (u) ⊆ B l−1 and all elements of F1d (u) are obtained by deletion of one of the components of u. For example, if l = 4 and u = 0101 then F1d (u) = {101, 001, 011, 010}. See [25] for a survey of single-deletion-correcting codes. Sergiy Butenko, Panos Pardalos University of Florida, 303 Weil Hall, Gainesville, FL 32611, U. S. A. e-mail: butenko,pardalos@ufl.edu Ivan Sergienko, Vladimir Shylo and Petro Stetsyuk Institute of Cybernetics, NAS of Ukraine, Kiev, UKRAINE e-mail: [email protected] C. Pearce, E. Hunt (eds.), Structure and Applications, Springer Optimization and Its Applications 32, DOI 10.1007/978-0-387-98096-6 12, c Springer Science+Business Media, LLC 2009
227
228
S. Butenko et al.
• Two-deletion (e = 2d): F2d (u) ⊆ B l−2 and all elements of F2d (u) are obtained by deletion of two of the components of u. For u = 0101 we have F2d (u) = {00, 01, 10, 11}. • Single transposition, excluding the end-around transposition (e = 1t): F1t (u) ⊆ B l and all elements of F1t (u) are obtained by transposition of a neighboring pair of components in u. For example, if l = 5 and u = 11100 then F1t (u) = {11100, 11010}. • Single transposition, including the end-around transposition (e = 1et): F1et (u) ⊆ B l and all elements of F1et (u) are obtained by transposition of a neighboring pair of components in u, where the first and the last components are also considered as neighbors. For l = 5 and u = 11100 we obtain F1et (u) = {11100, 11010, 01101}. • One error on the Z-channel (e = 1z): F1z (u) ⊆ B l and all elements of F1z (u) are obtained by possibly changing one of the nonzero components of u from 1 to 0. If l = 5 and u = 11100 then F1z (u) = {11100, 01100, 10100, 11000}. The codes correcting one error on the Zchannel represent the simplest case of asymmetric codes. Our problem of interest here is to find the largest correcting codes. It appears that this problem can be formulated in terms of extremal graph problems as follows [24]. Consider a simple undirected graph G = (V, E), where V = {1, . . . , n} is the set of vertices and E is the set of edges. The complement graph of G ¯ = (V, E), ¯ where E ¯ is the complement of E. Given a subset is the graph G W ⊆ V , we denote by G(W ) the subgraph induced by W on G. A subset I ⊆ V is called an independent set (stable set, vertex packing) if the edge set of the subgraph induced by I is empty. An independent set is maximal if it is not a subset of any larger independent set and maximum if there are no larger independent sets in the graph. The independence number α(G) (also called the stability number) is the cardinality of a maximum independent set in G. A subset C ⊆ V is called a clique if G(C) is a complete graph. Consider a graph Gl having a vertex for every vector u ∈ B l , with an edge joining the vertices corresponding to u, v ∈ B l , u = v if and only if : Fe (u) Fe (v) = ∅. Then a correcting code corresponds to an independent set in Gl . Hence the largest e-correcting code can be found by solving the maximum independent set problem in the considered graph. Note that this problem could be equivalently formulated as the maximum clique problem in the complement graph of G. Another discrete optimization problem which we will use to obtain lower bounds for asymmetric codes is the graph coloring problem, which is formulated as follows. A legal (proper) coloring of G is an assignment of colors to its vertices so that no pair of adjacent vertices has the same color. A coloring induces naturally a partition of the vertex set such that the elements of each set in the partition are pairwise nonadjacent; these sets are precisely the subsets of vertices being assigned the same color. If there exists a coloring of G that uses no more than k colors, we say that G admits a k-coloring
12
Estimating the size of correcting codes using extremal graph problems
229
(G is k-colorable). The minimal k for which G admits a k-coloring is called the chromatic number and is denoted by χ(G). The graph coloring problem is to find χ(G) as well as the partition of vertices induced by a χ(G)-coloring. The maximum independent set (clique) and the graph coloring problems are NP-hard [15]; moreover, they are associated with a series of recent results about hardness of approximations. Arora and Safra [2] proved that for some positive the approximation of the maximum clique within a factor of n is NP-hard. H˚ astad [16] has shown that in fact for any δ > 0 the maximum clique is hard to approximate in polynomial time within a factor n1−δ . Similar approximation complexity results hold for the graph coloring problem as well. Garey and Johnson [14] have shown that obtaining colorings using sχ(G) colors, where s < 2, is NP-hard. It has been shown by Lund and Yannakakis [18] that χ(G) is hard to approximate within n for some > 0, and Feige and Kilian [13] have shown that for any δ > 0 the chromatic number is hard to approximate within a factor of n1−δ , unless NP ⊆ ZPP. These results together with practical evidence [17] suggest that the maximum independent set (clique) and coloring problems are hard to solve even in graphs of moderate sizes. Therefore heuristics are used to solve practical instances of these problems. References [3] and [19] provide extensive reviews of the maximum clique and graph coloring problems, respectively. In this chapter, using efficient approaches for the maximum independent set and graph coloring problems, we have improved some of the previously known lower bounds for asymmetric codes and found the exact solutions for some of the instances. The remainder of this chapter is organized as follows. In Section 12.2 we find lower bounds and exact solutions for the largest codes using efficient algorithms for the maximum independent set problem. In Section 12.3 a graph coloring heuristic and the partitioning method are utilized in order to obtain better lower bounds for some asymmetric codes. Finally, concluding remarks are made in Section 12.4.
12.2 Finding lower bounds and exact solutions for the largest code sizes using a maximum independent set problem In this section we summarize the results obtained in [5, 21]. We start with the following global optimization formulation for the maximum independent set problem. Theorem 1 ([1]). The independence number of G satisfies the following equality: n xi (1 − xj ). (12.1) α(G) = max n x∈[0,1]
i=1
(i,j)∈E
230
S. Butenko et al.
This formulation is valid if instead of [0, 1]n we use {0, 1}n as the feasible region, thus obtaining an integer 0–1 programming problem. In problem (12.1), for each vertex i there is a corresponding Boolean expression:
i ←→ ri = xi
;
⎡ ⎣
;
⎤ xj ⎦ .
(i,j)∈E
Therefore the problem of finding a maximum independent set can be reduced to the problem of finding a Boolean vector x∗ which maximizes the number of “true” values among ri , i = 1, . . . , n: * ⎡ ⎤* * n * ; ; * * ∗ * ⎣ xj ⎦ ** . x = argmax *xi * i=1 * (i,j)∈E
(12.2)
To apply local search techniques to the above problem one needs to define a proper neighborhood. We define the neighborhood on the set of all maximal independent sets as follows. For each jq ∈ I, q = 1, . . . , |I|, ⎧ ⎨ Listjq =
⎡ i∈ / I : ri = ⎣xi
⎩
; (i,k)∈E
⎫ has exactly 2 literals ⎬ x ¯k ⎦ with value 0, namely . ⎭ ¯jq = 0 xi = 0 and x ⎤
If the set Listjq is not empty, let I(G(Listjq )) be an arbitrary maximal independent set in G(Listjq ). Then sets of the form (I − {jq }) ∪ I(G(Listjq )), q = 1, . . . , |I|, are maximal independent sets in G. Therefore the neighborhood of a maximal independent set I in G can be defined as follows: O(I) = (I − {jq }) ∪ I(G(Listjq )), jq ∈ I, q = 1, . . . |I|}. We have the following algorithm to find maximal independent sets: 1. Given a randomly generated Boolean vector x, find an appropriate initial maximal independent set I. 2. Find a maximal independent set from the neighborhood (defined for maximal independent sets) of I, which has the largest cardinality.
12
Estimating the size of correcting codes using extremal graph problems
231
We tested the proposed algorithm with the following graphs arising from coding theory. These graphs are constructed as discussed in Section 12.1 and can be downloaded from [24]: • Graphs From Single-Deletion-Correcting Codes (1dc); • Graphs From Two-Deletion-Correcting Codes (2dc); • Graphs From Codes For Correcting a Single Transposition, Excluding the End-Around Transposition (1tc); • Graphs From Codes For Correcting a Single Transposition, Including the End-Around Transposition (1et); • Graphs From Codes For Correcting One Error on the Z-Channel (1zc). The results of the experiments are summarized in Table 12.1. In this table, the columns “Graph,” “n” and “|E|” represent the name of the graph, the number of its vertices and its number of edges. This information is available from [24]. The column “Solution found” contains the size of the largest independent sets found by the algorithm over 10 runs. As one can see the results are very encouraging. In fact, for all of the considered instances they were at least as good as the best previously known estimates. Table 12.1 Lower bounds obtained Graph 1dc128 1dc256 1dc512 1dc1024 1dc2048 2dc128 2dc256 2dc512 2dc1024 1tc64 1tc128 1tc256 1tc512 1tc1024 1tc2048 1et64 1et128 1et256 1et512 1et1024 1et2048 1zc128 1zc256 1zc512 1zc1024 1zc2048 1zc4096
n
|E|
Solution found
128 256 512 1024 2048 128 256 512 1024 64 128 256 512 1024 2048 64 128 256 512 1024 2048 128 256 512 1024 2048 4096
1471 3839 9727 24063 58367 5173 17183 54895 169162 192 512 1312 3264 7936 18944 264 672 1664 4032 9600 220528 1120 2816 6912 16140 39424 92160
16 30 52 94 172 5 7 11 16 20 38 63 110 196 352 18 28 50 100 171 316 18 36 62 112 198 379
232
S. Butenko et al.
12.2.1 Finding the largest correcting codes The proposed exact algorithm consists of the following steps: • • • • •
Preprocessing: finding and removing the set of isolated cliques; Finding a partition which divides the graph into disjoint cliques; Finding an approximate solution; Finding an upper bound; A Branch-and-Bound algorithm.
Below we give more detail on each of these steps. 0. Preprocessing: finding and removing the set of isolated cliques We will call a clique C isolated if it contains a vertex i with the property |N (i)| = |C| − 1. Using the fact that if C is an isolated clique, then α(G) = α(G − G(C)) + 1, we iteratively find and remove all isolated cliques in the graph. After that, we consider each connected component of the obtained graph separately. 1. Finding a partition which divides the graph into disjoint cliques We partition the set of vertices V of G as follows: V =
k /
Ci ,
i=1
where Ci , i = 1, 2, . . . , k, are cliques such that Ci ∩ Cj = ∅, i = j. The cliques are found using a simple greedy algorithm. Starting with C1 = ∅, we pick the vertex j ∈ / C1 that has the maximal number of neighbors among those vertices outside of5C1 which are in the neighborhood of every vertex from C1 . Set C1 = C1 {j}, and repeat recursively, until there is no vertex to add. Then remove C1 from the graph, and repeat the above procedure to obtain C2 . Continue in this way until the vertex set in the graph is empty. 2. Finding an approximate solution An approximate solution is found using the approach described above. 3. Finding an upper bound To obtain an upper bound for α(G) we can solve the following linear program: OC (G) = max
n
xi ,
(12.3)
i=1
s. t.
xi ≤ 1,
j = 1, . . . , m,
(12.4)
i∈Cj
x ≥ 0,
(12.5)
12
Estimating the size of correcting codes using extremal graph problems
233
where Cj ∈ C is a maximal clique and C is a set of maximal cliques with |C| = m. For a general graph the last constraint should read 0 ≤ xi ≤ 1, i = 1, . . . , n. But since an isolated vertex is an isolated clique as well, after the preprocessing step our graph does not contain isolated vertices and the above inequalities are implied by the set of clique constraints (12.4) along with nonnegativity constraints (12.5). We call OC (G) the linear clique estimate. In order to find a tight bound OC (G) one normally needs to consider a large number of clique constraints. Therefore one deals with linear programs in which the number of constraints may be much larger than the number of variables. In this case it makes sense to consider the linear program which is dual to problem (12.3)–(12.5). The dual problem can be written as follows: m yj , (12.6) OC (G) = min j=1
s. t.
m
aij yj ≥ 1, i = 1, . . . , n,
(12.7)
j=1
y ≥ 0, where
(12.8)
1, if i ∈ Cj , aij = 0, otherwise.
The number of constraints in the last LP is always equal to the number of vertices in G. This gives us some advantages in comparison to problem (12.3)–(12.5). If m > n, the dual problem is more suitable for solving with the simplex method and interior point methods. Increasing the number of clique constraints in problem (12.3)–(12.5) only leads to an increase in the number of variables in problem (12.6)–(12.8). This provides a convenient “restart” scheme (start from an optimal solution to the previous problem) when additional clique constraints are generated. To solve problem (12.6)–(12.8) we used a variation of an interior point method proposed by Dikin [8, 9]. We will call this version of interior point method Dikin’s Interior Point Method, or DIPM. We present a computational scheme of DIPM for an LP problem in the following form: min m+n
y∈R
s. t.
m+n
ci yi ,
(12.9)
i=1
Ay = e,
(12.10)
yi ≥ 0, i = 1, . . . , m + n.
(12.11)
234
S. Butenko et al.
Here A is an (m+n)×n matrix in which the first m columns are determined by coefficients aij and columns am+i = −ei for i = 1, . . . , n, where ei is the i-th orth. The vector c ∈ Rm+n has its first m components equal to one and the other n components equal to zero; e ∈ Rn is the vector of all ones. Problem (12.6)–(12.8) can be reduced to this form if the inequality constraints in (12.7) are replaced by equality constraints. As the initial point for the DIPM method we choose y 0 such that ⎧ 2, for i = 1, . . . , m, ⎨ m yi0 = 2 aij − 1, for i = m + 1, . . . , m + n. ⎩ j=1
Now let y k be a feasible point for problem (12.9)–(12.11). In the DIPM method the next point y k+1 is obtained by the following scheme: • Determine Dk of dimension (m + n) × (m + n) as Dk = diag{y k }. • Compute vector cp = (I − (ADk )T (ADk2 AT )−1 ADk )Dk c. • Find ρk =
cpi . cp = yik 1 − α ρik , i = 1, . . . , m + n, where α = 0.9.
max
i=1,...,m+n
• Compute yik+1
As the stopping criterion we used the condition m+n j=1
cj yjk −
m+n
cj yjk+1 < ε, where ε = 10−3 .
j=1
The most labor-consuming operation of this method is the computation of the vector cp . This part was implemented using subroutines DPPFA and DPPSL of LINPACK [10] for solving the following system of linear equations: ADk2 AT uk = ADk2 c, uk ∈ Rn . In this implementation the time complexity of one iteration of DIPM can be estimated as O(n3 ). The values of the vector uk = (ADk2 AT )−1 ADk2 c found from the last system define the dual variables in problem (12.6)–(12.8) (Lagrange multipliers for constraints (12.7)). The optimal values of the dual variables were then used as weight coefficients for finding additional clique constraints, which help to reduce the linear clique estimate OC (G). The problem of finding weighted cliques was solved using an approximation algorithm; a maximum of 1000 cliques were added to the constraints. 4. A Branch-and-Bound algorithm (a) Branching: Based on the fact that the number of vertices from a clique that can be included in an independent set is always equal to 0 or 1.
12
Estimating the size of correcting codes using extremal graph problems
235
(b) Bounding: We use the approximate solution found as a lower bound and the linear clique estimate OC (G) as an upper bound. Tables 12.2 and 12.3 contain a summary of the numerical experiments with the exact algorithm. In Table 12.2 Column “#” contains a number assigned to
Table 12.2 Exact algorithm: Computational results Graph #
1
2
3
1tc128 1 2 3 4
5 5 5 5
4 5 5 4
4.0002 5.0002 5.0002 4.0001
1tc256 1 2 3 4 5
6 10 19 10 6
5 9 13 9 5
5.0002 9.2501 13.7501 9.5003 5.0003
1tc512 1 2 3 4 5 6
10 18 29 29 18 10
7 14 22 22 14 7
7.0003 14. 9221 23.6836 23.6811 14.9232 7.0002
1dc512 1
75
50
51.3167
2dc512 1
16
9
10.9674
1et128 1 2 3 4 5 6
3 6 9 9 6 3
2 4 7 7 4 2
3.0004 5.0003 7.0002 7.0002 5.0003 3.0004
1et256 1 2 3 4 5 6 7
3 8 14 22 14 8 3
2 6 10 12 10 6 2
3.0002 6.0006 12.0001 14.4002 12.0004 6.0005 3.0002
1et512 1 2 3 4 5 6 7 8
3 10 27 29 29 27 10 3
3 7 18 21 21 18 7 3
3.0000 8.2502 18.0006 23.0626 23.1029 18.0009 8.2501 3.0000
236
S. Butenko et al.
Table 12.3 Exact solutions found Graph
n
|E|
α(G)
Time (s)
1dc512 2dc512 1tc128 1tc256 1tc512 1et128 1et256 1et512
512 512 128 256 512 128 256 512
9727 54895 512 1312 3264 672 1664 4032
52 11 38 63 110 28 50 100
2118 2618 7 39 141 25 72 143
each connected component of a graph after the preprocessing. Columns “1,” “2” and “3” stand for the number of cliques in the partition, the solution found by the approximation algorithm and the value of the upper bound OC (G), respectively. In Table 12.3 Column “α(G)” contains the independence number of the corresponding instance found by the exact algorithm; Column “Time” summarizes the total time needed to find α(G). Among the exact solutions presented in Table 12.3 only two were previously known, for 2dc512 and 1et128. The rest were either unknown or were not proved to be exact.
12.3 Lower Bounds for Codes Correcting One Error on the Z-Channel The error-correcting codes for the Z-channel have very important practical applications. The Z-channel shown in Fig. 12.1 is an asymmetric binary channel, in which the probability of transformation of 1 into 0 is p, and the probability of transformation of 0 into 1 is 0. Fig. 12.1 A scheme of the Z-channel
0
1
0
p
1
1−p
1
The problem we are interested in is that of finding good estimates for the size of the largest codes correcting one error on the Z-channel. Let us introduce some background information related to asymmetric codes.
12
Estimating the size of correcting codes using extremal graph problems
237
The asymmetric distance dA (x, y) between vectors x, y ∈ B l is defined as follows [20]: (12.12) dA (x, y) = max{N (x, y), N (y, x)}, where N (x, y) = |{i : (xi = 0) ∧ (yi = 1)}|. It is related to the Hamming distance dH l (x, y) = i=1 |xi − yi | = N (x, y) + N (y, x) by the expression 2dA (x, y) = dH (x, y) + |w(x) − w(y)|,
(12.13)
l where w(x) = i=1 xi = |{i : xi = 1}| is the weight of x. Let us define the minimum asymmetric distance Δ for a code C ⊂ B l as Δ = min {dA (x, y)| x, y ∈ C, x = y}. It was shown in [20] that a code C with the minimum asymmetric distance Δ can correct at most (Δ − 1) asymmetric errors (transitions of 1 to 0). In this subsection we present new lower bounds for codes with the minimum asymmetric distance Δ = 2. Let us define the graph G = (V (l), E(l)), where the set of vertices V (l) = B l consists of all binary vectors of length l, and (vi , vj ) ∈ E(l) if and only if dA (vi , vj ) < Δ. Then the problem of finding the size of the code with minimal asymmetric distance Δ is reduced to the maximum independent set problem in this graph. Table 12.4 contains the lower bounds obtained using the algorithm presented above in this section (some of which were mentioned in Table 12.1). Table 12.4 Lower bounds obtained in: a [27]; b [6]; c [7]; d [12]; e (this chapter) l 4 5 6 7 8 9 10 11 12
Lower Bound
Upper Bound
4 6a 12b 18c 36c 62c 112d 198d 379e (378d)
4 6 12 18 36 62 117 210 410
12.3.1 The partitioning method The partitioning method [4, 12, 26] uses independent set partitions of the vertices of graph G in order to obtain a lower bound for the code size. An
238
S. Butenko et al.
independent set partition is a partition of vertices into independent sets such that each vertex belongs to exactly one independent set, that is, V (l) =
m /
Ii , Ii is an independent set, Ii
<
Ij = ∅, i = j.
(12.14)
i=1
Recall that the problem of finding the smallest m for which a partition of the vertices into m disjoint independent sets exists is the well-known graph coloring problem. The independent set partition (12.14) can be identified by the vector Π(l) = (I1 , I2 , . . . , Im ). We associate the vector π(l) = (|I1 |, |I2 |, . . . , |Im |), which is called the index vector of partition Π(n), with Π(l). Its norm is defined as m |Ii |2 . π(l) · π(l) = i=1
We will assume that |I1 | ≥ |I2 | ≥ . . . ≥ |Im |. In terms of the codes, the independent set partition is a partition of words (binary vectors) into a set of codes, where each code corresponds to an independent set in the graph. Similarly, for the set of all binary vectors of weight w we can
construct a graph G(l, w), in which the set of vertices is the set of the wl vectors, and two vertices are adjacent iff the Hamming distance between the corresponding vectors is less than 4. Then an independent set partition w ) Π(l, w) = (I1w , I2w , . . . , Im
can be considered in which each independent set will correspond to a subcode with minimum Hamming distance 4. The index vector and its norm are defined in the same way as for Π(n). By the direct product Π(l1 ) × Π(l2 , w) of a partition of asymmetric codes Π(l1 ) = (I1 , I2 , . . . , Im1 ) and a partition of constant weight codes Π(l2 , w) = w ) we will mean the set of vectors (I1w , I2w , . . . , Im 2 C = {(u, v) : u ∈ Ii , v ∈ Iiw , 1 ≤ i ≤ m}, where m = min{m1 , m2 }. It appears that C is a code of length l = l1 + l2 with minimum asymmetric distance 2, that is, a code correcting one error on the Z-channel of length l = l1 + l2 [12]. In order to find a code C of length l and minimum asymmetric distance 2 by the partitioning method, we can use the following construction procedure:
12
Estimating the size of correcting codes using extremal graph problems
1. Choose l1 and l2 such that l1 + l2 = n. 2. Choose = 0 or 1. 3. Set , l2 /2 + / Π(l1 ) × Π(l2 , 2i + ) . C=
239
(12.15)
i=0
12.3.2 The partitioning algorithm One of the popular heuristic approaches to the independent set partitioning (graph coloring) problem is the following. Suppose that a graph G = (V, E) is given. INPUT: G = (V, E); OUTPUT: I1 , I2 , . . . , Im . 0. i = 1; 1. while G = ∅ find a maximal independent set I; set Ii = I; i = i + 1; G = G − G(I), where G(I) is the subgraph induced by I; end In [22, 23] an improvement of this approach was proposed by finding at each step a specified number of maximal independent sets. Then a new graph G is constructed, in which a vertex corresponds to a maximal independent set, and two vertices are adjacent iff the corresponding independent sets have common vertices. In the graph G, a few maximal independent sets are found, and the best of them (say, the one with the least number of adjacent edges in the corresponding independent sets of G) is chosen. This approach is formally described in Figure 12.2.
12.3.3 Improved lower bounds for code sizes The partitions obtained using the described partition algorithm are given in Tables 12.5 and 12.6. These partitions, together with the facts that [11] Π(l, 0) consists of one (zero) codeword, Π(l, 1) consists of l codes of size 1, Π(l, 2) consists of l − 1 codes of size l/2 for even l, the index vectors of Π(l, w) and Π(l, l − w) are equal
240
S. Butenko et al.
Fig. 12.2 Algorithm for finding independent set partitions
Table 12.5 Partitions of asymmetric codes found l1
#
8 9
1 1 2 3 4 5 6 7 8 9 10 11 12 13 14
10
1 2 3 4 5 6 7
Partition Index Vector
62, 62, 62, 62, 62, 62, 62, 62, 62, 62, 62, 62, 62, 62,
36,34, 34, 33, 30, 29, 26, 25, 9 62, 62, 61, 58, 56, 53, 46, 29, 18, 62, 62, 62, 58, 56, 53, 43, 32, 16, 62, 62, 61, 58, 56, 52, 46, 31, 17, 62, 62, 62, 58, 56, 52, 43, 33, 17, 62, 62, 62, 58, 56, 54, 42, 31, 15, 62, 62, 60, 57, 55, 52, 45, 31, 18, 62, 62, 60, 58, 55, 51, 45, 37, 16, 62, 62, 60, 58, 56, 53, 45, 32, 16, 62, 62, 62, 58, 56, 52, 43, 32, 17, 62, 62, 60, 58, 56, 53, 45, 31, 18, 62, 62, 62, 58, 56, 50, 45, 32, 18, 62, 62, 61, 58, 56, 51, 45, 30, 22, 62, 62, 62, 58, 56, 50, 44, 34, 16, 62, 62, 62, 58, 55, 51, 44, 32, 20,
5 6 5 5 8 8 4 6 6 5 5 3 6 4
112, 110, 110, 109, 105, 100, 99, 88, 75, 59, 37, 16, 4 112, 110, 110, 109, 105, 101, 96, 87, 77, 60, 38, 15, 4 112, 110, 110, 108, 106, 99, 95, 89, 76, 60, 43, 15, 1 112, 110, 110, 108, 105, 100, 96, 88, 74, 65, 38, 17, 1 112, 110, 110, 108, 106, 103, 95, 85, 76, 60, 40, 15, 4 112, 110, 110, 108, 106, 101, 95, 87, 75, 61, 40, 17, 2 112, 110, 109, 108, 105, 101, 96, 86, 78, 63, 36, 17, 3
Norm
m
7820 27868 27850 27848 27832 27806 27794 27788 27782 27778 27776 27774 27772 27760 27742
9 11 11 11 11 11 11 11 11 11 11 11 11 11 11
97942 97850 97842 97828 97720 97678 97674
13 13 13 13 13 13 13
12
Estimating the size of correcting codes using extremal graph problems
241
Table 12.6 Partitions of constant weight codes obtained in: a (this chapter); b [4]; c [12] l2
w
#
Partition Index-Vector
Norm
m
10 12 12 12 12 14 14 14
4 4 4 4 6 4 4 6
1a 1a 2a 3a 1a 1c 2c 1b
30, 30, 30, 30, 26, 25, 22, 15, 2 51, 51, 51, 51, 49, 48, 48, 42, 42, 37, 23, 2 51, 51, 51, 51, 49, 48, 48, 45, 39, 36, 22, 4 51, 51, 51, 51, 49, 48, 48, 45, 41, 32, 22, 6 132, 132, 120, 120, 110, 94, 90, 76, 36, 14 91, 91, 88, 87, 84, 82, 81, 79, 76, 73, 66, 54, 38, 11 91, 90, 88, 85, 84, 83, 81, 79, 76, 72, 67, 59, 34, 11, 1 278, 273, 265, 257, 250, 231, 229, 219, 211, 203, 184, 156, 127, 81, 35, 4
5614 22843 22755 22663 99952 78399 78305 672203
9 12 12 12 10 14 15 16
were used in (12.15), with = 0, to obtain new lower bounds for the asymmetric codes presented in Table 12.7. To illustrate how the lower bounds were computed, let us show how the code for l = 18 was constructed. We use l1 = 8 and l2 = 10: |Π(8) × Π(10, 0)| = 36 · 1 = 36; |Π(8) × Π(10, 2)| = 256 · 5 = 1280; |Π(8) × Π(10, 4)| = 36 · 30 + 34 · 30 + 34 · 30 + 33 · 30 + 30 · 26 + 29 · 25 + 26 · 22 + 25 · 15 + 9 · 2 = 6580; |Π(8) × Π(10, 6)| = |Π(8) × Π(10, 4)| = 6580; |Π(8) × Π(10, 8)| = |Π(8) × Π(10, 2)| = 1280; |Π(8) × Π(10, 10)| = |Π(8) × Π(10, 0)| = 36; The total is 2(36 + 1280 + 6580) = 15792 codewords. Table 12.7 New lower bounds. Previous lower bounds were found in: a [11]; b [12] l 18 19 20 21 22 24
Lower Bound New Previous 15792 29478 56196 107862 202130 678860
15762a 29334b 56144b 107648b 201508b 678098b
12.4 Conclusions In this chapter we have dealt with binary codes of given length correcting certain types of errors. For such codes, a graph can be constructed in which each vertex corresponds to a binary vector and the edges are built such
242
S. Butenko et al.
that each independent set corresponds to a correcting code. The problem of finding the largest code is thus reduced to the maximum independent set problem in the corresponding graph. For asymmetric codes, we also applied the partitioning method, which utilizes independent set partitions (or graph colorings) in order to obtain lower bounds for the maximum code sizes. We use efficient approaches to the maximum independent set and graph coloring problems to deal with the problem of estimating the largest code sizes. As a result, some improved lower bounds and exact solutions for the size of the largest error-correcting codes were obtained. Acknowledgments We would like to thank two anonymous referees for their valuable comments.
References 1. J. Abello, S. Butenko, P. Pardalos and M. Resende, Finding independent sets in a graph using continuous multivariable polynomial formulations, J. Global Optim. 21(4) (2001), 111–137. 2. S. Arora and S. Safra, Approximating clique is NP–complete, Proceedings of the 33rd IEEE Symposium on Foundations on Computer Science (1992) (IEEE Computer Society Press, Los Alamitos, California, 1992), 2–13. 3. I. M. Bomze, M. Budinich, P. M. Pardalos and M. Pelillo, The maximum clique problem, in D.-Z. Du and P. M. Pardalos, Eds, Handbook of Combinatorial Optimization (Kluwer Academic Publishers, Dordrecht, 1999), 1–74. 4. A. Brouwer, J. Shearer, N. Sloane and W. Smith, A new table of constant weight codes, IEEE Trans. Inform. Theory 36 (1990), 1334–1380. 5. S. Butenko, P. M. Pardalos, I. V. Sergienko, V. Shylo and P. Stetsyuk, Finding maximum independent sets in graphs arising from coding theory, Proceedings of the 17th ACM Symposium on Applied Computing (ACM Press, New York, 2002), 542–546. 6. S. D. Constantin and T. R. N. Rao, On the theory of binary asymmetric error correcting codes, Inform. Control 40 (1979), 20–36. 7. P. Delsarte and P. Piret, Bounds and constructions for binary asymmetric error correcting codes, IEEE Trans. Inform. Theory IT-27 (1981), 125–128. 8. I. I. Dikin, Iterative solution of linear and quadratic programming problems, Dokl. Akad. Nauk. SSSR 174 (1967), 747–748 (in Russian). 9. I. I. Dikin and V. I. Zorkal’tsev, Iterative Solution of Mathematical Programming Problems (Algorithms for the Method of Interior Points) (Nauka, Novosibirsk, 1980). 10. J. Dongarra, C. Moler, J. Bunch and G. Stewart, Linpack users’ guide, http://www.netlib.org/linpack/index.html, available from the ICTP Library, 1979. 11. T. Etzion, New lower bounds for asymmetric and undirectional codes, IEEE Trans. Inform. Theory 37 (1991), 1696–1704. 12. T. Etzion and P. R. J. Ostergard, Greedy and heuristic algorithms for codes and colorings, IEEE Trans. Inform. Theory 44 (1998), 382–388. 13. U. Feige and J. Kilian, Zero knowledge and the chromatic number, J. Comput. System Sci. 57 (1998), 187–199. 14. M. R. Garey and D. S. Johnson, The complexity of near–optimal coloring, JACM 23 (1976), 43–49. 15. M. R. Garey and D. S. Johnson, Computers and Intractability: A Guide to the Theory of NP–completeness (Freeman, San Francisco, 1979).
12
Estimating the size of correcting codes using extremal graph problems
243
16. J. H˚ astad, Clique is hard to approximate within n1− , Acta Math. 182 (1999), 105–142. 17. D. S. Johnson and M. A. Trick (Eds), Cliques, Coloring, and Satisfiability: Second DIMACS Implementation Challenge, Vol. 26 of DIMACS Series, (American Mathematical Society, Providence, RI, 1996). 18. C. Lund and M. Yannakakis, On the hardness of approximating minimization problems, JACM 41 (1994), 960–981. 19. P. M. Pardalos, T. Mavridou and J. Xue, The graph coloring problem: a bibliographic survey, in D.-Z. Du and P. M. Pardalos, Eds, Handbook of Combinatorial Optimization, Vol. 2 (Kluwer Academic Publishers, Dordrecht, 1999), 331–395. 20. T. R. N. Rao and A. S. Chawla, Asymmetric error codes for some lsi semiconductor memories, Proceedings of the 7th Southeastern Symposium on System Theory (1975) (IEEE Computer Society Press, Los Alamitos, California, 1975), 170–171. 21. I. V. Sergienko, V. P. Shylo and P. I. Stetsyuk, Approximate algorithm for solving the maximum independent set problem, in Computer Mathematics, (V.M. Glushkov Institute of Cybernetics NAS of Ukraine, Kiev, 2001), 4–20 (in Russian). 22. V. Shylo, New lower bounds of the size of error–correcting codes for the Z–channel, Cybernet. Systems Anal. 38 (2002), 13–16. 23. V. Shylo and D. Boyarchuk, An algorithm for construction of covering by independent sets, in Computer Mathematics (V.M. Glushkov Institute of Cybernetics NAS of Ukraine, Kiev, 2001), 151–157. 24. N. Sloane, Challenge problems: Independent sets in graphs, http://www.research. att.com/∼njas/doc/graphs.html, 2001. 25. N. Sloane, On single–deletion–correcting codes, in K. T. Arasu and A. Suress, Eds, Codes and Designs: Ray–Chaudhuri Festschrift (Walter de Gruyter, Berlin, 2002), 273–291. 26. C. L. M. van Pul and T. Etzion, New lower bounds for constant weight codes, IEEE Trans. Inform. Theory 35 (1989), 1324–1329. 27. R. R. Varshamov, A class of codes for asymmetric channels and a problem from the additive theory of numbers, IEEE Trans. Inform. Theory IT–19 (1973), 92–95.
Chapter 13
New perspectives on optimal transforms of random vectors P. G. Howlett, C. E. M. Pearce and A. P. Torokhti
Abstract We present a new transform which is optimal over the class of transforms generated by second-degree polynomial operators. The transform is based on the solution of the best constrained approximation problem with the approximant formed by a polynomial operator. It is shown that the new transform has advantages over the Karhunen–Lo`eve transform, arguably the most popular transform, which is optimal over the class of linear transforms of fixed rank. We provide a strict justification of the technique, demonstrate its advantages and describe useful extensions and applications. Key words: Optimal transforms, singular-value decomposition, filtering, compression, tensors, random signals
13.1 Introduction and statement of the problem Optimal transforms of random vectors have been applied succesfully to many problems in signal processing including, for example, the filtering and
P. G. Howlett Centre for Industrial and Applicable Mathematics, The University of South Australia, Mawson Lakes, SA 5095, AUSTRALIA e-mail: [email protected] C. E. M. Pearce School of Mathematical Sciences, The University of Adelaide, Adelaide SA 5005, AUSTRALIA e-mail: [email protected] A. P. Torokhti Centre for Industrial and Applicable Mathematics, The University of South Australia, Mawson Lakes, SA 5095, AUSTRALIA e-mail: [email protected] C. Pearce, E. Hunt (eds.), Structure and Applications, Springer Optimization and Its Applications 32, DOI 10.1007/978-0-387-98096-6 13, c Springer Science+Business Media, LLC 2009
245
246
P.G. Howlett et al.
compression of random signals and the classification and clustering of signals [4, 8, 18]. Known transforms are mainly based on linear models. The Karhunen– Lo`eve transform is perhaps the most popular linear transform and achieves the smallest associated error of all linear transforms of fixed rank. Recently Hua and Liu [8] generalized it to the case where no relationship is assumed between a stochastic signal and noise. Although the associated error cannot be reduced by the use of any other linear transform of the same rank, the performance of this transform is still unsatisfactory in many applications. See the simulations in Section 13.7 in this connection. In this chapter we present a new nonlinear transform with a substantially better performance than that of the generalized Karhunen– Lo`eve transform (GKLT) of [8]. In particular, we show that for the same rank our transform possesses a much smaller associated error. Our method is based on the best constrained approximation of a stochastic signal by an approximant generated by a second-degree polynomial operator. The technique is based on the primary concept presented in [14]–[16]. We begin with a rigorous statement of the problem. Let (Ω, Σ, μ) be a probability space, with Ω the set of outcomes, Σ the minimal σ-field of measurable subsets of Ω and μ : Σ → [0, 1] an associated probability measure on Σ. Suppose that x ∈ L2 (Ω, Rm ) and y ∈ L2 (Ω, Rn ) are random vectors with realizations x(ω) ∈ Rm and y(ω) ∈ Rn . We interpret x as a given “idealized” signal (without any distortion) and y as an observed signal. In particular, y can be interpreted as x contaminated with noise so that no specific relationships between signal and noise are assumed. For instance, noise can be additive, multiplicative or a combination of the two. Each operator F : Rm → Rn defines an associated operator FF : 2 L (Ω, Rm ) → L2 (Ω, Rn ) via FF [(x)](ω) = F [x(ω)]
for each
ω ∈ Ω.
(13.1)
It is customary to write F (x) rather than FF (x), since we have [F (x)](ω)= F [x(ω)] for each ω ∈ Ω. It is also convenient to write x for x(ω), y for y(ω), etc. Let T : Rn → Rm be the operator associated with the map TT : 2 L (Ω, Rn ) → L2 (Ω, Rm ) by an equation similar to (13.1). Suppose T is given by T (y) = A +
n
Bj zj ,
(13.2)
j=0
where A ∈ Rm , Bj ∈ Rm×n (j = 0, 1, . . . n), z0 = y, y = (y1 , . . . , yn )T ∈ Rn and zj = yj y for j = 1, . . . n. Then the operator T is completely defined by A and Bj for j = 0, 1, . . . n.
13
New perspectives on optimal transforms of random vectors
247
The problem is to find a vector A0 and matrices Bj0 such that J(A0 , B00 , . . . Bn0 ) =
min
A,B0 ,...,Bn
J(A, B0 , . . . , Bn ),
(13.3)
subject to rank[A B0 B1 . . . Bn ] = r
(13.4)
with r ≤ m. Here
⎡2 ⎞22 ⎤ ⎛ 2 2 n 2 ⎥ ⎢2 2 ⎠ ⎝A + B z J(A, B0 , . . . , Bn ) = E ⎣2 x − j j 2 ⎦, 2 2 2 j=0
(13.5)
with E the expectation operator, · the Frobenius norm, q = n2 + n + 1 and [A B0 B1 . . . Bn ] ∈ Rm×q .
13.2 Motivation of the statement of the problem Equations (13.3)–(13.5) represent the best constrained approximation problem. It is well known that a nonlinear approximant normally possesses a smaller associated error than that of a linear approximation. Therefore it is natural to seek a suitable nonlinear form of approximant. Let us consider the nonlinear operator T given by T (y) = A + B0 y + C(y, y),
(13.6)
where C : Rn × Rn → Rm is a bilinear operator, that is, (y, y) ∈ Rn × Rn . The operator C is a (m × n × n) − −tensor, C = {cijk } ∈ Rm×n×n . Therefore the vector C(y, y) can be presented as the product of a tensor C and vector y and also as a product of the matrix Cy with the vector y. As a result we have C(y, y) = (Cy)y = B1 y1 y + . . . + Bn yn y, where B1 = {ci1k }∈ Rm×n , . . . , Bn = {cink } ∈ Rm×n . Alternatively, B1 = {cij1 } ∈ Rm×n , . . . , Bn = {cijn } ∈ Rm×n . Hence (13.6) coincides with (13.2). In the following four sections we show that the transform T 0 produced by the nonlinear approximant T 0 (y) = A0 +
n
Bj0 zj
(13.7)
j=0
possesses a much smaller associated error than that of the GKLT. We then proceed to address applications and simulations.
248
P.G. Howlett et al.
13.3 Preliminaries For any u, w ∈ Rn , we write Euw = E[uwT ] and E uw = Euw − E[u]E[wT ]. The symbol M † denotes the Moore–Penrose pseudo-inverse of a matrix M (see [2]). Lemma 13.3.1 We have the relations E xy E †yy E yy = E xy ,
E zk y E †yy E yy = E zk y
and
E xzk E †zk zk E zk zk = E xzk . (13.8)
2
Lemma 13.3.2 Let z = [z1T · · · znT ]T ∈ Rn , D = E zz − E zy E †yy E yz Then
and
G = E xz − E xy E †yy E yz .
GD† D = G.
The proofs are similar to those of Lemmas 2 and 3 in [14]. For the following result it is convenient to write s = [1 y T z T ]T ∈ Rq . Lemma 13.3.3 Let P11 = 1 − P12 E[y] − P13 E[z],
T P12 = P21 ,
P13 = −E[y T ]P23 − E[z T ]P33 ,
P21 = −P22 E[y] − P23 E[z], P22 = E †yy − P23 E zy E †yy , P32 = −P33 E zy E †yy ,
P31 = −P33 E[z] − P32 E[y],
T P23 = P32 ,
P33 = D† .
⎤ P11 P12 P13 = ⎣ P21 P22 P23 ⎦ . P13 P32 P33 ⎡
Then † Ess
Proof. Let t=
1 , y
(13.9)
S11 = 1 − S12 E[y],
S12 = −E[y T ]S22 , First we show that † = Ett
T S21 = S12 ,
S11 S12 . S21 S22
S22 = E †yy .
(13.10)
13
New perspectives on optimal transforms of random vectors
We have Ett
249
S11 S12 Q11 Q12 Ett = , S21 S22 Q21 Q22
where Q11 = S11 + E[y T ]S21 + S12 E[y] + E[y T ]S22 E[y] = 1, Q12 = S11 E[y T ] + E[y T ]S21 E[y T ] + S12 Eyy + E[y T ]S22 Eyy = E[y T ], Q21 = E[y]S11 + Eyy S21 + E[y]S12 E[y] + Eyy S22 E[y] = E[y] and Q22 = E[y]S11 E[y T ] + Eyy S21 E[y T ] + E[y]S12 Eyy + Eyy S22 Eyy = Eyy . S11 S12 Ett = Ett , that is, the first condition for the Moore– Hence Ett S21 S22 Penrose inverse of Ett to be given by (13.10) is satisfied. The remain† ing Moore–Penrose conditions for Ett are also easily verified, and therefore (13.10) is established. Next, let † † T − R12 Ezt Ett , R12 = R21 , (13.11) R11 = Ett † R21 = −R22 Ezt Ett
and
† R22 = Dzt ,
(13.12)
† Etz = D. where Dzt = Ezz − Ezt Ett Arguing much as above, we have by Lemmas 13.3.1 and 13.3.2 that R11 R12 † . (13.13) = Ess R21 R22
Relation (13.9) follows from (13.13) by virtue of (13.10)–(13.12).
13.4 Main results We denote by U ΣV T the singular-value decomposition of Exs (Ess )† ,that is, 1/2
1/2 † ) , U ΣV T = Exs (Ess
(13.14)
where U = (u1 , . . . , uq ) ∈ Rm×q
and
V = (v1 , . . . , vq ) ∈ Rq×q
are orthogonal matrices and Σ = diag(σ1 , . . . , σq ) ∈ Rq×q is a diagonal matrix with σ1 ≥ · · · ≥ σk > 0 and σk+1 = · · · = σq = 0. Put Ur = (u1 , . . . , ur ),
Vr = (v1 , . . . , vr ),
Σr = diag(σ1 , . . . , σr )
250
P.G. Howlett et al.
and define Θr = Θr(x,s) = Ur Σr VrT .
(13.15)
Suppose Φ = [A B0 . . . Bn ] ∈ Rm×q and let Φ(:, η : τ ) be the matrix formed by the τ − η + 1 sequential columns of Φ beginning with column η. The optimal transform T 0 , introduced by (13.7), is defined by the following theorem. Theorem 13.4.1 The solution to problem (13.3) is given by A0 = Φ0 (:, 1 : 1), Bj0 = Φ0 (:, jn + 2 : jn + n + 1),
(13.16)
for j = 0, 1, . . . n, where 1/2 † 1/2 1/2 † ) + Mr [I − Ess (Ess ) ], Φ0 = Θr (Ess
with I the identity matrix and Mr ∈ Rm×q any matrix such that rank Φ 0 ≤ r < m. Proof. We have J(A, B0 , . . . , Bn ) = E[ x − Φs 2 ]. By Lemma 13.3.1, J(A, B0 , . . . , Bn ) 9 8 9 8 † † Esx + tr (Φ − Exs Ess )Ess (Φ − Exs Es† )T = tr Exx − Exs Ess 22 9 2 8 2 † † 1/2 2 = tr Exx − Exs Ess Esx + 2(Φ − Exs Ess )Ess (13.17) 2 . The minimum of this functional subject to constraint (13.3) is achieved if 1/2 = Θr ΦEss
(13.18)
(see [5]). Here we have used † 1/2 1/2 T 1/2 † 1/2 T 1/2 † † 1/2 Ess = ([Ess ] Ess ) [Ess ] = (Ess ) = (Ess ) . Ess
The necessary and sufficient condition (see [2]) for (13.18) to have a solution is readily verified to hold and provides the solution Φ = Φ0 . The theorem is proved. Remark 1. The proof above is based on Lemma 13.3.1. The first equation in (13.8) has been presented in [8] but without proof. Theorem 13.4.2 Let † Eyz )(D† )1/2 2 . Δ = (Exz − Exy Eyy
13
New perspectives on optimal transforms of random vectors
251
The error associated with the transform T 0 is E[ x − T 0 (y) 2 ] = tr{Exx } +
k
† 1/2 2 σi2 − Exy (Eyy ) − Δ.
i=r+1
Proof. By Lemma 13.3.3, it follows from (13.17) and (13.18) E[ x − T 0 (y) 2 ] = tr{Exx − Exy E †yy Eyx } + U ΣV T − Θr 2 − (Exz − Exy E †yy Eyz )(D† )1/2 2 , where U ΣV T − Θr 2 =
k
σj2
j=r+1
(see [5]). This proves the theorem.
13.5 Comparison of the transform T 0 and the GKLT The GKLT is a particular case of our transform T 0 with A0 = O and each Bj0 = O in (13.16), where O is the corresponding zero vector or zero matrix. To compare the transform T 0 with the GKLT, we put A0 = O in (13.16). Then the vector s in (13.14) can be written as s = s˜ = [y T z T ]T . We denote by σ ˜j the eigenvalues in (13.14) for s = s˜ and by T˜0 the transform which follows from (13.7) and (13.16) with A0 = O and s = s˜. We denote the GKLT by H. Theorem 13.5.1 Let ϑ1 , . . . , ϑl be the nonzero singular values of the matrix 1/2 Exy (Eyy )† , rank H = p ≤ l and D = Ezz − Ezy E †yy Eyz . If l 2 22 2 2 † σ ˜j2 < 2(Exz − Exy Eyy Eyz )(D† )1/2 2 + ϑ2i ,
k j=r+1
(13.19)
i=p+1
then the error associated with the transform T˜0 is less than that associated with H, that is, 2(2 '2 2 2 E 2x − T˜0 (y)2 < E[ x − Hy ]2 .
Proof. It is easily shown that † E[ x − Hy ]2 = tr{Exx − Exy Eyy Eyx } +
l i=p+1
ϑ2i .
252
P.G. Howlett et al.
Hence 2(2 '2 2 2 E[ x − Hy ]2 − E 2x − T˜0 (y)2 l k 22 2 2 † † 1/2 2 2 = 2(Exz − Exy Eyy Eyz )(D ) 2 + ϑi − σ ˜j2 , i=p+1
j=r+1
giving the desired result. Condition (13.19) is not restrictive and is normally satisfied in practice. In this connection, see the results of the simulations in Section 13.8.
13.6 Solution of the unconstrained minimization problem (13.3) We now address the solution of the minimization problem (13.3) without the constraint (13.4). This is important in its own right. The solution is a special form of the transform T 0 and represents a model of the optimal nonlinear filter with x an actual signal and y the observed data. Let ⎤ ⎡ D11 . . . D1n ⎢ D21 . . . D2n ⎥ ⎥ P=⎢ ⎣ . . . . . . . . . ⎦ and G = [G1 G2 . . . Gn ], Dn1 . . . Dnn where Dij = E zi zj − E zi y E †yy E yzj ∈ Rn×n
and
Gj = E xzj − E xy E †yy E yzj ∈ Rm×n
for i, j = 1, . . . , n. We denote a solution to the unconstrained problem (13.3) using the same symbols as before, that is, with A0 and Bj0 for j = 0, · · · , n. Theorem 13.6.1 The solution to the problem (13.3) is given by n A0 = E[x] − B00 E[y] − Bk0 E[zk ],
(13.20)
k=1 n
1/2 † Bk0 E zk y )E †yy + M1 [I − E 1/2 yy (E yy ) ],
(13.21)
[B10 B20 . . . Bn0 ] = GP † + M2 [I − P 1/2 (P 1/2 )† ],
(13.22)
B00 = (E xy −
k=1
13
New perspectives on optimal transforms of random vectors
253 2
where B10 , B20 , . . . , Bn0 ∈ Rm×n , M1 ∈ Rm×n and M2 ∈ Rm×n are arbitrary matrices. Theorem 13.6.2 Let
⎡
Q11 ⎢ Q21 † P =⎢ ⎣ ... Qn1
... ... ... ...
⎤ Q1n Q2n ⎥ ⎥, ... ⎦ Qnn
where Qij ∈ Rn×n for i, j = 1, . . . , n. The error associated with the transform T (1) defined by n Bj0 zj , T (1) (y) = A0 + j=0 0
with A and
Bj0
given by (13.20)–(13.22), is
E[ x − T (1) (y) 2 ] = tr{E xx } − E xy (E †yy )1/2 2 − −
n
1/2
Gi Qii 2
i=1
tr{Gj Qjk GTk }.
j,k=1,...,n
j =k
The proofs of both theorems are similar to those of Theorems 1 and 2 in [14]. It follows from Theorem 13.6.2 that the filter T (1) has a much smaller associated error than the error † 1/2 2 ) E[ x − H (1) (y) 2 ] = tr{Exx } − Exy (Eyy † associated with the optimal linear filter H (1)) = Exy Eyy in [8].
13.7 Applications and further modifications and extensions Applications of our technique are abundant and include, for example, simultaneous filtering and compression of noisy stochastic signals, feature selection in pattern recognition, blind channel equalization and the optimal rejection of colored noise in some neural systems. For the background to these applications see [1, 4, 8, 18, 19]. The efficiency of a fixed-rank transform is mainly characterized by two parameters; the compression ratio (see [8]) and the accuracy of signal restoration. The signal compression is realized through the following device. Let p be the rank of the transform H. Then H can be represented in the form
254
P.G. Howlett et al.
H = H1 H2 , where the matrix H2 ∈ Rp×n relates to compression of the signal and the matrix H1 ∈ Rm×p to its reconstruction. The compression ratio of the transform is given by cH = p/m. Similarly the transform T 0 can be represented in the form Φ0 = C1 C2 , where, for example, C1 = Ur ∈ Rm×r 1/2 † and C2 = Σr VrT (Ess ) ∈ Rr×q , so that the matrix C2 is associated with compression and the matrix C1 with signal reconstruction. The compression ratio of the transform T 0 is given by cT = r/m. Modifications of the method are motivated mostly by a desire to reduce the computation entailed in the estimation of the singular-value decomposition 1/2 of Exs (Ess )† . This can be done by exploiting the representation (13.20)– (13.22) in such a way that the matrices B10 , · · · , Bn0 in (13.22) are estimated by a scheme similar to the Gaussian elimination scheme in linear algebra. A rank restriction can then be imposed on the matrices B10 , · · · , Bn0 that will bring about reduction of the computational work in finding certain pseudoinverse matrices. Extensions of the technique can be made in the following directions. First, the method can be combined with a special iterative procedure to improve the associated accuracy of the signal estimation. Secondly, an attractive extension may be based on the representation of the operator T (13.6) in the form T (y) = A0 + A1 y + A2 (y, y) + · · · + Ak (y, . . . , y), where Ak : (Rn )k → Rm is a k-linear operator. Thirdly, a natural extension is to apply the technique to the optimal synthesis of nonlinear sytems. Background material can be found in papers by Sandberg (see, for example, [11, 12]) and also [6, 7, 13, 16].
13.8 Simulations The aim of our simulations is to demonstrate the advantages of T 0 over the GKLT H. To this end, we use the standard digitized image “Lena” presented by a 256 × 256 matrix X. To compare the transforms T 0 and H for different noisy signals, we partition the matrix X into 128 submatrices Xij ∈ R16×32 with i = 1, . . . , 16 and j = 1, . . . , 8 and treat each Xij as a set of 32 realizations of a random vector so that a column of Xij represents the vector realization. Observed data have been simulated in the form (1)
(2)
Yij = 10 ∗ Rij . ∗ Xij + 500 ∗ Rij , (1)
with i = 1, . . . , 16 and j = 1, . . . , 8, where each Rij is a matrix with entries (2)
uniformly distributed over the interval (0, 1) and each Rij is a matrix with normally distributed entries with mean 0 and variance 1. The symbol .∗ signifies Hadamard matrix multiplication.
13
New perspectives on optimal transforms of random vectors
255
The transforms T 0 and H have been applied to each pair Xij , Yij with the same rank r = 8, that is, with the same compression ratio. The corresponding covariance matrices have been estimated from the samples Xij and Yij . Special methods for their estimation can be found, for example, in [3, 9, 10] and [17]. Table 13.1 represents the values of ratios ρij = Xij − H(Yij ) 2 / Xij − T 0 (Yij ) 2 for each i = 1, . . . , 16 and j = 1, . . . , 8, where Xij − H(Yij ) 2 and Xij − T 0 (Yij ) 2 are the errors associated with the transforms H and T 0 , respectively. The value ρij is placed in the cell situated in row i and column j.
Table 13.1 Ratios ρij of the error associated with the GKLT H to that of the transform T 0 with the same compression ratios ↓i j → 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
1
2
3
4
5
6
7
8
5268.3 2168.4 2269.3 1394.3 3352.4 1781.5 2077.4 3137.2 2313.2 1476.0 1836.7 1808.5 1849.1 2123.6 1295.1 2125.5
3880.6 995.1 803.5 716.2 1970.1 758.6 1526.0 901.2 117.0 31.5 35.3 74.5 17.6 54.9 136.3 114.9
1864.5 1499.7 158.4 173.7 98.9 93.6 67.4 27.1 18.0 35.7 36.4 38.2 30.3 38.6 31.8 31.5
1094.7 338.6 136.4 62.9 192.8 79.3 30.3 38.5 39.3 119.3 1015.5 419.0 492.4 302.0 711.1 732.3
2605.4 1015.1 66.7 451.4 390.0 59.8 172.5 475.3 180.6 859.3 460.6 428.0 1175.5 1310.5 2561.7 2258.2
2878.0 3324.0 2545.4 721.6 92.8 223.2 70.3 445.6 251.0 883.5 487.0 387.2 135.8 2193.8 5999.2 5999.2
4591.6 2440.5 1227.1 227.8 680.4 110.5 1024.4 1363.2 1500.4 2843.1 2843.1 2616.9 1441.9 2681.5 550.7 550.7
1052.7 336.1 326.6 691.6 3196.8 2580.8 4749.3 2917.5 2074.2 3270.6 8902.3 8895.3 1649.2 1347.9 996.0 427.1
Inspection of Table 13.1 shows that, for the same compression ratio, the transform T 0 has associated error varying from one part in 17.6 to one part in 8,895.3 to that of the transform H. We also applied our filter T (1) (constructed from Theorem 13.6.1) and the † to the same signals and data as above, optimal linear filter H (1) = Exy Eyy that is, to each pair Xij , Yij with i = 1, · · · , 16 and j = 1, · · · , 8. The errors associated with filters T (1) and H (1) are X − XT 2 = 1.4 × 10−12
and X − XH 2 = 3.9 × 107 ,
where the matrices XT and XH have been constructed from the submatrices XT ij ∈ R16×32 and XHij ∈ R16×32 correspondingly, that is, XT = {XT ij } ∈
256
P.G. Howlett et al.
R256×256 and XH = {XHij } ∈ R256×256 with XT ij = T (1) (Yij ) the estimate of Xij by the filter T (1) and XHij = H (1) Yij that of Xij by the filter H (1) . The error produced by the filter H (1) is 2.7 × 1019 times greater than that of the filter T (1) . Figures 13.1(c) and (d) represent images reconstructed after filtering and compression of the noisy image in Figure 13.1(b) by the transforms H and
50
50
100
100
150
150
200
200
250
250 50
100
150
200
250
50
100
150
200
250
200
250
(b) Observed signals.
(a) Given signals.
50
50
100
100
150
150
200
200
250
250 50
100
150
200
250
50
50
50
100
100
150
150
200
100
150
(d) Reconstruction after filtering and compression by our transform with the same rank as that of the GKLT.
(c) Reconstruction after filtering and compression by the GKLT.
200
250 50
100
150
200
(e) Estimates by the filter H(1).
250
250 50
100
150
200
(f) Estimates by the filter T(1).
Fig. 13.1 Illustration of the performance of our method.
250
13
New perspectives on optimal transforms of random vectors
257
200
180
160
140
120
100
80
0
50
100
150
200
250
(a) Estimates of the 18th column in the matrix X.
250
200
150
100
50
0
50
100
150
200
250
(b) Estimates of the 244th column in the matrix X. Fig. 13.2 Typical examples of a column reconstruction in the matrix X (image “Lena”) after filtering and compression of the observed noisy image (Figure 13.1b) by transforms H (line with circles) and T 0 (solid line) of the same rank. In both subfigures, the plot of the column (solid line) virtually coincides with the plot of the estimate by the transform T 0 .
258
P.G. Howlett et al.
T 0 (which have been applied to each of the subimages Xij , Yij with the same compression ratio. Figures 13.1(e) and (f) represent estimates of the noisy image in Figure 13.1(b) by filters H (1) and T (1) , respectively. To illustrate the simulation results in a different way, we present typical examples of the plots of a column estimate in matrix X by transforms H and T 0 . Note that differences of the estimate by T 0 from the column plot are almost invisible. Table 13.1 and Figures 13.1 and 13.2 demonstrate the advantages of our technique.
13.9 Conclusion The recently discovered generalization [8] of the Karhunen–Lo`eve transform (GKLT) is the best linear transform of fixed rank. In this chapter we have proposed and justified a new nonlinear transform which possesses substantially smaller associated error than that of the GKLT of the same rank. A number of potential applications, modifications and extensions have been described. Numerical simulations demonstrate the clear advantages of our technique.
References 1. K. Abed–Meraim, W. Qiu and Y. Hua, Blind system identification, Proc. IEEE 85 (1997), 1310–1322. 2. A. Ben–Israel and T. N. E. Greville, Generalized Inverses: Theory and Applications (John Wiley & Sons, New York, 1974). 3. J.–P. Delmas, On eigenvalue decomposition estimators of centro–symmetric covariance matrices, Signal Proc. 78 (1999), 101–116. 4. K. Fukunaga, Introduction to Statistical Pattern Recognition (Academic Press, Boston, 1990). 5. G. H Golub and C. F. Van Loan, Matrix Computations (Johns Hopkins University Press, Baltimore, 1996). 6. P. G. Howlett and A. P. Torokhti, A methodology for the constructive approximation of nonlinear operators defined on noncompact sets, Numer. Funct. Anal. Optim. 18 (1997), 343–365. 7. P. G. Howlett and A. P. Torokhti, Weak interpolation and approximation of non–linear operators on the space C([0, 1]), Numer. Funct. Anal. Optim. 19 (1998), 1025–1043. 8. Y. Hua and W. Q. Liu, Generalized Karhunen-Lo`eve transform, IEEE Signal Proc. Lett. 5 (1998), 141–142. 9. M. Jansson and P. Stoica, Forward–only and forward–backward sample covariances – a comparative study, Signal Proc. 77 (1999), 235–245. 10. E. I. Lehmann, Testing Statistical Hypotheses (John Wiley, New York, 1986). 11. I. W. Sandberg, Separation conditions and approximation of continuous–time approximately finite memory systems, IEEE Trans. Circuit Syst.: Fund. Th. Appl., 46 (1999), 820–826.
13
New perspectives on optimal transforms of random vectors
259
12. I. W. Sandberg, Time–delay polynomial networks and quality of approximation, IEEE Trans. Circuit Syst.: Fund. Th. Appl. 47 (2000), 40–49. 13. A. P. Torokhti and P. G. Howlett, On the constructive approximation of non–linear operators in the modelling of dynamical systems, J. Austral. Math. Soc. Ser. B 39 (1997), 1–27. 14. A. P. Torokhti and P. G. Howlett, An optimal filter of the second order, IEEE Trans. Signal Proc. 49 (2001), 1044–1048. 15. A. P. Torokhti and P. G. Howlett, Optimal fixed rank transform of the second degree, IEEE Trans. Circuit Syst.: Analog Digital Signal Proc. 48 (2001), 309–315. 16. A. P. Torokhti and P. G. Howlett, On the best quadratic approximation of nonlinear systems, IEEE Trans. Circuit Syst.: Fund. Th. Appl., 48 (2001), 595–602. 17. V. N. Vapnik, Estimation of Dependences Based on Empirical Data (Springer-Verlag, New York, 1982). 18. Y. Yamashita and H. Ogawa, Relative Karhunen–Lo`eve transform, IEEE Trans. Signal Proc. 44 (1996), 371–378. 19. L.–H. Zou and J. Lu, Linear associative memories with optimal rejection to colored noise, IEEE Trans. Circuit Syst.: Analog Digital Signal Proc. 44 (1997), 990–1000.
Chapter 14
Optimal capacity assignment in general queueing networks P. K. Pollett
Abstract We consider the problem of how best to assign the service capacity in a queueing network in order to minimize the expected delay under a cost constraint. We study systems with several types of customers, general service time distributions, stochastic or deterministic routing, and a variety of service regimes. For such networks there are typically no analytical formulae for the waiting-time distributions. Thus we shall approach the optimal allocation problem using an approximation technique: specifically, the residual-life approximation for the distribution of queueing times. This work generalizes results of Kleinrock, who studied networks with exponentially distributed service times. We illustrate our results with reference to data networks. Key words: Capacity approximation
assignment,
queueing
network,
residual-life
14.1 Introduction Since their inception, queueing network models have been used to study a wide variety of complex stochastic systems involving the flow and interaction of individual items: for example, “job shops,” where manufactured items are fashioned by various machines in turn [7]; the provision of spare parts for collections of machines [17]; mining operations, where coal faces are worked in turn by a number of specialized machines [12]; and delay networks, where packets of data are stored and then transmitted along the communications links that make up the network [18, 1]. For some excellent recent expositions, which describe these and other instances where queueing networks have been applied, see [2, 6] and the important text by Serfozo [16]. P. K. Pollett Department of Mathematics, University of Queensland, Queensland 4072, AUSTRALIA e-mail: [email protected] C. Pearce, E. Hunt (eds.), Structure and Applications, Springer Optimization and Its Applications 32, DOI 10.1007/978-0-387-98096-6 14, c Springer Science+Business Media, LLC 2009
261
262
P.K. Pollett
In each of the above-mentioned systems it is important to be able to determine how best to assign service capacity so as to optimize various performance measures, such as the expected delay or the expected number of items (customers) in the network. We shall study this problem in greater generality than has previously been considered. We allow different types of customers, general service time distributions, stochastic or deterministic routing, and a variety of service regimes. The basic model is that of Kelly [8], but we do not assume that the network has the simplifying feature of quasi-reversibility [9].
14.2 The model We shall suppose that there are J queues, labeled j = 1, 2, . . . , J. Customers enter the network from external sources according to independent Poisson streams, with type u customers arriving at rate νu (customers per second). Service times at queue j are assumed to be mutually independent, with an arbitrary distribution Fj (x) that has mean 1/μj (units of service) and variance σj2 . For simplicity we shall assume that each queue operates under the usual first-come–first-served (FCFS) discipline and that a total effort (or capacity) of φj (units per second) is assigned to queue j. We shall explain later how our results can be extended to deal with other queueing disciplines. We shall allow for two possible routing procedures: fixed routing, where there is a unique route specified for each customer type, and random alternative routing, where one of a number of possible routes is chosen at random. (We do not allow for adaptive or dynamic routing, where routing decisions are made on the basis of the observed traffic flow.) For fixed routing we define R(u) to be the (unique) ordered list of queues visited by type u customers. In particular, let R(u) = {ru (1), . . . , ru (su )}, where su is the number of queues visited by a type u customer and ru (s) is the queue visited at stage s along its route (ru (s), s = 1, 2, . . . , su , are assumed to be distinct). It is perhaps surprising that random alternative routing can be accommodated within the framework of fixed routing (see Exercise 3.1.2 of [10]). If there are several alternative routes for a given type u, then one simply provides a finer type classification for customers using these routes. We label the alternative routes as (u, i), i = 1, 2, . . . , N (u), where N (u) is the number of alternative routes for type u customers, and we replace R(u) by R(u, i) = {rui (1), . . . , rui (sui )}, for i = 1, 2, . . . , N (u), where now rui (s) is the queue visited at stage s along alternative route i and sui is the number of stages. We then replace νu by νui = νu qui , where qui is the probability that alternative route i is choN (u) sen. Clearly νu = i=1 νui , and so the effect is to thin the Poisson stream of arrivals of type u into a collection of independent Poisson streams, one for each type (u, i). We should think of customers as being identified by their type, whether this be simply u for fixed routing, or the finer classi-
14
Optimal capacity assignment in general queueing networks
263
fication (u, i) for alternative routing. For convenience, let us denote by T the set of all types, and suppose that, for each t in T , customers of type t arrive according to a Poisson stream with rate νt and traverse the route R(t) = {rt (1), . . . , rt (st )}, a collection of st distinct queues. This is the network of queues with customers of different types described in [8]. If all service times have a common exponential distribution with mean 1/μ (and hence μj = μ), the model is analytically tractable. In equilibrium the queues behave independently: indeed, as if they were isolated , each with independent Poisson arrival streams (independent among types). For example, if we let = 0 otherwise, so that the arrival αj (t, s) = νt when rt (s) = j, and α j (t, s) st αj (t, s), and the demand (in rate at queue j is given by αj = t∈T s=1 units per second) by aj = αj /μ, then, provided the system is stable (aj < φj for each j), the expected number of customers at queue j is n ¯ j = aj /(φj − aj ) ¯ j /αj = 1/(μφj − αj ); for further details, and the expected delay is W j = n see Section 3.1 of [10].
14.3 The residual-life approximation Under our assumption that service times have arbitrary distributions, the model is rendered intractable. In particular, there are no analytical formulae for the delay distributions. We shall therefore adopt one of the many approximation techniques. Consider a particular queue j and let Qj (x) be the distribution function of the queueing time, that is, the period of time a customer spends at queue j before its service begins. The residual-life approximation, developed by the author [14], provides an accurate approximation for Qj (x): ∞ Qj (x) $ Pr(nj = n)Gjn (x) , (14.1) n=0
φj x
where Gj (x) = μj 0 (1 − Fj (y)) dy and Gjn (x) denotes the n-fold convolution of Gj (x). The distribution of the number of customers nj at queue j, which appears in (14.1), is that of a corresponding quasi-reversible network [10]: specifically, a network of symmetric queues obtained by imposing a symmetry condition at each queue j. The term residual-life approximation comes from renewal theory; Gj (x) is the residual-life distribution corresponding to the (lifetime) distribution Fj (x/φj ). One immediate consequence of (14.1) is that the expected queueing time Qj is approximated by Qj $ n ¯ j (1 + μ2j σj2 )/(2μj φj ), where n ¯ j is the expected number of customers at queue j in the corresponding quasi-reversible network. Hence the expected delay at queue j is approximated as follows: Wj $
1 + μ2j σj2 1 + n ¯j . μj φj 2μj φj
264
P.K. Pollett
Under the residual-life approximation, it is only n ¯ j which changes when the service discipline is altered. In the current context, the FCFS discipline, which is assumed to be in operation everywhere in the network, is replaced by a preemptive-resume last-come–first-served discipline, giving n ¯ j = aj /(φj − aj ) with aj = αj /μj , for each j, and hence 1 + μ2j σj2 1 Wj $ + μj φj 2μj φj
+
αj μj φj − αj
, .
(14.2)
Simulation results presented in [14] justify the approximation by assessing its accuracy under a variety of conditions. Even for relatively small networks with generous mixing of traffic, it is accurate, and the accuracy improves as the size and complexity of the network increases. (The approximation is very accurate in the tails of the queueing time distributions and so it allows an accurate prediction to be made of the likelihood of extreme queueing times.) For moderately large networks the approximation becomes worse as the coefficient of variation μj σj of the service-time distribution at queue j deviates markedly from 1, the value obtained in the exponential case.
14.4 Optimal allocation of effort We now turn our attention to the problem of how best to apportion resources so that the expected network delay, or equivalently (by Little’s theorem) the expected number of customers in the network, is minimized. We shall suppose that there is some overall network budget F (dollars) which cannot be exceeded, and that the cost of operating queue j is a function fj of its capacity. Suppose that the cost of operating queue j is proportional to φj , that is, fj (φj ) = fj φj (the units of fj are dollars per unit of capacity, or dollar–seconds per unit of service). Thus we should choose the capacities subject to the cost constraint J
fj φj = F .
(14.3)
j=1
We shall suppose that the average delay of customers at queue j is adequately approximated by (14.2). Using Little’s theorem, we obtain an approximate expression for the mean number m ¯ of customers in the network. This is & % J J αj (1 + μ2j σj2 ) 1 aj (1 + cj ) 1 = , αj + aj + m ¯ $ μj φj 2μj φj (μj φj − αj ) φj 2φj (φj − aj ) j=1 j=1
14
Optimal capacity assignment in general queueing networks
265
where cj = μ2j σj2 is the squared coefficient of variation of the service time ¯ over φ1 , . . . , φJ subject to (14.3). distribution Fj (x). We seek to minimize m To this end, we introduce a Lagrange multiplier 1/λ2 ; our problem then becomes one of minimizing ⎛ ⎞ J 1 ¯ + 2⎝ fj φj − F ⎠ . L(φ1 , . . . , φJ ; λ−2 ) = m λ j=1
Setting ∂L/∂φj = 0 for fixed j yields a quartic polynomial equation in φj : 2fj φ4j − 4aj fj φ3j + 2aj (aj fj − λ2 )φ2j − 2j a2j λ2 φj + j a3j λ2 = 0 , where j = cj − 1, and our immediate task is to find solutions such that φj > aj (recall that this latter condition is required for stability). The task is simplified by observing that the transformation φj fj /F → φj , aj fj /F → aj , λ2 /F → λ2 , reduces the problem to one with unit costs fj = F = 1, whence the above polynomial equation becomes 2φ4j − 4aj φ3j + 2aj (aj − λ2 )φ2j − 2j a2j λ2 φj + j a3j λ2 = 0 ,
(14.4)
and the constraint becomes φ1 + φ2 + · · · + φJ = 1 .
(14.5)
It is easy to verify that, if service times are exponentially distributed (j = 0 for each j), there is a unique solution to (14.4) on (aj , ∞), given by φj = √ aj +|λ| aj . Upon application of the constraint (14.5) we arrive at the optimal J J √ √ capacity assignment φj = aj + aj (1− k=1 ak )/( k=1 ak ), for unit costs. In the case of general costs this becomes J fj aj 1 F− fk ak J √ , φj = aj + fj k=1 fk ak k=1
after applying the transformation. This is a result obtained by Kleinrock [11] (see also [10]): the allocation proceeds by first assigning enough capacity to meet the demand aj , at each queue j, and then allocating a proportion of the J affordable excess capacity, (F − k=1 fk ak )/fj (that which could be afforded to queue j), in proportion to the square root of the cost fj aj of meeting that demand. In the case where some or all of the j , j = 1, 2, . . . , J, deviate from zero, (14.4) is difficult to solve analytically. We shall adopt a perturbation technique, assuming that the Lagrange multiplier and the optimal allocation take the following forms:
266
P.K. Pollett
λ = λ0 +
J
λ1k k + O(2 ),
(14.6)
k=1
φj = φ0j +
J
φ1jk k + O(2 ) ,
j = 1, . . . , J,
(14.7)
k=1
where O(2 ) denotes terms of order i k . The zero-th order terms come from √ Kleinrock’s solution: specifically, φ0j = aj + λ0 aj , j = 1, . . . , J, where λ0 = J J √ (1 − k=1 ak )/( k=1 ak ). On substituting (14.6) and (14.7) into (14.4) we obtain an expression for φ1jk in terms of λ1k , which in turn is calculated using the constraint (14.5) and by setting k = δkj (the Kronecker delta). We find that the optimal allocation, to first order, is √ √ aj aj √ bj j , (14.8) φj = aj + λ0 aj − J √ bk k + 1 − J √ ak k=1 ak k =j k=1 √ √ 3/2 where bk = 14 λ0 ak (ak + 2λ0 ak )/(ak + λ0 ak )2 . For most practical applications, higher-order solutions are required. To achieve this we can simplify matters by using a single perturbation = max1≤j≤J |j |. For each j we define a quantity βj = j / and write φj and λ as power series in : λ=
∞ n=0
λn n ,
φj =
∞
φnj n ,
j = 1, . . . , J.
(14.9)
n=0
Substituting as before into (14.4), and using (14.5), gives rise to an iterative scheme, details of which can be found in [13]. The first-order approximation is useful, nonetheless, in dealing with networks whose service-time distributions are all ‘close’ to exponential in the sense that their coefficients of variation do not differ significantly from 1. It is also useful in providing some insight into how the allocation varies as j , for fixed j, varies. Let φi , i = 1, 2, . . . , J, be the new optimal allocation obtained after incrementing j by a small quantity δ > 0. We find that to first order in δ √ a j bj δ > 0, φj − φj = 1 − J √ ak k=1 √ ai φi − φi = − J √ (φj − φj ) < 0 , k=1 ak
i = j.
Thus, if the coefficient of variation of the service-time distribution at a given queue j is increased (respectively decreased) by a small quantity δ, then there is an increase (respectively decrease) in the optimal allocation at queue j which is proportional to δ. All other queues experience a complementary decrease (respectively increase) in their allocations and the resulting deficit is reallocated in proportion to the square root of the demand.
14
Optimal capacity assignment in general queueing networks
267
In [13] empirical estimates were obtained for the radii of convergence of the power series (14.9) for the optimal allocation. In all cases considered there, the closest pole to the origin was on the negative real axis outside the physical limits for i , which are of course −1 ≤ j < ∞. The perturbation technique is therefore useful for networks whose service-time distributions are, for example, Erlang (gamma) (−1 < j < 0) or mixtures of exponential distributions (0 < j < ∞) with not too large a coefficient of variation.
14.5 Extensions So far we have assumed that the capacity does not depend on the state of the queue (as a consequence of the FCFS discipline) and that the cost of operating a queue is a linear function of its capacity. Let us briefly consider some other possibilities. Let φj (n) be the effort assigned to queue j when there are n customers present. If, for example, φj (n) = nφj /(n + η − 1), where η is a positive constant, the zero-th order allocation, optimal under (14.3), is precisely the same as before (the case η = 1). For values of η greater than 1 the capacity increases as the number of customers at queue j increases and levels off at a constant value φj as the number becomes large. If we allow η to depend on j we get a similar allocation but with the factor fj aj fj ηj aj replaced by J √ J √ fk ak k=1 k=1 fk ηk ak (see Exercise 4.1.6 of [10]). The higher-order analysis is very nearly the same as before. The factor 1 + cj is replaced by ηj (1 + cj ); for the sake of brevity, we shall omit the details. As another example, suppose that the capacity function is linear, that is, φj (n) = φj n, and that service times are exponentially distributed. In this case, the total number of customers in the system has a Poisson distribuJ tion with mean j=1 (aj /φj ) and it is elementary to show that the optimal allocation subject to (14.3) is given by fj aj F, j = 1, . . . , J. φj = J √ fj k=1 fk ak It is interesting to note that we get a proportional allocation, φj /φk = aj /ak , J in this case if (14.3) is replaced by j=1 log φj = 1 (see Exercise 4.1.7 of [10]). More generally, we might use the constraint J j=1
fj log(gj φj ) = F
268
P.K. Pollett
to account for ‘decreasing costs’: costs become less with each increase in capacity. Under this constraint, the optimal allocation is φj = λaj /fj , where log λ =
F−
J k=1
= fk log(gk ak /fk )
J
fk
.
k=1
14.6 Data networks One of the most interesting and useful applications of queueing networks is in the area of telecommunications, where they are used to model (among other things) data networks. In contrast to circuit-switched networks (see for example [15]), where one or more circuits are held simultaneously on several links connecting a source and destination node, only one link is used at any time by a given transmission in a data network (message- or packet-switched network); a transmission is received in its entirety at a given node before being transmitted along the next link in its path through the network. If the link is at full capacity, packets are stored in a buffer until the link becomes available for use. Thus the network can be modeled as a queueing network: the queues are the communications links and the customers are the messages. The most important measure of performance of a data network is the total delay, the time it takes for a message to reach its destination. Using the results presented above, we can optimally assign the link capacities (service rates) in order to minimize the expected total delay. We shall first explain in detail how the data network can be described by a queueing network. Suppose that there are N switching nodes, labeled n = 1, 2, . . . , N , and J communications links, labeled j = 1, 2, . . . , J. We assume that all the links are perfectly reliable and not subject to noise, so that transmission times are determined by message length. We shall also suppose that the time taken to switch, buffer, and (if necessary) re-assemble and acknowledge, is negligible compared with the transmission times. Each message is therefore assumed to have the same transmission time on all links visited. Transmission times are assumed to be mutually independent with a common (arbitrary) distribution having mean 1/μ (bits, say) and variance σ 2 . Traffic entering the network from external sources is assumed to be Poisson and that which originates from node m and is destined for node n is offered at rate νmn ; the origin– destination pair determines the message type. We shall assume that each link operates under a FCFS discipline and that a total capacity of φj (bits per second) is assigned to link j. In order to apply the above results, we shall need to make a further assumption. It is similar to the celebrated independence assumption of Kleinrock [11]. As remarked earlier, each message has the same transmission time on all links visited. However, numerous simulation results (see for example [11]) suggest
14
Optimal capacity assignment in general queueing networks
269
that, even so, the network behaves as if successive transmission times at any given link are independent. We shall therefore suppose that transmission times at any given link are independent and that transmission times at different links are independent. This phenomenon can be explained by observing that the arrival process at a given link is the result of the superposition of a generally large number of streams, which are themselves the result of thinning the output from other links. The approximation can therefore be justified on the basis of limit theorems concerning the thinning and superposition of marked point processes; see [3, 4, 5], and the references therein. Kleinrock’s assumption differs from ours only in that he assumes the transmission-time distribution at a given link j is exponential with common mean 1/μ, a natural consequence of the usual teletraffic modeling assumption that messages emanating from outside the network are independent and identically distributed exponential random variables. However, although the exponential assumption is usually valid in circuit-switched networks, we should not expect it to be appropriate in the current context of message/packet switching, since packets are of similar length. Thus it is more realistic to assume, as we do here, that message lengths have an arbitrary distribution. For each origin–destination (ordered) pair (m, n), let R(m, n) = {rmn (1), rmn (2), . . . , rmn (smn )} be the ordered sequence of links used by messages on that route; smn is the number of links and rmn (s) is the link used at stage s. Let αj (m, n, s) = νmn if rmn (s) = otherwise, so that the arrival rate at link j is given by j, and0smn αj (m, n, s), and the demand (in bits per second) by αj = m n =m s=1 aj = αj /μ. Assume that the system is stable (αj < μφj for each j). The optimal capacity allocation (φj , j = 1, 2, . . . , J) can now be obtained using the results of Section allocation of capacity 14.4. For unit costs, the optimal √ (constrained by j φj = 1) satisfies μφj = αj + λ αj , j = 1, . . . , J, where J √ J λ = (μ − k=1 αk )/( k=1 αk ), in the case of exponential transmission times. More generally, in the case where the transmission times have an arbitrary distribution with mean 1/μ and variance σ 2 , the optimal allocation satisfies (to first order in ) √ J αj √ (14.10) ck , μφj = αj + λ αj + cj − J √ αk k=1 k=1 √ √ 3/2 where ck = 14 λαk (αk + 2λ αk )/(αk + λ αk )2 and = μ2 σ 2 − 1. To illustrate this, consider a symmetric star network , in which a collection of identical outer nodes communicate via a single central node. Suppose that there are J outer nodes and thus J communications links. The corresponding queueing network, where the nodes represent the communications links, is a fully connected symmetric network. Clearly there are J(J −1) routes, a typical one being R(m, n) = {m, n}, where m = n. Suppose that transmission times
270
P.K. Pollett
have a common mean 1/μ and variance σ 2 (for simplicity, set μ = 1), and, to begin with, suppose that transmission times are exponentially distributed and that all traffic is offered at the same rate ν. Clearly the optimal allocation will be φj = 1/J, owing to the symmetry of the network. What happens to the optimal allocation if we alter the traffic offered on one particular route by a small quantity? Suppose that we alter ν12 by setting ν12 = ν + e. The arrival rates at links 1 and 2 will then be altered by the same amount e. Since μ = 1 we will have a1 = a2 = ν + e and aj = ν for j = 3, . . . , J. The optimal allocation is easy to evaluate. We find that, for j = 1, 2, √ 1 1 (Jν + 1) (1 − Jν − 2e) ν + e √ = + (J − 2) e + O(e2 ), φj = ν + e + √ J 2 J 2ν (J − 2) ν + 2 ν + e and for j = 3, . . . , J, √ 1 Jν + 1 (1 − Jν − 2e) ν √ = − e + O(e2 ). φj = ν + √ J J 2ν (J − 2) ν + 2 ν + e Thus, to first order in e, there is an O(1/J) decrease in the capacity at all links in the network, except at links 1 and 2, where there is an O(1) increase in capacity. When the transmission times are not exponentially distributed, similar results can be obtained. For example, suppose that the transmission times have a distribution whose squared coefficient of variation is 2 (such as a mixture of exponential distributions). Then it can be shown that the optimal allocation is given for j = 1, 2 by φj =
1 (J 2 ν 2 − Jν + 2)(J 2 ν 2 − 2Jν − 1) 1 + e + O(e2 ) J 2 J 2ν
and for 3 ≤ j ≤ J by φj =
(J − 2)(J 2 ν 2 − Jν + 2)(J 2 ν 2 − 2Jν − 1) 1 − e + O(e2 ). J 4J 2 ν
Thus, to first order in e, there is an O(J 3 ) decrease in the capacity at all links in the network, except at links 1 and 2, where there is an O(J 2 ) increase in capacity. Indeed, the latter is true whenever the squared coefficient of variation c is not equal to 1, for it is easily checked that φj = 1/J + gJ (c)e + O(e2 ), j = 1, 2, and φj = 1/J − (J/2 − 1)gJ (c)e + O(e2 ), j = 3, . . . , J, where gJ (c) =
Jν(Jν − 1)3 c − (J 4 ν 4 − 3J 3 ν 3 + 3J 2 ν 2 + Jν + 2) . 2J 2 ν
Clearly gJ (c) is O(J 2 ). It is also an increasing function of c, and so this accords with our previous general results on varying the coefficient of variation of the service-time distribution.
14
Optimal capacity assignment in general queueing networks
271
14.7 Conclusions We have considered the problem of how best to assign service capacity in a queueing network so as to minimize the expected number of customers in the network subject to a cost constraint. We have allowed for different types of customers, general service-time distributions, stochastic or deterministic routing, and a variety of service regimes. Using an accurate approximation for the distribution of queueing times, we derived an explicit expression for the optimal allocation to first order in the squared coefficient of variation of the service-time distribution. This can easily be extended to arbitrary order in a straightforward way using a standard perturbation expansion. We have illustrated our results with reference to data networks, giving particular attention to the symmetric star network. In this context we considered how best to assign the link capacities in order to minimize the expected total delay of messages in the system. We studied the effect on the optimal allocation of varying the offered traffic and the distribution of transmission times. We showed that for the symmetric star network, the effect of varying the offered traffic is far greater in cases where the distribution of transmission times deviates from exponential, and that more allocation is needed at nodes where the variation in the transmission times is greatest. Acknowledgments I am grateful to Tony Roberts for suggesting that I adopt the perturbation approach described in Section 14.4. I am also grateful to Erhan Kozan for helpful comments on an earlier draft of this chapter and to the three referees, whose comments and suggestions did much to improve the presentation of my results. The support of the Australian Research Council is gratefully acknowledged.
References 1. H. Akimaru and K. Kawashima, Teletraffic: Theory and Applications, 2nd edition (Springer-Verlag, London, 1999). 2. G. Bloch, S. Greiner, H. de Meer and K. Trivedi, Queueing Networks and Markov Chains: Modeling and Performance Evaluation with Computer Science Applications (Wiley, New York, 1998). 3. T. Brown, Some Distributional Approximations for Random Measures, PhD thesis, University of Cambridge, 1979. 4. T. Brown and P. Pollett, Some distributional approximations in Markovian networks, Adv. Appl. Probab. 14 (1982), 654–671. 5. T. Brown and P. Pollett, Poisson approximations for telecommunications networks, J. Austral. Math. Soc., 32 (1991), 348–364. 6. X. Chao, M. Miyazawa and M. Pinedo, Queueing Networks: Customers, Signals and Product Form Solutions (Wiley, New York, 1999). 7. J. Jackson, Jobshop-like queueing systems, Mgmt. Sci. 10 (1963), 131–142. 8. F. Kelly, Networks of queues with customers of different types, J. Appl. Probab. 12 (1975), 542–554. 9. F. Kelly, Networks of queues, Adv. Appl. Probab. 8 (1976), 416–432.
272
P.K. Pollett
10. 11. 12. 13.
F. Kelly, Reversibility and Stochastic Networks (Wiley, Chichester, 1979). L. Kleinrock, Communication Nets (McGraw-Hill, New York, 1964). E. Koenigsberg, Cyclic queues, Operat. Res. Quart. 9 (1958), 22–35. P. Pollett, Distributional Approximations for Networks of Queues, PhD thesis, University of Cambridge, 1982. P. Pollett, Residual life approximations in general queueing networks, Elektron. Informationsverarb. Kybernet. 20 (1984), 41–54. K. Ross, Multiservice Loss Models for Broadband Telecommunication Networks (Springer-Verlag, London, 1995). R. Serfozo, Introduction to Stochastic Networks (Springer-Verlag, New York, 1999). J. Taylor and R. Jackson, An application of the birth and death process to the provision of spare machines, Operat. Res. Quart. 5 (1954), 95–108. W. Turin, Digital Transmission Systems: Performance Analysis and Modelling, 2nd edition (McGraw-Hill, New York, 1998).
14. 15. 16. 17. 18.
Chapter 15
Analysis of a simple control policy for stormwater management in two connected dams Julia Piantadosi and Phil Howlett
Abstract We will consider the management of stormwater storage in a system of two connected dams. It is assumed that we have stochastic input of stormwater to the first dam and that there is regular demand from the second dam. We wish to choose a control policy from a simple class of control policies that releases an optimal flow of water from the first dam to the second dam. The cost of each policy is determined by the expected volume of water lost through overflow. Key words: Stormwater management, storage dams, eigenvalues, steadystate probabilities
15.1 Introduction We will analyze the management of stormwater storage in interconnected dams. Classic works by Moran [4, 5, 6] and Yeo [7, 8] have considered a single storage system with independent and identically distributed inputs, occurring as a Poisson process. Simple rules were used to determine the instantaneous release rates and the expected average behavior. These models provide a useful background for our analysis of more complicated systems with a sequence of interdependent storage systems. In this chapter we have developed a discrete-time, discrete-state Markov chain model that consists of two connected dams. It is assumed that the input of stormwater into the first dam is stochastic and that there is Julia Piantadosi C.I.A.M., University of South Australia, Mawson Lakes SA 5095, AUSTRALIA e-mail: [email protected] Phil Howlett C.I.A.M., University of South Australia, Mawson Lakes SA 5095, AUSTRALIA e-mail: [email protected] C. Pearce, E. Hunt (eds.), Structure and Applications, Springer Optimization and Its Applications 32, DOI 10.1007/978-0-387-98096-6 15, c Springer Science+Business Media, LLC 2009
273
274
J. Piantadosi and P. Howlett
regular release from the second dam that reflects the known demand for water. We wish to find a control policy that releases an optimal flow of water from the first dam to the second dam. In the first instance we have restricted our attention to a very simple class of control policies. To calculate the cost of a particular policy it is necessary to find an invariant measure. This measure is found as the eigenvector of a large transposed transition matrix. A key finding is that for our simple class of control policies the eigenvector of the large matrix can be found from the corresponding eigenvector of a small block matrix. The cost of a particular policy will depend on the expected volume of water that is wasted and on the pumping costs. An appropriate cost function will assist in determining an optimal pumping policy for our system. This work will be used to analyze water cycle management in a suburban housing development at Mawson Lakes in South Australia. The intention is to capture and treat all stormwater entering the estate. The reclaimed water will be supplied to all residential and commercial sites for watering of parks and gardens and other non-potable usage. Since this is a preliminary investigation we have been mainly concerned with the calculation of steady-state solutions for different levels of control in a class of practical management policies. The cost of each policy is determined by the expected volume of water lost through overflow. We have ignored pumping costs. A numerical example is used to illustrate the theoretical solution presented in the chapter. For underlying methodology see [1, 2].
15.2 A discrete-state model 15.2.1 Problem description Consider a system with two connected dams, D1 and D2 , each of finite capacity. The content of the first dam is denoted by Z1 ∈ {0, 1, . . . , h} and the content of the second dam by Z2 ∈ {0, 1, . . . , k}. We assume a stochastic supply of untreated stormwater to the first dam and a regular demand for treated stormwater from the second dam. The system is controlled by pumping water from the first dam into the second dam. The input to the first dam is denoted by X1 and the input to the second dam by X2 . We have formulated a discrete-state model in which the state of the system, at time t, is an ordered pair (Z1,t , Z2,t ) specifying the content of the two dams before pumping. We will consider a class of simple control policies. If the content of the first dam is greater than or equal to a specified level U1 = m, then we will pump precisely m units of water from the first dam to the second dam. If the content of the first dam is below this level we do not pump any water into the second dam. The parameter m is the control parameter for the class of policies we wish to study. We assume a constant demand for treated water from the second dam and pump a constant volume U2 = 1 unit from the second dam provided the dam is not empty. The units of measurement are chosen to be the daily level of demand.
15
Control policy for stormwater management in two connected dams
275
15.2.2 The transition matrix for a specific control policy We need to begin by describing the transitions. We consider the following cases: • for the state (z1 , 0) where z1 < m we do not pump from either dam. If n units of stormwater enter the first dam then (z1 , 0) → (min([z1 + n], h), 0); • for the state (z1 , z2 ) where z1 < m and 0 < z2 we do not pump water from the first dam but we do pump from the second dam. If n units of stormwater enter the first dam then (z1 , z2 ) → (min([z1 + n], h), z2 − 1); • for the state (z1 , 0) where z1 ≥ m we pump m units from the first dam into the second dam. If n units of stormwater enter the system then (z1 , 0) → (min([z1 − m + n], h), min(m, k));
and
• for the state (z1 , z2 ) where z1 ≥ m and 0 < z2 we pump m units from the first dam into the second dam and pump one unit from the second dam to meet the regular demand. If n units of stormwater enter the system then (z1 , z2 ) → (min([z1 − m + n], h), min(z2 + m − 1, k)). If we order the states (z1 , z2 ) by the rules that (z1 , z2 ) ≺ (ζ1 , ζ2 ) if z2 < ζ2 and (z1 , z2 ) ≺ (ζ1 , z2 ) if z1 < ζ1 then the transition matrix can be written in the form ⎡
A 0 ··· 0 B 0 ··· 0 0 0
⎢ ⎢A ⎢ ⎢0 ⎢ ⎢ ⎢. ⎢ .. ⎢ ⎢ ⎢0 H(A, B) = ⎢ ⎢ ⎢0 ⎢ ⎢0 ⎢ ⎢ ⎢ .. ⎢. ⎢ ⎢ ⎣0 0
⎤
⎥ 0 ··· 0 B 0 ··· 0 0 0 ⎥ ⎥ A ··· 0 0 B ··· 0 0 0 ⎥ ⎥ ⎥ .. .. .. .. .. .. .. ⎥ . ··· . . . ··· . . . ⎥ ⎥ ⎥ 0 ··· A 0 0 ··· 0 B 0 ⎥ ⎥. ⎥ 0 ··· 0 A 0 ··· 0 0 B⎥ ⎥ 0 ··· 0 0 A ··· 0 0 B⎥ ⎥ ⎥ .. .. .. ⎥ .. .. .. .. . ··· . . . ··· . . . ⎥ ⎥ ⎥ 0 ··· 0 0 0 ··· A 0 B⎦ 0 ··· 0 0 0 ··· 0 A B
276
J. Piantadosi and P. Howlett
The block matrices A = [ai,j ] and B = [bi,j ] for i, j ∈ {0, 1, . . . , h} are defined by ⎧ 0 for 1 ≤ i ≤ m − 1, j < i ⎪ ⎪ ⎪ ⎪ ⎪ pj for i = 0, 0 ≤ j ≤ h − 1 ⎪ ⎨ ai,j = pj−i for 1 ≤ i ≤ m − 1, 1 ≤ j ≤ h − 1 and i ≤ j ⎪ ⎪ ⎪ ph−i + for j = h, 0 ≤ i ≤ m − 1 ⎪ ⎪ ⎪ ⎩ 0 for m ≤ i ≤ h, 0 ≤ j ≤ h and
bi,j
⎧ 0 ⎪ ⎪ ⎪ ⎪ ⎪ p ⎪ ⎨ j = pj−i+m ⎪ ⎪ ⎪ ph−i+m + ⎪ ⎪ ⎪ ⎩ 0
for 0 ≤ i ≤ m − 1, 0 ≤ j ≤ h for i = m, 0 ≤ j ≤ h − 1 for m + 1 ≤ i ≤ h, 0 ≤ j ≤ h − 1 and i − m ≤ j for j = h, m ≤ i ≤ h for m + 1 ≤ i ≤ h, j < i − m,
that r units of stormwater will flow into the first where pr is the probability ∞ dam, and ps + = r=s pr . Note that A, B ∈ IR(h+1)×(h+1) and H ∈ IRn×n , where n = (h + 1)(k + 1).
15.2.3 Calculating the steady state when 1 < m < k We suppose that the level of control m is fixed and write Hm = H ∈ IRn×n . The steady state x[m] = x ∈ IRn is the vector of state probabilities determined by the non-negative eigenvector of the transposed transition matrix K = H T corresponding to the unit eigenvalue. Thus we find x by solving the equation Kx = x subject to the conditions
x≥0
and
1T x = 1.
(15.1)
If we define C = AT and D = B T then the matrix K can be written in block form as K = F (C) + Gm (D) = F + Gm , where 0 1 . F = .. k−1 k
⎡
C ⎢0 ⎢ ⎢ ⎢ .. ⎢ . ⎢ ⎢0 ⎣ 0
C 0 .. . 0 0
0 C .. . 0 0
⎤ 0 0⎥ ⎥ ⎥ .. ⎥ . ⎥ ⎥ ··· C ⎥ ⎦ ··· 0 ··· ··· .. .
15
Control policy for stormwater management in two connected dams
and
Gm
0 .. . m−1 = m m+1 .. . k
⎡
0 ⎢ . ⎢ .. ⎢ ⎢ ⎢0 ⎢ ⎢D ⎢ ⎢ ⎢0 ⎢ ⎢ . ⎢ .. ⎣ 0
0 .. . 0 D 0 .. . 0
0 .. . 0 0 D .. . 0
··· 0 . · · · .. ··· 0 ··· 0 ··· 0 . . .. . . ··· D
277
⎤ ··· 0 . ⎥ · · · .. ⎥ ⎥ ⎥ ··· 0 ⎥ ⎥ ··· 0 ⎥ ⎥. ⎥ ··· 0 ⎥ ⎥ .. ⎥ ··· . ⎥ ⎦ ··· D
Therefore Equation (15.1) can be rewritten as [F + Gm ]x = x and by substituting y = [I − F ]x and rearranging, this becomes Gm [I − F ]−1 y = y.
(15.2)
To solve this equation we make some preliminary calculations. We will show later that the inverse matrices used in the sequel are well defined. From the Neumann expansion (I − F )−1 = I + F + F 2 + · · · , we deduce that ⎡
(I − F )−1
P P C P C 2 · · · P C k−1 P C k
⎢0 I ⎢ ⎢ ⎢0 0 ⎢ =⎢ . . ⎢ .. .. ⎢ ⎢ ⎣0 0 0 0
⎤
C · · · C k−2 C k−1 ⎥ ⎥ ⎥ I · · · C k−3 C k−2 ⎥ ⎥ , .. . . .. .. ⎥ . . . . ⎥ ⎥ ⎥ 0 ··· I C ⎦ 0
···
0
I
where we have written P = (I − C)−1 . It follows that 0 0 , Gm (I − F )−1 = RS where S = [S0 , S1 , . . . , Sk−m ] is a block matrix with columns consisting of k − m + 1 blocks given by
278
J. Piantadosi and P. Howlett
⎤ DP C m−1 ⎢ DC m−2 ⎥ ⎥ ⎢ ⎥ ⎢ .. ⎥ ⎢ ⎥ ⎢ . ⎥ ⎢ ⎥ ⎢ D ⎥ ⎢ S0 = ⎢ ⎥, ⎥ ⎢ 0 ⎥ ⎢ ⎥ ⎢ . ⎥ ⎢ .. ⎥ ⎢ ⎥ ⎢ ⎦ ⎣ 0 ⎡
0 ⎡
· · · Sk−2m+1
DP C k−m
⎤ DP C m ⎢ DC m−1 ⎥ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ .. ⎥ ⎢ . ⎥ ⎢ ⎥ ⎢ ⎢ DC ⎥ ⎥,··· ⎢ S1 = ⎢ ⎥ ⎢ D ⎥ ⎥ ⎢ .. ⎥ ⎢ ⎥ ⎢ . ⎥ ⎢ ⎥ ⎢ 0 ⎦ ⎣ 0 ⎡
⎤
⎢ DC k−m−1 ⎥ ⎥ ⎢ ⎥ ⎢ .. ⎥ ⎢ ⎥ ⎢ . ⎥ ⎢ ⎢ DC k−2m+1 ⎥ ⎥ ⎢ =⎢ ⎥, ⎢ DC k−2m ⎥ ⎥ ⎢ ⎥ ⎢ .. ⎥ ⎢ . ⎥ ⎢ ⎥ ⎢ ⎦ ⎣ DC
⎡
Sk−2m+2
DP C k−m+1
⎢ DC k−m ⎢ ⎢ .. ⎢ ⎢ . ⎢ ⎢ DC k−2m+2 ⎢ =⎢ ⎢ DC k−2m+1 ⎢ ⎢ .. ⎢ . ⎢ ⎢ ⎣ DC 2
D and finally
⎡
Sk−m
⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ =⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣
⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥,··· ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦
D(I + C) DP C k−1 DC k−2 .. . DC k−m DC k−m−1 .. . DC m D(I + C + · · · + C m−1 )
⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥. ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦
By writing the matrix equation (15.2) in partitioned form % &% & % & I 0 u 0 = −R (I − S) v 0 it can be seen that u = 0 and that our original problem has reduced to solving the matrix equation (I − S)v = 0.
(15.3)
15
Control policy for stormwater management in two connected dams
279
Thus we must find the eigenvector for S corresponding to the unit eigenvalue. To properly describe the elimination process we need to establish some suitable notation. We will write S = [Si,j ] where i, j ∈ {0, 1, . . . , (k − m)} and ⎧ DP C m−1+j ⎪ ⎪ ⎪ ⎪ DC m−1−i+j ⎪ ⎪ ⎪ ⎪ ⎨ Si,j =
for i = 0 for 1 ≤ i ≤ k − m − 1 and i − m + 1 ≤ j
⎪ D(I + C + · · · + C j−k+2m−1 ) for i = k − m ⎪ ⎪ ⎪ ⎪ ⎪ and k − 2m + 1 ≤ j ⎪ ⎪ ⎩ 0 for m + j ≤ i.
We note that ⎡
⎤
S0,j S1,j .. .
⎢ ⎢ Sj = ⎢ ⎢ ⎣
⎥ ⎥ ⎥ ⎥ ⎦
Sk−m,j and that 1T Sj = 1T for each j = 0, 1, . . . , k − m. Hence S is a stochastic matrix. One of our key findings is that we can use Gaussian elimination to further reduce the problem from one of finding an eigenvector for the large matrix S ∈ IR(h+1)(k−m+1)×(h+1)(k−m+1) to one of finding the corresponding eigenvector for a small block matrix in IR(h+1)×(h+1) .
15.2.4 Calculating the steady state for m = 1 For the special case when m = 1 we have the block matrix G1 with the following structure: ⎡ ⎤ 0 0 0 0 ··· 0 ⎥ 1⎢ ⎢D D 0 ··· 0 ⎥ ⎢ ⎥ ⎢ ⎥ G1 = 2 ⎢ 0 0 D · · · 0 ⎥ . ⎥ .. ⎢ . . . . . . . .. . ⎥ . ⎢ ⎣ . . . . . ⎦ k 0 0 0 ··· D In this case we have
% −1
G1 (I − F )
=
& 0 0 , RS
280
J. Piantadosi and P. Howlett
where S ∈ IR(h+1)k×(h+1)k is given by ⎡ ⎢ ⎢ ⎢ ⎢ ⎢ S=⎢ ⎢ ⎢ ⎢ ⎣
DP DP C DP C 2 · · · DP C k−2 DP C k−1 0
D
0 .. . 0 0
0 .. . 0 0
DC · · · DC k−3 · · · DC k−4 .. .. . .
D .. . 0 0
··· ···
D 0
⎤
DC k−2 ⎥ ⎥ ⎥ DC k−3 ⎥ ⎥ ⎥. .. ⎥ . ⎥ ⎥ DC ⎦ D
We now wish to solve (I − S)v = 0.
15.2.5 Calculating the steady state for m = k For the case when m = k we have ⎡ 0 0 ⎢ . . ⎢ .. .. ⎢ ⎢ 0 0 Gk = ⎢ ⎢ ⎢ . . ⎢ .. .. ⎣
0 .. . 0 .. .
··· 0 . · · · .. ··· 0 . · · · ..
··· 0 . · · · .. ··· 0 . · · · ..
⎤ ⎥ ⎥ ⎥ ⎥ ⎥. ⎥ ⎥ ⎥ ⎦
D D D ··· D ··· D %
Therefore −1
Gk (I − F )
=
& 0 0 , RS
where S = DP ∈ IR(h+1)×(h+1) , and so we wish to solve (I − DP )v = 0.
15.3 Solution of the matrix eigenvalue problem using Gaussian elimination for 1 < m < k We wish to find the eigenvector corresponding to the unit eigenvalue for the matrix S. We use Gaussian elimination in a block matrix format. During the elimination we will make repeated use of the following elementary formulae. Lemma 1. If W = (I − V )−1 then W = (I + W V ) and W V r = V r W for all non-negative integers r.
15
Control policy for stormwater management in two connected dams
281
15.3.1 Stage 0 Before beginning the elimination we write T (0) = [I −S](0) = I −S (0) = I −S. We consider a sub-matrix M from the matrix T (0) consisting of the (0, 0), (q, 0), (0, s) and (q, s)th elements where 1 ≤ q ≤ m − 1 and 1 ≤ s. We have % & −DP C m−1+s I − DP C m−1 M= . −DC m−1−q Iδq,s − DC m−1−q+s If we write W0 = [I − DP C m−1 ]−1 then the standard elimination gives % M →
&
−W0 DP C m−1+s
0 Iδq,s − DC m−1−q+s − DC m−1−q W0 DP C m−1+s %
→
I
&
−DP C m−1 W0 C s
0 Iδq,s − DC m−1−q [I + W0 DP C m−1 ]C s %
→
I
I
&
−DP C m−1 (W0 C)C s−1
0 Iδq,s − DC m−1−q (W0 C)C s−1
.
After stage 0 of the elimination we have a new matrix T (1) = I − S (1) where
(1)
Si,j
⎧ 0 ⎪ ⎪ ⎪ ⎪ ⎪ DP C m−1 (W0 C)C j−1 ⎪ ⎪ ⎪ ⎪ ⎪ DC m−1−i (W0 C)C j−1 ⎪ ⎪ ⎪ ⎨ DC m−1−i+j = ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ D(I + C + · · · + C j−k+2m−1 ) ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩ 0
for j = 0 for i = 0, 1 ≤ j for 1 ≤ i ≤ m − 1, 1 ≤ j for m ≤ i ≤ k − m − 1 and i − m + 1 ≤ j for i = k − m and k − 2m + 1 ≤ j for m + j ≤ i.
Note that column 0 is reduced to a zero column and that row 0 is fixed for all subsequent stages. We therefore modify T (1) by dropping both column and row 0.
15.3.2 The general rules for stages 2 to m − 2 After stage p−1 of the elimination, for 1 ≤ p ≤ m−2, we have T (p) = I −S (p) where
282
J. Piantadosi and P. Howlett
(p)
Si,j
⎧ 0 ⎪ ⎪ p−1 ⎪ ⎪ ⎪ DC m−p t=0 (Wt C)C j−p ⎪ ⎪ p−1 ⎪ ⎪ DC m−1−i t=0 (Wt C)C j−p ⎪ ⎪ ⎨ p−1 D t=i−m+1 (Wt C)C j−p = ⎪ DC m−1−i+j ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ j−k+2m−1 t ⎪ ⎪ C D t=0 ⎪ ⎪ ⎩0
for for for for for and for for
j =p−1 i = p − 1, p ≤ j p ≤ i ≤ m − 1, p ≤ j m ≤ i ≤ m + p − 2, p ≤ j m+p−1≤i≤k−m−1 i−m+1≤j i = k − m, k − 2m + 1 ≤ j m + j ≤ i.
Column p − 1 is reduced to a zero column and row p − 1 is fixed for all subsequent stages. We modify T (p) = (I − S (p) ) by dropping both column and row p − 1 and consider a sub-matrix M consisting of the (p, p), (q, p), (r, p), (m + p − 1, p), (p, s), (q, s), (r, s) and (m + p − 1, s)th elements, where p + 1 ≤ q ≤ m − 1, m ≤ r ≤ m + p − 2 and p + 1 ≤ s. The sub-matrix M is given by ⎤ ⎡ p−1 p−1 −DC m−1−p t=0 (Wt C)C s−p I − DC m−1−p t=0 (Wt C) ⎥ ⎢ p−1 ⎢ − DC m−1−q p−1 (Wt C) Iδq,s − DC m−1−q t=0 (Wt C)C s−p ⎥ t=0 ⎥ ⎢ ⎥ ⎢ p−1 s−p ⎥ ⎢ − D p−1 Iδr,s − D t=r−m+1 (Wt C)C t=r−m+1 (Wt C) ⎦ ⎣ s−p −D Iδm+p−1,s − DC and if Wp = [I − DC m−1−p (W0 C) · · · (Wp−1 C)]−1 elimination gives ⎡ ⎤ p (Wt C)C s−p−1 I −DC m−1−p t=0 ⎢ 0 Iδq,s − DC m−1−q p (Wt C)C s−p−1 ⎥ t=0 ⎥. p M →⎢ ⎣0 Iδr,s − D t=r−m+1 (Wt C)C s−p−1 ⎦ 0 Iδm+p−1,s − D(Wp C)C s−p−1 After stage p of the elimination we have T (p+1) = I − S (p+1) where ⎧ 0 for j = p ⎪ ⎪ p ⎪ m−1−p j−p−1 ⎪ (W C)C for i = p, p + 1 ≤ j DC ⎪ t ⎪ pt=0 ⎪ m−1−i j−p−1 ⎪ (W C)C for p + 1 ≤ i ≤ m − 1 DC ⎪ t t=0 ⎪ ⎪ ⎪ and p+1≤j ⎪ ⎪ ⎪ ⎪ D pt=i−m+1 (Wt C)C j−p−1 for m ≤ i ≤ m + p − 1 ⎨ (p+1) and p + 1 ≤ j = Si,j ⎪ m−1−i+j ⎪ for m + p ≤ i ≤ k − m − 1 DC ⎪ ⎪ ⎪ ⎪ and i−m+1≤j ⎪ ⎪ j−k+2m−1 t ⎪ ⎪ ⎪ C for i =k−m D ⎪ t=0 ⎪ ⎪ ⎪ and k − 2m + 1 ≤ j ⎪ ⎩ 0 for m + j ≤ i. Since column p is reduced to a zero column and row p is fixed for all subsequent stages we modify T (p+1) by dropping both column and row p.
15
Control policy for stormwater management in two connected dams
283
15.3.3 Stage m − 1 After stage m − 2 we have T (m−1) = I − S (m−1) ⎧ 0 for ⎪ ⎪ ⎪ m−2 ⎪ j−m+1 ⎪ for DC t=0 (Wt C)C ⎪ ⎪ ⎪ ⎪ ⎪ D m−2 (W C)C j−m+1 ⎪ for t ⎪ t=0 ⎪ ⎪ m−2 ⎪ j−m+1 ⎪ D t=i−m+1 (Wt C)C for ⎪ ⎪ ⎪ ⎨ and (m−1) Si,j = ⎪ for DC m−1−i+j ⎪ ⎪ ⎪ ⎪ ⎪ and ⎪ ⎪ ⎪ j−k+2m−1 t ⎪ ⎪ ⎪ D t=0 C for ⎪ ⎪ ⎪ ⎪ ⎪ and ⎪ ⎪ ⎩ 0 for
where j =m−2 i = m − 2, m − 1 ≤ j i = m − 1, m − 1 ≤ j m ≤ i ≤ 2m − 3 m−1≤j 2m − 2 ≤ i ≤ k − m − 1 i−m+1≤j i=k−m k − 2m + 1 ≤ j m + j ≤ i.
We modify T (m−1) by dropping both column and row m − 2 and consider a sub-matrix M consisting of the (m − 1, m − 1), (r, m − 1), (2m − 2, m − 1), (m − 1, s), (r, s) and (2m − 2, s)th elements, where m ≤ r ≤ 2m − 3 and m ≤ s. We have ⎡ ⎤ m−2 m−2 −D t=0 (Wt C)C s−m+1 I − D t=0 (Wt C) ⎢ ⎥ m−2 ⎢ s−m+1 ⎥ . M = ⎢ − D m−2 (W C) Iδ − D (W C)C ⎥ t r,s t t=r−m+1 t=r−m+1 ⎣ ⎦ s−m+1 −D Iδ2m−2,s − DC If we write Wm−1 = [I − D(W0 C) · · · (Wm−2 C)]−1 then the standard elimination gives ⎤ ⎡ I −D(W0 C) · · · (Wm−1 C)C s−m ⎥ ⎢ M → ⎣ 0 Iδr,s − D(Wr−m+1 C) · · · (Wm−1 C)C s−m ⎦ . 0 Iδ2m−2,s − D(Wm−1 C)C s−m After stage m − 1 of the elimination we ⎧ 0 ⎪ ⎪ m−1 ⎪ ⎪ ⎪ (Wt C)C j−m D ⎪ ⎪ t=0 ⎪ m−1 ⎪ j−m ⎪ ⎨ D t=i−m+1 (Wt C)C (m) Si,j = DC m−1−i+j ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ j−k+2m−1 t ⎪ ⎪ D t=0 C ⎪ ⎪ ⎩ 0
have T (m) = I − S (m) where for
j =m−1
for i = m − 1, m ≤ j for m ≤ i ≤ 2m − 2, m ≤ j for 2m − 1 ≤ i ≤ k − m − 1 and i − m + 1 ≤ j for i = k − m, k − 2m + 1 ≤ j for m + j ≤ i.
284
J. Piantadosi and P. Howlett
Column m − 1 is reduced to a zero column and row m − 1 is fixed for all subsequent stages. We modify T (m) by dropping both column and row m − 1.
15.3.4 The general rules for stages m to k − 2m After stage p − 1 for m ≤ p ≤ k − 2m we have T (p) = I − S (p) where ⎧ 0 for j = p − 1 ⎪ ⎪ ⎪ p−1 ⎪ j−p ⎪ (W C)C for i = p − 1, p ≤ j D t ⎪ ⎪ t=p−m ⎪ p−1 ⎪ j−p ⎪ for p ≤ i ≤ m + p − 2, p ≤ j ⎨ D t=i−m+1 (Wt C)C (p) Si,j = DC m−1−i+j for m + p − 1 ≤ i ≤ k − m − 1 ⎪ ⎪ ⎪ and i−m+1≤j ⎪ ⎪ ⎪ j−k+2m−1 t ⎪ ⎪ D t=0 C for i = k − m, k − 2m + 1 ≤ j ⎪ ⎪ ⎩ 0 for m + j ≤ i. We modify T (p) by dropping both column and row p − 1 and consider a submatrix M using the (p, p), (r, p), (m+p−1, p), (p, s), (r, s) and (m+p−1, s)th elements, where p + 1 ≤ r ≤ m + p − 2 and p + 1 ≤ s. We have ⎤ ⎡ p−1 p−1 −D t=p−m+1 (Wt C)C s−p I − D t=p−m+1 (Wt C) p−1 ⎥ ⎢ M = ⎣ −D p−1 Iδr,s − D t=r−m+1 (Wt C)C s−p ⎦ t=r−m+1 (Wt C) −D
Iδm+p−1,s − DC s−p
and if we write Wp = [I − D(Wp−m+1 C) · · · (Wp−1 C)]−1 then the standard elimination gives ⎤ ⎡ I −D(Wp−m+1 C) · · · (Wp C)C s−p−1 M → ⎣ 0 Iδr,s − D(Wr−m+1 C) · · · (Wp C)C s−p−1 ⎦ . 0 Iδm+p−1,s − D(Wp C)C s−p−1 After stage p of the elimination we have T (p+1) = I − S (p+1) where ⎧ 0 for j = p ⎪ ⎪ ⎪ ⎪ D pt=i−m+1 (Wt C)C j−p−1 for p + 1 ≤ i ≤ m + p − 1 ⎪ ⎪ ⎪ ⎪ and p + 1 ≤ j ⎪ ⎪ ⎨ DC m−1−i+j for m + p ≤ i ≤ k − m − 1 (p+1) Si,j = and i−m+1≤j ⎪ ⎪ ⎪ ⎪ D j−k+2m−1 C t ⎪ for i = k − m ⎪ t=0 ⎪ ⎪ ⎪ and k − 2m + 1 ≤ j ⎪ ⎩ 0 for m + j ≤ i. We now modify T (p+1) by dropping both column and row p.
15
Control policy for stormwater management in two connected dams
285
15.3.5 Stage k − 2m + 1 To reduce the cumbersome notation it is convenient to write p = k − 2m + 1. After stage k − 2m we have T (p) = I − S (p) where
(p)
Si,j
⎧ 0 ⎪ ⎪ k−2m ⎪ ⎪ D t=k−3m+1 (Wt C)C j−k+2m−1 ⎪ ⎪ ⎪ ⎪ ⎨ j−k+2m−1 = D k−2m t=i−m+1 (Wt C)C ⎪ ⎪ ⎪ ⎪ ⎪ j−p t ⎪ ⎪ ⎪ ⎩ D t=0 C 0
for for and for and for for
j = k − 2m i = k − 2m p≤j p≤i≤k−m−1 p≤j i = k − m, p ≤ j m + j ≤ i.
We modify T (p) by dropping both column and row k − 2m. We consider a sub-matrix M consisting of the (p, p), (r, p), (k − m, p), (p, s), (r, s) and (k − m, s)th elements, where p + 1 ≤ r ≤ k − m − 1 and p + 1 ≤ s. We have ⎤ ⎡ p−1 p−1 −D t=p−m+1 (Wt C)C s−p I − D t=p−m+1 (Wt C) p−1 ⎥ ⎢ M = ⎣ −D p−1 Iδr,s − D t=r−m+1 (Wt C)C s−p ⎦ . t=r−m+1 (Wt C) s−p −D Iδk−m,s − D t=0 C t We write Wp = [I − D(Wp−m+1 C) · · · (Wp−1 C)]−1 . In order to describe the final stages of the elimination more easily we define Xk−2m = I
and set
Xk−2m+r+1 = I + Xk−2m+r (Wk−2m+r C)
(15.4)
for each r = 0, . . . , m − 1. With this notation the standard elimination gives p ⎤ ⎡ I −D t=p−m+1 (Wt C)C s−p−1 p ⎥ ⎢ Iδr,s − D t=r−m+1 (Wt C)C s−p−1 M → ⎣0 7⎦. 6 s−p−1 t s−p−1 0 Iδk−m,s − D C + (Xp+1 C)C t=0 After stage p = k − 2m + 1 of the elimination we have T (p+1) = I − S (p+1) where ⎧ 0 for j = p ⎪ ⎪ ⎪ p j−p−1 ⎪ ⎪ (W C)C for p + 1 ≤ i ≤ k − m − 1 D t ⎪ t=i−m+1 ⎪ ⎪ ⎨ and p + 1 ≤ j (p+1) 6 Si,j = j−p−1 t ⎪ D C ⎪ t=0 ⎪ ⎪ ⎪ +(X C)C j−p−1 9 ⎪ for i = k − m, p + 1 ≤ j p+1 ⎪ ⎪ ⎩ 0 for m + j ≤ i. We modify T (k−2m+2) by dropping both column and row k − 2m + 1.
286
J. Piantadosi and P. Howlett
15.3.6 The general rule for stages k − 2m + 2 to k − m − 2 After stage p − 1 for k − 2m + 2 ≤ p ≤ where ⎧ 0 ⎪ ⎪ ⎪ D p−1 (W C)C j−p ⎪ ⎪ t ⎨ t=p−m p−1 j−p (p) D (W t C)C Si,j = t=i−m+1 7 6 ⎪ j−p t ⎪ j−p ⎪ D C + X C p ⎪ t=0 ⎪ ⎩ 0
k − m − 2 we have T (p) = I − S (p) for j = p − 1 for i = p − 1, p ≤ j for p ≤ i ≤ k − m − 1, p ≤ j for i = k − m, p ≤ j for m + j ≤ i.
We modify T (p) by dropping both column and row p − 1 and consider a submatrix M using the (p, p), (r, p), (k − m, p), (p, s), (r, s) and (k − m, s)th elements, where p + 1 ≤ r ≤ k − m and p + 1 ≤ s. We have M given by ⎡
p−1 I − D t=p−m+1 (Wt C) ⎢ −D p−1 ⎣ t=r−m+1 (Wt C) −DXp−1
p−1 ⎤ −D t=p−m+1 (Wt C)C s−p p−1 Iδr,s − D 6t=r−m+1 (Wt C)C s−p 7 ⎥ ⎦ s−p t s−p Iδk−m,s − D C + X C p t=0
and if we write Wp = [I − D(Wp−m+1 C) · · · (Wp−1 C)]−1 then the standard elimination gives ⎡
I ⎢0 M →⎣ 0
p ⎤ −D t=p−m+1 (Wt C)C s−p−1 p Iδr,s −6D t=r−m+1 (Wt C)C s−p−1 7 ⎥ ⎦. s−p−1 Iδk−m,s − D t=0 C t + Xp+1 C s−p−1
After stage p of the elimination we have T (p+1) = I − S (p+1) where
(p+1)
Si,j
⎧ 0 ⎪ ⎪ ⎪ p ⎪ D t=i−m+1 (Wt C)C j−p−1 ⎪ ⎪ ⎪ ⎨ 6 = j−p−1 t D C ⎪ t=0 ⎪ ⎪ 9 ⎪ j−p−1 ⎪ +Xp+1 C ⎪ ⎪ ⎩ 0
for j = p for p + 1 ≤ i ≤ k − m − 1 and p + 1 ≤ j for i = k − m, p + 1 ≤ j for m + j ≤ i.
We again modify T (p+1) by dropping both column and row p.
15
Control policy for stormwater management in two connected dams
287
15.3.7 The final stage k − m − 1 The matrix S (k−m−1) is given by k−m−2 D t=k−2m (Wt C) S (k−m−1) = DXk−m−1
k−m−2 D t=k−2m (Wt C)C . D[I + (Xk−m−1 C)]
Hence k−m−2 −D t=k−2m (Wt C)C . I − D[I + (Xk−m−1 C)]
T
(k−m−1)
k−m−2 I − D t=k−2m (Wt C) = −DXk−m−1
If we write Wk−m−1 = [I − D(Wk−2m C) · · · (Wk−m−2 C)]−1 then elimination gives I −D(Wk−2m C) · · · (Wk−m−1 C) . M→ 0 I − DXk−m Since the original system is singular and since we show later in this chapter that the matrices W0 , . . . , Wk−m−1 are well defined, we know that the final pivot element I −DXk−m must also be singular. Therefore the original system can be solved by finding the eigenvector corresponding to the unit eigenvalue for the matrix DXk−m and then using back substitution.
15.4 The solution process using back substitution for 1 < m < k After the Gaussian elimination has been completed the system reduces to an equation for v0 , k−m m−1 v0 = DP C (W0 C) C j−1 vj , j=1
a set of equations for vp when p = 1, 2, . . . , m − 2, % vp = DC
m−1−p
p
& (Wt C)
t=0
k−m
C j−p−1 vj ,
j=p+1
a set of equations for vq when q = m − 1, m, . . . , k − m − 1, % vq = D
q
& (Wt C)
t=q−m+1
k−m
C j−q−1 vj ,
j=q+1
and finally an equation for vk−m (I − DXk−m )vk−m = 0.
288
J. Piantadosi and P. Howlett
We begin by solving the final equation to find vk−m . The penultimate equation now shows us that % k−m−1 & (Wt C) vk−m . vk−m−1 = D t=k−2m
We proceed by induction. We suppose that m ≤ q and that for all s with q ≤ s ≤ k − m we have % k−m−1 & (Wt C) vk−m vs = D t=s−m+1
and
k−m
C
j−s−1
%k−m−1 & vj = (Wt C) vk−m . t=s+1
j=s+1
The hypothesis is clearly true for q = k − m − 1. Now we have k−m
C j−q−1 vj
j=q
= vq + C
k−m
C j−q−2 vj
j=q+1
=
%
k−m−1
D
&
%k−m−1 & (Wt C) + C (Wt C) vk−m
t=q−m+1
=
%
q−1
D
t=q+1
&
-
(Wt C) Wq + I
%k−m−1 & ×C (Wt C) vk−m
t=q−m+1
t=q+1
%k−m−1 & = (Wt C) vk−m t=q
and hence % vq−1 = D
q−1
& k−m (Wt C) C j−q vj
t=q−m
j=q
%k−m−1 & =D (Wt C) vk−m . t=q−m
Thus the hypothesis is also true for m − 1 ≤ q − 1 ≤ s ≤ k − m. To complete the solution we note that the pattern changes at this point. We still have
15
Control policy for stormwater management in two connected dams k−m
289
C j−m+1 vj
j=m−1
= vm−1 + C
k−m
C j−m vj
j=m
=
%k−m−1 & %k−m−1 & D (Wt C) + C (Wt C) vk−m t=m
t=0
=
%m−2 & %k−m−1 & D (Wt C) Wm−1 + I × C (Wt C) vk−m t=m
t=0
%k−m−1 & = (Wt C) vk−m t=m−1
but now we have vm−2 = DC
%m−2
&
k−m
(Wt C)
t=0
C j−m+1 vj
j=m−1
%k−m−1 & = DC (Wt C) vk−m . t=0
We use induction once more. Let 1 ≤ p ≤ m − 2 and for p ≤ s ≤ m − 2 we suppose that %k−m−1 & m−s−1 vs = DC (Wt C) vk−m t=0 k−m
%k−m−1 & C j−s−1 vj = (Wt C) vk−m .
j=s+1
t=s+1
and
The hypothesis is true for p = m − 2. Now we have k−m
C j−p−1 vj
j=p
= vp + C
C j−p−2 vj
j=p+1
=
k−m
DC m−p−1
%p−1 t=0
& (Wt C) Wp + I
%k−m−1 & = (Wt C) vk−m t=p
-
%k−m−1 & ×C (Wt C) vk−m t=p+1
290
J. Piantadosi and P. Howlett
and hence vp−1 = DC m−p
%p−1
& k−m (Wt C) C j−q vj
t=0
j=p
%k−m−1 & = DC m−p (Wt C) vk−m . t=0
Thus the hypothesis is also true for 0 ≤ p − 1 ≤ s ≤ k − m. In summary we have the solution to Equation (15.3) given by %k−m−1 & m−p−1 vp = DC (Wt C) vk−m (15.5) t=0
for p = 0, 1, . . . , m − 2 and % vq = D
k−m−1
& (Wt C) vk−m
(15.6)
t=q−m+1
for q = m − 1, m, . . . , k − m − 1. The original solution can now be recovered through u 0 y= = and x = (I − F )−1 y, v v where x = x[m] is the steady-state vector for the original system using the control policy with level m. The steady-state vector can be used to calculate the expected amount of water lost from the system when this policy is implemented. The cost of a particular policy will depend on the expected volume of water that is wasted and on the pumping costs. This cost will assist in determining an optimal pumping policy for the system.
15.5 The solution process for m = 1 For the case when m = 1 the final equation is given by (I − D)vk = 0. We will show that (I − D)−1 is well defined and hence deduce that vk = 0. Since 0 L2 D= 0 M2 it follows that (I − D)−1 is well defined if and only if (I − M2 )−1 is well defined. We have the following result.
15
Control policy for stormwater management in two connected dams
291
Lemma 2. The matrix M2 k+1 is strictly sub-stochastic and (I − M2 )−1 is well defined by the formula (I − M2 )−1 = (I + M2 + M2 2 + · · · + M2 k )(I − M2 k+1 )−1 . Proof. We observe that 1T M2 = [p1 + , 1, 1, . . . , 1] and suppose that 1T M2 r = [α0 , α1 , . . . , αr−1 , 1, 1, . . . , 1] for each r = 1, 2, . . . , q, where αj = αj (r) ∈ (0, 1). Now it follows that 1T M2 q+1 = [α0 , α1 , . . . , αq−1 , 1, 1, . . . , 1]M2 = α0 [p1 , p0 , 0, . . . , 0, 0, 0, . . . , 0] + α1 [p2 , p1 , p0 , . . . , 0, 0, 0, . . . , 0] + · · · ··· + αq−2 [pq−1 , pq−2 , pq−3 , . . . , p0 , 0, 0, . . . , 0] + αq−1 [pq , pq−1 , pq−2 , . . . , p1 , p0 , 0, . . . , 0] + [pq+1 , pq , pq−1 , . . . , p2 , p1 , p0 , . . . , 0] + · · · ··· + [pk , pk−1 , pk−2 , . . . , pk−q+1 , pk−q , pk−q−1 , . . . , p0 ] + [pk+1 + , pk + , pk−1 + , . . . , pk−q+2 + , pk−q+1 + , pk−q + , . . . , p1 + ]. The first element in the resultant row matrix is α0 p1 + α1 p2 + · · · + αq−1 pq + pq+1 + = β0 < 1, and the second element is α0 p0 + α1 p1 + · · · + αq−1 pq−1 + pq + = β1 < 1. A similar argument shows that the j th element is less than 1 for all j ≤ q and indeed, for the critical case j = q, the j th element is given by αq−1 p0 + p1 + = βq < 1. The remaining elements for q < j ≤ k are easily seen to be equal to 1. Hence the hypothesis is also true for r = q + 1. By induction it follows that 1T M2 k+1 < 1T .
292
J. Piantadosi and P. Howlett
Hence (I − M2 k+1 )−1 is well defined. The matrix (I − M2 )−1 is now defined by the identity above. This completes the proof. By back substitution into the equation (I −S)v = 0 we can see that vp = 0 for p = 1, . . . , k − 1 and finally that (I − DP )v0 = 0. Since DP is a stochastic matrix, the eigenvector v0 corresponding to the unit eigenvalue can be found. We know that u 0 y= = v v and hence we can calculate the steady-state vector x = (I − F )−1 y.
15.6 The solution process for m = k In this particular case we need to solve the equation (I − DP )v0 = 0. Hence we can find the eigenvector v0 corresponding to the unit eigenvalue of the matrix DP and the original steady-state solution x can be recovered through u 0 y= = v v and x = (I − F )−1 y.
15.7 A numerical example We will consider a system of two connected dams with discrete states z1 ∈ {0, 1, 2, 3, 4, 5, 6} and z2 ∈ {0, 1, 2, 3, 4}. Assume that the inflow to the first dam is defined by pr = (0.5)r+1 for r = 0, 1, . . . and consider the control policy with m = 2. The transition probability matrix has the block matrix form ⎡ ⎤ A 0 B 0 0 ⎢ ⎥ 0 0⎥ ⎢A 0 B ⎢ ⎥ 0⎥ H=⎢ ⎢0 A 0 B ⎥, ⎢ ⎥ 0 A 0 B⎦ ⎣0 0 0 0 A B
15
Control policy for stormwater management in two connected dams
where
⎡1 2
⎢ ⎢0 ⎢ ⎢ ⎢0 ⎢ ⎢ A=⎢ ⎢0 ⎢ ⎢0 ⎢ ⎢ ⎢0 ⎣ 0 and
⎡
0
⎢ ⎢0 ⎢ ⎢1 ⎢2 ⎢ ⎢ B =⎢0 ⎢ ⎢ ⎢0 ⎢ ⎢ ⎣0 0
1 4 1 2
1 8 1 4
1 16 1 8
1 32 1 16
1 64 1 32
1 64 1 32
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1 4 1 2
0
1 8 1 4 1 2
0
0
1 16 1 8 1 4 1 2
0
0
0
1 32 1 16 1 8 1 4 1 2
1 64 1 32 1 16 1 8 1 4
293
⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦ ⎤
⎥ 0 ⎥ ⎥ 1 ⎥ 64 ⎥ ⎥ 1 ⎥ . 32 ⎥ ⎥ 1 ⎥ 16 ⎥ ⎥ 1 ⎥ 8 ⎦ 1 4
As explained in Subsection 15.2.3 we solve the equation (I − S)v = 0, where S = [Si,j ] for i, j = {0, 1, 2}. Using the elimination process described in Section 15.3 we find the reduced coefficient matrix ⎤ ⎡ I −W0 D(I − C)−1 C 2 −W0 D(I − C)−1 C 3 ⎦, I −W1 DW0 C 2 (I − S) → ⎣ 0 0 0 I − D(I + W1 C) where ⎡
447 2296
491 2296
1 2
0
0
0
1065 4592
1 4
1 2
0
0
⎢ ⎢ 1021 ⎢ 4592 ⎢ ⎢ 1849 ⎢ 9184 ⎢ ⎢ 2677 D(I + W1 C) = ⎢ ⎢ 18368 ⎢ ⎢ 619 ⎢ 5248 ⎢ ⎢ 619 ⎢ 10496 ⎣
1805 9184
1 8
1 4
1 2
0
2545 18368
1 16
1 8
1 4
1 2
575 5248
1 32
1 16
1 8
1 4
575 10496
1 64
1 32
1 16
1 8
619 10496
575 10496
1 64
1 32
1 16
1 8
0
⎤
⎥ 0⎥ ⎥ ⎥ ⎥ 0⎥ ⎥ ⎥ 0⎥ ⎥. ⎥ 1⎥ 2⎥ ⎥ 1⎥ 4⎥ ⎦ 1 4
294
J. Piantadosi and P. Howlett
We solve the matrix equation (I − D(I + W1 C))v2 = 0 to find the vector ' 2787 821 12337 9053 7411 7411 7411 (T , , , , , , v2 = 15124 3781 60496 60496 60496 120992 120992 and, by the back substitution process of Section 15.4, the vectors v1 =
' 2845 6197 10563 14929 23661 23661 23661 (T , , , , , , 13408 26816 53632 107264 214528 429056 429056
and v0 =
'1 4
,
1 3 1 3 3 3 (T , , , , , . 4 16 8 32 64 64
The probability measure is the steady-state vector x given by ' 1718 983 983 983 983 983 983 62375 , 12475 , 24950 , 49900 , 99800 , 199600 , 199600 , 1718 3197 3197 3197 3197 3197 62375 , 62375 , 124750 , 249500 , 499000 , 998000 ,
3197 998000 ,
3436 4676 569 62375 , 62375 , 12475 ,
1676 2183 2183 62375 , 124750 , 249500 ,
2183 249500 ,
2816 3888 9959 6071 4127 4127 62375 , 62375 , 249500 , 249500 , 249500 , 499000 ,
4127 499000 ,
2787 3284 12337 9053 7411 7411 62375 , 62375 , 249500 , 249500 , 249500 , 499000 ,
7411 499000
(T .
Using the steady-state vector x we can calculate the expected overflow of water from the system. Let z = z(s) = (z1 (s), z2 (s)) for s = 1, 2, . . . , n denote the collection of all possible states. The expected overflow is calculated by %∞ & n f [z(s)|r]pr xs , J= s=1
r=0
where f [z(s)|r] is the overflow from state z(s) when r units of stormwater enter the first dam. We will consider the same pumping policy for four different values m = 1, 2, 3, 4 of the control parameter. We obtain the steady-state vector x = x[m] for each particular value of the control parameter and determine the expected total overflow in each case. Table 15.1 compares the four parameter values by considering the overflow Ji = Ji [m] from the first and second dams. From the table it is clear that the first pumping policy results in less overflow from the system. If pumping costs are ignored then it is clear that the policy m = 1 is the best. Of course, in a real system there are likely to
15
Control policy for stormwater management in two connected dams
295
Table 15.1 Overflow lost from the system for m = 1, 2, 3, 4 m=1
m=2
m=3
m=4
J1
1 7
1 25
1 25
1 17
J2
0
9053 62375
263 1350
57 272
Total
1 7
11548 62375
317 1350
73 272
be other cost factors to consider. It is possible that less frequent pumping of larger volumes may be more economical.
15.8 Justification of inverses To justify the solution procedure described earlier we will show that Wr is well defined for r = 0, 1, . . . , k − m − 1. From the definition of the transition matrix H = H(A, B) in Subsection 15.2.2 and the subsequent definition of C = AT and D = B T we can see that C and D can be written in block form as 0 L2 L1 0 , and D = C= 0 M2 M1 0 where L1 ∈ IRm×m and M1 ∈ IR(h−m+1)×m are given by ⎡ ⎤ ⎡ ⎤ p1 pm pm−1 · · · p0 0 ··· 0 ⎢ pm+1 pm · · · p2 ⎥ ⎢ p1 ⎢ ⎥ p0 · · · 0 ⎥ ⎢ ⎢ .. ⎥ ⎥ .. .. L1 = ⎢ . , M1 = ⎢ . ⎥ ⎥ . . . . · · · . .. . . .. ⎦ ⎢ ⎥ ⎣ .. ⎣ ph−1 ph−2 · · · ph−m ⎦ pm−1 pm−2 · · · p0 + p+ p+ h h−1 · · · ph−m+1 and where L2 ∈ IRm×(h−m+1) and M2 ∈ IR(h−m+1)×(h−m+1) are given by ⎤ ⎡ p0 · · · 0 0 · · · 0 ⎢ p1 · · · 0 0 · · · 0 ⎥ ⎥ ⎢ L2 = ⎢ . . . .⎥ ⎣ .. · · · .. .. · · · .. ⎦ pm−1 · · · p0 0 · · · 0
and
296
J. Piantadosi and P. Howlett
⎡
p1 0 pm · · · ⎢ pm+1 · · · p p 2 1 ⎢ ⎢ .. . . . . M2 = ⎢ . · · · . . ⎢ ⎣ ph−1 · · · ph−m ph−m−1 + p+ · · · p+ h h−m+1 ph−m
··· ···
0 0 .. .
⎤
⎥ ⎥ ⎥ ⎥. ··· ⎥ · · · pm−1 ⎦ · · · p+ m
15.8.1 Existence of the matrix W0 Provided pj > 0 for all j ≤ h the matrix L1 is strictly sub-stochastic [3] with 1T L1 < 1T . It follows that (I − L1 )−1 is well defined and hence (I − L1 )−1 0 P = (I − C)−1 = M1 (I − L1 )−1 I is also well defined. Note also that 1T C = [1T , 0T ] and 1T D = [0T , 1T ]. We begin with an elementary but important result. Lemma 3. With the above definitions, 1T DP = 1T
and
1T DP C m−1 ≤ 1T C m−1
(15.1)
and the matrix W0 = [I − DP C m−1 ]−1 is well defined. Proof. 1T D + 1T C = 1T
⇒
1T D = 1T (I − C)
⇒
1T D(I − C)−1 = 1T .
Hence L1 m−1 0 1T DP C m−1 = 1T C m−1 = [1T , 1T ] M1 L1 m−2 0 4 3 T m−1 = 1 L1 + 1T M1 Lm−2 , 0T 1 4 3 , 0T = (1T L1 + 1T M1 )Lm−2 1 4 3 , 0T < 1T = 1T Lm−2 1 for m ≥ 3. Hence W0 = [I − DP C m−1 ]−1 is well defined.
15.8.2 Existence of the matrix Wp for 1 ≤ p ≤ m − 1 We will consider the matrix W1 = [I − DC m−2 (W0 C)]−1 .
15
Control policy for stormwater management in two connected dams
297
We have 1T DC m−2 + 1T DP C m−1 = 1T D[I + P C]C m−2 = 1T DP C m−2 = 1T C m−2 . Since 1T C m−2 ≤ 1T we deduce that 1T DC m−2 + (1T C m−2 )DP C m−1 ≤ 1T C m−2 , from which it follows that 1T DC m−2 ≤ 1T C m−2 [I − DP C m−1 ] and hence that 1T DC m−2 (W0 C) ≤ 1T C m−1 < 1T . Since the column sums of the matrix DC m−2 (W0 C) are all non-negative and less than one, it follows that the inverse matrix W1 = [I − DC m−2 (W0 C)]−1 is well defined. Lemma 4. With the above definitions the matrix Wp is well defined for each p = 0, 1, . . . , m − 1 and for each such p we have ⎤ ⎡ m−1−p p−1 T j⎦ ⎣ C (Wt C) ≤ 1T C p . 1 D t=0
j=0
In the special case when p = m − 1 the inequality becomes 1T D
m−2
(Wt C) ≤ 1T C m−1 .
t=0
Proof. The proof is by induction. We note that ⎤ ⎡ m−2 1T D ⎣ C j + P C m−1 ⎦ = 1T DP = 1T j=0
and hence
⎡
m−2
1T D ⎣
j=0
from which it follows that
⎤ C j ⎦ = 1T [I − DP C m−1 ],
298
J. Piantadosi and P. Howlett
⎤
⎡
m−2
1T D ⎣
C j ⎦ (W0 C) = 1T C.
j=0
Thus the result is true for p = 1. Suppose the result is true for 1 ≤ s ≤ p − 1. Then p−2 (Wt C) ≤ 1T C p−1 < 1T 1T DC m−p t=0
and hence
Wp−1 = [I − DC m−p (W0 C) · · · (Wp−2 C)]−1
is well defined. Now we have ⎤ ⎡ m−1−p p−2 p−2 1T D ⎣ Cj⎦ (Wt C) + (1T C p−1 )DC m−p (Wt C) j=0
t=0
⎡
m−p
≤ 1T D ⎣
⎤
Cj⎦
t=0 p−2
(Wt C)
t=0
j=0
≤ 1T C p−1 and hence
⎡
⎤
1T D ⎣
Cj⎦
m−1−p j=0
p−2
(Wt C) ≤ (1T C p−1 )[I − DC m−p
t=0
p−2
(Wt C)],
t=0
from which it follows that ⎤ % ⎡ & m−1−p p−2 1T D ⎣ Cj⎦ × (Wt C) Wp−1 ≤ 1T C p−1 . t=0
j=0
If we multiply on the right by C we obtain the desired result ⎤ ⎡ m−1−p p−1 1T D ⎣ Cj⎦ (Wt C) ≤ 1T C p . j=0
t=0
Thus the result is also true for s = p. This completes the proof.
15.8.3 Existence of the matrix Wq for m ≤ q ≤ k − m − 1 We need to establish some important identities.
15
Control policy for stormwater management in two connected dams
Lemma 5. The JP identities of the first kind ⎧ ⎫ ⎡ ⎤ ⎤ ⎡ p−1 p−1 m−p−1 p−1 ⎨ ⎬ ⎣ 1T D I + (Wt C)⎦ + ⎣ Cj⎦ (Wt C) = 1T ⎩ ⎭ j=1
t=p−j
299
(15.2)
t=0
j=0
are valid for each p = 1, 2, . . . , m − 1. Proof. From the identity m−2
C j + C m−1 P = P
j=0
and Lemma 3 we deduce that ⎡ ⎤ m−2 C j + C m−1 P ⎦ = 1T . 1T D ⎣ j=0
By rearranging this identity we have ⎡ ⎤ m−2 1T D ⎣ C j ⎦ = 1T [I − DC m−1 P ] j=0
⎤
⎡
and hence
m−2
1T D ⎣
C j ⎦ (W0 C) = 1T C,
j=0
from which it follows that ⎧ ⎫ ⎤ ⎡ m−2 ⎨ ⎬ 1T D I + ⎣ C j ⎦ (W0 C) = 1T C + 1T D = 1T . ⎩ ⎭ j=0
Therefore the JP identity of the first kind is valid for p = 1. We will use induction to establish the general identity. Let p > 1 and suppose the result is true for s = p < m − 1. From ⎧ ⎫ ⎡ ⎤ ⎤ ⎡ p−1 p−1 m−p−1 p−1 ⎨ ⎬ ⎣ (Wt C)⎦ + ⎣ Cj⎦ (Wt C) = 1T 1T D I + ⎩ ⎭ j=1
t=p−j
j=0
t=0
we deduce that ⎧ ⎫ ⎡ ⎤ ⎤ ⎡ p−1 p−1 m−p−2 p−1 ⎨ ⎬ ⎣ 1T D I + (Wt C)⎦ + ⎣ Cj⎦ (Wt C) ⎩ ⎭ j=1
t=p−j
= 1 [I − DC T
m−p−1
j=0
(W0 C) · · · (Wp−1 C)]
t=0
300
J. Piantadosi and P. Howlett
and hence that ⎧ ⎫ ⎡ ⎤ ⎤ ⎡ p−1 p−1 m−p−2 p−1 ⎨ ⎬ ⎣ 1T D I + (Wt C)⎦ + ⎣ Cj⎦ (Wt C) (Wp C) = 1T C. ⎩ ⎭ j=1
t=p−j
t=0
j=0
If we rewrite this in the form ⎧ ⎫ ⎡ ⎤ ⎤ ⎡ p p m−p−2 p ⎨ ⎬ ⎣ 1T D (Wt C)⎦ + ⎣ C j ⎦ (Wt C) = 1T C ⎩ ⎭ j=1
t=p+1−j
then it is clear that ⎧ ⎡ p ⎨ ⎣ 1T D I + ⎩ j=1
t=0
j=0
⎡
⎤
(Wt C)⎦ + ⎣
Cj⎦
⎤
p
t=p+1−j
m−p−2 j=0
p t=0
⎫ ⎬ (Wt C) ⎭
= 1T C + 1T D = 1T . Hence the result is also true for s = p + 1. This completes the proof. Lemma 6. The matrix Wq exists for q = m, m + 1, . . . , k − m − 1 and for each such q the JP identities of the second kind ⎧ ⎡ ⎤⎫ m−1 ⎨ ⎬ q−1 ⎣ (Wt C)⎦ = 1T (15.3) 1T D I + ⎩ ⎭ j=1
t=q−j
are also valid. Proof. The JP identities of the second kind are established in the same way that we established the JP identities of the first kind but care is needed because it is necessary to establish that each Wq is well defined. From Lemma 5 the JP identity of the first kind for p = 1 is ⎧ ⎫ ⎤ ⎡ m−3 ⎨ ⎬ C j ⎦ (W0 C) = 1T . 1T D I + ⎣ ⎩ ⎭ j=0
Therefore
⎧ ⎫ ⎤ ⎡ m−3 ⎨ ⎬ 1T D I + ⎣ C j ⎦ (W0 C) + 1T DC m−2 (W0 C) = 1T ⎩ ⎭ j=0
and hence 1T D
⎧ ⎨
⎡
m−3
I +⎣
⎩
j=0
⎫ ⎬ C j ⎦ (W0 C) = 1T [I − DC m−2 (W0 C)], ⎭ ⎤
15
Control policy for stormwater management in two connected dams
301
from which we obtain ⎧ ⎫ ⎤ ⎡ m−3 ⎨ ⎬ 1T D I + ⎣ C j ⎦ (W0 C) (W1 C) = 1T C. ⎩ ⎭ j=0
In general if we suppose that ⎧ ⎫ ⎤ ⎡ m−p ⎨ ⎬ p−2 1T D I + ⎣ C j ⎦ (W0 C) (Wt C) ≤ 1T C p−2 ⎩ ⎭ t=1
j=0
then we have ⎧ ⎫ ⎤ ⎡ m−p−1 p−2 ⎨ ⎬ p−2 1T D I + ⎣ C j ⎦ (W0 C) (Wt C) + (1T C p−2 )DC m−p (Wt C) ⎩ ⎭ t=1
j=0
≤1 C T
t=0
p−2
and hence 1T D
⎧ ⎨
⎡
m−p−1
I +⎣
⎩
j=0
⎫ ⎬ p−2 C j ⎦ (W0 C) (Wt C) ⎭ ⎤
t=1
≤ (1T C p−2 )[I − DC m−p
p−2
(Wt C)],
t=0
from which we obtain ⎧ ⎫ ⎤ ⎡ m−p−1 ⎨ ⎬ p−1 1T D I + ⎣ C j ⎦ (W0 C) (Wt C) ≤ 1T C p−1 . ⎩ ⎭ t=1
j=0
By continuing this process until W0 is eliminated we obtain the inequality 1T D
m−1
(Wt C) ≤ 1T C m−1 .
t=1
Therefore the matrix Wm = [I − D(W1 C) · · · (Wm−1 C)]−1 is well defined. The JP identity of the first kind with p = m − 1 gives ⎧ ⎫ ⎡ ⎤ m−2 m−2 m−2 ⎨ ⎬ ⎣ 1T D I + (Wr C)⎦ + (Wt C) = 1T ⎩ ⎭ j=1
t=m−1−j
t=0
302
J. Piantadosi and P. Howlett
which, after rearrangement, becomes ⎧ ⎡ ⎤⎫ m−2 m−2 m−2 ⎨ ⎬ T T ⎣ ⎦ 1 D I+ (Wt C) (Wt C)] = 1 [I − D ⎩ ⎭ j=1
t=0
t=m−1−j
and allows us to deduce ⎧ ⎡ m−2 ⎨ ⎣ 1T D I + ⎩ j=1
m−2
t=m−1−j
⎤⎫ ⎬ (Wt C)⎦ (Wm−1 C) = 1T C. ⎭
If we rewrite this in the form ⎧ ⎫ ⎡ ⎤ m−1 ⎨m−1 ⎬ m−1 ⎣ 1T D (Wt C)⎦ + (Wt C) = 1T C ⎩ ⎭ j=1
then it follows that 1T D
t=0
t=m−j
⎧ ⎨ I+
⎩
m−1 j=1
⎡ ⎣
m−1
t=m−j
⎤⎫ ⎬ (Wt C)⎦ = 1T . ⎭
Thus the JP identity of the second kind is true for q = m. We proceed by induction. We suppose that the matrix Ws is well defined for each s with m ≤ s ≤ q < k − m − 1 and that the JP identity of the second kind ⎧ ⎡ ⎤⎫ m−1 ⎨ ⎬ s−1 ⎣ 1T D I + (Wt C)⎦ = 1T ⎩ ⎭ j=1
t=s−j
is valid for these values of s. Therefore ⎧ ⎡ ⎤⎫ m−2 ⎨ ⎬ q−1 ⎣ 1T D I + (Wt C)⎦ + 1T D ⎩ ⎭ j=1
t=q−j
q−1
(Wt C) = 1T
t=q−m+1
and hence ⎧ ⎡ ⎤⎫ m−2 ⎨ ⎬ q−1 ⎣ 1T D I + (Wt C)⎦ = 1T [I − D(Wq−m+1 C) · · · (Wq−1 C)]. ⎩ ⎭ j=1
t=q−j
Since Wq = [I − D(Wq−m+1 C) · · · (Wq−1 C)]−1 is well defined, we have ⎧ ⎡ ⎤⎫ m−2 ⎨ ⎬ q−1 ⎣ 1T D I + (Wt C)⎦ (Wq C) = 1T C ⎩ ⎭ j=1
t=q−j
15
Control policy for stormwater management in two connected dams
which we rewrite as
⎧ ⎡ ⎨m−1 ⎣ 1T D ⎩ j=1
q
t=q+1−j
and from which it follows that ⎧ ⎡ m−1 ⎨ ⎣ 1T D I + ⎩ j=1
303
⎤⎫ ⎬ (Wt C)⎦ = 1T C ⎭
q
t=q+1−j
⎤⎫ ⎬ (Wt C)⎦ = 1T . ⎭
Hence the JP identity of the second kind is valid for s = q + 1. To show that Wq+1 is well defined, we must consider two cases. When q ≤ 2m − 2 we set p = q − m + 2 in the JP identity of the first kind to give ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ q+1−m q+1−m 2m−q−1 q+1−m ⎣ (Wt C)⎦+⎣ C j⎦ (Wt C)⎦ = 1T . 1T D ⎣I + j=1
t=q−m+2−j
Therefore ⎧ ⎡ q+1−m ⎨ ⎣ 1T D I + ⎩ j=1
= 1 [I − DC and hence 1T D
⎧ ⎨ I+
⎡
⎤
(Wt C)⎦ + ⎣
Cj⎦
⎤
q+1−m
2m−q−2
t=q−m+2−j
T
2m−q−1
q+1−m
⎩
⎡
⎡ ⎣
j=1
+⎣
j=0
t=0
⎤
q+1−m
⎤
Cj⎦
(Wt C)⎦ ⎫ ⎬ (Wt C) (Wq−m+2 C) = 1T C. ⎭
q+1−m t=0
j=0
Since 1T C ≤ 1T it follows that ⎧ ⎡ ⎤ q+1−m q+1−m ⎨ ⎣ (Wt C)⎦ 1T D I + ⎩ j=1 t=q−m+2−j ⎫ ⎡ ⎤ 2m−q−3 q+1−m ⎬ + ⎣ Cj⎦ (Wt C) (Wq−m+2 C) ⎭ j=0
⎫ ⎬ (Wt C) ⎭
q+1−m
(W0 C) · · · (Wq+1−m C)]
t=q−m+2−j
2m−q−2
t=0
j=0
t=0
+ (1T C)DC 2m−q−2
q−m+2
(Wt C) ≤ 1T C
t=0
304
J. Piantadosi and P. Howlett
and hence that
⎧ ⎨
q+1−m
⎡
⎤
q+1−m
⎣ (Wt C)⎦ I+ ⎩ j=1 t=q−m+2−j ⎫ ⎤ ⎡ 2m−q−3 q+1−m ⎬ Cj⎦ (Wt C) (Wq−m+2 C) +⎣ ⎭
1T D
t=0
j=0
≤ (1T C)[I − DC 2m−q−2 (W0 C) · · · (Wq−m+2 C)]. Now we can deduce that ⎧ ⎡ ⎤ q+1−m q+1−m ⎨ ⎣ (Wt C)⎦ 1T D I + ⎩ j=1 t=q−m+2−j ⎫ ⎤ ⎡ 2m−q−3 q+1−m ⎬ Cj⎦ (Wt C) (Wq−m+2 C)(Wq−m+3 C) ≤ 1T C 2 . +⎣ ⎭ t=0
j=0
If we continue this process the terms of the second sum on the left-hand side will be eliminated after 2m − q steps, at which stage we have ⎧ ⎡ ⎤⎫ q+1−m q+1−m ⎨ ⎬ m−1 T ⎣ ⎦ 1 D I+ (Wt C) (Wt C) ≤ 1T C 2m−q . ⎩ ⎭ j=1
t=q−m+2
t=q−m+2−j
The details change slightly but we continue the elimination process. Since (1T C 2m−q ) < 1T we now have ⎧ ⎡ ⎤⎫ q−m q+1−m ⎨ ⎬ m−1 ⎣ 1T D I + (Wt C)⎦ (Wt C) ⎩ ⎭ j=1
t=q−m+2
t=q−m+2−j
+ (1T C 2m−q )D
m−1
(Wt C) ≤ 1T C 2m−q
t=1
and hence 1T D
⎧ ⎨ I+
⎩
q−m j=1
⎡ ⎣
q+1−m
t=q−m+2−j
⎤⎫ ⎬ (Wt C)⎦ ⎭
m−1
(Wt C)
t=q−m+2
≤ (1T C 2m−q )[I − D(W1 C) · · · (Wm−1 C)], from which we obtain
15
Control policy for stormwater management in two connected dams
1T D
⎧ ⎨ ⎩
I+
q−m
⎡ ⎣
j=1
q+1−m
t=q−m+2−j
⎤⎫ ⎬ (Wt C)⎦ ⎭
m
305
(Wt C) ≤ 1T C 2m−q+1 .
t=q−m+2
The elimination continues in the same way until we eventually conclude that T
1 D
q
(Wt C) ≤ 1T C m−1 < 1T
t=q−m+2
and hence establish that Wq+1 = [I − D(Wq−m+2 C) · · · (Wq C)]−1 is well defined. A similar but less complicated argument can be carried through using the appropriate JP identity of the second kind when q ≥ 2m − 1. Hence the matrix Ws is well defined for s = q + 1. This completes the proof.
15.9 Summary We have established a general method of analysis for a class of simple control policies in a system of two connected dams where we assume a stochastic supply and regular demand. We calculated steady-state probabilities for each particular policy within the class and hence determined the expected overflow from the system. A key finding is that calculation of the steady-state probability vector for a large system can be reduced to a much smaller calculation using the block matrix structure. We hope to extend our considerations to more complex control policies in which the decision to pump from the first dam requires that the content of the first dam exceeds a particular level m1 and also that the content of the second dam is less than the level m2 = k − m1 . We observe that for this class the transition matrix can be written in block matrix form using the matrices A and B described in this article in almost the same form but with the final rows containing only one non-zero block matrix R. Thus it seems likely that the methodology presented in this chapter could be adapted to provide a general analysis for this new class of pumping policies. Ultimately we would like to extend our considerations to include more complicated connections and the delays associated with treatment of stormwater. We also believe a similar analysis is possible for the policies considered in this chapter when a continuous state space is used for the first dam. The matrices must be replaced by linear integral operators but the overall block structure remains the same.
306
J. Piantadosi and P. Howlett
References 1. D. R. Cox and H. D. Miller, The Theory of Stochastic Processes (Methuen & Co., London, 1965). 2. F. R. Gantmacher, The Theory of Matrices (Chelsea Publishing Company, New York, 1960). 3. D. L. Isaacson and R. W. Madsen, Markov Chains: Theory and Applications (John Wiley & Sons, New York, 1976). 4. P. A. P. Moran, A probability theory of dams and storage systems, Austral. J. Appl. Sci. 5 (1954), 116–124. 5. P. A. P. Moran, An Introduction to Probability Theory (Clarendon Press, Oxford, 1968). 6. P. A. P. Moran, A Probability Theory of Dams and Storage Systems (McGraw-Hill, New York, 1974). 7. G. F. Yeo, A finite dam with exponential release, J. Appl. Probability 11 (1974), 122–133. 8. G. F. Yeo, A finite dam with variable release rate, J. Appl. Probability 12 (1975), 205–211.
Chapter 16
Optimal design of linear consecutive–k–out–of–n systems Malgorzata O’Reilly
Abstract A linear consecutive–k–out–of–n:F system is an ordered sequence of n components that fails if and only if at least k consecutive components fail. A linear consecutive–k–out–of–n:G system is an ordered sequence of n components that works if and only if at least k consecutive components work. This chapter establishes necessary conditions for the variant optimal design and procedures to improve designs not satisfying these conditions for linear consecutive systems with 2k ≤ n ≤ 3k. Key words: Linear consecutive–k–out–of–n:F system, linear consecutive– k–out–of–n:G system, variant optimal design, singular design, nonsingular design
16.1 Introduction 16.1.1 Mathematical model A linear consecutive–k–out–of–n:F system ([11], [20]–[23]) is a system of n components ordered in a line, such that the system fails if and only if at least k consecutive components fail. A linear consecutive–k–out–of–n:G system is a system of n components ordered in a line, such that the system works if and only if at least k consecutive components work. A particular arrangement of components in a system is referred to as a design and a design that maximizes system reliability is referred to as optimal. We assume the following:
Malgorzata O’Reilly School of Mathematics and Physics, University of Tasmania, Hobart TAS 7001, AUSTRALIA e-mail: [email protected] C. Pearce, E. Hunt (eds.), Structure and Applications, Springer Optimization and Its Applications 32, DOI 10.1007/978-0-387-98096-6 16, c Springer Science+Business Media, LLC 2009
307
308
1. 2. 3. 4.
M. O’Reilly
the system is either in a failing or a working state; each component is either in a failing or a working state; the failures of the components are independent; component reliabilities are distinct and within the interval (0, 1).
The fourth assumption is made for the clarity of presentation, without loss of generality. Cases that include reliabilities 0 and 1 can be viewed as limits of other cases. Some of the strict inequalities will become nonstrict when these cases are included. Note also that, in all procedures, an improved design X = (q1 , . . . , qn ) and its reverse X = (qn , . . . , q1 ) are considered to be equivalent.
16.1.2 Applications and generalizations of linear consecutive–k–out–of–n systems Two classic examples of consecutive–2–out–of–n:F systems were given by Chiang and Niu in [11]: • a telecommunication system with n relay stations (satellites or ground stations) which fails when at least 2 consecutive stations fail; and • an oil pipeline system with n pump stations which fails when at least 2 consecutive pump stations are down. Kuo, Zhang and Zuo [24] gave the following example of a linear consecutive–k–out–of–n:G system: • consider n parallel-parking spaces on a street, with each space being suitable for one car. The problem is to find the probability that a bus, which takes 2 consecutive spaces, can park on this street. More examples of these systems are in [2, 19, 34, 35, 38]. For a review of the literature on consecutive–k–out–of–n systems the reader is referred to [8]. Also see [5] by Chang, Cui and Hwang. Introducing more general assumptions and considering system topology has led to some generalizations of consecutive–k–out–of–n systems. These are listed below: • • • • • • • •
consecutively connected systems [32]; linearly connected systems [6, 7, 14]; consecutive–k–out–of–m–from–n: F systems [36]; consecutive–weighed–k–out–of–n: F systems [37]; m–consecutive–k–out–of–n: F systems [15]; 2–dimensional consecutive–k–out–of–n: F systems [31]; connected–X–out–of–(m, n): F lattice systems [3]; connected–(r, s)–out–of–(m, n): F lattice systems [27];
16
• • • •
Optimal design of linear consecutive–k–out–of–n systems
309
k–within–(r, s)–out–of–(m, n): F lattice systems [27]; consecutively connected systems with multi-state components [27]; generalized multi-state k–out–of–n: G systems [16]; combined k–out–of–n: F , consecutive–k–out–of–n: F and linear connected– (r, s)–out–of–(m, n): F system structures [39].
A number of related, more realistic systems have also been reported [1, 9, 30]. Linear consecutive–k–out–of–n: F systems have been used to model vacuum systems in accelerators [18], computer ring networks [17], systems from the field of integrated circuits [4], belt conveyors in open-cast mining [27] and the exploration of distant stars by spacecraft [10]. Applications of generalized consecutive systems include medical diagnosis [31], pattern detection [31], evaluation of furnace systems in the petro-chemical industry [39] and a shovel-truck system in an open mine [16].
16.1.3 Studies of consecutive–k–out–of–n systems Studies of the optimal designs of consecutive–k–out–of–n systems have resulted in establishing two types of optimal designs: invariant and variant designs. The optimality of invariant optimal designs is independent of the numerical values of components’ reliabilities and subject only to the ordering of the numerical values of component reliabilities. Conversely, the optimality of variant optimal designs is contingent on those numerical values. Malon [26] has noticed that, in practice, it may be sufficient to know the ages of the components to be able to order them according to their reliabilities. This has an important implication when an optimal design of the system is invariant, that is, independent of the component reliabilities. For such optimal designs, one does not need to know the exact component reliabilities to be able to order components in an optimal way. A linear consecutive–k–out–of– n:F system has an invariant optimal design only for k ∈ {1, 2, n − 2, n − 1, n} [26]. The invariant optimal design for linear consecutive–2–out–of–n:F systems has been given by Derman, Lieberman and Ross [12] and proven by Malon [25] and Du and Hwang [13]. The invariant optimal designs for linear consecutive–k–out–of–n:F systems with k ∈ {n − 2, n − 1} have been established by Malon [26]. A linear consecutive–k–out–of–n:G system has an invariant optimal design only for k ∈ {1, n − 2, n − 1, n} and for n/2 ≤ k < n − 2 [40]. The invariant optimal design for linear consecutive–k–out–of–n:G systems with n/2 ≤ k ≤ n−1 has been given by Kuo et al. [24]. Zuo and Kuo [40] have summarized the complete results on the invariant optimal designs of consecutive–k–out–of–n systems. Table 16.1 lists all invariant optimal designs of linear consecutive– k–out–of–n systems and has been reproduced from [40]. The assumed order
310
M. O’Reilly
Table 16.1 Invariant optimal designs of linear consecutive–k–out–of–n systems k
F System
G System
k=1 k=2
ω −
2 < k < n/2 n/2 ≤ k < n − 2
ω (1, n, 3, n − 2, . . . , n − 3, 4, n − 1, 2) − −
k =n−2
(1, 4, ω, 3, 2)
k =n−1
(1, ω, 2)
k=n
ω
− (1, 3, . . . , 2(n − k) − 1, ω, 2(n − k), . . . , 4, 2) (1, 3, . . . , 2(n − k) − 1, ω, 2(n − k), . . . , 4, 2) (1, 3, . . . , 2(n − k) − 1, ω, 2(n − k), . . . , 4, 2) ω
of component reliabilities is p1 < p2 < . . . < pn . The symbol ω represents any possible arrangement. In all cases where an invariant optimal design is not listed, only variant optimal designs exist. Linear consecutive–k–out–of–n systems have variant optimal designs for all F systems with 2 < k < n − 2 and all G systems with 2 ≤ k < n/2. For these systems, the information about the order of component reliabilities is not sufficient to find the optimal design. In fact, one needs to know the exact value of component reliabilities. This is because different sets of component reliabilities produce different optimal designs, so that for a given linear consecutive–k–out–of–n system there is more than one possible optimal design. Zuo and Kuo [40] have proposed methods for dealing with the variant optimal design problem which are based upon the following necessary conditions for optimal design, proved by Malon [26] for linear consecutive–k–out–of–n:F systems and extended by Kuo et al. [24] to linear consecutive–k–out–of–n:G systems: (i) components from positions 1 to min{k, n − k + 1} are arranged in nondecreasing order of component reliability; (ii) components from positions n to max{k, n − k + 1} are arranged in nondecreasing order of component reliability; (iii) the (2k − n) most reliable components are arranged from positions (n − k + 1) to k in any order if n < 2k. In the case when n ≥ 2k, a useful concept has been that of singularity, which has been also applied in invariant optimal designs [13]. A design X = (q1 , q2 , . . . , qn ) is singular if for symmetrical components qi and qn+1−i , 1 ≤ i ≤ [n/2], either qi > qn+1−i or qi < qn+1−i for all i; otherwise the design is nonsingular. According to Shen and Zuo [33] a necessary condition for the optimal design of a linear consecutive–k–out–of–n:G system with
16
Optimal design of linear consecutive–k–out–of–n systems
311
n ∈ {2k, 2k + 1} is for it to be singular. In [28] we have shown that a necessary condition for the optimal design of linear consecutive–k–out–of–n:F systems with 2k ≤ n ≤ (2k + 1) is for it to be nonsingular. Procedures to improve designs not satisfying necessary conditions for the optimal design of linear consecutive–k–out–of–n:F and linear consecutive–k–out–of–n:G were also given. The significance of these results was illustrated by an example showing that designs satisfying these necessary conditions can be better than designs satisfying other known necessary conditions.
16.1.4 Summary of the results In this chapter we treat the case 2k + 2 ≤ n ≤ 3k and explore whether the results of Shen and Zuo [33] and O’Reilly [28] can be extended to this case. The proofs included here are more complicated and the produced results do not exactly mirror those when 2k ≤ n ≤ 2k + 1. We find that, although the necessary conditions for the optimal design of linear consecutive–k–out–of–n:F systems in the cases 2k ≤ n ≤ 2k +1 and 2k +2 ≤ n ≤ 3k are similar, the procedures to improve designs not satisfying this necessary condition differ in the choice of interchanged components. Furthermore, the necessary conditions for linear consecutive–k–out–of–n:G systems in these two cases are significantly different. In the case when 2k + 2 ≤ n ≤ 3k, the requirement for the optimal design of a linear consecutive–k–out–of–n:G system to be singular holds only under certain limitations. Examples of nonsingular and singular optimal designs are given. The theorems are built on three subsidiary propositions, which are given in Sections 16.2 and 16.4. Proposition 16.4.1 itself requires some supporting lemmas which are the substance of Section 16.3. The main results for this case are presented in Section 16.5. The ideas are related to those in the existing literature, though the detail is somewhat complicated. The arguments are constructive and based on the following. Suppose X ≡ (q1 , . . . , q2k+m ) is a design and {qi1 , . . . , qir } is an arbitrary • proper subset of {q1 , . . . , qk } when m ≤ 1, or • nonempty subset of {qm , . . . , qk } when m > 1. ∗ ) the design obtained from X by interWe denote by X ∗ ≡ (q1∗ , . . . , q2k+m changing symmetrical components qij and q2k+m+1−ij for all 1 ≤ j ≤ r. We show that a number of inequalities exist between quantities defined from X and the corresponding quantities defined for a generic X ∗ . We use the notation X ∗ in this way throughout this chapter without further comment. Theorem 16.5.1 of Section 16.5 rules out only one type of design of consecutive–k–out–of–(2k + m): F systems: singular designs. However, we emphasize that the results for consecutive–k–out–of–(2k + m): G systems in Theorem 16.5.2 and Corollary 16.5.2, obtained by symmetry from Theorem 16.5.1,
312
M. O’Reilly
significantly reduce the number of designs to be considered in algorithms searching for an optimal design when m is small. For example, for m = 2, when we apply the necessary condition stated in Corollary 16.5.2, the number of designs to be considered reduces from (2k − 2)! to (2k − 2)!/2k if (q1 , qk+1 , qk+2 , q2k+2 ) is singular (which occurs with probability 0.5 when a design is chosen randomly). We note that, except for the necessary conditions mentioned in Section 16.1.3, little is known about the variant optimal designs. We establish more necessary conditions for the variant optimal design in [29], also appearing in this volume, which is an important step forward in studying this difficult problem.
16.2 Propositions for R and M Throughout this chapter we adopt the convention qs ≡ 1, ∅
a + b − ab ≡ (a ⊕ b), and make use of the following definitions. Definition 1. Let X ≡ (q1 , . . . , q2k+m ), 2 ≤ m ≤ k, l ≤ k and {qi1 , . . . , qir } be an arbitrary nonempty subset of {qm , . . . , qk }. We define A(X) ≡
qs ,
s∈{i1 ,...,ir }
A (X) ≡
q2k+m+1−s ,
s∈{i1 ,...,ir }
Bl (X) ≡
qs and
s∈{l,...,k}\{i1 ,...,ir }
Bl (X) ≡
q2k+m+1−s ,
s∈{l,...,k}\{i1 ,...,ir }
with similar definitions for X ∗ (obtained by replacing X with X ∗ and q with q ∗ ).
Note that Bl (X) = Bl (X ∗ ), Bl (X) = Bl (X ∗ ), A (X) = A(X ∗ ) and A (X ∗ ) = A(X).
Definition 2. Let X ≡ (q1 , . . . , q2k+m ), 2 ≤ m ≤ k. Thus we have either m = 2T + 1 for some T > 0 or m = 2T + 2 for some T ≥ 0. We define
16
Optimal design of linear consecutive–k–out–of–n systems
313
W0 (X) ≡ F¯k2k+m (X), t
Wt (X) ≡
> ?@ A
F¯k2k+m (q1 , . . . , qk , 1, . . . , 1, qk+t+1 , . . . , qk+m−t , t
> ?@ A 1, . . . , 1, qk+m+1 , . . . , q2k+m ) for
1 ≤ t ≤ T, m > 2 and m
WT +1 (X) ≡
> ?@ A
F¯k2k+m (q1 , . . . , qk , 1, . . . , 1, qk+m+1 , . . . , q2k+m ).
Definition 3. Let X ≡ (q1 , . . . , q2k+m ), 2 ≤ m ≤ k with either m = 2T + 1 for some T > 0 or m = 2T + 2 for some T ≥ 0. We define k 2k+m−t qs ⊕ qs Mt (X) ≡ s=t+1
for
s=k+m+1
0 ≤ t ≤ T.
Definition 4. Let X ≡ (q1 , . . . , q2k+m ), 2 ≤ m ≤ k. If m = 2T + 2 for some T ≥ 0, we define k 2k+m−T −1 qs ⊕ qs RT (X) ≡ pk+T +1 qk+m−T s=T +1
+ qk+T +1 pk+m−T
2k+m−T
s=k+m+1
qs ⊕
s=k+m+1 k
qs
.
s=T +2
If m > 2 with either m = 2T + 1 for some T > 0 or m = 2T + 2 for some T > 0, then for 0 ≤ t ≤ T − 1 we define k qs ⊕ Rt (X) ≡ pk+t+1 qk+m−t s=t+1
m+k−3t−3 (qk+t+2 , . . . , qk+m−t−1 , qk+m+1 , . . . , q2k+m−t−1 ) F¯k−t−1
+ qk+t+1 pk+m−t
2k+m−t
qs ⊕
s=k+m+1
m+k−3t−3 ¯ (qt+2 , . . . , qk , qk+t+2 , . . . , qk+m−t−1 ) . Fk−t−1 It will be convenient to make the following abbreviations: A = A(X), A∗ = A(X ∗ ), Bl = Bl (X), Bl = Bl (X), Wt (X) = Wt , Wt (X ∗ ) = Wt∗ , Mt (X) = Mt , Mt (X ∗ ) = Mt∗ , Rt (X) = Rt and Rt (X ∗ ) = Rt∗ .
314
M. O’Reilly
Propositions 16.2.1 and 16.2.2 below contain results for Mt and RT , which are later used to prove a result for W0 in Theorem 16.5.1 of Section 16.5. In the proofs of Propositions 16.2.1, 16.2.2 and 16.4.1 and Theorem 16.5.1 we assume q1 > q2k+m . Note that, by the symmetry of the formulas, reversing the order of components in X and X ∗ would not change the values of Wt , Mt and Rt for X and X ∗ . Therefore the assumption q1 > q2k+m can be made without loss of generality. Proposition 16.2.1 Let X ≡ (q1 , . . . , q2k+m ) be singular, 2 ≤ m ≤ k, with either m = 2T + 1 for some T > 0 or m = 2T + 2 for some T ≥ 0. Then Mt > Mt∗
for
0 ≤ t ≤ T,
for any X ∗ . Proof. Without loss of generality we can assume q1 > q2k+m . Note that m ≥ 2 and so t + 1 < m for all 0 ≤ t ≤ T . We have Mt = ABt+1 ⊕ A∗ Bt+1 , Mt∗ = A∗ Bt+1 ⊕ ABt+1 ,
Mt − Mt∗ = (A − A∗ )(Bt+1 − Bt+1 ),
where A − A∗ > 0 and Bt+1 − Bt+1 > 0 by the singularity of X, and so Mt − Mt∗ > 0,
proving the proposition.
Proposition 16.2.2 Let X ≡ (q1 , . . . , q2k+m ) be singular, 2 ≤ m ≤ k, with m = 2T + 2 for some T ≥ 0. Then RT > RT∗ for any X ∗ . Proof. Without loss of generality we can assume q1 > q2k+m . Note that m ≥ 2 and so T + 2 ≤ m. We have RT = pk+T +1 qk+m−T qT +1 ABT +2 ⊕ A∗ BT +2 + qk+T +1 pk+m−T q2k+m−T A∗ BT +2 ⊕ ABT +2 , RT∗ = pk+T +1 qk+m−T qT +1 A∗ BT +2 ⊕ ABT +2 + qk+T +1 pk+m−T q2k+m−T ABT +2 ⊕ A∗ BT +2
16
Optimal design of linear consecutive–k–out–of–n systems
315
and
RT − RT∗ = qk+T +1 pk+m−T (BT +2 − q2k+m−T BT +2 )(A − A∗ )
− qk+m−T pk+T +1 (BT +2 − qT +1 BT +2 )(A − A∗ ), where by the singularity of X
BT +2 − q2k+m−T BT +2 > 0, A − A∗ > 0, qk+T +1 pk+m−T > qk+m−T pk+T +1 ,
BT +2 − q2k+m−T BT +2 > BT +2 − qT +1 BT +2 , and so
RT − RT∗ > 0,
proving the proposition.
16.3 Preliminaries to the main proposition Lemmas 16.3.1–16.3.3 below contain some preliminary results which are used in the proof of a result for Rt in Proposition 16.4.1 of Section 16.4. Lemma 16.3.1 Let X ≡ (q1 , . . . , q2k+m ) be singular, 2 < m ≤ k, with either m = 2T + 1 for some T > 0 or m = 2T + 2 for some T > 0. Then for any X ∗ and all 0 ≤ t ≤ T − 1 we have k s=t+1
qs = ABm
m−1
qs ,
(16.1)
s=t+1
m+k−3t−3 (qk+t+2 , . . . , qk+m−t−1 , qk+m+1 , . . . , q2k+m−t−1 ) F¯k−t−1 2(m−2t−2) = F¯m−2t−2 (qk+t+2 , . . . , qk+m−t−1 , q2k+t+2 , . . . , q2k+m−t−1 ) · 2k+t+1
qs
s=k+m+1 2(m−2t−2) = F¯m−2t−2 (qk+t+2 , . . . , qk+m−t−1 , q2k+t+2 , . . . , q2k+m−t−1 ) ·
A∗ Bm
2k+t+1 s=2k+2
qs ,
(16.2)
316
M. O’Reilly k
m+k−3t−3 (qk+t+2 , . . . , qk+m−t−1 , qk+m+1 , . . . , q2k+m−t−1 ) F¯k−t−1
qs
s=t+1 2(m−2t−2) = F¯m−2t−2 (qk+t+2 , . . . , qk+m−t−1 , q2k+t+2 , . . . , q2k+m−t−1 ) · m−1 2k+t+1 ∗ qs qs , (16.3) ABm A Bm s=t+1 k
s=2k+2
qs∗
s=t+1
=
m−1
qs
A∗ Bm ,
(16.4)
s=t+1
2(m−2t−2) ∗ ∗ ∗ ∗ , . . . , qk+m−t−1 , qk+m−1 , . . . , q2k+m−t−1 ) F¯m−2t−2 (qk+t+2 2(m−2t−2)
= F¯m−2t−2
ABm
(qk+t+2 , . . . , qk+m−t−1 , q2k+t+2 , . . . , q2k+m−t−1 ) ·
2k+t+1
qs ,
(16.5)
s=2k+2 k
m+k−3t−3 ∗ ∗ ∗ ∗ F¯k−t−1 (qk+t+2 , . . . , qk+m−t−1 , qk+m+1 , . . . , q2k+m−t−1 )
qs∗
s=t+1
=
2(m−2t−2) F¯m−2t−2 (qk+t+2 , . . . , qk+m−t−1 , q2k+t+2 , . . . , q2k+m−t−1 )
ABm A∗ Bm
m−1
s=t+1
qs
2k+t+1
·
qs
(16.6)
s=2k+2
and m+k−3t−3 (qk+t+2 , . . . , qk+m−t−1 , qk+m+1 , . . . , q2k+m−t−1 ) F¯k−t−1
k
qs
s=t+1 m+k−3t−3 ∗ ∗ ∗ ∗ = F¯k−t−1 (qk+t+2 , . . . , qk+m−t−1 , qk+m+1 , . . . , q2k+m−t−1 )
k
qs∗ .
s=t+1
(16.7) Proof. Without loss of generality we can assume q1 > q2k+m . We have m > 2 and so t + 1 < m for all 0 ≤ t ≤ T − 1. Therefore (16.1) is satisfied. Also, consider that in m+k−3t−3 (qk+t+2 , . . . , qk+m−t−1 , qk+m+1 , . . . , q2k+m−t−1 ) F¯k−t−1
we have 2(k − t − 1) > m + k − 3t − 3. Therefore every event in which k − t − 1 consecutive components fail will include failure of the components qk+m+1 , . . . , q2k+t+1 . Hence (16.2) follows. From (16.1) and (16.2) we have (16.3).
16
Optimal design of linear consecutive–k–out–of–n systems
317
In a similar manner we show (16.4)–(16.6). From (16.3) and (16.6), we have (16.7) and the lemma follows.
Lemma 16.3.2 Let X ≡ (q1 , . . . , q2k+m ) be singular, 2 < m ≤ k, with either m = 2T + 1 for some T > 0 or m = 2T + 2 for some T > 0. Then for any X ∗ and all 0 ≤ t ≤ T − 1 we have 2k+m−t
qs = A∗ Bm
s=k+m+1
2k+m−t
qs ,
(16.8)
s=2k+2
m+k−3t−3 F¯k−t−1 (qt+2 , . . . , qk , qk+t+2 , . . . , qk+m−t−1 ) k
2(m−2t−2) = F¯m−2t−2 (qt+2 , . . . , qm−t−1 , qk+t+2 , . . . , qk+m−t−1 )
qs
s=m−t 2(m−2t−2) = F¯m−2t−2 (qt+2 , . . . , qm−t−1 , qk+t+2 , . . . , qk+m−t−1 )ABm
m−1
qs ,
s=m−t
(16.9)
m+k−3t−3 F¯k−t−1 (qt+2 , . . . , qk , qk+t+2 , . . . , qk+m−t−1 )
2k+m−t
qs
s=k+m+1
(qt+2 , . . . , qm−t−1 , qk+t+2 , . . . , qk+m−t−1 )ABm A∗ Bm · m−1 qs qs , (16.10)
2(m−2t−2)
= F¯m−2t−2 2k+m−t s=2k+2
s=m−t 2k+m−t
qs∗ = ABm
s=k+m+1
2k+m−t
qs ,
(16.11)
s=2k+2
m+k−3t−3 ∗ ∗ ∗ F¯k−t−1 (qt+2 , . . . , qk∗ , qk+t+2 , . . . , qk+m−t−1 )
=
2(m−2t−2) F¯m−2t−2 (qt+2 , . . . , qm−t−1 , qk+t+2 , . . . , qk+m−t−1 )A∗ Bm
m−1
qs ,
s=m−t
(16.12) m+k−3t−3 ∗ ∗ ∗ F¯k−t−1 (qt+2 , . . . , qk∗ , qk+t+2 , . . . , qk+m−t−1 )
2k+m−t
qs∗
s=k+m+1
=
2(m−2t−2) F¯m−2t−2 (qt+2 , . . . , qm−t−1 , qk+t+2 , . . . , qk+m−t−1 )ABm A∗ Bm
2k+m−t s=2k+2
qs
m−1 s=m−t
qs
·
(16.13)
318
M. O’Reilly
and m+k−3t−3 (qt+2 , . . . , qk , qk+t+2 , . . . , qk+m−t−1 ) F¯k−t−1
2k+m−t
qs
s=k+m+1 m+k−3t−3 ∗ ∗ ∗ = F¯k−t−1 (qt+2 , . . . , qk∗ , qk+t+2 , . . . , qk+m−t−1 )
2k+m−t
qs∗ .
s=k+m+1
(16.14) Proof. Note that Lemma 16.3.1 is also true for designs reversed to X and ∗ , . . . , q1∗ ). Therefore X ∗ , that is, for Xr = (q2k+m , . . . , q1 ) and Xr∗ = (q2k+m all equalities of Lemma 16.3.2 are satisfied. Lemma 16.3.3 Let Y ≡ (q2k , . . . , q1 ), k ≥ 2, let (qk , . . . , q1 ) be singular with q1 < qk , and let Y ≡ (q2k , . . . , qk+1 , q1 , . . . , qk ). Then F¯k2k (Y ) > F¯k2k (Y ).
Proof. For 1 ≤ i ≤ [k/2], let Y i be obtained from Y by interchanging components i and k + 1 − i. It is sufficient to show F¯k2k (Y ) > F¯k2k (Y i ), that is, that the system Y is improved by interchanging the components at positions i and k+1−i. Note that by the singularity of Y we have qi < qk+1−i . Malon in [26] (for F systems) and Kuo et al. in [24] (for G systems) have shown that if in a linear consecutive–k–out–of–n system we have pi > pj (or equivalently qi < qj ) for some 1 ≤ i ≤ j ≤ min{k, n − k + 1}, then the system is improved by interchanging components pi and pj (equivalently qi and qj ). Applying this result to systems Y and Y i , we have F¯k2k (Y ) > F¯k2k (Y i ), proving the lemma.
16.4 The main proposition Proposition 16.4.1 below contains a result for Rt which is later used in the proof of a result for W0 in Theorem 16.5.1 of Section 16.5. Proposition 16.4.1 Let X ≡ (q1 , . . . , q2k+m ) be singular, 2 < m ≤ k, with either m = 2T + 1 for some T > 0 or m = 2T + 2 for some T > 0. Then Rt > Rt∗
for
0≤t≤T −1
16
Optimal design of linear consecutive–k–out–of–n systems
319
for any X ∗ . Proof. Without loss of generality we can assume q1 > q2k+m . Define
k
˜ t (X) ≡ pk+t+1 qk+m−t R
qs
s=t+1
+
m+k−3t−3 F¯k−t−1 (qk+t+2 , . . . , qk+m−t−1 , qk+m+1 , . . . , q2k+m−t−1 )
+ qk+t+1 pk+m−t
2k+m−t
qs
s=k+m+1
m+k−3t−3 ¯ + Fk−t−1 (qt+2 , . . . , qk , qk+t+2 , . . . , qk+m−t−1 ) , ˜ t (X ∗ ). Put R ˜ t (X) = R ˜ t and R ˜ t (X ∗ ) = R ˜∗. with a similar formula for R t ∗ Note that in the formulas for Rt and Rt the following equalities are satisfied: ∗ ∗ , qk+m−t = qk+m−t , (16.7) of Lemma 16.3.1 and (16.14) of qk+t+1 = qk+t+1 ˜t > R ˜ t∗ . Lemma 16.3.2. Hence to prove that Rt > Rt∗ , it is sufficient to show R Define T ≡
m−1
qs ,
T ≡
s=t+1
2k+m−t
qs ,
s=2k+2
2(m−2t−2) U1 ≡ F¯m−2t−2 (qk+t+2 , . . . , qk+m−t−1 , q2k+t+2 , . . . , q2k+m−t−1 ) and 2(m−2t−2)
U2 ≡ F¯m−2t−2
(qt+2 , . . . , qm−t−1 , qk+t+2 , . . . , qk+m−t−1 ).
Applying results (16.1), (16.2), (16.4) and (16.5) of Lemma 16.3.1, and (16.8), (16.9), (16.11) and (16.13) of Lemma 16.3.2, we have 2k+t+1 ∗ ˜ qs Rt = pk+t+1 qk+m−t T ABm + A B U1 m
s=2k+2
A∗ Bm T + ABm U2
+ qk+t+1 pk+m−t
,
qs
s=m−t
˜ t∗ = pk+t+1 qk+m−t R
m−1
T A∗ Bm + ABm U1
qs
s=2k+2
+ qk+t+1 pk+m−t
2k+t+1
ABm T + A∗ Bm U2
m−1 s=m−t
qs
,
320
M. O’Reilly
and so ˜t − R
˜∗ R t
Bm U2
= qk+t+1 pk+m−t
2k+t+1
Bm U1
qs − T Bm
s=m−t
− qk+m−t pk+t+1
m−1
(A − A∗ )
qs − T Bm
(A − A∗ ).
(16.15)
s=2k+2
Note that by the singularity of X qk+t+1 pk+m−t > qk+m−t pk+t+1 , A − A∗ > 0,
(16.16) (16.17)
Bm ≥ Bm ,
(16.18)
T > T and m−1
qs ≥
s=m−t
2k+t+2
U2
qs ,
(16.20)
s=2k+2
where the equalities include the assumption m−1
(16.19)
m−1
qs >
s=m−t
qs
m−1
≡ 1 , and
m−t−1
s=m−t
=
∅ qs
qs
s=t+2
qs
s=t+2
>
2k+m−t−1
qs
s=2k+2
>T .
(16.21)
From (16.18) and (16.21) it follows that Bm U2
m−1
qs − T Bm > 0.
s=m−t
Next, by Lemma 16.3.3 2(m−2t−2) F¯m−2t−2 (qt+2 , . . . , qm−t−1 , qk+t+2 , . . . , qk+m−t−1 ) 2(m−2t−2)
> F¯m−2t−2
(qt+2 , . . . , qm−t−1 , qk+m−t−1 , . . . , qk+t+2 ),
and since qt+2 > q2k+m−t−1 , . . . , qm−t−1 > q2k+t+2 ,
(16.22)
16
Optimal design of linear consecutive–k–out–of–n systems
321
we have 2(m−2t−2) F¯m−2t−2 (qt+2 , . . . , qm−t−1 , qk+m−t−1 , . . . , qk+t+2 ) 2(m−2t−2)
> F¯m−2t−2
(q2k+m−t−1 , . . . , q2k+t+2 , qk+m−t−1 , . . . , qk+t+2 ). (16.23)
From (16.22) and (16.23) it follows that 2(m−2t−2) F¯m−2t−2 (qt+2 , . . . , qm−t−1 , qk+t+2 , . . . , qk+m−t−1 ) 2(m−2t−2) > F¯m−2t−2 (qk+t+2 , . . . , qk+m−t−1 , q2k+t+2 , . . . , q2k+m−t−1 ),
that is, U2 > U1 . From (16.18)–(16.20) and (16.24) we have m−1
Bm U2
qs − T Bm > Bm U1
s=m−t
2k+t+1
qs − T Bm .
s=2k+2
Considering (16.15), we conclude by (16.16), (16.17), (16.22) and (16.24) that ˜ t∗ , ˜t > R R
proving the proposition.
16.5 Theorems Theorem 16.5.1 below states that if X is a singular design of a linear consecutive–k–out–of–(2k + m):F system with 2 ≤ m ≤ k, then for any nonsingular design X∗ obtained from X by interchanging symmetrical components (as defined in Section 16.1.4), X∗ is a better design. Theorem 16.5.1 Let X ≡ (q1 , . . . , q2k+m ) be singular, 2 ≤ m ≤ k, with either m = 2T + 1 for some T > 0 or m = 2T + 2 for some T ≥ 0. Then X ∗ is nonsingular and F¯k2k+m (X) > F¯k2k+m (X ∗ ) for any X ∗ . Proof. Clearly, X ∗ is nonsingular. Without loss of generality we can assume q1 > q2k+m . Proceeding by induction, we shall prove that W0 > W0∗ . STEP 1. For 2 ≤ m = k we have WT +1 = WT∗ +1 = 1. For 2 ≤ m < k we have
322
M. O’Reilly 2(k−m)
WT +1 = F¯k−m
(qm+1 , . . . , qk , qk+m+1 , . . . , q2k ),
with a similar formula for WT∗ +1 . By Theorem 16.5.1 (see also O’Reilly [28]) it follows that WT +1 ≥ WT∗ +1 , with equality if and only if either {qm+1 , . . . , qk } is a subset of {qi1 , . . . , qir } or the intersection of those sets is empty. Either way we have WT +1 ≥ WT∗ +1 . STEP 2. Note that if m = 2T + 1, then WT = pk+T +1 MT + qk+T +1 WT +1 , ∗ ∗ with a similar formula for WT∗ , where qk+T +1 = qk+T +1 . We have MT > MT
by Proposition 16.2.1 and WT +1 ≥ WT∗ +1 by Step 1, and so it follows that WT > WT∗ . If m = 2T + 2, then WT = pk+T +1 pk+m−T MT + qk+T +1 qk+m−T WT +1 + RT , ∗ ∗ with a similar formula for WT∗ , where qk+T +1 = qk+T +1 , qk+m−T = qk+m−T ,
MT > MT∗ by Proposition 16.2.1, WT +1 ≥ WT∗ +1 by Step 1 and RT > RT∗ by Proposition 16.2.2. Hence WT > WT∗ . Either way we have WT > WT∗ . If m = 2, then T = 0 and so W0 > W0∗ , completing the proof for m = 2. Consider m > 2. ∗ for some 0 ≤ t ≤ T − 1. We shall STEP 3. Suppose that Wt+1 > Wt+1 ∗ show that then Wt > Wt . We have Wt = pk+t+1 pk+m−t Mt + qk+t+1 qk+m−t Wt+1 + Rt , ∗ ∗ , qk+m−t = qk+m−t , with a similar formula for Wt∗ , where qk+t+1 = qk+t+1 ∗ by the inductive assumption Mt > Mt∗ by Proposition 16.2.1, Wt+1 > Wt+1 and Rt > Rt∗ by Proposition 16.4.1. It follows that
Wt > Wt∗ . From Steps 2–3 and mathematical induction we have W0 > W0∗ , proving the theorem. The following corollary is a direct consequence of Theorem 16.5.1.
16
Optimal design of linear consecutive–k–out–of–n systems
323
Corollary 16.5.1 A necessary condition for the optimal design of a linear consecutive–k–out–of–(2k + m):F system with 2 ≤ m ≤ k is for it to be nonsingular. Theorem 16.5.2 below states that if X is a singular design of a linear consecutive–k–out–of–(2k + m):G system with q1 > q2k+m , 2 ≤ m ≤ k, then for any nonsingular design X∗ obtained from X by interchanging symmetrical components (as defined in Section 16.1.4), X is a better design. Theorem 16.5.2 Let X ≡ (q1 , . . . , q2k+m ) be singular, 2 ≤ m ≤ k. Then X ∗ is nonsingular and (X) > G2k+m (X ∗ ) G2k+m k k for any X ∗ . Proof. This theorem for G systems can be proved in a manner similar to the proof of Theorem 16.5.1 for F systems. That is, by giving similar definitions for G systems, similar proofs for lemmas and propositions for G systems can be given. Alternatively, the theorem can be proved by applying only Theorem 16.5.1, as below. Clearly, X ∗ is nonsingular. Define p¯i ≡ qi for all 1 ≤ i ≤ 2k + m. Then we have (X) = F¯k2k+m (¯ p1 , . . . , p¯2k+m ), G2k+m k 2k+m 2k+m ∗ ∗ ¯ G (X ) = F (¯ p1 , . . . , p¯∗2k+m ), k
k
where (¯ p1 , . . . , p¯2k+m ) is singular, and so by Theorem 16.5.1 p1 , . . . , p¯2k+m ) > F¯k2k+m (¯ p∗1 , . . . , p¯∗2k+m ), F¯k2k+m (¯ proving the theorem. Corollary 16.5.2 Let Y = (q1 , . . . , q2k+m ) be the optimal design of a linear consecutive–k–out–of–(2k + m):G system with 2 ≤ m ≤ k. If (q1 , . . . , qm−1 , qk+1 . . . , qk+m , q2k+2 , . . . , q2k+m ) is singular, then Y must be singular too. Proof. Suppose Y is not singular. Let Z be a singular design obtained from Y by an operation in which we allow the interchange of only those symmetrical components which are in places m, . . . , k, k + m + 1, . . . , 2k + 1. Then Z and Y satisfy the conditions of Theorem 16.5.2, and so (Z) > G2k+m (Y ). G2k+m k k Hence Y is not optimal, and by contradiction the corollary is proved.
324
M. O’Reilly
16.6 Procedures to improve designs not satisfying necessary conditions for the optimal design We have shown that a necessary condition for the optimal design of a linear consecutive–k–out–of–n:F system with 2k + 2 ≤ n ≤ 3k is for it to be nonsingular (Corollary 16.5.1 of Section 16.5), which is similar to the case 2k ≤ n ≤ 2k + 1 treated in [28]. However, the procedures given in [28] cannot be implemented in this case. This is due to the restriction placed on the choice of interchanged symmetrical components ((3m − 2) components excluded from the interchange). The following procedure is a consequence of Theorem 16.5.1. Procedure 16.6.1 In order to improve a singular design of a linear consecutive–k–out–of–(2k +m):F system with 2 ≤ m ≤ k, apply the following steps: 1. select an arbitrary nonempty set of pairs of symmetrical components so that the first component in each pair is in a position from m to k; and then 2. interchange the two components in each selected pair. Note that the number of possible choices in Step 1 is 2(k−m+1) − 1. Consequently, the best improvement can be chosen or, if the number of possible choices is too large to consider all options, the procedure can be repeated as required. Because the result for systems with 2k +2 ≤ n ≤ 3k excludes some components, it is not possible to derive from it, unlike the case when 2k ≤ n ≤ 2k+1, that it is necessary for the optimal design of a linear consecutive–k–out–of–n: G system to be singular. However, as stated in Corollary 16.5.2 of Section 16.5, if a subsystem composed of those excluded components is singular, then the whole system has to be singular for it to be optimal. Consequently, the following procedure can be applied. Note that, for a given nonsingular design, the number of possible singular designs produced in this manner is 1. Procedure 16.6.2 Suppose a design of a linear consecutive–k–out–of–(2k + m): G system is nonsingular, with 2 ≤ m ≤ k. Consider its subsystem composed of components in positions from 1 to (m − 1) , from (k + 1) to (k + m), and from (2k + 2) to (2k + m), in order as in the design. If such a subsystem is singular, then in order to improve the design, interchange all required symmetrical components so that the design becomes singular. The following examples, calculated using a program written in C ++ , are given in order to illustrate the fact that both nonsingular and singular optimal designs of linear consecutive–k–out–of–n:G systems exist. Example 1. (q1 , q5 , q7 , q9 , q8 , q6 , q4 , q3 , q2 ) is a nonsingular optimal design of a linear consecutive–3–out–of–9:G system. It is optimal for q1 = 0.151860, q2 = 0.212439, q3 = 0.304657, q4 = 0.337662, q5 = 0.387477, q6 = 0.600855, q7 = 0.608716, q8 = 0.643610 and q9 = 0.885895.
16
Optimal design of linear consecutive–k–out–of–n systems
325
Example 2. (q1 , q3 , q4 , q5 , q7 , q9 , q8 , q6 , q2 ) is a singular optimal design of a linear consecutive–3–out–of–9:G system. It is optimal for q1 = 0.0155828, q2 = 0.1593690, q3 = 0.3186930, q4 = 0.3533360, q5 = 0.3964650, q6 = 0.4465830, q7 = 0.5840900, q8 = 0.8404850 and q9 = 0.8864280.
References 1. D. S. Bai, W. Y. Yun and S. W. Chung, Redundancy optimization of k–out–of–n systems with common–cause failures, IEEE Trans. Reliability 40(1) (1991), 56–59. 2. A. Behr and L. Camarinopoulos, Two formulas for computing the reliability of incomplete k–out–of–n:G systems, IEEE Trans. Reliability 46(3) (1997), 421–429. 3. T. Boehme, A. Kossow and W. Preuss, A generalization of consecutive–k–out–of–n:F systems, IEEE Trans. Reliability 41(3) (1992), 451–457. 4. R. C. Bollinger, A. A. Salvia, Consecutive–k–out–of–n:F networks, IEEE Trans. Reliability 31(1) (1982), 53–56. 5. G. J. Chang, L. Cui and F. K. Hwang, Reliabilities of Consecutive–k Systems, (Kluwer, Amsterdam, 2000). 6. M. T. Chao and J. C. Fu, A limit theorem of certain repairable systems, Ann. Inst. Statist. Math. 41 (1989), 809–818. 7. M. T. Chao and J. C. Fu, The reliability of large series systems under Markov structure, Adv. Applied Probability 23(4) (1991), 894–908. 8. M. T. Chao, J. C. Fu and M. V. Koutras, Survey of reliability studies of consecutive– k–out–of–n: F & related systems, IEEE Trans. Reliability 44(1) (1995), 120–127. 9. R. W. Chen, F. K. Hwang and Wen-Ching Winnie Li, Consecutive–2–out–of–n:F systems with node & link failures, IEEE Trans. Reliability 42(3) (1993), 497–502. 10. D. T. Chiang and R. F. Chiang, Relayed communication via consecutive–k–out–of–n:F system, IEEE Trans. Reliability 35(1) (1986), 65–67. 11. D. T. Chiang and S–C. Niu, Reliability of consecutive–k–out–of–n:F system, IEEE Trans. Reliability 30(1) (1981), 87–89. 12. C. Derman, G. J. Lieberman and S. M. Ross, On the consecutive–k–of–n:F system, IEEE Trans. Reliability 31(1) (1982), 57–63. 13. D. Z. Du and F. K. Hwang, Optimal consecutive–2–out–of–n systems, Mat. Oper. Research 11(1) (1986), 187–191. 14. J. C. Fu, Reliability of consecutive–k–out–of–n:F systems with (k − 1)–step Markov dependence, IEEE Trans. Reliability 35(5) (1986), 602–606. 15. W. S. Griffith, On consecutive–k–out–of–n failure systems and their generalizations, Reliability & Quality Control, Columbia, Mo. (1984), (North–Holland, Amsterdam, 1986), 157–165. 16. J. Huang and M. J. Zuo, Generalised multi–state k–out–of–n:G systems, IEEE Trans. Reliability 49(1) (2000), 105–111. 17. F. K. Hwang, Simplified reliabilities for consecutive–k–out–of–n systems, SIAM J. Alg. Disc. Meth. 7(2) (1986), 258–264. 18. S. C. Kao, Computing reliability from warranty, Proc. Amer. Statistical Assoc., Section on Statistical Computing (1982), 309–312. 19. K. C. Kapur and L. R. Lamberson, Reliability in Engineering Design (John Wiley & Sons, New York, 1977). 20. J. M. Kontoleon, Optimum allocation of components in a special 2–port network, IEEE Trans. Reliability 27(2) (1978), 112–115. 21. J. M. Kontoleon, Analysis of a dynamic redundant system, IEEE Trans. Reliability 27(2) (1978), 116–119.
326
M. O’Reilly
22. J. M. Kontoleon, Reliability improvement of multiterminal networks, IEEE Trans. Reliability 29(1) (1980), 75–76. 23. J. M. Kontoleon, Reliability determination of a r–successive–out–of–n:F system, IEEE Trans. Reliability 29(5) (1980), 437. 24. W. Kuo, W. Zhang and M. Zuo, A consecutive–k–out–of–n:G system: the mirror image of a consecutive–k–out–of–n:F system, IEEE Trans. Reliability 39(2) (1990), 244–253. 25. D. M. Malon, Optimal consecutive–2–out–of–n:F component sequencing, IEEE Trans. Reliability 33(5) (1984), 414–418. 26. D. M. Malon, Optimal consecutive–k–out–of–n:F component sequencing, IEEE Trans. Reliability 34(1) (1985), 46–49. 27. J. Malinowski and W. Preuss, On the reliability of generalized consecutive systems – a survey, Intern. Journ. Reliab., Quality & Safety Engineering 2(2) (1995), 187–201. 28. M. M. O’Reilly, Variant optimal designs of linear consecutive–k–out–of–n systems, to appear in Mathematical Sciences Series: Industrial Mathematics and Statistics, Ed. J. C. Misra (Narosa Publishing House, New Delhi, 2003), 496–502. 29. M. M. O’Reilly, The (k + 1)–th component of linear consecutive–k–out–of–n systems,” chapter in this volume. 30. S. Papastavridis and M. Lambiris, Reliability of a consecutive–k–out–of–n:F system for Markov–dependent components, IEEE Trans. Reliability 36(1) (1987), 78–79. 31. A. A. Salvia and W. C. Lasher, 2–dimensional consecutive–k–out–of–n:F models, IEEE Trans. Reliability 39(3) (1990), 382–385. 32. J. G. Shanthikumar, Reliability of systems with consecutive minimal cutsets, IEEE Trans. Reliability 36(5) (1987), 546–550. 33. J. Shen and M. Zuo, A necessary condition for optimal consecutive–k–out–of–n:G system design, Microelectron. Reliab. 34(3) (1994), 485–493. 34. J. Shen and M. J. Zuo, Optimal design of series consecutive–k–out–of–n:G systems, Rel. Engin. & System Safety 45 (1994), 277–283. 35. M. L. Shooman, Probabilistic Reliability: An Engineering Approach, (McGraw-Hill, New York, 1968). 36. Y. L. Tong, A rearrangement inequality for the longest run, with an application to network reliability, J. App. Prob. 22 (1985), 386–393. 37. Jer–Shyan Wu and Rong–Jaya Chen, Efficient algorithms for k–out–of–n & consecutive–weighed–k–out–of-n:F system, IEEE Trans. Reliability 43(4) (1994), 650–655. 38. W. Zhang, C. Miller and W. Kuo, Application and analysis for a consecutive–k–out– of–n:G structure, Rel. Engin. & System Safety 33 (1991), 189–197. 39. M. J. Zuo, Reliability evaluation of combined k–out–of–n:F , consecutive–k–out–of– n:F , and linear connected–(r, s)–out–of–(m, n):F system structures, IEEE Trans. Reliability 49(1) (2000), 99–104. 40. M. Zuo and W. Kuo, Design and performance analysis of consecutive–k–out–of–n structure, Nav. Research Logistics 37 (1990), 203–230.
Chapter 17
The (k + 1)-th component of linear consecutive–k–out–of–n systems Malgorzata O’Reilly
Abstract A linear consecutive–k–out–of–n:F system is an ordered sequence of n components that fails if and only if at least k consecutive components fail. A linear consecutive–k–out–of–n:G system is an ordered sequence of n components that works if and only if at least k consecutive components work. The existing necessary conditions for the optimal design of systems with 2k ≤ n provide comparisons between reliabilities of components restricted to positions from 1 to k and positions from n to (n − k + 1). This chapter establishes necessary conditions for the variant optimal design that involve components at some other positions, including component (k+1). Procedures to improve designs not satisfying those conditions are also given. Key words: Linear consecutive–k–out–of–n: F system, linear consecutive– k–out–of–n: G system, variant optimal design, singular design, nonsingular design
17.1 Introduction For the description of the mathematical model of the system discussed here, including nomenclature, assumptions and notation, the reader is referred to [9], also appearing in this volume. Zuo and Kuo [16] have proposed three methods for dealing with the variant optimal design problem: a heuristic method, a randomization method and a binary search method. The heuristic and randomization methods produce
Malgorzata O’Reilly School of Mathematics and Physics, University of Tasmania, Hobart TAS 7001, AUSTRALIA e-mail: [email protected] C. Pearce, E. Hunt (eds.), Structure and Applications, Springer Optimization and Its Applications 32, DOI 10.1007/978-0-387-98096-6 17, c Springer Science+Business Media, LLC 2009
327
328
M. O’Reilly
suboptimal designs of consecutive–k–out–of–n systems; the binary search method produces an exact optimal design. The heuristic method [16] is based on the concept of Birnbaum importance, which was introduced by Birnbaum in [1]. The Birnbaum reliability importance Ii of component i is defined by the following formula: Ii ≡ R(p1 , . . . , pi−1 , 1, pi+1 , . . . , pn ) − R(p1 , . . . , pi−1 , 0, pi+1 , . . . , pn ), where R stands for the reliability of a system. The heuristic method [16] implements the idea that a component with a higher reliability should be placed in a position with a higher Birnbaum importance. Based on Birnbaum’s definition, Papastavridis [10] and Kuo, Zhang and Zuo [5] defined component reliability functions for consecutive–k–out– of–n systems. Zakaria, David and Kuo [13], Zuo [14] and Chang, Cui and Hwang [3] have established some comparisons of Birnbaum importance in consecutive–k–out–of—n systems. Zakaria et al. [13] noted that more reliable components should be placed in positions with higher importance in a reasonable heuristic for maximizing system reliability. Zuo and Shen [17] developed a heuristic method which performs better than the heuristic method of Zuo and Kuo [16]. The randomization method [16] compares a limited number of randomly chosen designs and obtains the best amongst them. The binary search method [16] has been applied only to linear consecutive–k–out–of–n:F systems with n/2 ≤ k ≤ n. Both methods are based upon the following necessary conditions for optimal design, proved by Malon [7] for linear consecutive–k– out–of–n:F systems and extended by Kuo et al. [5] to linear consecutive–k– out–of–n:G systems: (i) components from positions 1 to min{k, n − k + 1} are arranged in nondecreasing order of component reliability; (ii) components from positions n to max{k, n − k + 1} are arranged in nondecreasing order of component reliability; (iii) the (2k − n) most reliable components are arranged from positions (n − k + 1) to k in any order if n < 2k. Pairwise rearrangement of components in a system has been suggested as another method to enhance designs [2, 4, 6]. Other necessary conditions have also been reported in the literature. Shen and Zuo [12] proved that a necessary condition for the optimal design of a linear consecutive– k–out–of–n:G system with n ∈ {2k, 2k + 1} is for it to be singular and O’Reilly proved that a necessary condition for the optimal design of a linear consecutive–k–out–of–n:F system with n ∈ {2k, 2k + 1} is for it to be nonsingular [8]. Those results have been extended to the case 2k ≤ n ≤ 3k by O’Reilly in [9]. Procedures to improve designs not satisfying those necessary conditions have been also provided ([8], Procedures 1–2; [9], Procedures 1–2).
17
The (k + 1)-th component of linear consecutive–k–out–of–n systems
329
17.2 Summary of the results In this chapter we focus on variant optimal designs of linear consecutive–k– out–of–n systems and establish necessary conditions for the optimal designs of those systems. As an application of these results, we construct procedures to enhance designs not satisfying these necessary conditions. An improved design and its reverse, that is, a design with components in reverse order, are regarded in these procedures as equivalent. Although variant optimal designs depend upon the particular choices of component reliabilities, the necessary conditions for the optimal design of linear consecutive systems established here rely only on the order of component reliabilities and not their exact values. Therefore they can be applied in the process of eliminating nonoptimal designs from the set of potential optimal designs when it is possible to compare component reliabilities, without necessarily knowing their exact values. We explore the case n ≥ 2k. The case n < 2k for G systems has been solved by Kuo et al. [5] and the case n ≤ 2k for F systems can be limited to n = 2k due to the result of Malon [7]. We can summarize this as follows. Theorem 17.2.1 A design X = (q1 , q2 , . . . , qn ) is optimal for a linear consecutive–k–out–of–n:F system with n < 2k if and only if 1. the (2k − n) best components are placed from positions (n − k + 1) to k, in any order; and 2. the design (q1 , . . . , qn−k , qk+1 , . . . , qn ) is optimal for a linear consecutive-(n − k)-out-of-2(n − k):F system. The existing necessary conditions for the optimal design of systems with n ≥ 2k [5, 7] provide comparisons between the reliabilities of components restricted to the positions from 1 to k and the positions from n to (n − k + 1). In this chapter we develop necessary conditions for the optimal design of systems with n ≥ 2k with comparisons that involve components at some other positions, including the (k + 1)-th component. The following conditions are established as necessary for the design of a linear consecutive system to be optimal (stated in Corollaries 17.3.1, 17.4.1 and 17.5.1 respectively): • q1 > qk+1 and qn > qn−k for linear consecutive–k–out–of–n:F and consecutive–k–out–of–n:G systems with n > 2k, k ≥ 2; • min{q1 , q2k } > qk+1 > max{qk , qk+2 } for linear consecutive–k–out–of– n:F systems with n = 2k + 1, k > 2; • (q1 , qk+1 , qk+2 , q2k+2 ) is singular and (q1 , . . . , qk , qk+3 , . . . , q2k+2 ) nonsingular for linear consecutive–k–out–of–n:F systems with n = 2k + 2, k > 2. Further, procedures to improve designs not satisfying these conditions are given. Whereas the first of the conditions is general, the other two conditions
330
M. O’Reilly
compare components in places other than only the k left-hand or k righthand places for systems with n > 2k, unlike what has been considered in the literature so far. Zuo [15] proved that for linear consecutive–k–out–of–n systems with components with common choices of reliability p and n > 2k, k > 2, we have I1 < Ik+1 , where Ii stands for Birnbaum reliability importance. Lemmas 17.3.1–17.3.2 and Corollary 17.3.1 give a stronger result, which also allows the component unreliabilities to be distinct. Suppose X is a design of a linear consecutive system with n > 2k. Let i and j be the intermediate components with k ≤ i < j ≤ n − k + 1. From the results of Koutras, Papadopoulos and Papastavridis [4] it follows that such components are incomparable in a sense that the information qi > qj is not sufficient for us to establish whether pairwise rearrangement of components i and j improves the design. However, as we show in this chapter, this does not necessarily mean that we cannot determine, as a necessary condition, which of the components i and j should be more reliable for the design to be optimal. In the proofs of Propositions 17.4.2 and 17.5.2 we apply the following recursive formula of Shanthikumar [11]: F¯kn (q1 , . . . , qn ) = F¯kn−1 (q1 , . . . , qn−1 ) + pn−k qn−k+1 . . . qn 3 4 + 1 − F¯kn−k−1 (q1 , . . . , qn−k−1 ) .
17.3 General result for n > 2k, k ≥ 2 We shall make use of the following notation. Definition 1. Let X ≡ (q1 , . . . , qn )(X ≡ (p1 , . . . , pn )). We define X i;j to be a design obtained from X by interchanging components i and j. Propositions 17.3.1 and 17.3.2 below contain preliminary results to Lemmas 17.3.1 and 17.3.2, followed by Corollary 17.3.1 which states a necessary condition for the optimal design of linear consecutive–k–out–of–n:F and linear consecutive–k–out–of–n:G systems with n > 2k, k ≥ 2. Proposition 17.3.1 Let X ≡ (q1 , . . . , qn ), n > 2k, k ≥ 2. Then 3 F¯kn (X)−F¯kn (X 1;k+1 )=qk+2 (qk+1 − q1 ) F¯kn−1 (q2 , . . . , qk , 1, 1, qk+3 , . . . , qn ) 4 − F¯kn (1, q2 , . . . , qk , 0, 1, qk+3 , . . . , qn ) . Proof. By the theorem of total probability, conditioning on the behavior of the items in positions 1 and k + 1, we have that
17
The (k + 1)-th component of linear consecutive–k–out–of–n systems
331
F¯kn (X) = q1 qk+1 F¯kn (1, q2 , . . . , qk , 1, qk+2 , . . . , qn ) + p1 pk+1 F¯kn (0, q2 , . . . , qk , 0, qk+2 , . . . , qn ) + (p1 qk+1 )qk+2 F¯ n−1 (q2 , . . . , qk , 1, 1, qk+3 , . . . , qn ) k
+ (p1 qk+1 )pk+2 F¯kn−1 (q2 , . . . , qk , 1, 0, qk+3 , . . . , qn ) + (q1 pk+1 )qk+2 F¯kn (1, q2 , . . . , qk , 0, 1, qk+3 , . . . , qn ) + (q1 pk+1 )pk+2 F¯kn (1, q2 , . . . , qk , 0, 0, qk+3 , . . . , qn ),
(17.1)
and so also F¯kn (X 1;k+1 ) = qk+1 q1 F¯kn (1, q2 , . . . , qk , 1, qk+2 , . . . , qn ) + pk+1 p1 F¯kn (0, q2 , . . . , qk , 0, qk+2 , . . . , qn ) + (pk+1 q1 )qk+2 F¯ n−1 (q2 , . . . , qk , 1, 1, qk+3 , . . . , qn ) k
+ (pk+1 q1 )pk+2 F¯kn−1 (q2 , . . . , qk , 1, 0, qk+3 , . . . , qn ) + (qk+1 p1 )qk+2 F¯kn (1, q2 , . . . , qk , 0, 1, qk+3 , . . . , qn ) + (qk+1 p1 )pk+2 F¯kn (1, q2 , . . . , qk , 0, 0, qk+3 , . . . , qn ).
(17.2)
Note that F¯ n−1 (q2 , . . . , qk , 1, 0, qk+3 , . . . , qn ) k
= F¯kn (1, q2 , . . . , qk , 0, 0, qk+3 , . . . , qn ) k k n−k−2 = qs + F¯ (qk+3 , . . . , qn ) − qs F¯ n−k−2 (qk+3 , . . . , qn ), k
k
s=2
s=2
(17.3) and therefore (p1 qk+1 )pk+2 F¯kn−1 (q2 , . . . , qk , 1, 0, qk+3 , . . . , qn ) + (q1 pk+1 )pk+2 F¯kn (1, q2 , . . . , qk , 0, 0, qk+3 , . . . , qn ) = (pk+1 q1 )pk+2 F¯kn−1 (q2 , . . . , qk , 1, 0, qk+3 , . . . , qn ) + (qk+1 p1 )pk+2 F¯ n (1, q2 , . . . , qk , 0, 0, qk+3 , . . . , qn ). k
Consequently, from (17.1), (17.2) and (17.4) we have F¯kn (X) − F¯kn (X 1;k+1 ) 3 = (p1 qk+1 )qk+2 F¯kn−1 (q2 , . . . , qk , 1, 1, qk+3 , . . . , qn ) 4 + (q1 pk+1 )qk+2 F¯kn (1, q2 , . . . , qk , 0, 1, qk+3 , . . . , qn ) 3 − (pk+1 q1 )qk+2 F¯kn−1 (q2 , . . . , qk , 1, 1, qk+3 , . . . , qn ) 4 + (qk+1 p1 )qk+2 F¯kn (1, q2 , . . . , qk , 0, 1, qk+3 , . . . , qn ) 3 = qk+2 (qk+1 − q1 ) F¯kn−1 (q2 , . . . , qk , 1, 1, qk+3 , . . . , qn ) 4 − F¯ n (1, q2 , . . . , qk , 0, 1, qk+3 , . . . , qn ) , k
proving the proposition.
(17.4)
(17.5)
332
M. O’Reilly
Proposition 17.3.2 Let X ≡ (p1 , . . . , pn ), n > 2k, k ≥ 2. Then Gnk (X) − Gnk (X 1;k+1 ) = pk+2 (pk+1 − p1 )· 3 n−1 Gk (p2 , . . . , pk , 1, 1, pk+3 , . . . , pn ) − Gnk (1, p2 , . . . , pk , 0, 1, pk+3 , . . . , pn )] . Proof. This result follows from Proposition 17.3.1 and the fact that p1 , . . . , p¯n ), F¯kn (q1 , . . . , qn ) = Gnk (¯ where p¯i ≡ qi , 1 ≤ i ≤ n. Lemma 17.3.1 Let X ≡ (q1 , . . . , qn ) be a design for a linear consecutive– k–out–of–n:F system, n > 2k, k ≥ 2. If q1 < qk+1 , then X 1;k+1 is a better design. Proof. We assume the notation that if T < R then T
fs ≡ 0.
s=R
Define W ≡ F¯kn−1 (q2 , . . . , qk , 1, 1, qk+3 , . . . , qn ) − F¯kn (1, q2 , . . . , qk , 0, 1, qk+3 , . . . , qn ).
(17.6)
Since qk+1 − q1 > 0 by assumption, it is sufficient by Proposition 17.3.1 to show that W > 0. We shall show W > 0. Define Wi for 0 ≤ i ≤ k − 1 in the following way. If i = 0, then Wi ≡ F¯kn−k (1, 1, qk+3 , . . . , qn ) − F¯kn−k (0, 1, qk+3 , . . . , qn ); if 1 ≤ i ≤ k − 2, then Wi = F¯kn−k+i (1, . . . , 1, 1, 1, qk+3 , . . . , qn ) @ A> ? i
− F¯kn−k+i (1, . . . , 1, 0, 1, qk+3 , . . . , qn ); @ A> ? i
and if i = k − 1, then Wi = F¯kn−k+i (1, . . . , 1, 1, 1, qk+3 , . . . , qn ) @ A> ? k−1
− F¯kn−k+i (1, 1, . . . , 1, 0, 1, qk+3 , . . . , qn ). @ A> ? k−1
(17.7)
17
The (k + 1)-th component of linear consecutive–k–out–of–n systems
Since pk +
k−2 i=1
pk−i
k
s=k−i+1
qs
+
k
333
qs = 1,
s=2
by the theorem of total probability, conditioning on the behavior of the items in positions 2, . . . , k, we have that k k−2 k pk−i qs W i + qs Wk−1 . (17.8) W = p k W0 + i=1
s=k−i+1
s=2
Note that Wk−1 = 1 − 1 = 0 and Wi > 0 for all 0 ≤ i ≤ k − 2. From this it follows that W > 0, proving the lemma. Lemma 17.3.2 Let X ≡ (p1 , . . . , pn ) be a design for a linear consecutive– k–out–of–n:G system, n > 2k, k ≥ 2. If q1 < qk+1 , then X 1;k+1 is a better design. Proof. By reasoning similar to that in the proof of Lemma 17.3.1 it can be shown that W > 0, where W is defined by W
≡ −
(p2 , . . . , pk , 1, 1, pk+3 , . . . , pn ) Gn−1 k n Gk (1, p2 , . . . , pk , 0, 1, pk+3 , . . . , pn ).
(17.9)
Since pk+1 − p1 < 0 by assumption, from Proposition 17.3.2 we have Gk (X) < Gk (X 1;k+1 ), proving that X 1;k+1 is a better design.
Corollary 17.3.1 Let X ≡ (q1 , . . . , qn ), n > 2k, k ≥ 2. If X is an optimal design for a linear consecutive–k–out–of–n:F or k–out–of–n:G system, then q1 > qk+1 and qn > qn−k . Proof. Let X be an optimal design for a linear consecutive–k–out–of–n:F system. Suppose that q1 < qk+1 or qn < qn−k . If q1 < qk+1 , then from Lemma 17.3.1 it follows that X 1;k+1 is a better design. Further, if qn < qn−k , then by Lemma 17.3.1 applied to the reversed design Xr ≡ (qn , . . . , q1 ), we have that X n;n−k is a better design. The proof for a linear consecutive–k–out–of–n:G system is similar and follows from Lemma 17.3.2. Note: In the optimal design of linear consecutive–k–out–of–n systems with n ∈ {2k + 1, 2k + 2}, the worst component must be placed in position 1 or n. This is due to the necessary condition for the optimal design stated in Corollary 17.3.1 and the necessary conditions of Malon [7] and Kuo et al. [5], as stated in Section 1. Considering that a design is equivalent to its reversed version in the sense that their reliabilities are equal, it can be assumed in algorithms that the worst component is placed in position 1.
334
M. O’Reilly
17.4 Results for n = 2k + 1, k > 2 We shall make use of the following notation. Definition 2. Let X ≡ (q1 , . . . , qn ) (X ≡ (p1 , . . . , pn )). We define X i1 ,...,ir ;j1 ,...,jr to be a design obtained from X by interchanging components is and js for all 1 ≤ s ≤ r. Propositions 17.4.1 and 17.4.2 below contain preliminary results to Lemma 17.4.1, followed by Corollary 17.4.1 which states a necessary condition for the optimal design of a linear consecutive–k–out–of–(2k+1):F system with k > 2. Proposition 17.4.1 Let X ≡ (q1 , . . . , q2k+1 ), k > 2. Then F¯k2k+1 (X) − F¯k2k+1 (X k;k+1 ) = (qk − qk+1 ) (q1 . . . qk−1 − qk+2 . . . q2k +qk+2 . . . q2k+1 − q1 . . . qk−1 qk+2 . . . q2k+1 ) . Proof. From F¯k2k+1 (X) = qk qk+1 F¯k2k+1 (q1 , . . . , qk−1 , 1, 1, qk+2 , . . . , q2k+1 ) + pk pk+1 F¯ 2k+1 (q1 , . . . , qk−1 , 0, 0, qk+2 , . . . , q2k+1 ) k
+ pk qk+1 F¯k2k+1 (q1 , . . . , qk−1 , 0, 1, qk+2 , . . . , q2k+1 ) + qk pk+1 F¯ 2k+1 (q1 , . . . , qk−1 , 1, 0, qk+2 , . . . , q2k+1 ), k
(17.10)
and consequently F¯k2k+1 (X k;k+1 ) = qk+1 qk F¯k2k+1 (q1 , . . . , qk−1 , 1, 1, qk+2 , . . . , q2k+1 ) + pk+1 pk F¯ 2k+1 (q1 , . . . , qk−1 , 0, 0, qk+2 , . . . , q2k+1 ) k
+ pk+1 qk F¯k2k+1 (q1 , . . . , qk−1 , 0, 1, qk+2 , . . . , q2k+1 ) + qk+1 pk F¯k2k+1 (q1 , . . . , qk−1 , 1, 0, qk+2 , . . . , q2k+1 ),
(17.11)
we have F¯k2k+1 (X) − F¯k2k+1 (X k+1;k+2 ) 3 = (qk − qk+1 ) F¯k2k+1 (q1 , . . . , qk−1 , 1, 0, qk+2 , . . . , q2k+1 ) 4 − F¯ 2k+1 (q1 , . . . , qk−1 , 0, 1, qk+2 , . . . , q2k+1 ) k
= (qk − qk−1 ) (q1 . . . qk−1 − qk+2 . . . q2k + qk+2 . . . q2k+1 − q1 . . . qk−1 qk+2 . . . q2k+1 ) , proving the proposition.
(17.12)
17
The (k + 1)-th component of linear consecutive–k–out–of–n systems
335
Proposition 17.4.2 Let X ≡ (q1 , . . . q2k+1 ) and Y ≡ X 1,...,k−1;2k,...,k+2 . Then F¯k2k+1 (X) − F¯k2k+1 (Y ) = (qk+2 . . . q2k − q1 . . . qk−1 )· (qk+1 p2k+1 + q2k+1 − qk ). Proof. From Shanthikumar’s recursive algorithm [11], as stated in Section 17.2, it follows that F¯k2k+1 (X) = F¯k2k (q1 , . . . , q2k ) + pk+1 qk+2 . . . q2k q2k+1 (1 − q1 . . . qk−1 qk ) (17.13) and F¯k2k+1 (Y ) = F¯k2k (q2k , . . . , qk+2 , qk , qk+1 , qk−1 , . . . , q1 ) (17.14) + pk+1 qk−1 . . . q1 q2k+1 (1 − qk+2 . . . q2k qk ) . Note that pk+1 qk+2 . . . q2k q2k+1 (1 − q1 . . . qk−1 qk ) − pk+1 qk−1 . . . q1 q2k+1 (1 − qk+2 . . . q2k qk ) = pk+1 q2k+1 (qk+2 . . . q2k − q1 . . . qk−1 ) .
(17.15)
Also, we have F¯k2k (q1 , . . . , q2k ) = pk pk+1 · 0 2(k−2) + qk qk+1 F¯k−2 (q2 , . . . , qk−1 , qk+2 , . . . , q2k−1 )
+ pk qk+1 (qk+2 . . . q2k ) + qk pk+1 (q1 . . . qk−1 )
(17.16)
and F¯k2k (q2k , . . . , qk+2 , qk , qk+1 , qk−1 , . . . , q1 ) = pk pk+1 · 0 2(k−2) + qk qk+1 F¯k−2 (q2 , . . . , qk−1 , qk+2 , . . . , q2k−1 )
+ pk qk+1 (q1 . . . qk−1 ) + pk+1 qk (qk+2 . . . q2k ),
(17.17)
and so F¯k2k (q1 , . . . , q2k ) − F¯k2k (q2k , . . . , qk+2 , qk , qk+1 , qk−1 , . . . , q1 ) (17.18) = (qk+1 − qk ) (qk+2 . . . q2k − q1 . . . qk−1 ) . From (17.13)–(17.15) and (17.18) it follows that F¯ 2k+1 (X) − F¯ 2k+1 (Y ) k
k
= (qk+1 − qk ) (qk+2 . . . q2k − q1 . . . qk−1 ) + pk+1 q2k+1 (qk+2 . . . q2k − q1 . . . qk−1 ) = (qk+2 . . . q2k − q1 . . . qk−1 )(qk+1 p2k+1 + q2k+1 − qk ), and so the proposition follows.
(17.19)
336
M. O’Reilly
Lemma 17.4.1 Let X ≡ (q1 , . . . q2k+1 ) be a design for a linear consecutive– k–out–of–(2k + 1) : F system, k > 2. Let X satisfy the necessary conditions for the optimal design given by Malon [7] and Kuo et al. [5], as stated in Section 1. Assume qk+1 < qk . If q1 . . . qk−1 ≥ qk+2 . . . q2k ,
(17.20)
then X k;k+1 is a better design, while if q1 . . . qk−1 < qk+2 . . . q2k ,
(17.21)
then X 1,...,k−1;2k,...,k+2 is a better design and (q2k+1 , q1 , . . . q2k ) is a better design. Proof. Suppose q1 . . . qk−1 ≥ qk+2 . . . q2k . Then, since qk − qk+1 > 0 and qk+2 . . . q2k+1 − q1 . . . qk−1 qk+2 . . . q2k+1 > 0, by Proposition 17.4.1 we have F¯k2k+1 (X) − F¯k2k+1 (X k;k+1 ) > 0,
(17.22)
and so X k;k+1 is a better design. Suppose q1 . . . qk−1 < qk+2 . . . q2k . We have assumed the values qi are distinct, so q2k+1 = qk . If q2k+1 < qk , then by the necessary conditions of Malon [7] and Kuo et al. [5] we have q1 > q2 > · · · > qk > q2k+1 > q2k > · · · > qk+2 ,
(17.23)
and then q1 . . . qk−1 > qk+2 . . . q2k , contrary to assumption. Hence q2k+1 > qk
(17.24)
and so by Proposition 17.4.2 we have F¯k2k+1 (X) − F¯k2k+1 (X 1,...,k−1;2k,...,k+2 ) > 0,
(17.25)
proving that X 1,...,k−1;2k,...,k+2 ≡ (q2k , . . . , qk+2 , qk , qk+1 , qk−1 , . . . , q1 , q2k+1 )
is a better design. Define X ≡ (q1 , . . . , q2k+1 ) ≡ X 1,...,k−1;2k,...,k+2 . Note that qk+1 < qk and q1 · . . . · qk−1 > qk+2 · . . . · q2k and so by Proposition 17.4.1, as we have shown in the earlier part of this proof, interchanging components qk+1 and qk ∈ X improves the design. Since a design
(q1 , . . . , qk−1 , qk+1 qk , qk+2 , . . . , q2k+1 ) ≡ (q2k , . . . , q1 , q2k+1 )
17
The (k + 1)-th component of linear consecutive–k–out–of–n systems
337
is better than X , X is better than X, and F¯k2k+1 (q2k+1 , q1 , . . . q2k ) = F¯k2k+1 (q2k , . . . , q1 , q2k+1 ),
(17.26)
so (q2k+1 , q1 , . . . q2k ) is better than X. Note that the rearrangement X 1,...,k−1;2k,...,k+2 as given in Lemma 17.4.1 above is equivalent to: • taking the (2k + 1)-th component and putting it on the left-hand side of the system, next to the first component (in position 0); • interchanging components k and (k + 1); and then • reversing the order of components. Corollary 17.4.1 Let X ≡ (q1 , . . . q2k+1 ) be a design for a linear consecutive–k–out–of–(2k + 1) : F system, k > 2. If X is optimal, then min{q1 , q2k+1 } > qk+1 > max{qk , qk+2 }.
(17.27)
Proof. From Corollary 17.3.1 we have min{q1 , q2k+1 } > qk+1 . If X is optimal, then it satisfies the necessary conditions for the optimal design given by Malon [7] and Kuo et al. [5], as stated in Section 17.2. From Lemma 17.4.1 it follows that qk+1 > qk must be satisfied. Similarly, from Lemma 17.4.1 applied to the reversed design Xr ≡ (q2k+1 , . . . , q1 ), we have qk+1 > qk+2 .
17.5 Results for n = 2k + 2, k > 2 Propositions 17.5.1 and 17.5.2 below contain preliminary results for Lemma 17.5.1, followed by Corollary 17.5.1 which gives a necessary condition for the optimal design of a linear consecutive–k–out–of–(2k + 2): F system. Proposition 17.5.1 Let X ≡ (q1 , . . . , q2k+2 ), k > 2. Then F¯k2k+2 (X) − F¯k2k+2 (X k+1;k+2 ) = (qk+2 − qk+1 )· [(p2k+2 qk+3 . . . q2k+1 − p1 q2 . . . qk ) − (p2k+2 − p1 )q2 . . . qk qk+3 . . . q2k+1 ] . Proof. Since 2(k−2) F¯k2k+2 (X) = qk+1 qk+2 F¯k−2 (q3 , . . . , qk , qk+3 , . . . , q2k ) + pk+1 pk+2 F¯ 2k+2 (q1 , . . . , qk , 0, 0, qk+3 , . . . , q2k ) k
+ pk+1 qk+2 F¯k2k+2 (q1 , . . . , qk , 0, 1, qk+3 , . . . , q2k+2 ) + qk+1 pk+2 F¯ 2k+2 (q1 , . . . , qk , 1, 0, qk+3 , . . . , q2k+2 ) k
(17.28)
338
M. O’Reilly
and 2(k−2) F¯k2k+2 (X k+1;k+2 ) = qk+1 qk+2 F¯k−2 (q3 , . . . , qk , qk+3 , . . . , q2k ) + pk+1 pk+2 F¯ 2k+2 (q1 , . . . , qk , 0, 0, qk+3 , . . . , q2k ) k
+ pk+2 qk+1 F¯k2k+2 (q1 , . . . , qk , 0, 1, qk+3 , . . . , q2k+2 ) + qk+2 pk+1 F¯ 2k+2 (q1 , . . . , qk , 1, 0, qk+3 , . . . , q2k+2 ), k
(17.29)
we have F¯k2k+2 (X) − F¯k2k+2 (X k+1;k+2 ) = (qk+2 − qk+1 )· 3 2k+2 (q1 , . . . , qk , 0, 1, qk+3 , . . . , q2k+2 ) F¯k 4 2k+2 ¯ (q1 , . . . , qk , 1, 0, qk+3 , . . . , q2k+2 ) −F k
= (qk+2 − qk+1 ) [q1 . . . qk + qk+3 . . . q2k+1 − q1 . . . qk qk+3 . . . q2k+1 − q2 . . . qk − qk+3 . . . q2k+2 + q2 . . . qk qk+3 . . . q2k+2 ] = (qk+2 − qk+1 ) [(p2k+2 qk+3 . . . q2k+1 − p1 q2 . . . qk ) − (p2k+2 − p1 )q2 . . . qk qk+3 . . . q2k+1 ] , (17.30)
proving the proposition. Proposition 17.5.2 Let X ≡ (q1 , . . . , q2k+2 ), k > 2. Then F¯k2k+2 (X) − F¯k2k+2 (X 1;2k+2 ) = (q1 − q2k+2 )· [(pk+1 q2 . . . qk − pk+2 qk+3 . . . q2k+1 ) − (pk+1 − pk+2 )q2 . . . qk qk+3 . . . q2k+1 ] . Proof. Since F¯k2k+2 (X) = q1 q2k+2 F¯k2k+2 (1, q2 , . . . , q2k+1 , 1) + p1 p2k+1 F¯ 2k+2 (0, q2 , . . . , q2k+1 , 0) k
+ p1 q2k+2 F¯k2k+1 (q2 , . . . , q2k+1 , 1) + p2k+2 q1 F¯ 2k+1 (1, q2 , . . . , q2k+1 ) k
(17.31)
and F¯k2k+2 (X 1;2k+2 ) = q1 q2k+2 F¯k2k+2 (1, q2 , . . . , q2k+1 , 1) + p1 p2k+1 F¯ 2k+2 (0, q2 , . . . , q2k+1 , 0) k
+ p2k+2 q1 F¯k2k+1 (q2 , . . . , q2k+1 , 1) + p1 q2k+2 F¯ 2k+1 (1, q2 , . . . , q2k+1 ), k
(17.32)
17
The (k + 1)-th component of linear consecutive–k–out–of–n systems
339
we have F¯k2k+2 (X) − F¯k2k+2 (X 1;2k+2 ) = (q1 − q2k+2 )· 3 2k+1 F¯k (1, q2 , . . . , q2k+1 ) 4 (17.33) − F¯k2k+1 (q2 , . . . , q2k+1 , 1) . From this, by Shanthikumar’s recursive algorithm [11], as stated in Section 17.2, it follows that F¯k2k+2 (X) − F¯k2k+2 (X 1;2k+2 ) = (q1 − q2k+2 )· 38 2k F¯k (q2 , . . . , q2k+1 ) + pk+1 q2 . . . qk (1 − qk+2 . . . q2k+1 )} 8 − F¯k2k (q2 , . . . , q2k+1 ) + pk+2 qk+3 . . . q2k+1 (1 − q2 . . . qk+1 )}] = (q1 − q2k+2 )· [pk+1 q2 . . . qk (1 − qk+3 . . . q2k+1 + pk+2 qk+3 . . . q2k+1 )] − pk+2 qk+3 . . . q2k+1 (1 − q2 . . . qk + pk+1 q2 . . . qk )] = (q1 − q2k+2 ) [(pk+1 q2 . . . qk − pk+2 qk+3 . . . q2k+1 ) − (pk+1 − pk+2 )q2 . . . qk qk+3 . . . q2k+1 ] ,
(17.34)
completing the proof.
Lemma 17.5.1 Let X ≡ (q1 , . . . , q2k+2 ) be a design for a linear consecutive– k–out–of–(2k + 2) : F system, q1 > q2k+2 . Assume qk+1 < qk+2 . If qk+3 . . . q2k+1 ≥ q2 . . . qk ,
(17.35)
then X k+1;k+2 is a better design, whereas if qk+3 . . . q2k+1 ≤ q2 . . . qk ,
(17.36)
then X 1;2k+2 is a better design. Proof. If qk+3 . . . q2k+1 ≥ q2 . . . qk , then p2k+2 qk+3 . . . q2k+1 − p1 q2 . . . qk ≥ (p2k+2 − p1 )q2 . . . qk > (p2k+2 − p1 )q2 . . . qk qk+3 . . . q2k+1 ,
(17.37)
and so by Proposition 17.5.1 we have F¯k2k+2 (X) > F¯k2k+2 (X k+1;k+2 ), proving that X k+1;k+2 is a better design.
340
M. O’Reilly
If qk+3 . . . q2k+1 ≤ q2 . . . qk , then pk+1 q2 . . . qk − pk+2 qk+3 . . . q2k+1 ≥ (pk+1 − pk+2 )qk+3 . . . q2k+1 > (pk+1 − pk+2 )q2 . . . qk qk+3 . . . q2k+1 ,
(17.38)
and from Proposition 17.5.2 it follows that F¯k2k+2 (X) > F¯k2k+2 (X 1;2k+2 ), proving that X 1;2k+2 is a better design and completing the proof. Corollary 17.5.1 Let X ≡ (q1 , . . . , q2k+2 ) be a design for a linear consecutive–k–out–of–(2k + 2):F system, k > 2. If X is optimal, then (i)(q1 , qk+1 , qk+2 , q2k+2 ) is singular; and (ii)(q1 , . . . , qk , qk+3 , . . . , q2k+2 ) is nonsingular. Proof. Without loss of generality we may assume q1 > q2k+2 . For q1 < q2k+2 we apply the reasoning below to the reversed design Xr ≡ (q2k+2 , . . . , q1 ). Suppose (q1 , qk+1 , qk+2 , q2k+2 ) is nonsingular. Then qk+1 < qk+2 , and by Lemma 17.5.1 we have that either X k+1;k+2 or X 1;2k+2 must be a better design. Hence X is not optimal contrary to the assumption, and (i) follows. Suppose that (q1 , . . . , qk , qk+3 , . . . , q2k+2 ) is singular. Then, since from above (q1 , qk+1 , qk+2 , q2k+2 ) must be singular, we have that X is singular, contrary to the necessary condition of nonsingularity stated by O’Reilly in ([9], Corollary 1). Hence (ii) follows and this completes the proof.
17.6 Procedures to improve designs not satisfying the necessary conditions for the optimal design The procedures below follow directly from the results of Lemmas 17.3.1, 17.3.2, 17.4.1, 17.5.1 respectively. Procedure 17.6.3 also applies the necessary conditions for the optimal design given by Malon [7] and Kuo et al. [5], as stated in Section 17.1. Procedure 17.6.1 Let X be a design for a linear consecutive–k–out–of–n:F or a linear consecutive–k–out–of–n:G system, with n > 2k, k ≥ 2. In order to improve the design, if q1 < qk+1 , interchange components q1 and qk+1 . Next, if qn < qn−k+1 , interchange components qn and qn−k+1 . Procedure 17.6.2 Let X be a design for a linear consecutive–k–out–of– (2k + 1) : F system, k > 2. Rearrange the components in the positions from 1 to k, and then the components in the positions from (2k + 1) to (k + 2) in non-decreasing order of component reliability. In order to improve the design, proceed as follows:
17
The (k + 1)-th component of linear consecutive–k–out–of–n systems
341
• If qk+1 < qk , 1. Interchange components qk+1 and qk , when q1 . . . qk−1 ≥ qk+2 . . . q2k ; otherwise take the q2k+1 component, put it on the left-hand side of the system, next to the q1 component (in position 0). 2. In a design obtained in this way, rearrange the components in the positions from 1 to k, and then the components in the positions from (2k + 1) to (k + 2) in non-decreasing order of component reliability. 3. If required, repeat steps 1–3 to further improve this new rearranged design or until the condition qk+1 > qk is satisifed. • If qk+1 < qk+2 , reverse the order of components and apply steps 1–3 to the rearranged design. Procedure 17.6.3 Let X be a design for a linear consecutive–k–out–of– (2k + 2) : F system, with k > 2. In order to improve the design: • If q1 > q2k+2 and qk+1 < qk+2 , interchange components 1. qk+1 and qk+2 , when qk+3 . . . q2k+1 ≥ q2 . . . qk or 2. q1 and q2k+2 , when qk+3 . . . q2k+1 ≤ q2 . . . qk . • If q1 < q2k+2 and qk+1 > qk+2 , interchange components 1. qk+1 and qk+2 , when qk+3 . . . q2k+1 ≤ q2 . . . qk or 2. q1 and q2k+2 , when qk+3 . . . q2k+1 ≥ q2 . . . qk .
References 1. Z. W. Birnbaum, On the importance of different components in a multicomponent system, in P. R. Krishnaiah, editor, Multivariate Analysis, II, (Academic Press, New York, 1969), 581–592. 2. P. J. Boland, F. Proschan and Y. L. Tong, Optimal arrangement of components via pairwise rearrangements, Naval Res. Logist. 36 (1989), 807–815. 3. G. J. Chang, L. Cui and F. K. Hwang, New comparisons in Birnbaum importance for the consecutive–k–out–of–n system, Probab. Engrg. Inform. Sci. 13 (1999), 187–192. 4. M. V. Koutras, G. K. Papadopoulos and S. G. Papastavridis, Note: Pairwise rearrangements in reliability structures, Naval Res. Logist. 41 (1994), 683–687. 5. W. Kuo, W. Zhang and M. Zuo, A consecutive–k–out–of–n:G system: the mirror image of a consecutive–k–out–of–n : F system, IEEE Trans. Reliability, 39(2) (1990), 244–253. 6. J. Malinowski and W. Preuss, Reliability increase of consecutive–k–out–of–n : F and related systems through component rearrangement, Microelectron. Reliab. 36(10) (1996), 1417–1423. 7. D. M. Malon, Optimal consecutive–k–out–of–n : F component sequencing, IEEE Trans. Reliability, 34(1) (1985), 46–49. 8. M. M. O’Reilly, Variant optimal designs of linear consecutive–k–out–of–n systems, to appear in Mathematical Sciences Series: Industrial Mathematics and Statistics, Ed. J. C. Misra (Narosa Publishing House, New Delhi, 2003), 496–502.
342
M. O’Reilly
9. M. M. O’Reilly, Optimal design of linear consecutive–k–out–of–n systems, chapter in this volume. 10. S. Papastavridis, The most important component in a consecutive–k–out–of–n: F system, IEEE Trans. Reliability 36(2) (1987), 266–268. 11. J. G. Shanthikumar, Recursive algorithm to evaluate the reliability of a consecutive– k–out–of–n: F system, IEEE Trans. Reliability 31(5) (1982), 442–443. 12. J. Shen and M. Zuo, A necessary condition for optimal consecutive–k–out–of–n: G system design, Microelectron. Reliab. 34(3) (1994), 485–493. 13. R. S. Zakaria, H. T. David and W. Kuo, A counter–intuitive aspect of component importance in linear consecutive–k–out–of–n systems, IIE Trans. 24(5) (1992), 147–154. 14. M. J. Zuo, Reliability of linear and circular consecutively–connected systems, IEEE Trans. Reliability 42(3) (1993), 484–487. 15. M. Zuo, Reliability and component importance of a consecutive–k–out–of–n system, Microelectron. Reliab. 33(2) (1993), 243–258. 16. M. Zuo and W. Kuo, Design and performance analysis of consecutive–k–out–of–n structure, Naval Res. Logist. 37 (1990), 203–230. 17. M. Zuo and J. Shen, System reliability enhancement through heuristic design, OMAE II, Safety and Reliability (ASME Press, 1992), 301–304.
Chapter 18
Optimizing properties of polypropylene and elastomer compounds containing wood flour Pavel Spiridonov, Jan Budin, Stephen Clarke and Jani Matisons
Abstract Despite the fact that wood flour has been known as an inexpensive filler in plastics compounds for many years, commercial wood-filled plastics are not widely used. One reason for this has been the poor mechanical properties of wood-filled compounds. Recent publications report advances in wood flour modification and compatibilization of polymer matrices, which has led to an improvement in processability and the mechanical properties of the blends. In most cases the compounds were obtained in Brabendertype mixers. In this work the authors present the results for direct feeding of mixtures of wood flour and thermoplastic materials (polypropylene and SBS elastomer) in injection molding. The obtained blends were compared with Brabender-mixed compounds from the point of view of physical and mechanical properties and aesthetics. It was shown that polymer blends with rough Pavel Spiridonov Centre for Advanced Manufacturing Research, University of South Australia, Mawson Lakes SA 5095, AUSTRALIA e-mail: [email protected] Jan Budin Institute of Chemical Technology, Prague, CZECH REPUBLIC e-mail: [email protected] Stephen Clarke Polymer Science Group, Ian Wark Research Institute, University of South Australia, Mawson Lakes SA 5095, AUSTRALIA e-mail: [email protected] Jani Matisons Polymer Science Group, Ian Wark Research Institute, University of South Australia, Mawson Lakes SA 5095, AUSTRALIA e-mail: [email protected] Grier Lin Centre for Advanced Manufacturing Research, University of South Australia, Mawson Lakes SA 5095, AUSTRALIA e-mail: [email protected] C. Pearce, E. Hunt (eds.), Structure and Applications, Springer Optimization and Its Applications 32, DOI 10.1007/978-0-387-98096-6 18, c Springer Science+Business Media, LLC 2009
343
344
P. Spiridonov et al.
grades of wood flour (particle size >300 microns) possess a better decorative look and a lower density having, at the same time, poorer mechanical properties. Usage of compatibilizers allowed the authors to optimize the tensile strength of these compounds. Key words: Tensile strength, wood-filled plastic, polypropylene elastomer, wood flour, optimal properties
18.1 Introduction Wood flour is referred to [1] as an extender – a type of filler that is added to polymer compounds as a partial substitute for expensive plastic or elastomeric material. The major advantages of organic fillers, including wood flour, are their relatively low cost and low density [2]. Despite the fact that wood flour has been known as a filler since the 1950s, its commercial application has been rather restricted. During the past 10–15 years, new organic materials such as rice husks, oil palm fibers, natural bast fibers and sisal strands have been studied as fillers [3]–[6]. Previous investigations in the area of traditional wood fibers have studied particular types of wood such as eucalyptus [7] or ponderosa pine [8, 9]. Most of the above organic fillers are used in composite materials [2]–[4],[6]–[8],[10]–[14]. The primary objective of this research was to find cost-effective ways to optimize properties of polypropylene and thermoplastic elastomers filled with wood flour. To eliminate a blending operation from the manufacturing process, direct feeding of an injection-molding machine with the polymer-filler mixtures was employed. Filling the polymer compounds with wood flour would allow not only a decrease in their cost, but would also reduce the environmental effect by utilizing wood wastes and making recyclable or bio-degradable products [2, 15, 16]. From this point of view, the authors did not select a particular type of wood; instead we used a mixture of unclassified sawdust.
18.2 Methodology 18.2.1 Materials As a polymer matrix two polymers were used. Polypropylene (unfilled, density 0.875 g/cm3 , melt index 10 cm3 /10 min at 230o C) was used. This is one of the most common industrial plastics. Styrene–butadiene tri-block (SBS) elastomer (unfilled, rigid-to-soft fraction ratio 30/70, density 0.921 g/cm3 , melt index 12 cm3 /10 min at 230o C) was selected because of the growing popularity of thermoplastic elastomers (TPE). This is due to their “soft touch feeling” properties and application in two-component injectionmolding applications.
18
Optimizing properties of plastics compounds containing wood flour
345
To compatibilize the wood flour with polymers [9]–[13], maleic anhydride grafted polypropylene (PP–MA) and styrene–ethylene / butylene–styrene elastomer (SEBS–MA) were added to the mix. The content of maleic anhydride in PP–MA and SBS–MA was 1.5% and 1.7%, respectively. As a filler, a mixture of local unclassified sawdust was used. The mixture was separated into 4 fractions. The characteristics of the fractions are given in Table 18.1. Table 18.1 Physical characteristics of the wood flour fractions Fraction 1 2 3 4
Particle Size, μ m 600–850 300–600 150–300 <150
Sieve Mesh Size 20 30 60 100
Density, g/cm3 0.92 1.17 1.37 1.56
18.2.2 Sample preparation and tests Wood flour samples were pre-dried for 6 hours at 50–60o C in the electric oven before blending. The polymer–wood flour blends were obtained in two ways. For injection molding, the polymer and filler were mixed just before molding. No additional pre-compounding was used. The specimens for tensile test (Australian Standard AS 1145) were molded in a 22 (metric) tonne injection-molding machine. These blends were also pre-mixed in a Brabender mixer at 40 rpm at 180o C to 190o C. The polymers were first introduced in the mixer; the wood flour was then added when the polymers melted (a constant torque was reached). The total mixing time was 6–8 min depending on the composition. Each blend weighed 65–70 grams. While warm, the blended materials were formed into a 2-mm sheet in the laboratory vulcanization press under 10 MPa pressure at 180o C. The tensile specimens were punched from the sheets using a standard cutting die. Tensile testing of the above specimens was conducted according to AS 1145 on a horizontal tensile test machine. Five test samples of each compound were tested. Densities of the wood flour and polymer compounds were determined by a volumetric method in either water or methylated spirits.
18.3 Results and discussions 18.3.1 Density of compounds Comparison of the densities of the polymer compounds provides information about the quality and interactions between the polymer matrix and filler.
346
P. Spiridonov et al.
From Table 18.1 it can be seen that the density of wood flour depends on the particle size in each fraction. The fractions consisting of smaller particles have a higher density. This is because wood is a cellulose material, which has a porous structure [4]. Larger wood particles retain this structure with considerably less displacement of voids by the floatation medium. On the other hand, for smaller particles a higher percentage of voids are filled by the flotation liquid [14] resulting in a higher density. The difference in density should influence the density of the polymer compounds that contain different wood flour fractions. When the compounds were molded in the injection-molding machine or pressed after mixing in the Brabender mixer, their density was both measured and calculated. The calculations were based on the density of the polymer matrix and the wood flour and their ratio in the compounds. The results are presented in Figure 18.1. A difference was observed between the two processing methods and the calculated values. The results indicated that the densities of un-coupled PP were lower when molded in the injection-molding machine. The most stable results were obtained when both PP and SBS compounds were mixed in the Brabender mixer and then were formed in the press. Stark et al. [8] have explained this observation by the compression of the compounds to the maximum density that the wood cell walls can sustain. This correlates with our results in regard to the Brabender method and the difference from the calculated values. However, the pressure created in an injection mold is comparable with the pressure developed in the press. Although an injection machine creates a bigger plasticizing effect, the total blending time is shorter (1–1.5 min) than for the Babender mixer (6–8 min). Therefore, in addition to the effect of compression, blending time is a very important parameter. The use of modifiers improved the quality of compounds without increasing the blending time. Thus maleated polypropylene allowed us to obtain compounds with close densities both in injection molding and the Brabender mixer (see Figure 18.1). This is because of the compatibilization impact of maleic anhydride, which is achieved by improving the polymer matrix impregnation, improving fiber dispersion, enhancing the interfacial adhesion and other effects [9]–[13].
18.3.2 Comparison of compounds obtained in a Brabender mixer and an injection-molding machine The difference in mixing by injection molding and by the Brabender method influences not only the density of compounds but also their mechanical properties. Despite the fact that the tensile strength of the control compound (unmodified polypropylene) molded in the injection machine was
18
Optimizing properties of plastics compounds containing wood flour
347
Fig. 18.1 Density of (a) polypropylene and (b) SBS in elastomer compounds for different blending methods.
higher than that of Brabender-mixed specimens, most of the other compounds were weaker (see Figure 18.2). The primary influential parameters have been discussed above. In addition to pressure and blending time, temperature is also an important technological parameter. The mixing temperature of the Brabender mixer was 185–190o C, which is below the temperature of wood degradation (200o C) [7]. The injection-molding temperature varied for polypropylene from 185o C in the center zone to 200o C in the nozzle.
348
P. Spiridonov et al.
Fig. 18.2 Comparison of tensile strength of the compounds obtained in an injectionmolding machine and in a Brabender mixer.
For the injection of SBS elastomer, the temperatures in the barrel were set 10–15o C higher. The observations of the injection molding of polymer–wood flour mixtures showed that stability of this process depends on the filler / polymer ratio and the temperature. When the content of wood flour was below 40%, the process and the quality of the molded specimens were both quite stable. When the content of wood flour exceeded 50%, its volume exceeded the polymer volume and it became hard to obtain a consistent quality. In addition, because wood flour is hard and does not melt during the process, the friction between metal parts of the machine (screw, barrel) and the wood flour particles is very high, which also prevents the polymer matrix from forming a continuous phase. Therefore it was visually detected that the distribution of wood flour in the polymer was not regular (for example, particle agglomerates were observed) when the wood filler content was greater than 50% weight. The maximum content of wood flour in the following experiments was maintained at 40%. During the injection-molding experiments it was noticed that higher temperatures led to the formation of vapors in the compounds. It was accounted for by decomposition of the wood flour, which is known to start at around 200o C [7]. In such cases it was difficult or even impossible to obtain good specimens, despite high mold pressure. Thus we were unable to mold mixtures of nylon with wood flour, because nylon requires higher injection temperatures (240–260o C). Therefore the direct feeding of polymer–wood flour mixtures into an injection-molding machine can be done only when the wood flour content is less than 40% and the polymer has a melting point below 200o C.
18
Optimizing properties of plastics compounds containing wood flour
349
18.3.3 Compatibilization of the polymer matrix and wood flour The mechanical properties of the wood-filled compounds were found to depend not only on the wood flour content but also on the fraction (particle size). Thus a maximum reduction in tensile strength was observed in the case of blends containing wood flour fraction 1, which contained the largest particle size (600–850 microns). It was found when the PP and SBS blends contained 40% of wood flour fraction 1, tensile strength losses of 72% and 60% in respect to the control samples occurred (see Figure 18.3). The best strength (15.2 MPa) of SBS compounds was achieved when they contained fraction
Fig. 18.3 Influence of wood flour fractions and the modifier on the tensile strength of injection-molded specimens of the (a) PP and (b) SBS compounds.
350
P. Spiridonov et al.
4 (particles <150 microns). The PP compound containing 40% of fraction 2 had the best mechanical properties, although this result was practically within the statistical deviations (8%) for the strength of the compounds containing fractions 3 and 4. Similar results for PP compounds were described in [8]. The influence of the particle size on the mechanical properties of wood flour–polymer compounds can be explained by incompatibility between the polymer matrix and the filler, and the differences in their physical and mechanical properties. The wood particles differ from the polymers by their chemical nature [4], and good adhesion between them cannot be achieved. Therefore a large proportion of the filler in the compounds prevents the matrix from forming a continuous phase, which leads to a reduction of the mechanical properties of the blends [8, 15]. In the case of bigger particles, they play a role of concentration points where deformation and strain occur. Coupling agents and chemically modified polymers were introduced to improve interfacial adhesion between the polymers and fillers [9]–[13]. In this research, maleic anhydride grafted PP and SEBS were used as modifiers for PP and SBS blends respectively. As shown above (see Figures 18.1 and 18.2) PP–MA was able to improve the properties of the compounds. Figure 18.3 demonstrates that PP–MA led to the improvement and stabilization of mechanical properties [12, 13] for injection-molded compounds regardless of the wood fraction. The modified SBS compounds had similar properties to the control specimens. Mixing wood flour with a polymer base and modifiers in the Brabender mixer (see Figure 18.2) gave even better results and allowed for the loss of mechanical properties of the compounds containing a high content of wood flour [12, 13]. The improved compatibility between the polymer matrix and the wood flour particles led to homogeneous morphology of the compounds [9, 11, 14]. Thus maleated polymers provided a compatibilization effect in the filled compounds.
18.3.4 Optimization of the compositions The introduction of wood flour as a filler and maleated polymers as modifiers to the polymer base resulted in opposing technical and economic effects [12, 13]. The relative cost of the PP compounds decreased by 40–50% as the content of the wood filler increased. At the same time, the tensile strength of these compounds droped to 40%. Increasing the PP–MA content up to a certain point allowed a 50% improvement in mechanical properties, however the cost of the compounds was 2–3 times higher. This example demonstrates the necessity to optimize these wood-filled compositions. The relative cost of the PP and SBS compounds was calculated as a ratio of the actual cost of the compound to the cost of the control (pure) material.
18
Optimizing properties of plastics compounds containing wood flour
351
Fig. 18.4 Relative cost of the (a) PP and (b) SBS compounds depending on the content of wood flour and maleated polymers.
The calculations were based on the cost of raw materials and their content in the compounds. Figure 18.4 shows that PP compounds are cheaper than the control sample (with a relative cost of less than 1) where they contain a considerable amount of wood flour and less than 17% PP–MA. For SBS compounds, the equilibrium cost threshold was much higher. It follows from Figure 18.4 that it is possible to introduce 50% SEBS–MA to the compound containing 50% wood flour without increasing the cost of the modified compound. The difference between the optimum cost of the PP and SBS compounds is caused by the difference in the cost of raw materials. Thus the cost of the SBS elastomer is ∼ 3 times higher than the cost of virgin PP and the cost of maleated SEBS is much higher than the cost of PP–MA. Under these conditions, the application of a cheap filler such as wood flour provides an economic effect, allowing the manufacturers greater flexibility with the composition. It is necessary to say that in addition to financial savings, the use of wood flour can provide plastics companies with the benefit of an aesthetically pleasing, natural looking, wood-filled finish. Figure 18.5 provides an indication of the decorative properties possible for PP compounds containing 40% of different grades of wood flour. It can be noticed that rough grades (fraction 1 and 2) provide a more natural wood look to the plastics than do fractions 3 and 4. Therefore wood flour with particle sizes in the range from 300 to 850 microns can be recommended for use in decorative plastics. Despite the fact that those fractions decrease the mechanical properties of the compounds, these properties may not necessarily be an essential criterion for decorative parts and components. It should be possible to find an optimal balance between the properties and the cost of the compounds in the way described above.
352
P. Spiridonov et al.
Fraction 1
Fraction 2
Fraction 3
Fraction 4
Fig. 18.5 Photographs of the PP compounds containing 40% wood flour of different fractions.
18.4 Conclusions The results of this research clearly demonstrate that it is possible to eliminate a mixing operation and make PP and SBS products with wood flour content up to 40% directly by injection molding. Due to its low degradation temperature, wood flour cannot be blended with polymers at temperatures higher than 200o C. This limitation has to be considered when selecting a polymer. Our research has also shown that in addition to conventional plastics, thermoplastic elastomers can be filled with wood flour. The combination of compatible wood flour–filled plastic and elastomer materials can be used in two-component injection-molding technology. Lower mechanical properties of the wood fiber–filled products can be compensated for using maleic anhydride grafted polymers. The properties of the wood flour–polymer compounds can be optimized from the point of view of their mechanical properties and cost. Due to the different influences of wood flour fractions on the properties of the polymer compounds, they can be applied in different ways. Thus wood flour grades with particle sizes less than 300 microns can be used as extenders to replace expensive plastic materials, which allows for economic savings. The
18
Optimizing properties of plastics compounds containing wood flour
353
grades with particles in the range between 300 to 850 microns can be used in decorative plastics.
References 1. Modern Plastics Encyclopedia, Eds G. Graff, K. Kreiser (McGraw–Hill, New York, 1989). 2. D. N. Saheb and J. P. Jog, Natural fiber polymer composites: A review, Adv. Polymer Tech. 18(4) (1999), 351–363. 3. M. Y. Fuad Ahmad et al., Rice husk and oil palm wood flour as fillers in polypropylene composites: Materials characterization and mechanical properties evaluation, in Proceedings of the 4th International Conference on Composites Engineering, 5–11 July 1998, Ed. M. L. Scott (Woodhead Publishing, Cambridge U. K.), 337–338. 4. D. Ruys, A. Crosky and W. J. Evans, Natural bast fibre structure, in Proceedings of the 3rd Conference on Technology Convergence in Composites Applications (ACUN3), 6–9 February 2001, Eds S. Bandyopadhyay, N. Gowripalan, N. Drayton (University of New South Wales, Sydney, 2001) 468–472. 5. A. S. Blicblau, S. Laird and R. S. P. Couts, Air cured sisal strand reinforced cement sheet, in Proceedings of the 3rd Conference on Technology Convergence in Composites Applications (ACUN-3), 6–9 February 2001, Eds S. Bandyopadhyay, N. Gowripalan, N. Drayton (University of New South Wales, Sydney, 2001) 447–451. 6. M. S. Sreekala and S. Thomas, Accelerated effects in oil palm fibre reinforced phenol formaldehyde composites, in Proceedings of the 3rd Conference on Technology Convergence in Composites Applications (ACUN-3), 6–9 February 2001, Eds S. Bandyopadhyay, N. Gowripalan, N. Drayton (University of New South Wales, Sydney, 2001) 461–467. 7. N. E. Marcovich, M. M. Reboredo and M. I. Aranguren, Modified woodflour as thermoset fillers: II. Thermal degradation of woodflours and composites, Thermoch. Acta 372 N1–2 (2001), 45–57. 8. N. M. Stark and M. J. Berger, Effect of particle size on properties of wood–flour reinforced polypropylene composites, in Proceedings of the Fourth International Conference on Woodfibre–Plastic Composites, 12–14 May 1997, (Forest Product Society, Madison, Wisconsin, 1997) 134–143. 9. K. Oksman and C. Clemons, Effects of elastomers and coupling agent on impact performance of wood flour–filled polypropylene, in Proceedings of the Fourth International Conference on Woodfibre–Plastic Composites, 12–14 May 1997, (Forest Product Society, Madison, Wisconsin, 1997) 144–155. 10. M. N. Angles, J. Salvado and A. Dufresne, Steam–exploded residual softwood–filled polypropylene composites, J. Appl. Polymer Sci. 74(8) (1999), 1962–1977. 11. M. Kazayawoko, J. J. Balatinecz and L. M. Matuana, Surface modification and adhesion mechanisms in woodfiber-polypropylene composites, J. Mat. Sci. 34(24) (1999), 6189–6199. 12. J. Z. Lu, Q. L. Wu and H. S. McNabb, Chemical coupling in wood fiber and polymer composites: A review of coupling agents and treatments, Wood Fib. Sci. 32(1) (2000), 88–104. 13. R. Mahlberg et al. Effect of chemical modification of wood on the mechanical and adhesion properties of wood fiber/polypropylene fiber and polypropylene/veneer composites, Holz Als Roh-und Werkstoff 59(5) (2001), 319–326. 14. S. B. Elvy, G. R. Dennis and L. T. Ng, Effects of coupling agent on the physical properties of wood-polymer composites, J. Mat. Proc. Tech. 48(1–4) (1995), 365–371.
354
P. Spiridonov et al.
15. B. J. Lee, A. G. McDonald and B. James, Influence of fiber length on the mechanical properties of wood-fiber/polypropylene prepreg sheets, Mat. Res. Innov. 4(2–3) (2001), 97–103. 16. J. J. Balatinecz and M. M. Sain, The influence of recycling on the properties of wood fibre plastic composites, Macro. Symp. 135 (1998), 167–173.
Chapter 19
Constrained spanning, Steiner trees and the triangle inequality Prabhu Manyem
Abstract We consider the approximation characteristics of constrained spanning and Steiner tree problems in weighted undirected graphs where the edge costs and delays obey the triangle inequality. The constraint here is in the number of hops a message takes to reach other nodes in the network from a given source. A hop, for instance, can be a message transfer from one end of a link to the other. A weighted hop refers to the amount of delay experienced by a message packet in traversing the link. The main result of this chapter shows that no approximation algorithm for a delay-constrained spanning tree satisfying the triangle inequality can guarantee a worst case approximation ratio better than Θ(log n) unless NP ⊂ DTIME(nlog log n ). This result extends to the corresponding problem for Steiner trees which satisfy the triangle inequality as well. Key words: Minimum spanning tree, maximum spanning tree, triangle inequality, Steiner tree, APX, approximation algorithm, asymptotic worst case ratio
19.1 Introduction Consider a network G = (V, E) where a certain node (the source or the speaker) broadcasts messages to all the other nodes (the destinations or the receivers) in the network. When a broadcast occurs, suppose the network links through which the message is relayed need to be leased for a given non-negative cost cij , where i and j are the end nodes of the link (i, j) ∈ E. Prabhu Manyem Centre for Industrial and Applied Mathematics, University of South Australia Mawson Lakes SA 5095, AUSTRALIA∗ e-mail: [email protected] ∗ Currently
at The University of Ballarat.
C. Pearce, E. Hunt (eds.), Structure and Applications, Springer Optimization and Its Applications 32, DOI 10.1007/978-0-387-98096-6 19, c Springer Science+Business Media, LLC 2009
355
356
P. Manyem
A feasible solution to this single source broadcast problem is a set of leased links so that the message from the specified source reaches all destinations, and the message passes through each (intermediate) node at most once – because a receiver hearing the same piece of message more than once could become confused. In other words, there should be no loops or cycles in the feasible solution. (Any solution with loops can be modified by removing a few edges to break the loops, with no increase in cost. Hence we consider only acyclic solutions.) A piece of message (or data) is known as a packet in telecommunications terminology. The cost of a solution is the sum of the costs of the leased links, and an optimal solution is one which minimizes this overall cost. This broadcast problem can be modeled as a MinST (minimum spanning tree) and the solution is a tree rooted at the pre-defined source s. As opposed to the broadcast problem, in the multicast problem, the message needs to be sent only to a select group of nodes in the network, known as the multicasting group. Just as the broadcast version lends itself to a spanning tree formulation, the multicast version lends itself to a Steiner tree formulation. Suppose we add the following constraint to the broadcast problem: that the number of hops taken by a message to reach any destination from a given source vertex s is bounded by a threshold value Δ. A hop can be defined as a message transfer from one end of a link to the other. We call this the hopconstrained spanning tree problem or HCSP. This problem has been shown to be NP-hard [12]. A variation of the HCSP is the DCSP, the diameterconstrained spanning tree problem, where the diameter (the number of edges in the longest path of the solution obtained) obeys an upper bound of Δ. The DCSP is also NP-hard [4]. The CSP, the delay-constrained spanning tree problem, is a generalization of the HCSP. Each edge in the network has two distinct parameters: (1) a cost cij and (2) a delay dij . (Here, delay refers to the amount of delay experienced by a message packet in traveling from one end of the link to the other. The total delay in a link can be broken down into transmission delay, switching delay and queueing delay, of which transmission delay is usually predominant.) The delay parameter can be considered to be a weighted hop. If in a CSP the delay is set to one for each edge, one obtains an HCSP. For a given minimization problem P, let A be an approximation algorithm and PI the set of instances in P. For a given instance I ∈ PI , let the cost of the solution obtained by A be AI . Let the cost of the optimal solution for I be OP TI . Then the approximation ratio of A for instance I is RA,I = AI /OP TI . Over all instances I ∈ PI , the absolute performance ratio is defined as [4]: RA = sup (r ≥ 1 : RA,I ≤ r for all I ∈ PI ) .
(19.1)
The lower the value of RA , the better the heuristic A. A constant value of RA is superior to a value that depends on the size of instances, for example,
19
Constrained spanning, Steiner trees and the triangle inequality
357
RA ∈ Θ(n) or RA ∈ Θ(log n). Lund and Yannakakis [7] show that the SET COVER problem cannot be in the class APX, which is the class of problems for which it is possible to construct a polynomial time heuristic A that guarantees a constant value on RA . Feige [3] showed that unless NP ⊂ DTIME(nlog log n ), the SET COVER problem cannot be approximated to within Θ(log n). Manyem and Stallmann [9] have shown that an HCSP, and hence a CSP too, cannot be in the complexity class APX. Results from [2] indicate that a DCSP is unlikely to be in APX. Heuristics for the Steiner tree version of the problem with general costs and weighted hops appear in Manyem [8]. Marathe, Ravi et al. [10] consider networks with both cost and delay parameters on the edges. They provide an approximation algorithm that guarantees a diameter within O(log |V |) of the given threshold Δ and a total cost within O(log |V |) of the optimum. A vast compendium of results on approximability is provided in Ausiello et al. [1]. Figure 19.1 provides a road map of some of the optimization problems that arise in telecommunication networks. Here S is the set of terminal nodes for Steiner tree problems. In multicasting terminology, S is the set of conference nodes. Problem 1, the Constrained Steiner Tree (CST), is the hardest in the
|S| = 2
CST
all edge weights = 1
NP-C all delays =1
1
S=V
2 3
Constr. Shortest Path NP-C
CSP
HCST
NP-C all edge wts = 1
all edge wts = 1
NP-C
all delays =1
CSP with unit weight edges
Problem #9
5 4
S=V
all edge wts = 1
HCSP 7
NP-C
8
poly. time edge wts 1 or 2
6
all delays =1
10
S=V
|S| = 2
delay constr. path with min. # of edges
NP-C
poly. time 9
|S| = 2
all delays =1 11
poly. time
all edge wts = 1 NP-C : NP-Complete poly. time : solvable in polynomial time
HCSP with unit weight edges poly. time
NP-C
HCST with unit weight edges
height constr. path with min. # of edges
HCSP with edge weights 1 or 2 NP-C
all delays =1
CST with unit weight edges
12
Fig. 19.1 A Constrained Steiner Tree and some of its special cases.
358
P. Manyem
figure – all other problems are special cases of CSTs. Given an instance of a CST, if we set the multicast group to be all nodes in the network, we obtain the Constrained Spanning Tree (Problem 3). Given an instance of a CST, if we set the multicast group to be just two nodes which need to communicate with each other, we obtain Problem 2, the Constrained Shortest Path. Problem 7 is a Hop-Constrained Spanning Tree, and Problem 4 is a Hop-Constrained Steiner Tree. All problems above the dotted line in Figure 19.1 are NP-complete, and the ones below can be solved in polynomial time. Positive results from one problem to another flow in the direction of the arrows, and negative results flow in the direction against that shown by the arrows. For example, if we can develop a heuristic for Problem 1 that guarantees an upper bound B on the approximation ratio over all instances, this will also hold true for all problems in the figure. On the other hand, if we can show (a negative result) that unless NP ⊂ DTIME(nlog log n ), there can be no heuristic that guarantees an upper bound of B for Problem 7, then this will also be true for Problems 1, 3 and 4. See Table 19.1 for further details. Table 19.1 Constrained Steiner Tree and special cases: References Problem Number in Figure 19.1 1, 3–5 2 4 6 7 8 9, 11 10 12
Results and References [8] and [9] [5] [2], [8] and [9] Shortest path problem [2], [12] and Problem ND4 in [4] [8] and [9] Problem ND30 in [4] [12] and Problem ND4 in [4] Breadth First Search
The proof in [12] that Problem 10 in Figure 19.1 is NP-hard renders problems 1, 3, 4 and 7 NP-hard as well. Similarly, the proofs in [9] show that unless NP ⊂ DTIME(nlog log n ), Problems 7 and 8 cannot be approximated to better than Θ(log n). Hence this non-approximability result carries over to Problems 1, 3, 4 and 5. In this chapter, we consider special cases of CSPs and HCSPs where the edge costs and delays obey the triangle inequality (we call these problems CSPI s and HCSPI s respectively). First, in Section 19.2, we show that the cost of spanning tree solutions for a CSPI and an HCSPI in a given network G = (V, E) is at most |V | − 1 times the cost of any other spanning tree solution for G. This implies that any solution is within a |V | − 1 factor of the optimal solution. Next, in Section 19.3, we prove that the lower bound for any approximation algorithm for a CSPI is Θ(log n). Unless NP ⊂ DTIME(nlog log n ), no
19
Constrained spanning, Steiner trees and the triangle inequality
359
approximation algorithm can guarantee an RA better than this. We show this by an E-Reduction (explained in Section 19.3.1) from the SET COVER problem.
19.2 Upper bounds for approximation We first show that for a given network G = (V, E) with non-negative costs cij on undirected edges (i, j) ∈ E, the value of a spanning tree is at most |V | − 1 times that of any other. We shall assume that the underlying graph is complete without loss of generality (if the network is not complete, we can add edges to the network with costs that obey the triangle inequality). We start with a well-known result for such graphs. Remark 1. For any two vertices i and j in G, where the edge costs of G obey the triangle inequality, the edge (i, j) is also a least expensive path in G between these two vertices.
19.2.1 The most expensive edge is at most a minimum spanning tree We show here that Lmax , the cost of the most expensive edge in E, is at most equal to Tmin , the cost of a MinSTG (minimum spanning tree for G). Let the endpoints of the most expensive edge be s and t. Let L1 be the cost of the s − t path using the edges in MinSTG . From Remark 1, it follows that cst = Lmax ≤ L1 . Since L1 ≤ Tmin , we conclude that Lmax ≤ Tmin . Remark 2. In a network G where the edge costs obey the triangle inequality, the cost of the most expensive edge in G is at most the cost of a minimum spanning tree of G.
19.2.2 MaxST is at most (n − 1)MinST Let Tmax be the value of a MaxSTG (maximum spanning tree of G) where |V | = n. Since Lmax is the cost of the most expensive edge in G, it follows that Tmax ≤ (n − 1)Lmax . From Remark 2, Lmax ≤ Tmin . Thus Tmax ≤ (n − 1)Tmin , which is what we set out to show in this section. Remark 3. For a given undirected network G = (V, E) which satisfies the triangle inequality, the ratio of the costs of MaxSTG to those of MinSTG has an upper bound of |V | − 1. Hence the ratio of the costs of any two spanning trees in G has this upper bound.
360
P. Manyem
Remark 4. For undirected networks where the edge costs obey the triangle inequality, the performance ratio RA for any approximation algorithm A has an upper bound of |V | for any version of the spanning tree problem that has the objective of minimizing the sum of the edge costs in the feasible solution. In particular, the above remark is true for CSPI s and HCSPI s.
19.3 Lower bound for a CSP approximation An upper bound on the performance ratio RA for any approximation algorithm for a CSPI and an HCSPI is provided in Remark 4. Let us now turn to proving a lower bound for a CSPI . We show in this section that unless NP ⊂ DTIME(nlog log n ), there can be no heuristic that can guarantee a performance ratio better than Θ(log n) for a CSPI . We show this by an E-Reduction from a SET COVER to a CSPI . Since this lower bound holds for a SET COVER [3, 7], it does so for a CSPI too, via E-Reduction. Recall that a CSPI is the version of a CSP where the edge costs obey the triangle inequality.
19.3.1 E-Reductions: Definition If problem A E-reduces to problem B, then B is as hard to approximate as A. The formal definition of E-Reduction is as follows. Definition 1 (E-reduction [6]). A problem A E-reduces to a problem B, or A ≤E B, if there exist polynomial time functions f and g and a constant β such that (1) f maps an instance I of A to an instance J of B; and (2) g maps solutions T of J to solutions S of I such that ε(I, S) ≤ βε(J, T ),
(19.2)
where the error term ε(I, S) is defined below. Definition 2 (Error [6]). For minimization problems, a solution S to an instance I has error ε(I, S) if V (I, S) = 1 + ε(I, S), opt(I)
(19.3)
where V (I, S) is the value of a solution S to instance I and opt(I) is the value of an optimal solution to I.
19
Constrained spanning, Steiner trees and the triangle inequality
361
Another type of reduction used in approximability theory is the LReduction introduced in [11].
19.3.2 SET COVER A SET COVER instance is defined by a ground set X = {xi |1 ≤ i ≤ p} and a collection Y = {yj |1 ≤ j ≤ q}, each yj being a subset of X. The goal is to find a cover Y of X such that (a) Y ⊆ Y , (b) |Y |, the cardinality of Y , is minimal, and (c) Y satisfies / yj = X. yj ∈Y
19.3.3 Reduction from SET COVER For a CSPI , a spanning tree needs to be determined such that (1) its cost is minimal and (2) the sum of the edge delays in the path from a specified vertex s ∈ V (the source) to every other vertex in V is at most Δ, a non-negative integer. We create an instance of a CSPI as follows (see Figure 19.2). For each xi ∈ X and yj ∈ Y in SET COVER, create a vertex. Create an additional vertex s. Thus |V | = |X| + |Y | + 1. Since |V | = n, |X| = p and |Y | = q, we
s n n
n
y2
1
n+1 n 1
y1
yq
........
y3
1 1 1 1
1
x1
1
x2
1 1
1 1
x3
1 1
x4
. . . . . . .
Fig. 19.2 A CSPI instance reduced from SET COVER (not all edges shown).
xp
362
P. Manyem
have n = p+q +1. The edges in E in the instance G = (V, E) of the CSPI are assigned as in Table 19.2. The costs (delays) assigned to the edges are given in Column 3 (Column 4) of the table respectively. The graph G is complete. However, in the interests of clarity, not all edges are shown in Figure 19.2. Only edge costs are shown in the figure, not edge delays. Table 19.2 E-Reduction of a SET COVER to a CSPI : Costs and delays of edges in G Edge Set E1 E2 E3 E4 E5 E6
Definition {(s, yj )| 1 ≤ j ≤ q} {(s, xi )| 1 ≤ i ≤ p} {(yj , xi )| xi ∈ yj , 1 ≤ i ≤ p, 1 ≤ j ≤ q} {(yj , xi )| xi ∈ / yj , 1 ≤ i ≤ p, 1 ≤ j ≤ q} {(yi , yj )| 1 ≤ i < j ≤ q} {(xi , xj )| 1 ≤ i < j ≤ p}
Cost n n+1 1 1 1 1
Delay 1 2 1 2 1 1
Let Δ = the delay constraint at all vertices = 2. Note that both the edge costs and the edge delays in G obey the triangle inequality. (In most cases, both the cost and the transmission delay of an edge directly relate to its length. Further, queueing and switching delays are usually minor. Hence the cost cij and delay dij of an edge are closely related (they could be directly proportional, for example). However, there may be instances where the increase in edge delay is significantly faster than that of edge cost. For instance, due to a high degree of congestion in the network, queueing and switching delays could be far higher than normal. From Table 19.2, the total set of edges of the graph G is given by E = ∪6i=1 Ei .
19.3.4 Feasible Solutions Recall that the delay constraint is equal to 2. It is possible for the (s, xi ) edges to be utilized in a feasible solution – if they are, they can be deleted from the solution with no increase in cost. Note that there can be no paths of the form s − xi − yj nor of the form s − yj − xi , where (xi , yj ) ∈ E4 ; that / yj in the SET COVER problem. This is due to the high delay is, when xi ∈ (2 units) of such edges. In either of the paths just mentioned, the leaf vertex would experience a delay of 3. Suppose for xi = x0 , the edge (s, x0 ) is in the feasible solution S0 returned by a heuristic. This edge can be replaced as follows. For any yj ∈ Y , the edge (x0 , yj ) is not in the feasible solution S0 , otherwise the delay at such a yj would be 3, violating the delay constraint. There are two possible cases here: • Suppose there exists a y0 such that x0 ∈ y0 in SET COVER, and edge (s, y0 ) ∈ S0 . Then replace (s, x0 ) with (xi , y0 ) to obtain a new solution S1 . Observe that cost[S0 ] − cost[S1 ] = n units (the cost decreases).
19
Constrained spanning, Steiner trees and the triangle inequality
363
• Alternatively, such a y0 may not exist. In any case, there is at least one edge (x0 , yj ) in G for some yj ∈ Y , otherwise no feasible solution is ever possible for the CSPI . This is due to the fact that x0 ∈ yj for at least one yj ∈ Y in SET COVER. We name this yj , y1 . As per our assumption for this case, the edge (s, y1 ) is not in S0 . This implies that y1 is a leaf vertex in S0 , and has another yj (say y2 ) as its parent. To obtain a new solution S1 , we can delete the edges (y1 , y2 ) and (s, x0 ), and replace them with (s, y1 ) and (x0 , y1 ). In S1 , the vertex x0 remains a leaf, but y1 is no longer a leaf. The delay constraints are still obeyed at all vertices. We have cost[S0 ] − cost[S1 ] = cost[(s, x0 )] + cost[(y1 , y2 )] − cost[(s, y1 )] − cost[(x0 , y1 )] = (n + 1) + 1 − n − 1 = 1 (that is, the cost decreases). To obtain S1 from S0 , at most |X| edges of the form (s, xi ) need to be replaced, and each replacement takes a constant amount of time. Thus S1 can be obtained from S0 in time Θ(|X|) = O(n), a time polynomial in the number of elements in the ground set of SET COVER. Once all edges of the form (s, xi ) have been eliminated from S0 , the resulting feasible solution S1 will be as described below.
19.3.4.1 Structure of S1 The parents of the x’s in an FS have to be y’s — such y’s should in turn have s as their parent. Also due to the delay constraint, a path such as s−yj −yr −xi can also be ruled out for any 1 ≤ i ≤ p and 1 ≤ j < r ≤ q. Not all y’s need to have s as their parent – some of the y’s can have another y (say ya , for example) as their parent, as long as ya ’s parent is s (recall the delay constraint of 2). Suppose we call a y such as ya a covering y and the rest non-covering y’s. The covering y’s together form a cover to the x’s – these y’s may or may not be leaves in an FS. The non-covering y’s will be leaves. In other words, a yj is • in the cover if s is yj ’s parent, and • not in the cover otherwise. In such a case, the parent of yj would be a covering y. The delay constraint forbids a non-covering yj to be the parent of an xi in an FS. It is sufficient for all the non-covering y’s to have a common parent. Let the cover size (the number of covering y’s) be k. If in Figure 19.2, we move the cover to the left (the y’s can be renumbered in such a way that y1 through yk cover all elements in X), a feasible solution as described above will look like the one in Figure 19.3.
364
P. Manyem
s n n
y1
1
x1
n
y2
1
1
1
x2
x3
1
yk
.....
yk+1
yq
.......
1
1
x4
. . . . . . . . .
xp
Fig. 19.3 Feasible solution for our instance of CSPI (not all edges shown).
Note that it is cheaper for the non-covering y’s to have one of the covering y’s as their parent, rather than s – cheaper by a factor of n. The spanning tree (the feasible solution in Figure 19.3) includes the following: • (s, yj ) edges: k in number, each with a cost of n, • edges of the form (yi , yj ), where (s, yi ) is part of the FS, and (s, yj ) is not (in other words, yi is in the cover and yj is not): q − k such edges, each with unit cost, • (xi , yj ) edges: p in number, each with unit cost. Thus the cost of the spanning tree of Figure 19.3 equals k(n) + (q − k)(1) + (p)(1) = kn + n − k − 1, since n = p + q + 1. The cost then, can be described as a function C(k), where C(k) = kn + n − k − 1, 1 ≤ k ≤ q. (19.4)
19.3.4.2 Correspondence Between Feasible Solutions Note that there is a one-to-one correspondence between feasible solutions in SET COVER and S1 for our instance of the CSPI . A set of covering y’s in our CSPI instance can also be used as a cover in SET COVER. In the other direction, a cover in SET COVER can be transformed to a set of covering y’s in our CSPI instance; these will have s as their parent in a FS. The other (non-covering) y’s will have one of the covering y’s as their parent, and the x’s will have the covering y’s as their parent(s).
19
Constrained spanning, Steiner trees and the triangle inequality
365
For this reduction, I is a SET COVER instance, S is a solution to I, J is our instance of CSPI corresponding to I, and T is a solution to J. From the above argument, we have the following lemma. Lemma 1. A SET COVER instance I has a solution with cardinality k (1 ≤ k ≤ q) iff the corresponding CSPI instance J has a solution with a total cost C(k). For any approximation algorithm for CSPI to obtain the least possible cover size, the costs C(k) should monotonically increase from C(1) through C(q), and this is indeed the case with Equation (19.4). Further note that the reduction from SET COVER can be carried out in polynomial time. To complete the proof that this is an E-reduction, we only need to show that the error condition (19.2) is satisfied for some constant β.
19.3.5 Proof of E-Reduction Let k (≤ q) be the value of any feasible solution to SET COVER, and l be that of the optimal solution. Obviously, l ≤ k ≤ q. Therefore ε(I, S) =
k−l k −1 = l l
and from (19.4), ε(J, T ) =
kn + n − k − 1 (n − 1)(k − l) C(k) −1 = −1 = . C(l) ln + n − l − 1 nl + n − l − 1
We need to find a constant β such that βε(J, T ) ≥ ε(I, S), or β
(n − 1)(k − l) k−l ≥ , nl + n − l − 1 l
or β ≥ 1 + l−1 .
(19.5)
The second term in (19.5), l−1 , is bound by 0 ≤ l−1 ≤ 1. Thus it is sufficient to find a β ≥ 2. Let us set β = 2. This completes the proof of E-Reduction. Thus we have shown that the following theorem holds. Theorem 1. SET COVER E-reduces to a CSPI . Corollary 1. A CSPI does not belong to APX. Further, CSPI cannot be approximated to within Θ(log n) unless NP ⊂ DTIME(nlog log n ), where n is the number of nodes in the network.
366
P. Manyem
19.4 Conclusions From Remark 4, it follows that certain versions of the minimum spanning tree problem that are of interest in data networking (which need not necessarily be single-source) have an approximation upper bound of |V |, the number of nodes in the network, when the edge costs obey the triangle inequality. In particular, • the hop-constrained version HCSPI , • the delay-constrained version CSPI , and • the diameter-constrained versions (weighted as well as unweighted) have an upper bound of |V | on the performance ratio of any approximation algorithm. The result from Section 19.3 extends to the case of constrained Steiner trees which satisfy the triangle inequality, since CSPI is a special case of such problems for Steiner trees. Specifically, we can conclude that the following theorem holds. Theorem 2. The following single-source problems with edge costs obeying the triangle inequality cannot have an approximation heuristic A that can guarantee a performance ratio RA better than Θ(log n) unless NP ⊂ DTIME(nlog log n ), and hence these problems cannot be in APX: • the delay-constrained spanning tree problem CSPI , and • the delay-constrained Steiner tree CSTI . The CSTI is the triangle-inequality version of Problem 1 in Figure 19.1. Both the edge costs and delays in the delay-constrained problem versions mentioned in this section need to obey the triangle inequality. Acknowledgments The author benefited from discussions with Matt Stallmann of North Carolina State University. Support from the Sir Ross and Sir Keith Smith Foundation is gratefully acknowledged. The comments from the referee were particularly helpful. Since the early 1990s, the online compendium of Crescenzi and Kann, and more recently, their book [1], has been a great help to the research community.
References 1. G. Ausiello, P. Crescenzi, G. Gambosi, V. Kann, A. Marchetti–Spaccamela and M. Protasi, Combinatorial Optimization Problems and their Approximability Properties (Springer-Verlag, Berlin, 1999). 2. J. Bar–Ilan, G. Kortsarz and D. Peleg, Generalized submodular cover problems and applications, Theor. Comp. Sci. 250 (2001), 179–200. 3. U. Feige, A threshold of ln n for approximating set cover, JACM 45 (1998), 634–652. 4. M. R. Garey and D. S. Johnson, Computers and Intractability: A Guide to the Theory of NP–Completeness (Freeman, New York, 1979).
19
Constrained spanning, Steiner trees and the triangle inequality
367
5. R. Hassin, Approximation schemes for the restricted shortest path problem, Math. Oper. Res. 17 (1992), 36–42. 6. S. Khanna, R. Motwani, M. Sudan and U. Vazirani, On syntactic versus computational views of approximability, SIAM J. Comput. 28 (1998), 164–191. 7. C. Lund and M. Yannakakis, On the hardness of approximating minimization problems, JACM 41 (1994), 960–981. 8. P. Manyem, Routing Problems in Multicast Networks, PhD thesis, North Carolina State University, Raleigh, NC, USA, 1996. 9. P. Manyem and M.F.M. Stallmann, Approximation results in multicasting, Technical Report 312, Operations Research, NC State University, Raleigh, NC, 27695–7913, USA, 1996. 10. M. V. Marathe, R. Ravi, R. Sundaram, S. S. Ravi, D. J. Rosenkrantz and H. B. Hunt III, Bicriteria network design problems, J. Alg. 28 (1998), 142–171. 11. C.H. Papadimitriou, Computational Complexity (Addison-Wesley, Reading, MA, 1994). 12. H. F. Salama, Y. Viniotis and D. S. Reeves, The delay–constrained minimum spanning tree problem, in Second IEEE Symposium on Computers and Communications (ISCC’97), 1997.
Chapter 20
Parallel line search T. C. Peachey, D. Abramson and A. Lewis
Abstract We consider the well-known line search algorithm that iteratively refines the search interval by subdivision and bracketing the optimum. In our applications, evaluations of the objective function typically require minutes or hours, so it becomes attractive to use more than the standard three steps in the subdivision, performing the evaluations in parallel. A statistical model for this scenario is presented giving the total execution time T in terms of the number of steps k and the probability distribution for the individual evaluation times. Both the model and extensive simulations show that the expected value of T does not fall monotonically with k, in fact more steps may significantly increase the execution time. We propose heuristics for speeding convergence by continuing to the next iteration before all evaluations are complete. Simulations are used to estimate the speedup achieved. Key words: Line search, parallel computation
20.1 Line searches A line search involves finding the minimal value of a real function f of a single real variable x. We attempt to locate the minimizing argument to within a “tolerance.” Formally, given an interval [a, b] ∈ IR, a function g : [a, b] → IR T. C. Peachey School of Computer Science and Software Engineering, Monash University, Clayton, VIC 3800, AUSTRALIA D. Abramson School of Computer Science and Software Engineering, Monash University, Clayton, VIC 3800, AUSTRALIA A. Lewis HPC Facility, Griffith University, Nathan, QLD 4111, AUSTRALIA C. Pearce, E. Hunt (eds.), Structure and Applications, Springer Optimization and Its Applications 32, DOI 10.1007/978-0-387-98096-6 20, c Springer Science+Business Media, LLC 2009
369
370
T.C. Peachey et al.
and a tolerance d, we require p, q such that x∗ ∈ [p, q] ⊂ [a, b] where g is minimal at x∗ and q − p ≤ d. We assume that the derivative, if it exists, is unknown. Apart from their use in one-dimensional optimization, line searches are used in optimization on domains of higher dimension. For example, the quasiNewton search methods use repeated cycles of determining the search direction and then performing a line search in that direction. The line search algorithm is one of repeated subdivision of the interval and restriction to a subinterval. It can be summarized as follows: 1. 2. 3. 4. 5. 6.
Enter initial interval [a, b] and tolerance d. Set p = a, q = b. Subdivide [p, q] with points p = x0 < x1 < x2 < ... < xk = q, where k ≥ 3. Compute gi = g(xi ) for i = 0, 1, 2, . . . , k. Select xm : gm = mini gi , the point where g is least. If m = 0 replace p by x0 and q by x1 , else if m = k replace p by xk−1 and q by xk , else replace p by xm−1 and q by xm+1 . 7. If q − p ≤ d then return (p, q), else go to Step 3. Clearly the algorithm will terminate if sup(xi −xi−1 )/(q−p) < 1/2, where the supremum is taken over both steps in the line search and iterations of that search. The process yields an interval [p, q] which is guaranteed to contain the minimum if g is unimodal on [a, b]. Usually k is 3 as this is more efficient in terms of the number of function evaluations. It has long been known that the “Fibonacci search” [5] will minimize the number of function evaluations in the worst case. If g is approximately quadratic near the minimum then alternative methods such as Powell’s [6] can be expected to be more efficient. We are concerned with applications where each function evaluation may take at least several minutes on a fast processor. For example, g may represent aerodynamic drag on an object where x is some shape parameter, so a flow simulation would be required for each function evaluation. Further, we assume that batches of evaluations may be performed concurrently, on a cluster of computers or using the resources of the global grid. Clearly in such cases the speed of convergence may be improved by using more than three steps in each subdivision. These “parallel line searches” are the subject of this chapter.
20.2 Nimrod/O Nimrod/O [1, 2] is an optimization package designed for the scenario described above, that is, long evaluation times employing multiple processors. The user prepares a “schedule file” such as the one in Figure 20.1. This specifies the problem parameters, any constraints linking them, how the objective
20
Parallel line search
Fig. 20.1 A sample configuration file.
371
parameter alpha float range from 1 to 15 parameter tcmax float range from 0.5 to 1.5 parameter cmax float range from 0.5 to 1.0 constraint alpha >= tcmax + 2.0*cmax task main copy * node:. node:substitute skeleton foil.inp node:execute run.all copy node:obj.dat output.$jobname endtask method simplex starts 5 starting points random tolerance 0.01 endstarts endmethod method bfgs starts 5 starting points random tolerance 0.01 line steps 8 endstarts endmethod
function is to be evaluated and the optimization algorithm to be used. This example uses two algorithms, the downhill simplex and the method of Broyden, Fletcher, Goldfarb and Shanno (BFGS), each run 5 times with different starting points. The architecture of Nimrod/O is shown in Figure 20.2. Rectangles represent separate processes. The Controller reads the schedule and launches a process for each optimization. When an optimization requires a set of objective evaluations, it first checks the Cache to determine which jobs have already been run. Jobs that are new are sent to the dispatcher which is either the “Nimrod” system [1] or its commercial version “enFuzion.” The dispatcher may run evaluations on the local machine, or on a cluster of machines or perhaps on the world grid. Note that this architecture allows separate optimizations to be run in parallel. Within each optimization we have endeavored to speed the algorithms by employing parallel evaluations where possible. For example our implementation of the BFGS algorithm uses a parallel line search and also concurrent evaluations in the determination of the search direction; we call this implementation “Parallel-BFGS.”
372
T.C. Peachey et al.
Optimizations 1 Schedule 2
Dispatcher Cluster
Controller
3
Cache Results N Fig. 20.2 Architecture of Nimrod/O.
Currently Nimrod/O is being applied in three areas: • Design of an aerofoil. Here a two-dimensional aerofoil is specified in terms of three shape parameters. A FLUENT simulation is used to compute the flowfield around the aerofoil and compute lift and drag. The design problem is to determine the shape parameters that maximize the ratio of lift to drag. • Optimal fatigue life. Finite element models are used to predict the life of mechanical components with pre-existing cracks under a cyclical stress regime. We require the component shape that maximizes this life • Image compression. We consider a compression method based on the mammalian vision system which involves up to 96 parameters. The parameters are to be selected to minimize the compression ratio. It was noticed during the aerofoil study that the execution time for evaluations was bimodal. Most jobs took about 30 minutes but occasional ones required between 3 and 4 hours. Consequently some of the line searches had completed all but one of the evaluations in less than 40 minutes and then required about 3 more hours to finish the last one. (There was no obvious pattern to the values of the domain that gave rise to long execution times.) This raised two issues: A: A smaller number of steps in the line search may achieve faster convergence as fewer jobs are less likely to include an exceptionally long one. B: Faster completion may be provided by a mechanism for aborting longer jobs and proceeding to a subinterval identified by the completed jobs. We consider Hypothesis A in Section 20.3 and B in Section 20.4.
20
Parallel line search
373
20.3 Execution time 20.3.1 A model for execution time This section presents a model for the execution time for a line search, in terms of the number of steps used. Suppose that each iteration of the line search uses k ≥ 3 steps; we assume that the points are equally spaced. Let l be the length of the original search. Each iteration reduces the length of the current domain to a proportion 2/k of the previous (or 1/k if the minimum happens to fall at an end point). Let r iterations be the most required to reduce the length to the tolerance d so r is the least integer such that l(2/k)r ≤ d. Hence , + log(l/d) , (20.1) r = ceil log(k/2) where ceil(x) signifies the least integer that is not less than x. We write Ti for the evaluation time for the ith subdivision point and assume that all the Ti have the same probability density function f (t) and distribution function F (t). We write s for the number of evaluations required in an iteration. Note that, after the first iteration, subsequent ones will not require evaluations at the end points of the subinterval. Further, if k is even and the best point in the previous interval was internal, then the objective at the midpoint of the current interval will have been found in the previous iteration. So we approximate s by k − 2 if k is even and k − 1 if k is odd. As these evaluations are performed in parallel; the evaluation time for one iteration is B = maxi Ti . For the scenario discussed above these times are much larger than the times required for selection of the subdivision points and comparison of the values there. So we assume that the time for each iteration is just B. We assume also that the Ti are statistically independent. Under this condition, see for example [3], the distribution function for B is F (t)s . Thus the mean time for completion of a batch is approximately
∞ d t [F (t)s ] dt. (20.2) M= dt 0 Hence the expected time for the complete optimization is , ∞ + d log(l/d) t [F (t)s ] dt. E = M r = ceil log(k/2) dt 0
(20.3)
20.3.2 Evaluation time a Bernoulli variate As a model of the bimodal distribution encountered with the wing flow experiments, consider the case where the execution time for a single job has a discrete distribution
374
T.C. Peachey et al.
f (t) = aδ(t − x) + (1 − a)δ(t − y),
(20.4)
where δ is the Dirac delta and a, x and y are constants with 0 < a < 1 and x < y. Then (20.2) becomes M = xas + y(1 − as )
(20.5)
and (20.3) becomes + E = [xa + y(1 − a )] ceil s
s
log(l/d) log(k/2)
, .
(20.6)
Graphs of these functions are shown in Figure 20.3. Figure 20.3(a) shows how r decreases in a piecewise manner. Figure 20.3(b) gives M for the case x = 1, y = 8, l/d = 1000 and a = 0.9. Figure 20.3(c) shows E, the product of r and M . Since M increases and r is piecewise constant, E increases while r is constant.
20 15 10 5 0
0
10
20
30
40
50
60
70
(a) Number of iterations, r 10 8 6 4 2 0
0
10
20
30
40
50
60
70
(b) Expected time per iteration, M 50 40 30 20 10 0
Fig. 20.3 Performance with Bernoulli job times.
0
10
20
30
40
50
60
(c) Expected time for line search, E
70
20
Parallel line search
375
20.3.3 Simulations of evaluation time For other distributions of job times, computation of (20.2) becomes difficult so we have performed simulations instead. The line search was performed on the function g(x) = e−x sin(20x) on the domain [0, 1], shown in Figure 20.4. This function has four local minima with a global minimum at x ≈ 0.2331. The tolerance used was 0.001. Job times were generated randomly from (a) an exponential distribution with parameter 2 and (b) a rectangular distribution on [0, 1]. 1 0.5 0 –0.5 Fig. 20.4 Test function g(x).
–1
0
0.2
0.4
0.6
0.8
1
Figure 20.5(a) shows the mean total execution time, averaged over 10,000 runs, plotted against k for the exponential distribution. Each point is shown with error bars enclosing three standard errors. Figure 20.5(b) does the same for the rectangular distribution. Similar results were obtained for a wide variety of tolerance values. For some simulations the line search failed to locate the global minimum, converging on a local minimum instead. Figure 20.5(c) shows the “effectiveness,” the proportion of runs that achieved the global minimum. Here the algorithm is deterministic so effectiveness for a given k is either 0 or 1. In the next section the search will depend on the order of arrival of jobs and effectiveness will be fractional.
20.3.4 Conclusions The preceding results show that increasing the number of steps in a parallel line search may be counter-productive; increases in k may produce considerable increases in E. For this to occur there must of course be variability in the job times. Note that Figure 20.5(b) shows much less increase than does Figure 20.5(a), although the mean and variance of the job times are similar. The significant factor is that the probability of job times is considerably larger than the mean.
376 Fig. 20.5 Results of simulations.
T.C. Peachey et al. 14 12 10 8 6 4 2 0
0
10
20
30
40
50
60
70
(a) E versus k, exponential distribution job times 14 12 10 8 6 4 2 0
0
10
20
30
40
50
60
70
(b) E versus k, rectangular distribution job times 1 0.8 0.6 0.4 0.2 0 0
10
20
30
40
50
60
70
(c) Effectiveness versus k
A typical user of the line search algorithm will not have information on the distribution of job times. However the total time E(k) has local minima at points where r(k) decreases and these values can be predicted from knowledge of just the initial interval length l and the tolerance + d.,Consideration of (20.1) shows that r falls to a value ρ at k = ceil 2 ρ dl . This can be used to compute the number of steps k for a desired number of iterations ρ. Our analysis has assumed that evaluation times are independent. If these are dependent, one may expect positive autocorrelation on the parameter space. This would lead to reduced variation in the later iterations of the line search which in turn would reduce growth in E between jumps. When the objective function is continuous but not unimodal we expect a priori that increasing the value of k makes attaining the global minimum more likely. Figure 20.5(c) supports this.
20
Parallel line search
377
20.4 Accelerating convergence by incomplete iterations 20.4.1 Strategies for aborting jobs
g(x)
We consider strategies for proceeding to the next iteration of a line search before the evaluations for all points in the current iteration are complete. Three heuristics are proposed. Figure 20.6(a) illustrates a situation where 5 of the 7 evaluations of a function g(x) are complete. The minimum so far occurs at x = 3 and the neighbors of that point have been evaluated. If g is unimodal then clearly the minimum is in the range [2, 4]. Thus the remaining evaluations may be aborted and the line search can proceed to the next stage. This leads to the following algorithm for one iteration of a line search:
6 5 4 3 2 1 0
0
1
2
3
4
5
6
5
6
g(x)
(a) Strategy 1 6 5 4 3 2 1 0
Fig. 20.6 Incomplete evaluation points.
0
1
2
3
4
(b) Strategy 2
Strategy 1. Suppose an iteration involves determination of objective values g0 , g1 , . . . , gk . At any time suppose that S represents the set of the gi that have been completed by parallel evaluation. When each new value gj arrives: add it to the set S determine gm , the least value in S if 0 < m < k and gm−1 , gm+1 ∈ S return [xm−1 , xm+1 ] else if m = 0 and g1 ∈ S return [x0 , x1 ] else if m = k and gk−1 ∈ S return [xk−1 , xk ] continue
378
T.C. Peachey et al.
This approach can be extended to returning a greater interval than that provided by the immediate neighbors of the minimum point. In Figure 20.6(b), if g is unimodal then the minimum is in the interval [2, 5]; it may be worthwhile terminating the iteration with this interval. Many variants of this idea are possible. We investigate only the following. Strategy 2. Construct S as in Strategy 1. When each new value gj arrives: add it to the set S determine gm , the least value in S if 0 < m < k if gm−1 ,gm+1 ∈ S then return [xm−1 , xm+1 ] else if gm−1 ,gm+2 ∈ S return [xm−1 , xm+2 ] else if gm−2 ,gm+1 ∈ S return [xm−2 , xm+1 ] else if m = 0 if g1 ∈ S then return [x0 , x1 ] else if g2 ∈ S return [x0 , x2 ] else if m = k if gk−1 ∈ S then return [xk−1 , xk ] else if gk−2 ∈ S return [xk−2 , xk ] continue If sufficient processors are available it may be advantageous to both continue an iteration and to explore a subinterval identified as likely to contain the minimum. This leads to our third heuristic. Strategy 3. Use Strategy 2 to identify the subinterval and then start an iteration based on that interval, but also continue with the original iteration to completion. If later the original iteration finds a minimum better than any so far in the new iteration then the algorithm will “backtrack,” abort the new iteration and start another iteration based on this improved minimum. This is essentially a form of speculative computing, see [4]. Recursion allows a simple implementation. We also considered the effect of applying Strategies 1–3 only after the penultimate job has arrived, that is, when k of the k+1 evaluations have been completed. These heuristics will be denoted by 1p, 2p and 3p, respectively. A full search, completing each iteration before proceeding to the next, is denoted by F.
20.4.2 Experimental results The strategies were implemented for line searches on the test function of Figure 20.4 with tolerance 0.001. For each k from 3 to 70, the search process
Parallel line search
Fig. 20.7 Strategy 1 with exponential distribution of job times.
379 mean execution time
20
12
"strategy_1" "strategy_1p" "full_search"
10 8 6 4 2 0
0
10
20 30 40 50 steps in line search
60
70
(a) Execution times
effectiveness
1 "strategy_1" "strategy_1p" "full_search"
0.8 0.6 0.4 0.2 0 0
10
20 30 40 50 steps in line search
60
70
(b) Effectiveness
mean execution time
was simulated 10,000 times with execution times selected randomly from some probability distribution. Strategies 1 and 1p were applied using exponential evaluation times with a mean λ = 2. Figure 20.7(a) shows the mean execution times and Figure 20.7(b) the effectiveness. In each case the results for these strategies are compared with those for a full search. Figure 20.8 shows times for the same range of strategies but with evaluation times from a rectangular distribution over the interval [0, 1]. These experiments were repeated with the other strategies. Figure 20.9 shows results for the same method as Figure 20.7 but with Strategies 1 and 1p replaced by 2 and 2p. Similarly Figure 20.10 shows results for Strategies 3 and 3p.
Fig. 20.8 Strategy 1 with rectangular distribution of job times.
12
"strategy_1" "strategy_1p" "full_search"
10 8 6 4 2 0
0
10
20 30 40 50 steps in line search
60
70
Fig. 20.9 Results for Strategy 2.
T.C. Peachey et al. mean execution time
380 12
"strategy_2" "strategy_2p" "full_search"
10 8 6 4 2 0
0
10
20 30 40 50 steps in line search
60
70
(a) Execution time
effectiveness
1 0.8
"strategy_2" "strategy_2p" "full_search"
0.6 0.4 0.2 0 0
10
20 30 40 50 steps in line search
60
70
60
70
60
70
12
"strategy_3" "strategy_3p" "full_search"
10 8 6 4 2 0
0
10
20 30 40 50 steps in line search
(a) Execution time 1 effectiveness
Fig. 20.10 Results for Strategy 3.
mean execution time
(b) Effectiveness
"strategy_3" "strategy_3p" "full_search"
0.8 0.6 0.4 0.2 0 0
10
20 30 40 50 steps in line search
(b) Effectiveness
20
Parallel line search
381
20.4.3 Conclusions For job times with an exponential distribution, Strategy 1 shows a speedup of between Strategies 2 and 3 for k ≥ 12, less for k < 12. This increased speed is at the expense of a deterioration in the effectiveness of the search. Strategy 1p is intermediate in performance between F and Strategy 1 for both execution times and effectiveness. The experiments with a rectangular distribution of job times showed less speedup, as there was less increase in M with k. Strategy 2 gave more speedup than that of Strategy 1 but with a further loss of effectiveness. Strategy 3 gave a speedup almost identical to that of Strategy 2 but with improved effectiveness. Hence this strategy is to be preferred when occasional long jobs are delaying execution. This advantage is at the expense of the need for extra processors when two iterations are running concurrently.
References 1. D. Abramson, I. Foster, J. Giddy, A. Lewis, R. Sosic, R. Sutherst and N. White, The Nimrod computational workbench: A case study in desktop metacomputing, in Computational Techniques and Applications Conference, Melbourne, July, 1995. 2. D. Abramson, A. Lewis and T. C. Peachey, An automatic design optimization tool and its application to computational fluid dynamics, in Supercomputing 2001, Denver, November, 2001. 3. A. O. Allen, Probability, Statistics and Queueing Theory (Academic Press, New York, 1978). 4. F. W. Burton, Speculative computation, parallelism and functional programming, IEEE Trans. Comput., C–34 (1985), 1190–1193. 5. J. Kiefer, Sequential minimax search for a maximum, Proc. AMS 4 (1953), 502–506. 6. M. J. D. Powell, An efficient method of finding the minimum of a function of several variables without calculating derivatives, Comput. J., 7 (1964), 155–162.
Chapter 21
Alternative Mathematical Programming Models: A Case for a Coal Blending Decision Process Ruhul A. Sarker
Abstract Real-world problems are complex. It is not always feasible to include all aspects of reality in the model of a problem. In most cases, we deal with a simplified version of the problem that contains only some aspects of reality. Thus a problem can be modeled in a number of different ways depending on the portion of reality to be included or excluded. In this chapter, we address the alternative mathematical programming formulation approaches for a real-world coal-blending problem under different scenarios. The complexity of formulation and solution approaches, quality of solutions, and solution implementation difficulties for these models are compared and analyzed. Choice of the most appropriate model is suggested. Key words: Coal blending, alternative modeling, mathematical programming, linear programming, nonlinear programming
21.1 Introduction In this chapter, we consider a real-world coal-blending problem. Coals are extracted and upgraded for the customers. The raw coals are known as run of mine (ROM) in the coal mining industry. Each coal has its own typical quality specifications. Coal quality is measured in terms of percent of ash, sulfur and moisture, and BTU content per pound, as well as having metallurgical properties. BTU content per pound expresses the heating value of coal. Higher ash content lowers the BTU content value. Sulfur in coal results in sulfur dioxide emission that pollutes the environment. Water particles in the coal Ruhul A. Sarker School of Information Technology and Electrical Engineering, UNSW@ADFA, Australian Defence Force Academy, Canberra, ACT 2600, AUSTRALIA e-mail: [email protected] C. Pearce, E. Hunt (eds.), Structure and Applications, Springer Optimization and Its Applications 32, DOI 10.1007/978-0-387-98096-6 21, c Springer Science+Business Media, LLC 2009
383
384
R.A. Sarker
absorb heat to evaporate and then superheat. The customers specify the quality parameters (maximum percentage of ash and sulfur, and minimum BTU/pound) for their coals. The Coal Company considered in the present research currently operates three mines. These mines differ greatly in their cost of production and coal quality. Mine-3 is a relatively low cost mine, but its coal contains high sulfur and does not have satisfactory metallurgical properties. On the other hand, it contains reasonably low ash. Mine-1 is the highest cost mine, and the coal contains relatively high ash (stone) and medium sulfur but it has excellent metallurgical properties. Mine-2 is the largest and lowest cost mine. Its coal contains higher ash and sulfur than mine-1 coal, but it has good metallurgical properties. Because of the coal properties, only mine-1 and mine-2 coals are used in the preparation of metallurgical coal. Preparation and blending are the two coal upgrading and processing facilities. Coal preparation (washing) is a process of removing physical impurities. The process involves several different operations, including crushing (to create a size distribution), screening (to separate sizes) and separators (mainly cyclones, to remove the physical impurities). The objective of running a coal preparation plant is to maximize the revenue from clean coal while removing the undesirable impurities. The processing of ROM coal from mine-3 in the preparation plant does not improve the quality of coal with a reasonable yield. Therefore, the involvement of the preparation plant with this low quality ROM coal means a lower financial performance for the company. The customers do not accept these high sulfur coals for their plant operations because of environmental pollution restriction. The conversion of low quality ROM coals to a minimum acceptable quality level will mean a better financial performance for the company. A blending process provides an opportunity of quality improvement. Blending is a common process in the Coal Industry. Blending allows upgrading the low quality run of mine coals by mixing with good quality coals. Furthermore, supplying the good quality ROM coals to the customers, through blending, can reduce the cost of production, because it saves (i) the cost of washing and (ii) lost BTU from the refuses of the preparation plant, and (iii) it also eliminates the need for capital investment in washing facilities. Most of the thermal coal customers accept blended products if they satisfy their quality requirements. In the blending process, the problem is to determine the quantity required from each run-of-mine coals and washed products that maximizes the revenue but satisfies the quality constraints of the customers. A single period coal-blending problem can be formulated as a simple linear programming model ([Gershon, 1986], [Hooban and Camozzo, 1981], [Bott and Badiozamani, 1982], [Gunn, 1988], [Gunn et al., 1989], [Gunn and Chwialkowska, 1989], and [Gunn and Rutherford, 1990]). For the multiperiod case ([Sarker, 1990], [Sarker, 1991], [Sarker and Gunn, 1990], [Sarker, 1994], [Sarker and Gunn, 1991], [Sarker and Gunn, 1997], [Sarker and Gunn, 1995],
21
Alternative Mathematical Models
385
[Sarker and Gunn, 1994], and [Sarker, 2003]), the modeling process depends on the decision whether to carry inventory of run-of-mine (ROM) or of blended coal (final product). The multiperiod coal-blending problem with inventory of blended coal is a nonlinear program. On the other hand, the multiperiod blending problem with inventory of ROM can be formulated as a linear program. In this case, a number of alternative LP models can be developed allowing the use of ROM inventory in n future periods. A large-scale LP is solvable using any of the standard LP packages. However, a large-scale nonlinear program is complex and is not easy to solve. The current model is an especially structured nonlinear program, and is solved using a simple SLP (Successive Linear Programming) algorithm developed by [Sarker and Gunn, 1997]. The solutions of some multiperiod LP models are not practically feasible for several technical reasons. The quality of solutions, complexity of formulation and solution approaches, and solution implementation difficulties for these models are compared and analyzed. A choice of the most appropriate model is suggested. The chapter is organized as follows. Following the introduction, we discuss four alternative models for coal blending and unpgradation. The flexibility of these models ares analyzed in Section 3. Section 4 discusses the problem sizes and computational time required. The objective function values and the nature of fluctuating situations for the test problems are presented in Section 5. The selection criteria for choosing the most appropriate model is discussed in Section 6 and the conclusions are drawn in Section 7.
21.2 Mathematical programming models The single period LP model is formulated to determine an optimal strategy for coal blending, washing and customer allocation so as to transform the available run of mine coal into products within customer market specifications at maximum overall profit. The constraints considered in this model are the maximum and minimum allowable limits of ash, sulfur and BTU content, production limits, demand requirements, etc. In the multiperiod case, the objective function and constraint types are similar to the single period model. However, the inventory of ROM and/or blended product is used as the linking mechanism from one period to the next. The problem formulation when considering inventory of ROM becomes a linear program, whereas with inventory of blended product it becomes a nonlinear program. Any ROM extracted in period t can be used in the blending process in any or all future periods. This assumption controls the size of LP in the multiperiod formulation. The planning horizon considered is 12 months, in 1-month-long periods. We consider four different models in this chapter. These models are defined as follows: • SPM: Single period model. • MNM: Multiperiod nonlinear model.
386
R.A. Sarker
• MLM: Multiperiod linear model. • ULM: Upper bound linear model. To give an idea about the mathematical models of coal blending and upgradation, we will discuss the above four models briefly in this section. All the models consider M mines, N S local customers, K washed coals customers, CT thermal coal customers (by (metric) tonne basis) and CB thermal coal customers (by BTU content basis). The company has both coal washing/upgrading and coal blending facilities. The blending process accepts both ROMs and washed coal to produce blended coals. The objectives of these models are to find appropriate production plans to satisfy the customers’ demand for a given number of periods by satisfying the following constraints: • Demand constraints for all types of customers (maximum and minimum requirements are known in advance) • Mine production capacity (known maximum and minimum capacity) • Wash plant capacity constraints • Quality constraints such as allowable upper and lower limit of percentages of ash, sulfur and BTU content per pound, and • Overall sulfur emission constraint for environmental control
21.2.1 Single period model (SPM) The details of the SPM are presented below for the readers. SPM considers a period of 1 month long. Variables bpjl cmjl wck wbkjl mbkjl wpkc
(metric) tonnes of blended product for customer j made at location l (metric) tonnes of run-of-mine coal from mine m used for blended product j at location l (metric) tonnes of washed product k produced (metric) tonnes of washed product k used for blended product j at location l (metric) tonnes of middling product k used for blended product j at location l (metric) tonnes of washed product k sent to customer c Data
J L(j) c acm , scm , Bm w w aw k , sk , Bk
number of blended product customers set of sites used for blended product for customer j run-of-mine ash, sulfur and BTU/lb analysis for mine m as received ash, sulfur and BTU/lb analysis for washed product k
21
Alternative Mathematical Models
387
Data (continued) m m am k , sk , Bk + a+ j , sj − − a− j , sj , Bj
BT Uj+ , BT Uj− + SN S NS
ζj I c rmk k rmid + − , M Pm M Pm BCOSTjl
M P ROm M Bm BkW b BkM b P Bj P Pkc T CBCjl T CM Lml T CM Wm T CW Ll T CW Cc
as received ash, sulfur and BTU/lb analysis for middling product k maximum allowable ash and sulfur analysis for customer j minimum allowable ash, sulfur and BTU/lb analysis for customer j maximum and minimum BTU requirements for customer j maximum allowable sulfur supplied to local customers set of blended product customers who correspond to local customers amount of SO2 per (metric) tonne of sulfur supplied to customer j ∈ N S number of mines amount of run-of-mine coal from mine m used per (metric) tonne of washed product k (this corresponds to the washed product recipe) ratio of middling in product k produced maximum and minimum production from mine m blending cost (dollar/(metric) tonne) for customer j at location l mining cost (dollar/(metric) tonne) for mine m BTU content (million BTU/(metric) tonne) for ROM coal from mine m BTU content (million BTU/(metric) tonne) for washed product k BTU content (million BTU/(metric) tonne) for middlings of washed product k price (dollar/million BTU) offered by the blended customer j price (dollar/(metric) tonne) offered by customer c for washed product k transportation cost (dollar/(metric) tonne) to blended product customer j from blending location l transportation cost (dollar/(metric) tonne) from mine m to blending location l transportation cost (dollar/(metric) tonne) from mine m to VJ plant transportation cost (dollar/(metric) tonne) from VJ plant to blending location l transportation cost (dollar/(metric) tonne) from VJ plant to washed customer c (may include banking and pier costs)
388
R.A. Sarker
The objective function to be maximized, profit, is Z=
j
+
m
+
[−(BCOSTjl + T CBCjl ) × bpjl ]
l
j
l
[−Wk −
k
+
k
+ +
M [Bm × P Bj − T CM Ljl − M P ROm ] × cmjl
j
j
k
c
amk × (M CM Wm + M P ROm )] × wck
m
[BkW b × P Bj − T CW Ll ] × wbkjl
l
k
[BkM b × P Bj − T CW Ll ] × mbkjl
l
[P Pkc − T CW Cc ] × wpkc
Constraints The constraints of SPM are presented below. Relations (21.1)–(21.8) all hold for j = 1, J, l ∈ L(j). 1. Mass balance for blended products: cmjl + wbkjl + mbkjl = 0 − bpjl + m
k
2. Ash limits in blended products: − a+ acm cmjl + aw am k wbkjl + k mbkjl ≤ 0 j bpjl + m
− a− j bpjl +
k
(21.1)
k
(21.2)
k
am k mbkjl ≥ 0
(21.3)
3. Sulfur limits in blended products: − s+ scm cmjl + sw sm k wbkjl + k mbkjl ≤ 0 j bpjl +
(21.4)
acm cmjl +
m
k
m
− s− j bpjl +
m
aw k wbkjl +
k
k
scm cmjl +
k
k
sw k wbkjl +
sm k mbkjl ≥ 0
4. Minimum BTU content in blended products: c − Bj− bpjl + Bm cmjl + Bkw wbkjl + Bkm mbkjl ≥ 0 m
k
(21.5)
k
k
(21.6)
21
Alternative Mathematical Models
389
5. Overall BTU supply to customers: % & r c w m BT Uj ≤ Bm cmjl + Bk wbkjl + Bk mbkjl m
l
k
k
(21.7)
for r = +, − 6. Overall sulfur supplied to NSPC plants: % & + c w m ζj sm cmjl + sk wbkjl + sk mbkjl ≤ SN S j∈N S
m
k
7. Maximum and minimum mine production: − c + M Pm ≤ cmjl + rmk wck ≥ M Pm j
l
(21.8)
k
i = 1, I
(21.9)
k
8. Mass balance in washplant: wck − wbkjl − wpkc = 0
k = 1, K
(21.10)
9. Middlings ratio for washed products: rkmid wck − mbkjl = 0 k = 1, K
(21.11)
j
c
l
j
l
10. nonnegativity constraints. We compare this model with the multiperiod model by considering a collection of 12 single period models.
21.2.2 Multiperiod nonlinear model (MNM) This model considers 12 periods where each period is 1 month long. The variables of MNM are similar to those of SPM with an additional subscript t to represent time period. To differentiate from SPM we use capital letters for variables. Although the constraints of this model in each period are similar to SPM, it has additional constraints to link one period to the next for the entire planning horizon. The model allows the inventory of blended product to be carried from one period to the next. The inventory variables and inventory balance constraints maintain the links in this multiperiod model. However, the quality (percentage of ash, sulfur and BTU content per pound) parameters of blended coal inventories carried from one period to next are unknown which introduce nonlinearity in the model. Although the details of the mathematical model for MNM can be found in [Sarker and Gunn, 1997],
390
R.A. Sarker
the mass balance constraint is presented below to give an idea about the nature of variables and constraints in MNM: Cmjlt + W Bkjlt + M Bkjlt = 0 ∀j, l, t − BPjlt − Ijlt + Ijlt−1 + m
k
k
(21.12) where BPjlt
Cmjlt W Bkjlt M Bkjlt Ijlt
(metric) tonnes of blended product j, supplied to customer j, made at location l in period t (jth product corresponds to jth customer) (metric) tonnes of run-of-mine coal from mine m used for blended product j at location l in period t (metric) tonnes of washed product k used for blended product j at location l in a period t (metric) tonnes of middling product k used for blended product j at location l in period t inventory of blended product j at location l at the end of period t
The above constraint indicates that the total amount of blended product j (produced for customers and inventory) is equal to the sum of its constituents of ROM coals, washed coals, middling products and blended coals from inventories.
21.2.3 Upper bound linear model (MLM) The MLM is similar to MNM except that it allows the transfer of the inventory of ROM from one period to any or all future periods within the planning horizon. This model forms the upper bound of the problem since it considers all possible savings from inventories and productions. The details of the model can be found in [Sarker, 2003]. This model allows carrying the most attractive input(s) in terms of quality and cost, for future periods. This is the upper bound of the planning problem because: 1. this model considers all possible alternatives of supplying coals to customers and 2. the solution to this model will give an objective value larger than or equal to any feasible solution to the ”true problem.” To have a feeling about the nature of variables and constraints in ULM, we present the mass balance constraint below: Cmjlτ t + W Bkjlτ t + M Bkjlτ t = 0 (21.13) − BPjlt + m τ ≤t
k
τ ≤t
k
τ ≤t
21
Alternative Mathematical Models
391
where ROM coal of mine m produced in period τ , used in blended product j at location l in a period t washed product k produced in period τ , used in blended product j at location l in a period t middling product k produced in period τ , used in blended product j at location l in a period t
Cmjlτ t W Bkjlτ t M Bkjlτ t
The constraint represents that the total amount of blended product j (produced for customers only) is equal to the sum of its constituents of ROM coals, washed coals and middling products taken from current and previous periods.
21.2.4 Multiperiod linear model (MLM) The MLM is similar to ULM except that it only permits the carrying of inventory of ROM coals from one period to the next where the quality parameters of ROM coals are known. That means, the run-of-mine and washed coals produced in period t (=τ ) will be carried for further use in period t+1 only. So the corresponding mass balance constraint will be as follows: Cmjl(t−1)t + W Bkjl(t−1)t + M Bkjl(t−1)t = 0 (21.14) − BPjlt + m
where Cmjl(t−1)t W Bkjl(t−1)t M Bkjl(t−1)t
k
k
ROM coal of mine m produced in period (t − 1), used in blended product j at location l in a period t washed product k produced in period (t − 1), used in blended product j at location l in a period t middling product k produced in period (t − 1), used in blended product j at location l in a period t
A number of new models can be formulated between MLM and ULM by varying n (the inventory of ROM and washed coals which can be carried from one period to up to n period, where the maximum value of n is 11 in our 12 period case). Please note that we intentionally ignore the mathematical details of MNM, ULM and MLM in this chapter, as they are too long and the emphasis of the chapter is on comparisons, and refer [Sarker and Gunn, 1997] and [Sarker, 2003] to interested readers. Alternatively they can be made available by the author upon request. These models differ in their capability of handling fluctuating situations, the computational time required, the size of the problem, optimal objective function values, number of coal banks required, etc. By a fluctuating situation we mean a variable planning environment. These aspects are discussed in the following sections.
392
R.A. Sarker
SPM, ULM and MLM have been solved using the XMP linear programming codes. The MNM is a specially structured nonlinear program that is solved using a simple SLP algorithm developed by [Sarker and Gunn, 1997].
21.3 Model flexibility The SPM is the least flexible model and ULM is the most flexible model. The MNM is less flexible in choosing the inputs in comparison to MLM, but more flexible in handling fluctuating situations. In our computational experience, the objective function value of MNM is a little less than that of MLM when there is a stable demand and production pattern. This is due to the fact that the blended product inventory may not be an attractive input in the next period. In the following, we discuss how the models work under fluctuating demand and inputs. Consider the following simplified situations for a three period problem: Let X1 , X2 and X3 be the maximum level of inputs available in periods 1, 2 and 3, and Q1 , Q2 and Q3 are the respective demands in periods 1, 2 and 3. Case-1 X1 +X2 +X3 = Q1 +Q2 +Q3 , Q2 = X2 , Q1 < X1 and Q3 > X3 Case-2 X1 +X2 +X3 = Q1 +Q2 +Q3 , Q3 = X3 , Q1 < X1 and Q2 > X2 Case-3 X1 +X2 +X3 = Q1 +Q2 +Q3 , Q1 < X1 , Q2 > X2 and Q3 > X3 The simple line diagrams for these three cases are shown in parts (a)–(e) of Figure 21.1. The models treat each of the cases as follows:
21.3.1 Case-1 SPM (Figure 21.1a): • • • •
Q1 and Q2 are satisfied, but Q3 is not satisfied Shortage in period 3 = Q3 − X3 Unused capacity in period 1 = X1 − Q1 This model does not provide a feasible solution
MNM (Figure 21.1b): • • • •
Q1 , Q2 and Q3 are satisfied IQ1 ≤ X1 − Q1 , IQ2 = IQ1 and IQ2 = Q3 − X3 Unused capacity in period 1 = (X1 + X3 ) − (Q1 + Q3 ) The model does provide a feasible solution
21
Alternative Mathematical Models
393
Period-1
Period-2
Period-3
inputs
inputs
inputs
X1
X2
X3
Q1
Q2
Q3
to customers
to customers
to customers
(a) SPM inputs
inputs
X1
inputs
X2
X3
inventory of blend
inventory of blend
IQ1
IQ2
Q1
Q2
to customers
Q3
to customers
to customers
(b) MNM inputs X1
inputs X2
inputs X3
IX1
IX2
inventory of inputs
inventory of inputs
Q1 to customers
Q2 to customers
Q3 to customers
(c) MLM inputs X1
inputs X2
inputs X3
IX13
inventory of inputs IX1
IX2 inventory of inputs
Q1
Q2
to customers
Q3
to customers
to customer
(d) ULM
inputs X1
inputs
inputs
X2
X3
inventory of blend Q1 to customers
IQ1
inventory of blend Q2 to customers
(e) A Variant of MNM
Fig. 21.1 Simple case problem.
IQ2
Q3 to customers
394
MLM (Figure 21.1c): • • • • •
Q1 and Q2 are satisfied, but Q3 is not satisfied IX1 = 0, IX2 = 0 Shortage in period 3 = Q3 − X3 Unused capacity in period 1 = X1 − Q1 This model does not provide a feasible solution
ULM (Figure 21.1d): • • • • •
Q1 , Q2 and Q3 are satisfied IX1 = 0, IX2 = 0 IX13 ≤ X1 − Q1 , IX13 = Q3 − X3 Unused capacity in period 1 = (X1 + X3 ) − (Q1 + Q3 ) The model does provide a feasible solution
Variant of MNM (Figure 21.1e): • • • • •
Q1 and Q2 are satisfied, but Q3 is not satisfied IQ1 = 0, IQ2 = 0 Shortage in period 3 = Q3 − X3 Unused capacity in period 1 = X1 − Q1 This model does not provide a feasible solution
21.3.2 Case-2 SPM (Figure 21.1a): • Q1 and Q3 are satisfied, but Q2 is not satisfied • Shortage in period 2 = Q2 − X2 • The model does not provide a feasible solution MNM (Figure 21.1b): • Q1 , Q2 and Q3 are satisfied • IQ1 ≤ X1 − Q1 , IQ2 = 0 and IQ1 = Q2 − X2 • The model does provide a feasible solution MLM (Figure 21.1c): • Q1 , Q2 and Q3 are satisfied • IX1 ≤ X1 − Q1 , IX2 = 0 and IX1 = Q2 − X2 • The model provides a feasible solution ULM (Figure 21.1d): • • • •
Q1 , Q2 and Q3 are satisfied IX13 = 0, IX2 = 0 IX1 ≤ X1 − Q1 , IX2 = Q2 − +X2 The model provides a feasible solution
R.A. Sarker
21
Alternative Mathematical Models
395
Variant of MNM (Figure 21.1e): • Q1 , Q2 and Q3 are satisfied • IQ1 ≤ X1 − Q1 , IQ2 = 0 and IQ1 = Q2 − X2 • The model provides a feasible solution
21.3.3 Case-3 Only MNM and ULM provide feasible solutions. The MNM and ULM provide feasible solutions for all of the three cases. The MLM and variant of MNM give feasible solutions for case 2 only, and SPM does not provide any feasible solution for any of the three cases.
21.4 Problem size and computation time The ULM is a simple but large linear program. We can solve a reasonably large linear program without much difficulty. The MLM model is also a linear program. It is smaller than the upper bounding model. The MNM is smaller than the ULM and close to MLM, but it takes the largest computational time. In our study, we solved 36 test problems. In the test problems, we considered the number of blended products up to 2, blending locations up to 3, number of mines up to 3, coal washing facilities up to 3 and the time periods 3 to 12. The arbitrary demand, capacity and quality data were randomly generated for different test problems. The ranges for some monthly data are: blended product demand, 200,000 to 250,000 (metric) tonnes, washed coal demand, 290,000 to 550,000 (metric) tonnes, production capacity of mine-1, 85,000 to 270,000 (metric) tonnes, capacity of mine-2, 95,000 to 300,000 (metric) tonnes and capacity of mine-3, 70,000 to 240,000 (metric) tonnes. The relative problem sizes of the models are shown in Table 21.1. Table 21.1 Relative problem size of ULM, MLM and MNM
Model ULM MLM MNM
Minimum Problem Size Number of Number of Constraints Variables 66 49 66 44 75 46+9
Maximum Problem Size Number of Number of Constraints Variables 576 4770 576 1800 792 1494+216
For the largest problem, the number of variables in MNM is (1494+216) = 1710. Out of these 1710 variables, 216 variables are additional variables, which
396
R.A. Sarker
are required to solve the model using a SLP algorithm [Sarker and Gunn, 1997]. The ULM and MLM have a similar number of constraints and the MLM has many fewer variables. For the largest problem, the ULM model contains 576 constraints and 4770 variables, whereas the MLM contains 576 constraints and 1800 variables. With an increasing number of blended products, blending locations, washed products, inputs and customers, the ULM becomes a very large problem in comparison to the other two models. The ULM is a possible candidate for the planning problem under consideration. This model could be a very large linear program with a large number of blended products, washed products, customers, mines and blending locations. If the problem size is too large, one could consider the following points to reduce the size of the problem without losing the characteristics of the model. 1. Reduce the number of periods in the planning horizon. Consider 5 periods (3 of one month, 1 of three months and 1 of six months) instead of 12 periods. 2. Group the customers based on similar prices offered and transportation costs. 3. Reduce or group the number of products based on similar quality parameters. 4. Omit the quality constraints for blended product. These considerations will provide a solution in more aggregate form. More aggregate means more problem in disaggregation under an unstable planning environment. In such a case, the reformulation of the disaggregate model may be necessary to obtain a detailed and practically feasible solution. Though we ignore the physical inventory in the upper bounding model, the model does not require extensive inventory carry through. We have examined closely the blending processes considered in modeling MNM and MLM. The way of dealing with inventory is one of the major differences between these two models. We allow the inventory of blended product in MNM and inventory of run-of-mine and washed coals in MLM. In both cases, the inventories were carried for use in the next period. However, the inventory of blended product (in MNM) of a period can be used in further future periods through blending in the intermediate periods.
21.5 Objective function values and fluctuating situation The SPM, considered in this chapter, is a collection of 12 single period models without having linking mechanism from one period to the next. This means the model does not allow carrying of any inventory. This model gives a lower bound of the planning problem for profit maximization and cannot be applied in fluctuating cases. The ULM is a Land algorithm or transportation type
21
Alternative Mathematical Models
397
model. This model considers all possible savings from the carrying inventories of inputs for the blending process. This model gives an upper bound of the problem. The computational comparisons of these models are presented in Table 21.2. Table 21.2 Objective function values of ULM, MLM and MNM Model ULM MLM MNM
Objective Value of Smallest Problem (Million Dollars) 27.817 27.778 27.771
Objective Value of Largest Problem (Million Dollars) 117.852 117.327 116.011
SMP is infeasible for both cases. Normally the MLM shows lower profit than that of the ULM and higher than that of the SPM. However this model shows infeasibility in highly fluctuating situations. The MNM also shows lower profit than that of the ULM, higher than that of the SPM and close to that of MLM for most cases. The MNM can handle a highly fluctuating situation as well as the ULM.
21.6 Selection criteria The above analysis is used to suggest an appropriate model for the planning problem. The ULM model can be proposed for use as a tactical model. The analysis shows that 1. the ULM ensures the highest profit, 2. it takes less computational time than does the nonlinear model, 3. it provides flexibility to choose inputs and tackle fluctuating situations, and 4. for practicality the number of banks (coal piles) suggested by the model are much lower than that predicted by theoretical calculations. The selection of ULM as a planning model may raise a question as to the use of MNM. The MNM is an alternative formulation approach for the planning issues which follows the concept of traditional formulation of multiperiod planning. This model also allows us to check the solution of ULM or MLM. The solutions of these models are reasonably close as expected. One may prefer to carry an inventory of blended product instead of run-of-mine and washed coals, and take the advantage of managing fewer banks using the solution of MNM. The development of the solution method for MNM has given us that opportunity. The use of available LP codes in solving a large nonlinear program makes the algorithm very attractive for practical applications. The algorithm can also be used for solving other multiperiod
398
R.A. Sarker
blending problems like food or crude oil blending. Since the algorithm developed for the MNM can be generalized for a class of nonlinear programs, its contribution to the knowledge is justified.
21.7 Conclusions We addressed a real-world coal-blending problem. The coal-blending problem can be modeled in a number of different ways depending on the portion of reality to be excluded. The alternative mathematical programming models under different scenarios are discussed and analyzed. A coal-blending model has been selected from a number of alternative models by comparing: 1. 2. 3. 4.
the the the the
computational complexity, model flexibility, number of banks required, and objective function value.
The upper bound linear programming model seems appropriate because 1. 2. 3. 4.
it shows the highest profit, it takes less computational time than does the nonlinear model, it is the most flexible model in tackling an unstable planning environment, for practicality, the number of banks suggested by the model is much lower than that predicted by theoretical calculations, and 5. it is implementable.
References [Bott and Badiozamani, 1982] Bott, D. L. and Badiozamani, K. (1982). Optimal blending of coal to meeting quality compliance standards. In Proc. XVII APCOM Symposium, (American Institute of Mining Engineers, New York, 1982) pages 15–23. [Gershon, 1986] Gershon, M. (1986). A blending-based approach to mine planning and production scheduling. In Proc. XIX APCOM Symposium, (American Institute of Mining Engineers, New York, 1986) pages 120–126. [Gunn, 1988] Gunn, E. A. (1988). Description of the coal blend and wash linear programming model. In A Res. Report for Cape Breton Dev. Corp., Canada, (Cape Breton Development Corporation, Cape Breton) pages 1–16. [Gunn et al., 1989] Gunn, E. A., Allen, G., Campbell, J. C., Cunningham, B., and Rutherford, R. (1989). One year of or: Models for operational and production planning in the coal industry. In Won CORS Practice Prize-89, TIMS/ORSA/CORS Joint Meeting, Vancouver, Canada. [Gunn and Chwialkowska, 1989] Gunn, E. A. and Chwialkowska, E. (1989). Developments in production planning at a coal mining corporation. In Proc. Int. Ind. Eng. Conference, (Institute of Industrial Engineers, Norcross, Georgia USA) pages 319–324.
21
Alternative Mathematical Models
399
[Gunn and Rutherford, 1990] Gunn, E. A. and Rutherford, P. (1990). Integration of annual and operational planning in a coal mining enterprise. In Proc. of XXII APCOM Intl. Symposium, Berlin, (Technische Universit¨ at, Berlin) pages 95–106. [Hooban and Camozzo, 1981] Hooban, M. and Camozzo, R. (1981). Blending coal with a small computer. Coal Age, 86:102–104. [Sarker, 1990] Sarker, R. A. (1990). Slp algorithm for solving a nonlinear multiperiod coal blending problem. In Honourable Mention Award, CORS National Annual Conference, Ottawa, Canada, (Canadian Operational Research Society, Ottawa) pages 1–27. [Sarker, 1991] Sarker, R. A. (1991). A linear programming based algorithm for a specially structured nonlinear program. In CORS Annual Conference, Quebec City, Canada. [Sarker, 1994] Sarker, R. A. (1994). Solving a class of nonlinear programs via a sequence of linear programs. In Eds Krishna, G. Reddy, R. Nadarajan, Stochastic Models, Optimization Techniques and Computer Applications, pages 269–278. Wiley Eastern Limited. [Sarker, 2003] Sarker, R. A. (2003). Operations Research Applications in a Mining Company. (Dissertation. de Verlag, Berlin). [Sarker and Gunn, 1990] Sarker, R. A. and Gunn, E. A. (1990). Linear programming based tactical planning model for a coal industry. In CORS National Annual Conference, Ottawa, Canada. [Sarker and Gunn, 1991] Sarker, R. A. and Gunn, E. A. (1991). A hierarchical production planning framework for a coal mining company. In CORS Annual Conference, Quebec City, Canada. [Sarker and Gunn, 1994] Sarker, R. A. and Gunn, E. A. (1994). Coal bank scheduling using a mathematical programming model. Applied Mathematical Modelling, 18:672– 678. [Sarker and Gunn, 1995] Sarker, R. A. and Gunn, E. A. (1995). Determination of a coal preparation strategy using a computer based enumeration method. Indian Journal of Engineering and Material Sciences, 2:150–156. [Sarker and Gunn, 1997] Sarker, R. A. and Gunn, E. A. (1997). A simple slp algorithm for solving a class of nonlinear programs. European Journal of Operational Research, 101(1):140–154.
About the Editors
Emma Hunt undertook her undergraduate studies at the University of Adelaide, obtaining a B.A. with a double major in English and a B.Sc. with first class honors in Mathematics. She subsequently graduated with a Ph.D. in the area of Stochastic Processes. She worked as a Research Scientist with DSTO from 1999 to 2005. She is currently a Visiting Lecturer in the School of Economics at the University of Adelaide. She is Executive Editor of the ANZIAM Journal, Deputy Editor of the Bulletin of the Australian Society for Operations Research (ASOR) and Chair of the South Australian Branch of ASOR. Her research interests lie in optimization and stochastic processes. Charles Pearce graduated in mathematics and physics from the University of New Zealand and has a Ph.D. from the Australian National University in the area of stochastic processes. He holds the Elder Chair of Mathematics at the University of Adelaide and is on the editorial boards of more than a dozen journals, including the Journal of Industrial and Management Optimization, the Journal of Innovative Computing Information and Control, Advances in Nonlinear Variational Inequalities, Nonlinear Functional Analysis and Applications, and the ANZIAM Journal (of which he is Editor-In-Chief). He has research interests in optimization, convex analysis and probabilistic modeling and analysis and about 300 research publications. In 2001 he was awarded the ANZIAM medal of the Australian Mathematical Society for outstanding contributions to applied and industrial mathematics in Australia and New Zealand. In 2007 he was awarded the Ren Potts medal of the Australian Society for Operations Research for outstanding contributions to operations research in Australia.
C. Pearce, E. Hunt (eds.), Structure and Applications, Springer Optimization and Its Applications 32, DOI 10.1007/978-0-387-98096-6 BM2, c Springer Science+Business Media, LLC 2009
401