LOGIC-BASED DECISION SUPPORT: Mixed Integer Model Formulation
ANNALS OF DISCRETE MATHEMATICS
General Editor: Peter L...
22 downloads
507 Views
9MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
LOGIC-BASED DECISION SUPPORT: Mixed Integer Model Formulation
ANNALS OF DISCRETE MATHEMATICS
General Editor: Peter L. HAMMER Rutgers University,New Brunswick, NJ, U.S.A.
Advisory Editors C. BERGE, Universite de Paris, France M. A. HARRISON, University of California, Berkely, CA, U.S.A. V. KLEE, University of Washington, Seattle, WA, U.S.A. J.-H. VAN LINT California Institute of Technology, Pasadena, CA, U.S.A. G.-C.ROTA, Massachusetts Institute of Technology, Cambridge, MA, U.S.A.
This volume is based o n lectures delivered at the First Advanced Research Institute o n Discrete Applied Mathematics supported by the Air Force Office of Scientific Research and held at RUTCOR - Rutgers Center for Operations Research, May 1985.
40
LOGIC-BASED DECISION SUPPORT Mixed Integer Model Formulation
Robert G.JEROSLOW t
1989
NORTH-HOLLAND-AMSTERDAM
NEWYORK
OXFORD *TOKYO
'
Elsevier Science Publishers B.V., 1989
All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, without the prior written permission of the publishers, Elsevier Science Publishers B.V. (Physical Sciences and Engineering Division), PO. Box 103, lOOOAC Amsterdam, The Netherlands. Special regulations for readers in the USA - This publication has been registered with the Copyright Clearance Center lnc. (CCC), Salem, Massachusetts. Information can be obtained from the CCC about conditions under which photocopies of parts of this publication may be made in the USA. All other copyright questions, including photocopying outside of the USA, should be referred to the copyright owner, Elsevier Science Publishers B.V., unless otherwise specified. No responsibility is assumed by the Publisher for any injury and/or damage to persons or property as a matter of products liability, negligence or otherwise, or from any use or operation of any methods, products, instructions or ideas contained in the material herein.
ISBN: 0 444 87119 5 Published by:
ELSEVIER SCIENCE PUBLISHERS B.V P.O. Box 103 1000 AC Amsterdam The Netherlands Sole distributors for the U.S.A. and Canada.
ELSEVIER SCIENCE PUBLISHING COMPANY, INC 52 Vanderbilt Avenue
N e w York, N.Y. 10017 U.S.A.
Library of Congress Cataloging-in-Publication Data J e r o s i o w , R o b e r t G . , 1942-1988 1.9,gic-based d e c i s i o n s u p p o r t . (ilniials of d i s c r e t e m a t h e m a t i c s ; 4 0 ) Bibliography: p . 1 . Decision support systems. 2 . Decision-making-3. L o g i c , Symbolic and Mathematical models. mathematical. I. T i t l e . 11. S e r i e s .
T58.62.547 1989 l S B N 0-444-87119-5
658.4'03
PRINTED IN THE NETHERLANDS
88-31027
V
Robert S. Jeroslow 1942 - 1988
vi
ROBERT G. JEROSLOW 1942 - 1988 On August 3 1 this year, towards the middle of the Mathematical Programming Symposium in Tokyo, Bob Jeroslow had a fatal heart attack. His early, sudden and completely unexpected death at the age of 46 came as lightning from a clear sky. It was a terrible shock to his friends and colleagues and in a way made all of us newly aware of our vulnerability and the frailty of the human condition. Our profession suffered a heavy loss indeed. Bob started his graduate work in Operations Research (at Columbia first, then at Cornell), but soon switched to Mathematics and wrote his dissertation in Logic, under Anil Nerode. He was a fresh Ph.D. in the summer of ’69 on his way to a job in the Math. Department at the University of Minnesota, when he visited Pittsburgh and we met for the first time. We had several conversations that I found intellectually stimulating, and I showed him some of the problems in Discrete Optimization that I was working on and that I felt could benefit from his background in Logic. We continued our contacts through telephone and correspondence, and the upshot of our interaction was the paper “On the Structure of the Hypercube” (Management Science Research Report No. 198, CMU, August-December 1969). This joint piece of work, published later as “Canonical Cuts on the Unit Hypercube” in the SIAM Journal on Applied Mathematics, 23, 1972, seems to have played a role in luring Bob back to Operations Research, and so in 1972 he joined our group at CMU’s Graduate School of Industrial Administration as Assistant Professor. At C M U Bob flung himself wholeheartedly into the ongoing effort of developing a cutting plane theory for integer and nonconvex programming based on Convex Analysis, that would use different tools and capture different aspects of the problem than the group theoretic approach that was prevalent at the time. Among the people outside CMU involved in this effort, Bob was in contact mainly with Fred Glover. The result of our joint work in this direction became known as disjunctive programming, or the disjunctive method. It is essentially a theory of optimization over unions of convex polyhedra. Bob wrote several papers on the subject, some by himself, some with me and some with his former doctoral student, Charlie Blair. The topics of these papers range from the basic principles of disjunctive programming (Annals of Discrete Mathematics, I , 1977; Journal of Optimization Theory and Its Applications, 30, 1980), to structural properties like the sequential convexifiability of facial disjunctive sets (SIAM Journal on Control and Optimization, 18, 1980; Discrete Applied Mathematics, 9, 1984), to methodological contributions like the monoidal cut strengthening procedure for mixed-integer programs that combines the disjunctive method with the algebraic approach (European Journet of’Operational Research, 4, 1980). While at CMU, Bob also made some interesting contributions to complexity theory. In one of them (Discrete Mathematics, 4, 1973) he extended the Klee-Minty result about the simplex method requiring exponentially many steps on certain problem classes, to a non-
R. JEROSL 0W
Vii
standard variant of the simplex method which uses the pivot column choice rule of maximizing the improvement of the objective function value. In 1978, already a full professor, Bob moved to Georgie Tech. During the late seventies and early elghties he wrote a number of important papers with Charlie Blair on the value function of an integer program (Discrete Mathematics, 19, 1977 and 25, 1979;Mathematical Programming, 23, 1982; Discrete Mathematics, 9, 1984 and 10, 1985). These papers deal with issues like subadditive duality and sensitivity analysis in integer programming. Starting around 1982, Bob got interested in problems of integer programming representability (Mathematical Programming Study 22, 1984;Discrete Applied Mathematics, 17, 1987; European Journal of Operational Research, 12, 1988). Here he drew on earlier work by Bob Meyer, as well as on my 1974 technical report on the properties of the convex hull of feasible points of a disjunctive set. That report contained a linear representation ofthe convex hull of a union of polyhedra in ahigher dimensional space. At the time, this representation did not seem important because its dimension grows exponentially with the number of polyhedra in the union. However, in 1982, Bob recognized the crucial fact that if applied selectively to a subset o f the constraints instead of the full set, this representation provides one of the chiefvehicles towards obtaining formulations whose linear programming relaxations are tight. Bob coined the term “sharp” for representations in which the inequalities by themselves define the integer hull, and obtained several important results concerning such representations. He became convinced that introducing appropriately chosen new variables is in many situations a more efficient way of sharpening a formulation than generating cutting planes in the original Variables. We had many lively discussions on this matter and were planning on writing a joint paper on the subject. Other people with whom he interacted on this topic include Charlie Blair, Kipp Martin, Ron Rardin, and his student Jim Lowe. After a while, around, 1985, Bob’s preoccupation with representability focused on the integer programming representation of logical inference; and, more generally, on the application of mathematical programming techniques to artificial intelligence, expert systems etc. (Computers and Operations Research, 19, 1986;Decision Support Systems, 1988 ;Annals of Discrete Mathematics, 1988). Here was finally an area upon which Bob could bring to bear the full arsenal of his training in Logic, combined with his knowledge of the polyhedral method. His pathbreaking work in this new and exciting area, much of which he presented in his Lecture Notes for the first ARIDAM at RUTCOR, published in the present volume, may ultimately prove to be the most influential part of his entire professional legacy. Besides being an outstanding mathematician, Bob had exceptional pedagogical skills: his students used to rave about him. He was a very earnest person, scrupulously conscientious about his commitments and obligations, generous with his time for students and colleagues alike. He would sometimes worry without a good reason and get excited, or become suspicious; on those occasions he needed somebody, a friend, to calm him down. But ifhe needed friendship, he also offered it: he was loyal and reliable. Beyond personal relations, Bob was a warmhearted, sensitive human being, who cared about issues of fairness and justice, and was never indifferent to the plight of people he knew about. We will all badly miss him. Egon Balas
Viii
for Richard J D u f b an applied mathematician in the grand style, a gentle man
.
ix
"...Science as well as technology, will in the near and in the farther future increasingly turn from problems of intensity, substance, and energy, to problems of structure, organization, idormatioL and control. .
.... ."
J. von Neumann, 1949, in his attribution of the views of N. Wiener
X
Contents 1
INTRODUCTION I
3
MIXED-INTEGER MODEL FORMULATION
Lecture 1: Disjunctive Representations 1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 Some definitions and a basic result . . . . . . . . . . . . . . . . 1.3 Some s m a l l examples . . . . . . . . . . . . . . . . . . . . . . . . 1.3.1 Simple Fixed Charge. . . . . . . . . . . . . . . . . . . . 1.3.2 Simple Fixed Charge for Use in Equality Constraints . 1.3.3 Simple Unbounded Fixed Charge . . . . . . . . . . . . . 1.3.4 Simple Fixed Benefit . . . . . . . . . . . . . . . . . . . . 1.3.5 Simple Fixed Benefit With Minimum Usage Level . . . 1.3.6 "Or" Logical Connective .Epigraph . . . . . . . . . . . 1.4 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
.
Lecture 2: Furthtr Illustrations 2.1 Some further examples . . . . . . . . . . . . . . . . . . . . . . . 2.1.1 Graph of "or" logical connective . . . . . . . . . . . . . 2.1.2 Graph of "exclusive or" logical connective . . . . . . . . 2.1.3 Sepprable programming with fixed charges and convex sections (epigraph) . . . . . . . . . . . . . . . . . . . . . 2.1.4 Interactive fuced charges (epigraph) . . . . . . . . . . . . 2.1.5 Clique constraints in node packing . . . . . . . . . . . . 2.1.6 Distribution system design . . . . . . . . . . . . . . . . 2.2 A simplification in the disjunctive representation for some multiple rhs instances . . . . . . . . . . . . . . . . . . . . 2.2.1 Unions of nonempty rectangles . . . . . . . . . . . . . . 2.2.2 Translation of polyhedra . . . . . . . . . . . . . . . . . . 2.2.3 "Primitive" either/or constraints . . . . . . . . . . . . . 2.3 'Separate' vs . 'joint' formulations . . . . . . . . . . . . . . . . .
6
5 9 14 14 16 17 18 19 20 21 21 23 23 23 25
25 27 30 31 32 34 35 35 37
xi
CONTENTS 2.4
Exercises
..............................
41
Lecture 3: Constructions which Parallel Set Operations 3.1 Definitions and basic constructions . . . . . . . . . . . . . . . . 3.2 The union construction . . . . . . . . . . . . . . . . . . . . . . 3.3 Some other constructions . . . . . . . . . . . . . . . . . . . . . 3.4 Some technical properties of the basic constructions . . . . . . 3.5 Composite constructions and 'structure' in MIP . . . . . . . . . 3.6 Two central technical results . . . . . . . . . . . . . . . . . . . 3.7 Hereditary sharpness . . . . . . . . . . . . . . . . . . . . . . . .
43
Lecture 4: Topics in Representability 4.1 Reformulation via distributive laws . . . . . . . . . . . . . . . . 4.2 Convex union representability . . . . . . . . . . . . . . . . . . . 4.3 Using combinatorial principles in representability . . . . . . . . 4.4 Some experimental results . . . . . . . . . . . . . . . . . . . . . 4.4.1 Either/or constraints 4.4.2 Multiple fixed charges . . . . . . . . . . . . . . . . . . .
57
....................
43 45
47 48
49 52 55
57 62 65 69 69 73
I1 LOGIC-BASED APPROACHES TO DECISION SUPPORT 77 Lecture 5: Propositional Logic and Mixed Integer Programming 79 5.1 Introduction 79 5.2 A "natural deduction" system for propositional logic . . . . . . 82 5.3 Propositional logic as done by integer programming . . . . . . 85 5.4 Clausal chaining: a subroutine . . . . . . . . . . . . . . . . . . 90 5.5 Some properties of frequently-used algorithms of expert systems 95 5.6 The Davis-Putnam Algorithm in Two Forms . . . . . . . . . . 99 5.7 Some recent developments (December 1987) 100 102 5.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
.............................
...........
Lecture 6: A Primer on Predicate Logic 103 6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 6.2 Predicate logic: basic concepts. notation . . . . . . . . . . . . . 105 6.3 Applications for problem-solving . . . . . . . . . . . . . . . . . 111
R . JEROSLOW
Xii
Lecture 7: Computational Complexity above NP: A Retrospec119 tive Overview 7.1 7.2 7.3 7.4 7.5
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The fundamental distinction: conceptions vs . their instances Two fundamental results . . . . . . . . . . . . . . . . . . . . . . What if we increase expressability "a little bit"? . . . . . . . The Polynomial Hierarchy, Probabilistic Models. and Games
. .
................................
119 121 122 125 128
Lecture 8: Theorem-Proving Techniques which Utilise Discrete 137 Programming 8.1 8.2 8.3 8.4
8.5
Reduction of Predicate Logic to a Structured Propositional Logic138 Preliminary discussion 140 The algorithm framework 142 Illustrations and comments 146 A generalisation: predicate logic together with linear constraints 150
....................... ..................... ....................
Lecture B: Spatial Embeddings for Linear and Logic Structures 163 9.1 9.2 9.3 9.4
.................... .................... ..............
Definition of an Embedding Illustrations of embeddings Results for predicate logic embeddings Logic an pre-processing routines for MIP: an example via the DP/DPL algorithm
........................
Lecture 10: Tasks Ahead 10.1
Three "top-down" Views of Mathematical PropamminPf
162
166
. . . . 165
. . . . . . . . . . . . . . . . . 166 169 ..................... 172 . . . . . . . . . . . . . . . . . . . . . . 174 . . . . . . . . 175 . . . . . . . . . . . . . 175 . . . . . . . . . . . . . 176 176 . . . . . . . 177 . . . . . . . . . . . . . . . . . 177 . . . . . . . . . . . . . . . . . . . . . . . . . . . 178 . . . . . . 178 . . . . . . . . 178
The Intellectual Heritage Academic settings for Mathematical Programming . . . Users' Perspective8 Some conclusions 10.2 Some research challenges related to these lectures 10.2.1 Research on MIP representability 10.2.2 Research on the AI/OR Interface 10.3 Some other research programs in the AI/OR Interface . . . . . 10.4 Some programs and courses in the AI/OR Interface 10.4.1 Purdue University, MIRC 10.4.2 University of Texas at Austin, Ph.D. Programs in MIS andOR 10.4.3 Camegie-Mellon University, GSIA and SUPA 10.4.4 University of Iowa, Management Sciences 10.1.1 10.1.2 10.1.3 10.1.4
153 158 159
CONTENTS
xiii
. . . . 178 . . . . . . . . . . . . . . . 179
10.4.5 University of Colorado at Boulder. MIS and OR 10.4.6 Northwestern University. MIS 10.4.7 Duke University. the F’uqua School 10.4.8 Massachusetts Institute of Technology. the Sloan School 10.4.9 Georgia Institute of Technology. Management Science 10.5 Guessing Ahead
. . . . . . . . . . . . 179
...........................
179
. 179 179
IHustrat ive Examples
183
Solutions to Examples
191
Bibliography
203
xiv
List of Figures 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
Naive Paradigm of DSS . . . . . . . . . . . . . . . . . . . . . . Simple Fixed Charge Equality Fixed Charge Unbounded Fixed Charge Simple Fixed Bene-lit . . . . . . . . . . . . . . . . . . . . . . . . Fixed Benefit with Minimum Usage . . . . . . . . . . . . . . . Separable Programming . . . . . . . . . . . . . . . . . . . . . . Interactive Fixed Charges RectangdarDomain . . . . . . . . . . . . . . . . . . . . . . . . Clique Constraints . . . . . . . . . . . . . . . . . . . . . . . . . Need for the Hypotheses . . . . . . . . . . . . . . . . . . . . . . Intersection Loses Sharpness . . . . . . . . . . . . . . . . . . . . HisF+G ............................. His max{ F,G} . . . . . . . . . . . . . . . . . . . . . . . . . . A Lattice of Representations . . . . . . . . . . . . . . . . . . . Network on an Index Set . . . . . . . . . . . . . . . . . . . . . . Complex Fixed Charges . . . . . . . . . . . . . . . . . . . . . . An And/or Tree . . . . . . . . . . . . . . . . . . . . . . . . . . “Or” Tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . “Or/and” Tree . . . . . . . . . . . . . . . . . . . . . . . . . . . A Complexity Chain . . . . . . . . . . . . . . . . . . . . . . . . Algorithm Outline . . . . . . . . . . . . . . . . . . . . . . . . . Intellectual Heritage . . . . . . . . . . . . . . . . . . . . . . . . Current Influences . . . . . . . . . . . . . . . . . . . . . . . . . Academic Settings . . . . . . . . . . . . . . . . . . . . . . . . . Piecewise Linear Function . . . . . . . . . . . . . . . . . . . . . Two Convex Sections . . . . . . . . . . . . . . . . . . . . . . . . Concave Sum . . . . . . . . . . . . . . . . . . . . . . . . . . . . Linear Sum Minimum of Functions . . . . . . . . . . . . . . . . . . . . . . . Truth Valuation Search Tree . . . . . . . . . . . . . . . . . . .
........................ ....................... .....................
.....................
.............................
6 15 16 17 18 19 26 28 29 30 33 37 38 40 61 67 73 98 131 132 133 144 166 167 171 183 184 185 186 187 199
xv
List of Tables I MULTI-DIVISION PROBLEMS: R.H.S. MULTIPLIER (1.1) 11 MULTI-DIVISION PROBLEMS: R.H.S. MULTIPLIER (1.3) 111 MULTI-DIVISION PROBLEMS: R.H.S. MULTIPLIER (1.9) IV .................................. V
. . .
......................................
.
VI Seven problems with P = 0.3, N X 1 = 5 . . . . . . . . . . . . VII Six problems with P = 0.3, NX1= 6 . . . . . . . . . . . . . . VIII A harder problem at N X 1 = 6, P = 0.3 . . . . . . . . . . . . . I SATISFIABILITY TESTS USING BANDBX . . . . . . . . . 11 SATISFIABILITY TESTS USING LAND AND POWELL’S
.
.
CODE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . III T E S T S U S I N G A P E X I V . . . . . . . . . . . . . . . . . . . . . N CREATING A HARD PROBLEM . . . . . . . . . . . . . . . .
70 71 72 74 75 75 76 76 87 88 88 89
This Page Intentionally Left Blank
1
Introduction This monograph is directly based on a series of ten lectures, of the same title, which I gave at Rutgers University as part of the Advanced Research Institute on Discrete Applied Mathematics (ARIDAM). The lectures divide naturally into two parts. In Leetures 1 through 4, we cover the theory of representations for those problems which are solvable by mixed-integer programming ( M I P ) , with emphasis on disjunctive formulation techniques. In Lectures 5 through 9, we discuss logic-based approaches to decision support which help to create more "intelligent" systems. We try to show the huge potential for MIP techniques to assist in these approaches and, conversely, the potential for results in applied logic to be relevant in MIP research. Lecture 10 raises broader philosophical issues for speculation and discussion, and it attempts to put the work treated in the previous lectures in a broader perspective. Those readers interested primarily in mixed-integer programming model formulation may wish to read only the first four lectures. However, in my view, some of the most interesting research challenges for MIP in the future will derive from tie-ins to other subject areas ,notably, those relating to decision support and "intelligent" software systems. The disjunctive formulation techniques are specifically designed to bridge the tie-in to logic-based methods for decision support. Seen more broadly, the technical issue of MIP problem formulation is part of knowledge representation [Win 19841. Moreover, the distinction between MIP heuristic and branch-and-bound algorithms versus MIP formulation, exactly parallels that between artificial intelligence search techniques [Nil19711, [Pea 19841 versus knowledge representation techniques. Recent years have seen significant practical successes for applications of certain artificial intelligence techniques, notably "expert systems" [Hay Wat Len 19831,as well as for certain mixed-integer programmhqg techniques [Hof Pad 19841. From the US~T'B perspective, both use modeling conceptions and computer-based techniques to solve problems. My main thesis in these lectures is that, going beyond the simple "juxtaposition" of both approaches, there is also substantial practical benefit to be obtained from pursuing the intellectual connections between the approaches. In this development, against the backdrop of easily-accessed distributed processing, there is potential to more fully realize the aspiration of the 1940's and 1950's: of having widespread use of sophisticated modeling methodologies to aid in decision-making for large organizations and complex modem societies.
2
R. JEROSLO W
The lectures are intended for an audience familiar with mathematical programming. For example, we include introductory and expository material on the propositional calculus and predicate calculus from logic, although similar material is available at greater length in a number of excellent introductory logic texts, e.g. [Men 19641, [Shoen 19671. With the earlier lectures, we also include some exercises with solutions. I a m grateful to Peter Hammer for inviting me to give these lectures and for encouraging their development, and to Rutgers University and the Air Force Office of Scientific Research for their sponsorship of the ARIDAM. I a m particularly thankful to Professor Bernhard Korte, and the Alexander von Humboldt Foundation, for their support on a leave of absence in 1983 during which I could begin in depth study of some of the new developments in Artificial Intelligence. Most of the ideas in these lectures were sketched out during that period, although it has taken several years of furthur research and writing to provide precise technical underpinnings for what we conjectured then. During this process, I have greatly benefitted from the continued support of the National Science Foundation, the Air Force Office of Scientific Research, and the Georgia Tech Foundation. I wish to thank former Dean Charles Gearing for making me aware of developments in computer science and computer technology and their relevance to business problems. Both Anil Nerode and Richard Platek were very helpful in suggesting readings and literature. My secretary Tawanna Tookes has always provided invaluable assistance in producing the manuscript against demanding deadlines. I a m grateful for her efforts. I also wish to extend my appreciation to Dr. Endre Boros, for his fine work in . preparing camera-ready copy for this volume. During the lectures , I very much appreciated comments from Egon Balas, Peter Hammer, Giorgio Gallo, Harvey Greenberg, Michel Minoux, and Bruno Simeone and other A D A M participants. I believe they will find several of their suggestions reflected here.
Atlanta, Georgia July 1986
Part I
MIXED-INTEGER MODEL FORMULATION
This Page Intentionally Left Blank
5
LECTURE 1 DISJUNCTIVE REPRESENTATIONS: A FUNDAMENTAL RESULT AND SOME ILLUSTRATIONS Summary: The basic formulation result; the essential need for auxiliary variables, and obtaining faces and facets by projection; modular focus and linkage issues; some illustrations.
1.1
Introduction
Starting in the late 1960's, when experience with solving mixed-integer programs (MIPS)began to accumulate, it was empirically observed that Merent
algebraic representations of the same MIP constraint condition could behave very differently in computation. In one algebraic formulation, a given MIP could be intractible, while the same MIP might be easily solvable with another formulation. In addition, the easily-solved formulation might involve many more variables and constraints than the intractible one. This latter fact was not consistent with experience from linear programming, and suggested that some new features of MIP formulations could override representation size as a key to the computational tractibility of a MIP. To our knowledge, the earliest systematic studies of this nature are in [Geo Gra 19741 and [Wil1974],although the phenomenon was part of the "folklore" of MIP earlier. Moreover, it is in [Geo Gra 1974, Section 51 that the size of the linear relaxation is specifically cited aa a key feature of a MIP formulation, with a smaller linear relaxation being better. By the "linear relaxation" (LR) of a MIP formulation, we mean the linear program (LP) which results when every variable declaration "zj is binary" is replaced by "0 5 z j 5 1." While [Geo Gra 19741 is generally cited as the first successful large-scale test of Benders' decomposition [Bend 19621,it is equally notable for its insights on MIP formulation, which contributed crucially to this success. Let us put this phenomenon in a broader decision support perspective. A
R. JEROSLO W
6
naive paradigm for the solution of problems via models is in Figure 1 below. Realistic paradigms are given in [Bon Hol Whi 19821 and [Nan Bal 19831 but even the crude one in Figure 1 will allow us to make several points.
"REAL WORLD"
-
PROBLEM
+
REPRESENTATION
ALGORITHM/ INFERENCE ENGINE
4
USER'S "SOLUTION"
Figure 1: Naive Paradigm of DSS
In Figure 1, the "representation" includes: 0
Choice of language for representation
0
Determination of representability
0
Choice of a "best" or "undominated" representation Data structures to implement the representation
Also, the "inference engine" includes: 0
Pre-, during, and post-processing
0
Precompilation of frequently used routines
0
Algorithms for special structure/substructures
0
General-purpose algorithms/algorithm frameworks
In examining Figure 1, each arrow is somehow mysterious. How do realworld problems achieve a formulation? How does a computer printout either lead to or guide a solution, which is often to be implemented by a large organization with multiple participants having different orientations, abilities and preferences? In these lectures, we will not address these two latter questions, but we do focus on the arrow from representation to computation. Clearly, the algorithm ought to be matched to the representation, at least at run time. In branch-and-bound (BB), which solves a MIP by examining a sequen'ce of linear programs which are LR's of various partial solutions (see e.g. [Gar Nem 19721 for a discussion of branch-and-bound), the LR's should be as representative of the problem as is possible. For constraints, one has more information
LECTURE 1
7
on the MIP constraint set when its LR is smaller. The theoretically smallest LR would be the convex hull of the MIP constraints (see [Roc 19703 or [Stoe and Wit 19701 for convexity terminology and results). A formulation with this convex hull property we call sharp, It follows that, if we wish to solve MIPS by branch-and-bound (BB), we should use formulations with a s m a l l linear relaxation (LR), ideally a sharp formulation. There are, however, trade-offs in the size of the LR and the size of the MIP formulation, which we will discuss more fully later on. This dictum - of looking for a MIP formulation with a s m a l l LR - still holds if Lagrangean relaxation is imbedded in the BB scheme, or if Benders decomposition is used, or if an algorithm other than the Simplex Method is used to solve the LR, etc. A t run time the algorithm should be matched to the representation - i.e. formulation and solution are not independent subjects. There are some rather good representation techniques for substantial parts of human knowledge as for example, the first-order predicate logic to be discussed in Part Two. However, when computation is needed, the knowledge may have to be temporarily converted to a Merent form in order for efficient solution by specific algorithms. Despite the early realization of the importance of formulation techniques for MIP,very little was done in this area for nearly a decade following [Geo Gra 19741. The formulation work which was undertaken in the 1970’s was focused on cutting-planes, which is a more restrictive conception than current approaches. When interest returned in the early 1980’s to the solution of large-scale M I P S , the issue of problem representation was again addressed. As attention has gradually become focused on this topic, researchers have become aware of the relatively limited number of techniques for obtaining good MIP representa tions . There is even the more fundamental question, as to what kinds of constraints yield to Mzp formulations at all. Aside from R.R. Meyer and T. Ibaraki, for many years few researchers regarded this latter question as of interest. In m a n y respects, the field of discrete programming had not progressed beyond Dantzig’s early model formulations [Dan19571 that first motivated the importance of binary and integer variables in a program. We do have some knowledge of MIF’formulation techniques, however, and our knowledge is growing very rapidly now. We begin to review some of that knowledge in this lecture, building on work of R. R. Meyer and on the disjunctive methods of cutting-plane theory. We begin by illustrating that, already in linear programming, there are
R. JEROSLOW
8
representation issues which have not been systematically addressed. Consider, for example, the convex constraint n
j=1
This can be represented by 2" linear constraints, in which all combinations of signs occur for the terms f z j . For instance, with n = 2, 1211 1211 5 1 becomes:
+
Moreover, each of the inequalities in (1.2) is needed, since it is a facet (see e.g. [Roc 19701 or [Stoe Wit 19701 for this terminology). Generally, all 2" inequalities are needed if we use a linear system in only the "original" " variables" 2j. Obviously, such a hear system is too large to use, except possibly if the inequalities are added aa they are needed. However, this entire issue disappears if we are willing to use "auxiliary variables". We note that the points defined by (1.1) are the convex span of the positive and negative unit vectors +Ej and we can represent Cj lzjl 5 1 by:
u s i n g a d i a t y o@&blee g$,y; iliary vatiables is :
2
0. Yet a different representation with aux-
Both of the systems (1.3) or (1.4) are s m a l l in size. Under what circumstances does a polyhedron (i.e. the solution set to a system of h e a r inequalities), which may require a large number of inequalities to define, nevertheless have a compact formulation, possibly with auxiliary variables? This basic question has not been investigated, to our knowledge.
LECTURE 1 1.2
9
Some definitions and a basic result
We return to R. R. Meyer's concern for a precise definition of those sets representable by MIP constraints. We adapt his approach in the following d&nition.
Definition: Let a set S E R" be given. We say that S is bounded-MIPrepresentable (b-MIP.r) if there are matrices A, B and a vector h, and a subset K of the indices of a d i a i y warkbles y, with z
In this setting, relazatwn is:
ES
t ,
for someywith yj E (0,l) for j E K we have A = + By 2 h
(1.5)
= (A, B, h, K ) is called a rep+e~~entcrtion of S , and its rel(S) - = {zI for some y with 0 I y j 5 lfor j E K, we have Az By 1 h}
+
We note that : 0
0
0
M(S)depends on S, not just
S. Always rel(S)
2 conv(S).
Auxiliary variables are an intrinsic feature. The y j for j E K are called control uarkblea, as they "control" the part of S in which z lies. The set S m a y be unbounded; the term "bounded" refers to the yj, j E K,which could (equivalently) be given any bounds.
In his work, Meyer has been concerned with several notions of representability, which are distinct, and only one of which is essentially the one above. The concept above is the only representability we shall use here. It is adequate to cover all cases in which sets have been represented using linear inequalities and bounded integer variables. It is thus adequate for practical applications to date. However, it is more restrictive than representability by general integer variables (e.g. [Mey 19751, [Jer Low 19841) or other concepts of representability involving unions rather than integer variables (e.g. [Mey Tha Mal 19801).
R. JEROSLO W
10
Meyer’s original focus was on the representability of functions, while here f representable in Meyer’s sense are those whose epigraphs epi(f) are representable in the sense above. The concept of function representability used by Ibaraki is essentially that of the graph gph(f) of f. In some contexts, representation of the hypograph hypo(f) is also important. Here we have the definitions:
our focus is on sets. The functions
Lemma: gph(f) is b-M1P.r iff both epi(f) and hypo(f) are b-MIP.r. Discontinuous functions cannot have a representable graph, hence the fixed-charge function (see section 1.3 below) does not. However, the epigraph of the fixed-charge function is representable. Thus, the concepts of epigraph, hypograph, and graph representability are all distinct. By the lemma, graph representability is the more restrictive concept for functions. The different concepts of function representability are relevant in different modeling situations. For example, suppose that f appears only positively in the minimizing objective and/or in 5 constraints: min
8.t.
....... .......
+f(2)
+f(Z)
+....... +....... 5 .......
(1.8)
Then represent epi( f). This procedure works because occurrences of ”f(z)” can be replaced by I, and z representations of ” z 2 f(z)” can be added to the constraints. These modifications do not change the set of solutions 2. A similar rule holds for the use of ”hypo(f)” in monotone situations. In general, a representation of gph(f) is needed. [Jer 1984a] gives some fairly broad conditions under which representability of epi(f) or hypo(f) suffices, but this issue has not been systematically investigated. We now discuss disjunctive formulations, which are a particular kind of representation. First, we need to define the concept of a ”starred recession cone”. For a polyhedron P described using auxiliary variables
LECTURE 1
11
P = (21 for some y, A E + By 2 h ) Define rec*(P) = {zI for some y, A t
Lemma: If P
+ By 2 0)
(1.9) (1.10)
# 0, rec*(P) is independent of P representing P,and in fact: for some ?a E P, rnz E P for all rn 2 0) = (21 for all z' E P, X I + rnz E P for all rn 2 0 ) = rec(P)
rec*(fl) =
(21 2'
+
(1.11)
In the lemma, rec(P) is the ordinary recession cone of the polyhedron as in convexity theory (e. g. [ Roc 19701 or [Stoe and Wit 19701). It is the cone of directions such that, starting from some point in P, one can "recede" in that direction indefinitely without leaving P. For P bounded, rec(P) = (0). For P empty, by definition rec(P) = R". When P is empty, rec'(l') and rec(P) can be different. Indeed if P = { t l z = 5 , l 5 y 5 -1) then P = 0, but using the bracketed representation for P, rec*(P) = (0). As it turns out, the starred recession cone is more convenient to use in representation work, even though it depends on the representation P of P,when P = 0. For starred recession cones, see [Jer 1984~1and [Jer 1986b]. Our next result is from [Jer 1984~1,and it describes a representation which is obtained by adapting ideas from Balas' union construction [Bal19741 and our co-propositions in [Jer 19741. Disjunctive Representation Lemma: Suppose that PI,...,Pt are polyhedra with representations P1,...,Pt. Then the following is a representation of the smallest representable set S containing PIU U Pt: tE
S
-
...
there are dl), ...,adt),y(l), ...,y(t) and binary scalars ml, ...,mt with
+
> h(i)mi, i = 1, ...,t
~ ( i ) ~ ( i ~) ( i ) y ( i I
(1.12)
12
R. JEROSLOW
...,t then S = PI U ...U Pt.
When rec*(Pi) is independent of i = 1,
In the lemma, note that the final representation (1.12) depends on the given ones Pi. We cannot therefore speak of "the" disjunctive representation of a set S, but rather of "a" disjunctive representation of S. The representation (1.12) appears to be restricted to use on "either/or constraints" of the form: 2 E PI or 2 E P2 or ... or z E Pt. However, one consequence of our next result is that the only sets which can be represented are of the "either/or" form, so that (1.12) is entirely general. The result is essentially from [Jer Low 19841. Theorem: S R" is b-MIP.r iff S = PI u ... u Pt is a (finite) union of polyhedra Pi with rec(Pi) independent of i = 1,...,t .
When S is b.MIF'.r, let P1,...,Pt be representations of PI,...,Pt with rec*(Pi) independent of i = 1,...,t (as can always be assumed). Then the disjunctive formulation S represents S and furthurmore is sharp, i.e. rel(8) = conv(S).
Corollary: Any representable set has a sharp formulation.
To my knowledge, the only two completely general formulation techniques are the disjunctive and the netforms of Glover, K l i n p a n , and Macmillan (see e.g. [Glo Kli McM 19771, [Glo Hul Kli Stu 19781). Of the two, only the disjunctive technique always gives sharp formulations. The netform technique reduces a MTP to a network with logical conditions. While the latter formulation is not always sharp, it is very much in the spirit of meshing representation to the algorithm, where the algorithm used in net forms is one for networks. While uniform representability with sharpness is a desirable aspect of the disjunctive formulation (1.12), its disadvantage is its size. For example, when the number of logical alternatives (i.e. number t of Pi) is large, the representation can be huge. If simplifications do not occur, then the representation may be huge even with a smaller number of logical alternatives. As we work exercises with disjunctive formulations in section 1.3, we will see that, frequently, substantial simplifications occur which greatly reduce the size. However, this happens on an ad hoc basis, and sometimes the simplifications are not substantial. Later on in these lectures, we will cite a theoretical principle which sometimes allows great simplifications, and we will cite a mod-
LECTURE 1
13
%cation of the disjunctive construction which, when it applies, tends to reduce representation size dramatically. Work of this nature is current research. In order to get a ”feel” for what the disjunctive formulations can do, when they simplify, etc., one must work numerous examples. We now mention the issue of modularity of representations. For a ”complex” set S C_ Rn,it is usually too diflicult to get a sharp (i.e. rel(S) = conv(S)) representation. So one works with subsets (“modules”) within the constraints. This simple point is worth emphasizing, for it is crucial to understanding research in representation techniques. Given that one works with modules and not the entire constraint set S C_ En,the question arises, as to how the modules are to be ’linked’ together. We will discuss this issue in Lectures 3 and 4. Several other advantages of disjunctive approaches are that these: 0
Allow modularization of representations, with focus on linkage
0
Mesh with logic-based approaches to artificial intelligence
0
Mesh with polyhedral combinatorics to improve specialized representations
A general conceptual approach to MIP modeling due to R. K. Martin is ”variable redefinition”, which has led to formulations for lot-sizing problems of various types, using auxiliary variables. It has a broad domain of applicability. It is a conceptual framework, rather than a technique, and sharpness of the representation is essentially an hypothesis of the framework. Conceptually it includes the disjunctive methods. We will discuss it furthur in Lecture 4. Other formuhtion techniques use problem-specific ingenuity, and tend to focus on the travelling salesman problem and, more generally, networks with fixed charges. For certain lot sizing problems, Wolsey and Barany achieved a sharp formulation, and this has been recently extended in [Wol 19861. More typically, the linear relaxation is tightened, without necessarily obtaining sharpness. Results of Wolsey, Barany, van Roy, Padberg, Claus, Hochbaum, Magnanti and Wong, Rardin and Choe, Leung, and others fall in this category. We mention also Kipp Martin’s recent cubic size sharp formulation of (the incidence vectors of) spanning trees in graphs. Although the problem-specific techniques are for specialized settings, they have produced a number of new ideas which may prove useful more broadly, and are already of value in significant applications.
R. JEROSLOW
14
We end this section with a brief reference to the technique of projection from a representation, which can be used in contexts where a characterization of facets is needed, or a sharp linear system in "original" variables is desired. The technique was first used by Balm and Pulleyblank, who successfully obtained such a linear system in the context of bi-partite matching. This technique applies when a representation is available which is sharp, so that x E conv(5) +-+
there is y with A x
+ By 2 h
(1.13)
Here we include 0 5 y j I 1, j E K among the representation constraints for = ( A , B , H,K)). We observe that all facets of conv(S) occur as (pA)z >_ p h as p varies among all basic feasible solutions to:
s
(1.14)
For any z* E R",one can minimize p ( A z * - h) subject to (1.14). A negative value is found iff z* $ conv(5). If z* conv(5) a basic feasible solution p* is found which "cuts off" z* : p*Az* < p*h, although ( p * A ) z 2 p * h is valid for conv( S). One can use the projection method to generate original variable cuts "as needed" from a sharp representation. When used in the setting of a theoretical characterization, as done by Balas and Pulleyblank, its success requires "spotting the pattern" of all basic solutions p to (1.14). Here computational work can help to develop insights.
1.3
Some small examples
In this section, we begin to develop experience with disjunctive formulations, and with the theorem characterizing representability. 1.3.1
Simple Fixed Charge
Here we have epi(F) = PI U P2 where
15
LECTURE 1
F(xl)=
I
0,
x1=
0
f, 0 < 2 1 5 M
' Figure 2: Simple Fixed Charge Here rec*(P,) = rec*(P2),so epi(F) i s representable. A disjunctive formulation is: z\1 ) = o m 1 Om2 5 x r ) 5 m2M
2 0m1 ml
+ m2 = 1
x1 =
xp+ xy +
z = z1
This simplifies to: (use
m2
0
22
2(2)
2 mzf
ml,
m2
(= 0 + $1 (2 0
binary =El
1
+ mzf = m f )
= y)
5
x1
5 yM, y binary
where the term "+f" is put into the objective (or used for '7+F(21)'7 in constraints). This is the usual representation for a simple fked charge. We note that the variable y of the usual representation is an auziliary variable of the disjunctive formulation. This raises the issue: what does "original" variable mean? If the epigraph is taken as "given", the variable y is "awiliary." If the usual algebraic representation is taken as "given", the variable y is "original." What do we view as being a "given" of the problem? I would suggest that the real world problem is the given, and the rest are concepts about it (including the concept of a fked charge function). Since what is an "original" variable and what is an "auxiliary" variable depends on
R. JEROSLO W
16
the conceptualization, these are relative, and hence somewhat artificial, categories. In some cases, the problem is already "given" in abstract terms as e.g. representing the incidence vectors of members of a matroid. For these problems, there can be a meaningfd distinction between "original" and "auxiliary" variables. Let us now consider the use of the function above in constraints. gph(F) is not closed, hence cannot be a ( h i t e ) union of polyhedra. Therefore, gph(F) is not representable.
1.3.2
Simple Fixed Charge for Use in Equality Constraints
For use in cash balance equations, the following modification of the function in 1.3.1 can be useful.
t (here L
> 0 is b "minimumusage level")
M
Figure 3: Equality Fixed Charge
gph(F) is PI U P2 where
A disjunctive formulation is:
17
LECTURE 1
z = yf,
y binary
Instead of using the equation z = yf, one m a y rather put "yf" at each occurrence of F(21). In the "real world", both minimum usage levels L > 0 and maximum levels M typically occur. However, what one uses to achieve a representation depends on the representation task. Why have we used a maximum level M above? Our next example shows that it is needed (see [Mey 19751).
1.3.3
Simple Unbounded Fixed Charge Fbi)
{ f, iiff 0,
F(zl) =
f
x1
Figure 4: Unbounded Fixed Charge
21
0
=0
< 21
R. JEROSLO W
18
Here we cannot get the recession condition to hold. Hence the epigraph is not represent able. For example, we might try: epi(f) = PI u Pz, where P1={(z,
21)121 =
P2=((z121)10
0z
L 211
2 0) 2
1 f}
Then rec(P1) = { ( z , O ) l z 2 0}, but rec(P2) = { ( z , z ) ~ z2 0, 2 1 2 0}, so rec(P1) # rec(P2). Ifwe try the disjunctive construction anyway, we will get the representable hull. (Here it is the identically zero function). This outcome is typical of common (sophisticated) errors when a representation is attempted for a nonrepresentable set. In this example, we note that epi( f) is still not representable if a minimum usage level is added, i.e. F ( z 1 ) = f if L 5 21. Again the recession condition fails. In our next example, we suppose that a benefit is obtained (rather than a cost incurred) from utilization of a resource as measured by zl. The discontinuity of a h e d charge at the origin caused no difficulties for epigraph representation. However, here we shall see that the matter is different.
1.3.4
Simple Fixed Benefit
(here B
-B
> 0)
i Figure 5: Simple Fixed Benefit
LECTURE 1
19
Here epi(F) is not closed, since the line segment from (0,O)to (0, -B) is not in epi(F). Hence epi(F) is not a union of polyhedra; and so it is not representable.
1.3.5
Simple Fixed Beneflt With M i n i m u m Usage Level
t F(x1) =
0,if
O < X l < L
-B, if L 5 x1
Figure 6: Fixed Benefit with Minimum Usage Here epi(f) = PI U P 2 where Pl={(Z,zl)lz P2={(.z,21)12
L 0, 2 1 L 0) L -B, 21 L L )
The disjunctive formulation simplifies to (exercise): 21
>_ Ly, y binary
with "-yB" in the criterion for F(z1). In the fixed benefit context, the unboundedness of 2 1 did not matter; it was the minimum usage level which was crucial. The subtleties of modeling, which we have illustrated in the f i s t five examples of functions of a single variable, give an indication of complexities of functions of two or three variables. We will see some of these issues in the next lecture. We conclude these examples by representing the epigraph of a logical connective, in which the value zero is viewed as "false", while value one is viewed
R. JEROSLOW
20
as ”true”,
1.3.6
”Or” Logical Connective
- Epigraph
First, let us represent a set which is simpler than the logical connective. Specifically, consider S = PI u P2 where: P1={(2,21,
all i = 1,...,n)
PZ={(z,zl,
a U i = 1,...,n)
...,ZJZ 2 0, 2; = 0 ...,z,)Iz 2 1, o 5 zi 5 1
A disjunctive formulation is:
2 Om1
z(l) 2 1m2
zil) = 0m1 all i
o 5 z!’) 5 1 m2 all i
+ m2 = 1 z = z(O) + A2) ( 2 o + mz = m2) m1
2;
= Zi1) + “!a)
( == $))
This simplifies to: 0
5 zi L. y 5
z all z i , y binary
In linear relaxation (LR), ’y’ can be omitted. We thus obtain conv(P1uP2) in LR. But Pz = conv(F) where F = {(z,z1, ...,z,)\z 2 1, all 2; binary). The set we actually wish to represent is S‘ = PI u F. We obtain conv(P1 u F) in the LR, i.e. the convex span of the epigraph of the ”or” connective. To simultaneously obtain a representation, m a k e all variables binary. This last example has illustrated a very important principle, which we will discuss at more length in Lecture 3, when we treat ”canonical constructions.” The principle is as follows: Suppose That: 1) A very large (possibly exponential) number of logical alternatives can be viewed as the union of a relatively ”small” number of ”groupings” of the alternatives
And 2) A relatively ”small” shu7-p ”partial representation” is available for each of
these groupings” (via any technique)
LECTURE
1
21
Then Disjunctive techniques can be used to obtain a fairly " s m a l l " sharp representation of the entire set. With algebraic simplifications in addition, the representation can be made even more compact.
1.4
Related Work
As is evident from the previous sections, there is a close connection between the theory of cutting-planes and the theory of representations. In fact, from [Ball9741 and [Jer 19741, these subject are essentially in a duality relation (see particularly displays (107) and (108) of [Jer 19741). Moreover, the polyhedral annexation approach of Glover [Glo 1975133 is closely related to the disjunctive methods, in that both approaches generate the same family of cutting planes (see [Jer 1977]), and both are systematic developments building on an observation of Owen [Owe 19731. I have surveyed cutting-plane theory earlier in [Jer 19771 and [Jer 1978a] and consequently do not discuss it here. In retrospect, a relatively neglected result of cutting-plane theory is Balas' elegant characterization of the convex span of facial constraints [Bal 19741, [Bal 19791 which is related to Blair's result [Bla 19771 and has given rise to several consequences including [Bla Jer 19841, [Jer 1978bl. The boolean methods of discrete programming (e.g. [Ham Rud 19701, [Ham 19741, [Ham 19791) utilize concepts and ideas from the propositional logic, as do the disjunctive methods which are OUT focus here. Boolean approaches often proceed via reductions of an integer program to logical inequalities, (e.g. [Gran Ham 19741, [Ham Joh Pel 19741, [Ham Ngu 19791) and are most effective in the difficult context of nonlinear formulations (e.g. [Ham Pel 19721, [Ham Han Sim 19841, [Ham Sim 1987a], [Ham Sim 1987bI). In contrast, disjunctive approaches seek propositional logic structures in the manner in which a large linear constraint set is naturally decomposed into unions, intersections, etc. (as will be clearer after Lectures 3 and 4). These two approaches use boolean logic in different ways.
1.5
Exercises
Problems 1-4 of the "Illustrative Examples" can be worked.
This Page Intentionally Left Blank
23
LECTURE 2 FURTHER ILLUSTRATIONS OF DISJUNCTIVE REPRESENTATIONS Summary: The purpose of this lecture is to furthur familiarize the reader with disjunctive represent at ions. To this end, it contains applications to two-dimensional fixed charges, to either/or constraints, and other common constraints. One example accounts for the ”disaggregation” phenomenon of Graves and Geofiion [Geo Gra 19741. A result on a simplification for certain instances of multiple right-hand-side (r.h.s.) constraints is given, along with a practical rule-of-thumb for representing cost or revenue functions that are given by combining components of cost or revenue. Some of the examples are drawn from [Jer 1984bl and some others from an earlier version of [Jer Low 19841. The result on function representability is from [Jer 1984al and the simplification lemma is from [Jer 198681.
2.1
Some further examples
We continue the list of examples from Lecture 1.
2.1.1
Graph of ”or” logical connective
In 1.3.6 we treated the epigraph of the ”or” connective. Here we treat the graph. Let
R. JEROSLOW
24
We have S = Po u F1 u
...u Fn.We represent S disjunctively:
DF ~ + m l + - - - + m , = 1allmibinary z =
xy z(i) (= xy mi)
d
x j = ~ ; x y ) (
j
(2mi) In the linear relaxation (LR) this clearly implies: n
z5Cxi,
(*)
o<xj
allj
1
In fact, this is equioulent in the LR, as we indicate next. Gioen: z, xi satisfying (*) TO
find mi,$1, $1
satisfying DF.
Always do)= x y ) = 0. Rather than give a complete proof, we illustrate the ideas by an example. Suppose e.g. n = 3 , z = .9,x1 = .7,xp = .5,x3 = .2. Here XI < z < XI x1
+
i=1 J(1) =
.7
x(ll 1 = -7 x f ) = .3 x c ) = .2 (ml = .7)
=X 1
i=S
%('I = .2
z(3) =
xp' = 0
x p = .2 x p =0 (ml = -2) = 1- m1 - ma -m3 = .1
excess c not needed
t
i=2
0
remainder to xy' = 0 total to z $1 = 0 .f' = 0 (m3
= 0)
A complete proof is actually via an algorithm which finds mi, z y ) and z!' from z, xi. Since z 5 xi, an index t can be found with xi < z 5 xi+'zi;
xT
LECTURE 2
25
...
moreover, without loss of generality 21 1 21 2 12,. Then one may set di)= z;,z!) = z;,m; = 2; for 15 i 5 t . With v = C:+'z; - z 1 0, we set z(t+l) = g Jt+l) - u, mt = Q. All other quantities are set to zero, except v
t
+
that z!) for j 2 t 1 are given by zf:i = z:+l - u,zp' = 2 j for j 2 t and % = 1 - C: nq. The validity of this algorithm is easily checked.
2.1.2
+ 2,
Graph of "exclusive orw logical connective
In some instances, ezacUy one of a set of
t possibilities is to be true. We now given a s h a r p formulation for the graph of this "exclusive or" function. One may verify the formulation by noting that it is a simplification of the disjunctive formulation.
The formulation is: 2
= C7=1 2j
052,
Zj51
and all variables binary
2.1.3
Seperable programming with fixed charges and convex sections (epigraph)
This representation will illustrate the "extreme point formulation". We have
R. JEROSLOW
26
Figure 7: Separable Programming
A disjunctive formulation is:
LECTURE 2
27
The above simplifies to:
and put in the criterion function for
t:
In this manner, we can represent the epigraph of this function. Since the graph is not closed, it is not representable. The following result from [Jer 1984al precisely describes the difference between epigraph and graph representability, for functions on a bounded domain. Theorem: Suppose that F is a function with a bounded domain, and epi(F) is representable. Then:
gph(F) is representable if€ F is continuous on its domain
N.B. For unbounded domains, the condition is very technical and restrictive for gph(F) to be representable.
2.1.4
Interactive flxed charges (epigraph)
In addition to representing a fixed charge function of two variables, this example also illustrates non-rectangular domains. We draw the domain in Figure 8.
This example allows fixed charge functions in which the two activities z1 and 2 2 interact either to mutual benefit (fa < fi f ~or) to mutual detriment (fb > fi fz). It d o w s the additive case fb = fi fa. The reader may check that a necessary and sufficient condition for representability is:
+
+
+
R. JEROSLO W
28
M1
Figure 8: Interactive Fixed Charges Here we have epi(F) = Po U Pi u P2 U P3 where
The disjunctive formulation simplifies to (exercise):
LECTURE 2
29
with this in the objective function in place of I:
In the linear relaxation (LR), we obtain the function
Note that there is no effect from fb! However, when f b 5 f1 fa, in the LR along the line q / M 1 = xz/M2 of the rectangular domain 0 5 2 1 5 M I , 0 5 x 2 5 Ma, we obtain the function fb-&. The latter is inferior since, when $+- = 3
+
Thus, when the domain is non-rectangular, it c a n be advantageous to make this fact explicit. Consider the effects of more complex domains even in the udditive case when fb = fl -I f2, e.g.: XZ
T
A1
M1
Figure 9: Rectangular Domain
+
If formulated as two seperate fked charges, in LR we obtain f13 f 2 3 . At 2 2 = 0 , q = A1, this gives f1* < f1. But an accurate representation would have LR value f1 at this point.
R. JEROSLO W
30
2.1.5
Clique constraints in node packing
6
8
Figure 10: Clique Constraints
In the graph of Figure 10, nodes 1 , 2 , 3 , 4 , 5 form a clique C. Put: 2;
=
0 otherwise
1 if node i is in the packing
If nodes i and j are adjacent,
+ tj 5
+’i V 1Pj holds (i.e. t i means ” t i = 1”. Using rules of propositional logic, where is ’or’:
We formulate the r.h.s above disjunctively:
’A’ is
1) where P; ’and’ and ’V’
LECTURE 2
31
This simplifies to:
CiEC zj 5 1
(clique cut)
z j 2 0 aUj,
zjbinaryallj
This describes the conuez span for the suhgmph, and also provides a method for other subgraphs.
2.1.6
Distribution system design
The next illustration is drawn directly from [Geo Gra 19741. Let zij4 = amount of commodity i to be made at factory j, routed via distribution center (DC)k, to arrive at customer 4.
We seek to formulate:
”Each customer is to be serviced by exactly one DC for all of its commodities, and its demand is to be satisfied” In order to improve readability, we fix 4 and delete it from notation. We also use the abbreviation
zik =
zijk j
We obtain this disjunction ouer k, where d; is the demand of customer q for commodity i:
for some k,
2ik
= d;
q,=O
The disjunctive formulation is:
allpfk
alli
32
R. JEROSLOW This easily simplifies to (Geofiion and Graves):
2.2
A simplification in the disjunctive representation for some multiple rhs instances
As we mentioned in Lecture 1, the difficulty with using disjunctive formulations is their potentid to be of large size. From the examples we have worked thus far, we see that significant simplifications and reductions in size occur in specific, practical applications. However, there is a need for broad principles stating when such simplifications arise, rather than handling simplifications entirely on an ad hoe basis. In this section, we state a specialized result for a simplification which occurs in some instances of multiple right-hand-side (r.h.s.) constraints, and then we illustrate this principle. The result below focuses on a reduction in matrix size; in Lecture 4, the focus is on a reduction in logical variables. All reduction results thus far require fairly technical hypotheses. Let Ph = {zlAz 2 b(") z = ( 2 1 , ...,zn)} for h = 1, ...,t, with A = [a;] (rows), b(") = (bib'), D a square submatrix of a,b g ) the corresp. subvector of b(h)
Theorem (Jer [1986a])' Suppose that A is of full rank n, and that: For every square nonsingular submatrix D of A,
(#) every i and every h, i f q D - l b g ) < b p ) then there is j with ajD-lbg) < bj for all k = 1,...,t . Then a sharp representation of
Uh Ph is given by
n.b. The unique solution to Dz = b g ) is D - l b g ) . 'Added in proof: C. E. Blair haa extended this result in a recent technical report.
LECTURE 2
33
We give a simple example to show the need for an hypothesis such as (#), even in the specific context of multiple r.h.s. Put
1
2
3
Figure 11: Need for the Hypotheses
Here we have
R. JEROSLO W
34
but the simplified formulation
has among its solutions
This solution corresponds to a point outside conv(P1u Pa).For the invertible submatrix D =
[ : :]
with b g ) =
~ - 1 b g )= (o,o) faib constraint 21 1 0. Thus (#) fails.
21
2
[ :]
and b g ) =
[ :]
we have that
0 , ~ - 1 b g )= (1,1> satbfieo constraint
Of course, the constraints of a general MIP provide an example of multiple r.h.s. constraints (as the combinations of binary variables enumerate the logical possibilities). Thus the specific context of r.h.s. is, in actuality, fairly general. We next give three applications of the simplification.
2.2.1
Unions of nonempty rectangles
We illustrate the general principle by a specific instance. Suppose:
The choice of D in 1
A = [
0
-: :] 0
-1
LECTURE 2
35
corresponds to a comer of the hyperrectangle - so it is always feasible. Thus (#) holds and the simplified representation below is sharp: 3mp 4mr ml
2.2.2
+5m3 +m3
+ml
5 z1 5 4ma + 5m3 5 2 2 I 5m1+ m3
+ma =1, allmibinary
Translation of polyhedra
Let Ph = (zlA(z - dh)) 2 b } h = 1,...,t Here a choice in A gives a feasible vertex in one p h X it does so in all p h . Thus (#) holds and this simplified representation is sharp:
2.2 .ti
"Primitive" either/or constraints
Let u(") be a vector with distinct components (in fact, no two components the same a s h varies). Suppose we are given that u(h)
2 g(h) allh
u(h)
2
f(h)
for some h
To obtain a sharp representation we define
A = identity matrix for u In this notation, the given information becomes: for some h, Az 2 b(h) One easily checks that (#) holds. Thus a sharp representation is u
2
m#'), h
1=
mh, h
all mh binary
36
R. JEROSLOW
which simplifies to
In the above, if we substitute A(")z for v@), we obtain a common 'textbook' treatment for either/or constraints. This will be sharp if all forms used are linearly independent, but it is not sharp in general, as our next example shows. Suppose we are given that:
Note that a lower bound is -100 for both 21 - 2 2 and 2 3 - 21, and that only feasible points are (100,O) and (0,100). The textbook modelling is identical with our simplification here, and it is: 05
21
21 -21
,ZZ -21
+=a
5 100 2 -100+200rn~ 2 -loo+ 200rnz
These points satisfy the L R (100,O) (0,100)
use m l use ml use ml (090) (100,100) use ml
= 1, ma = 0 = 0, m2 = 1 = ma = = ma =
s
Hence the entire cube is in LR. But a sharp representation has LR 2 1 + 2 2 = 100,Zl 2 0 , 2 2 2 0. In this context, it is significant that the textbook formulation has lost all information in the LR except nonnegativities and upper bounds. Our experience is that it is typically a very poor formulation (see [Jer Low 19851).
37
LECTURE 2
2.3
'Separate' vs. 'joint' formulations
In this section, we indicate some of the crucial issues which occur in 'linkage', which are caused by the intersection operation. In this context, we have obtain separate formulations for conditions "z E Sl",and " 2 E Ss" for two sets S1,Sa. By simply juxtaposing (putting side-by-side) these, we obtain a formulation for " x E S1 n Sa", but typically that procedure loses sharpness. This occurs because, for sets S1,Ss in R"
(*)
cow(S1 n S2) g conv(S1)n conv( S2)
Moreover, the inclusion above is generally strict, as indicated in Figure 12.
1 I
1
I I
I s2
Figure 12: Intersection Loses Sharpness Here S1 n S, = 0 but conv(S1) n conv(S2) is a large rectangle. Therefore the LR is generally better from an S1 n Sa 'joint' formulation, as opposed to 'separate' formulations of S1 and SOsimply juxtaposed. Pructicd trude-ogs consider the fact that a joint formulation is 'typically' larger than two separate formulations juxtaposed. But there are important ezceptions. We illustrate this linkage issue in the context of functions of a single variable which arise by combining components either additively, or via maximum or minimum. For example, the cost of a product may involve materials cost, labor, machining costs, advertising costs, distribution charges, etc., which are to be added.
R. JEROSLO W
38
+
Figure 13: H is F
+G
39
LECTURE 2
I f H ( z ) = F(z)+G(z), and we have representations for epi(F) and epi(G), we can obtain one for epi(H) by juxtaposition: ( ( I , z)11
2 H ( 2 ) ) = ((I,2)I
for some 2 1 ~ 2 2we have I
= 11
+
12
and
( a ,2) E epi(F), (Ill
4 E epi(G))
By (*) above, we know that the LR of such a juxtaposition may not be sharp, i.e. may not be conv(epi(H)). We show below a concrete case where it in fact is not sharp. We also indicate how results on convex envelopes of functions can be used to obtain the same conclusion, as such results are a special case of (*). First suppose that we have a linear function H which is F G. Here the 'joint' formulation is trivial and it equals the function. For 2 1 = 1,the LR of the separate formulation gives function value The LR of the joint formulation gives 1. LR of the separate formulation can never be better because
+
3.
( 7 m ( Z l )
2 7 ( 4 +T b l )
where 7 = convex envelope of f, etc. Next, suppose that H arises by taking the maximum of F and G. The 'joint' formulation is I = 1 , O 5 z1 5 2 and the LR equals the function. For 21 = 1, the LR of juxtaposed separate formulations gives value max(+,i} = not I. The LR of the separate formulations never can be better because
3,
We leave it to the reader to explore the case of the minimum of two functions. The epigraph is obtainable as the union of epigraphs, hence separate formulations can be combined sharply by disjunctive formulations. But what if these are combined using the minimum function inside the MIP? What happens? Are there dominances?
R. JEROSLOW
40
vs.
max
Figure 14: H is max(F,G}
LECTURE 2
41
Here is one way to formulate the graph of the minimum function for values in range [0,2] (via disjunctive methods):
These examples have motivated the following very general "practitioners rule" for representing multi-component s functions: When functions in a MIP derive from several componenta (of cost, revenue, profit, etc.) which are combined by addition, maximum, minimum, etc. - it is generally better to numerically combine the components and represent the function in the MIP, than to represent the separate components in the MIP and have the MIE' combine them into the function. This principle does not rule out separate storage of the data in component form. However, at run time, in order to mesh with branch-and-bound codes that use the LR, the overall function needs to be numerically assembled for proper MIP representation.
2.4
Exercises
Problems 5-11 can be worked.
This Page Intentionally Left Blank
43
LECTURE 3 CONSTRUCTIONS WHICH PARALLEL SET OPERATIONS AND A NEW CONCEPT OF STRUCTURE FOR MIP Summary: We define the structure of a MIP constraint set to be the manner in which the constraints arise from juxtaposition (intersection) of constraints, taking alternatives of constraints (union), addition or projection of sets thus formed, etc., in an iterative manner. The individual constraint sets combined in this manner are viewed as modules, and our focus changes to the secondorder representability, i.e. the linkage (combination) of modules. The traditional concept of structure, by contrast, focuses on the nature of the modules (e. g. network flow constraints, capital budgeting constraints, etc.) In order to systematically discuss linkage, we are forced to make the distinction between semantics - ie. the actual sets discussed and the actual set operations performed on them - and syntaz, i.e. the constraint systems used to represent the sets, and the manipulations of these systems needed to represent the set operations. Actually, representations as discussed here are equivalence classes of constraint sets, where we permit interchange of rows and the order of constraints, etc. Due to the need for such distinctions, our discussion of second-order representability is more abstract than the preceding lectures.
3.1
Definitions and basic constructions
By an operation (denoted Op(.)) we shall mean a mapping of a specific type, i.e. one which takes vectors of sets in Euclidean space into other such sets. E-g. union, s u m , intersection, projections, Cartesian product, etc. are operations. By a construction Op(.) we shall mean a different type of mapping - one which takes vectors of repmsentation to representations. To provide some examples, let us fix a representation of a Euclidean set S:
R. JEROSLO W
44
zES
t-)
there exists y with yj E (0,l) for j E K
andAz
+ By 2 h
2 = ( A , B , h , K ) is a representation of
S . In writing
S
in this way, we allow for permuting of rows and columns, constraint sets, etc. Note how we systematically use underlining to distinguish semantics from syntax. A n example of a mapping is Rel(.), from representations to sets: Rel(S) = (21 y, 0 5 yj
A2
5 1 for j
E K,
+ By 2 h}
An example of a construction is RL(-): RL(S) = representation of rel(S), i.e. the system
05YjS1,j € K Ax
+ By 2 h
with no binary variables. Another mapping is spatial ewuluation Ev(-); which takes representations to sets: Ev(S) = (21 for some y with
yj E (0,1} for j E K , Az+By>h}
(= S ) In what follows, we shall need some definitions. Def: The construction Op(.) parallels the set operation Op(.) on if:
(S1,...,
Ezample: Let the construction S1 A ...A 3, consist of juztaposing the representations !&, ...,S, while making auxiliary variables of different Si disjoint. Then A parallels n (intersection): EV&
A
... A st) = S1n ... n St
LECTURE 3
45
This holds for for A.
all representations, so we say "there is no domain restric-
tion"
We note that spatial relaxation Rel(-) is related to syntactic relaxation RL(-) by: Ev(RL(2)) = Rel(3) We also note: Rel(S, A
...A S t )
= Rel(S,) n ... n Rel(8,)
2 conv(S1) n ...n conv(S:) 3 conv(S1 n ...n St) In the above, the first inclusion 2 is equality = for sharp representations. The second inclusion
3.2
2 is typically strict.
The union construction
We need some preliminary results, and we utilize this definition of the "starred recession cone" of a representation = ( A , B , h, K )of a set S: rec*(S) = {el
for some y with yj = O ,
For polyhedral P,if
j E K, Az+&
20)
P # 0, rec*(P) = rec(P)
Proposition: ([Jer 1984c, 1986b1): For representable S # 0 with representation rec*(S) = {el
for all 2'
= {el
e' E S
S
= (A, B , h, K):
and m
2
0,
+m e E S)
for some 2' E S and all m 2 0, z'+mz E S }
Suppose we are given representations 2; = ( A ( i ) B('), ,
K ( ' ) )for sets S;:
R. JEROSLOW
46
...
Theorem: [Jer 1984c, 1986bI S1v v St represents the smallest representable set containing S1 u .._ u St. If rec*(Si) is independent of i = 1,...,t , then it represents S1 u u St. Furthermore:
---
Note: If rec*($;) is independent of i, Ev($, v
... v S t )
=
S1
u ... U St so v
parallels u on this restricted domain of (S,, ...,S t ) . In our notation, we use the logical 'or' symbol 'V' for the construction which parallels the union operation, just as we need the logical 'and' symbol 'A' for the construction which parallels the intersection operation. For other operations, we may use the same symbol for the paralleling construction, or sometimes we underline the symbol. The domain of the union construction 'V' is necessarily restricted. Even for union of polyhedra, without the recession condition the union is typically not representable, so there cannot be a parallel construction for the union in general. The union construction very closely follows the disjunctive construction for union of polyhedra; it simply adds the constraints y): 5 m; for all j E K('), which are necessary. Note that if the yy) occur in a set partitioning (SOS1) constraint
CjEK(;) )y: = 1 (which occurs e.g.
when a disjunctive representation is used
for Si) that constraint is homogenized to become CjEK(;) y)! = mi. Therefore explicit constraints yy) 5 mi would not be needed in such a context. The formula above for Rel(Sl v v S t ) corresponds to what we earlier termed the sharpness of disjunctive constructions. Indeed, given the relaxations Rel(!&) for the individual Si, and using these representations $; for Si
-
LECTURE 3
47
to represent S = S1 U ... U St, one cannot expect to have a better relaxation than conv(Rel(S,) U U Rel(St)).
...
3.3
Some other constructions
We use the notation of 3.1 above and give constructions for some other commonly occurring set operations. Our list of operations is not meant to be exhaustive in any way, but simply useful in common contexts.
S = S1 x ..- x St
represents a set S:
---
Clearly S = S1 x x St so x parallels x with no domain restriction. Note that the variables z(') are made disjoint here. Also note: Rel(S) = Rel(S,) x
- - - x Rel(2,)
We compare the formula above with a well-known formula for the convex span of the Cartesian product of sets (se e.g. [Roc 19701): conv(S1 x
..- x St) = conv(S1) x - - - x cow(&)
The Cartesian product operation occurs in the settings of decentralization of operations (as in decomposition [Dan19631)and in the setting of hierarchical delegation of efforts. Another important set operation arises from linear &e transformations T ( z )= L ( z ) v, where L is a linear transformation. The most common instances of such linear a f h e maps are s u m and projection. The following is easily verified.
+
Proposition: T(conv(S)) = conv(T(S)) Suppose we have a representation S = ( A ,B , h, K ) of a set of S. This is then a representation T(S)of T(S):
R. JEROSLO W
48
2
E
T(S)++ there exist w , y with y j E ( 0 , l ) Aw 2
all j E
K,
+ By 2 h,
= T ( w )+ u
so T ( S )parallels T ( S ) . Also note Rel(T(S)) - = T(Rel(S)) We note that, of all the basic constructions (i.e. union, intersection, Cartesian product, linear afhe transformation), only the union construction introduces new control variables.
3.4
Some technical properties of the basic constructions
All the basic (or: "canonical") constructions Op(-) are (relaxation) commutative on their domain, i.e.
This can be verified by checking the formulas for Re1 for each of the basic constructions. Commutativity is not an easy concept to motivate, but it is an essential technical concept for later work.
DeE Op( .) is elementary on its domain if
All basic constructions are elementary, as one easily checks on a case-bycase basis. Proposition: If %( .) is elementary then it is sub-commutative, i.e.
49
...,
DeE Op(.) is sharp if, whenever S1, St are sharp representations of S1, ..., St whichGe in the domain of Op(.), then Op(Sl, S t ) is a sharp representation of Op(S1, ..., S:) i. e.:
...,
We note the intersection construction is not sharp since, even for sharp S;, Rel(S, A
...A S t )
= conv(S1) fl
...f l conv(St)
1 conv(S1 n ...n St) and typically the inclusion 2 is strict. Hence "commutative" does not imply sharp. The other basic constructions are sharp, and in this way we have a precise sense in which intersection is the issue of MIP!
3.5
Composite constructions and 'structure' in MIP
The composite constructions arise from composing (iterating) the basic constructions. These are needed to describe the structure of MIP constraints sets, and the degree of composition (i.e. maximum nesting of iteration) is often in the two to six range for practical MIPS. The degree depends on how the constraint sets are viewed, more specifically, which subsets of the constraints are viewed as the "modules" having representations that are not furthur decomposed into basic operations. To illustrate this concept of structure, we consider an hypothetical scenario.
R. JEROSLOW
50
Your firm has two divisions, which are completely separate except for some company-wide budget restrictions and some flow of goods to division dist (distributor) from division fact (a factory). These company-wide constraints form a set S; x E S is required of a feasible operating plan. These constraints may include e.g. overall budget and global resource constraints. Here x = (x(l),&)) and, except for the company-wide Constraints, the variables x(l) are those of diet. Diet distributes goods x(l) to a number of customers. It obtains these from three basic sources, but the first source has a choice of three possible subsources. The fact that x(') can be distributed through a network to meet all customer demand is represented as "x(l) E S1". Thus the constraints on dist, other than x E S, can be represented:
where Sa U S3 U S4 is the f i s t source, SS,the second source, and s6 the third source. Fact has variables x ( ~ )representing , two products which it manufactures. The first product can be manufactured in two ways, and the other in three ways. Thus the constraints on fact can be represented: ($7
v
$8)
x
(S, v Sl, v S l l )
The fim's situation can therefore be represented as the MIP:
The degree of this composite construction is four. In addition, if our firm is actually part of a larger corporation, the degree of the composite construction would be larger. This scenario provides a typical example of how composite constructions can be used as a means of structuring many large-scale MIPS, through a focus on the manner in which parts of the model are linked together. Even a general MIP can, if one wishes, be viewed from the perspective of composite constructions. For example
can be viewed as
LECTURE 3
51
where S1 = ((vl,sr~)lsr~ I 10, YO I 8) and Sa = ((sr1,yl)l for some ~ 3 ~ 2 2 04 we have 11 = 23, yl = 523 - 721. However, this may or may not be an insightful reformulation. In general, the most appropriate composite construction corresponds to the intuitive perceptions of the model's users as to how the subparts are structured. Composite constructions provide, first of all, a language for precisely describing structure of a MIP problem. In addition, a study of their mathematical properties assist in the solution of the problem. Via a study of the relazations of composite constructions ("Relaxation Calculus") we can obtain means of improaing the linear relaxation (LR). For example,
S- A (S1 - A is inferior in
(S, V 53 V S4 S, + $6)
X ($7
V SS)
X ($9
V
$10
V
SII))
(1)
LR generally to
s A (((Sl
A
( S 1 + $5
v(S1 A ( 8 4
+ s 6 ) ) (51 + Ss + Ss))) x
x(S9
(s3 (87
+ s 6 + S6))
v s.8)
(2)
v SlO v S l d
even though it is equivalent to (1)(i.e. describes the same set). In turn (2) is inferior in LR to 4
V (S A ( ~ A1 (Si + SS +
i=l
$6)
x
($7
v S 8 ) x (Ss v $10 v S11))
(3)
although equivalent. Howewer, even the best formulation (3) of these three is not sharp in general. We will prove the relative LR dominances among (l), (2) and (3) when we discuss the distributive laws in Lecture 4. Here we content ourselves with a summary of the four main goals of our concept of structure via composite constructions: 1. To reflect the structural features of MIP's, in terms of the way the parts are linked together, and to provide a language for dkussingrepresentations.
R. JEROSLOW
52
2. To develop guidelines for choosing among alternative representations of
the same MIP constraints. 3. To rnodulurke the work of representability. E.g., any desirable representation (not necessarily disjunctive) can be inserted for use for any given part of the MIP.
MIP to logic - based decision support, firstly to propositional logic but also to fragments of predicate logic.
4. To relate
The logic-based approaches are one of the main directions in ”artificial intelligence”. There are several other tie-ins between MIP and A1 as well. We will discuss these issues in Lectures 5 , 6, and 10.
3.6
T w o central technical results
In this section, we lay some groundwork for Lecture 4, by citing some technical results on the concept of commutativity and sharpness. These are hard results to motivate by themselves; their importance is in their later applications. We also indicate the key ideas in the proofs of these technical results. Full details for most of these proofs are in [Jer 1984~1;however that reference does not do linear &e transformations in generality, although it does handle addition and projection. For furthur details on linear afFine transformation, see [Jer 1986bl. The two main technical theorems are: Theorem on Commutativity: If the composite construction Op(.) does not contain occurrences of both union and intersection, then it is relaxation commutative.
Theorem on Sharpness: If the composite construction Op(.) does not contain occurrences of intersection, then it is sharp.
Here are the key lemmata used in the proofs: Lemma 1: Each basic construction (union, intersection, linear afFine transformation, Cartesian Product) is relaxation commutative.
LECTURE 3
53
L e m m a 2: If Op(.) has no occurrences of union and K = vector of convexsets, then Op(K) is convex.
...,K t ) is a
(K1,
L e m m a 3: For T linear f f i e :
(In the f i s t equation, T is an operation, while it is a construction in the second equation).
L e m m a 4: Rel(S1 x
- - .x S t )
cow( S1 x
0
- x St)
..-x Rel(S,) = conv(S1) x - - x conv(St) = Rel(Sl) x
*
Idea of proof for commutativity: By induction on the formation of Op. Let
OP(S) = op‘(op,(S),...,o p , ( S ) ) )
...,st),
where S = (S1, Op’is a basic construction, and the Op; are composite constructions or the identity. Case: Op’ is intersection. We have:
Rel(Op(S)) = Rel(Ai oP,(S)) = conv(/\t=l Rel(oP,(S))) = &l Rel(Op,(S))
(Lemma 1) (Re1 is convex)
= nf=, conv(Opi(Rel(S))) (induct.) = n L 1 oPi(Rel(S))
(Lemma 2)
= Op(Rel(S)) = conv(Op(Rel(S))
(Lemma 2)
R. JEROSLO W
54
Case: Op' is T ( . ) ,linear &e
Rel(Op(S)) = Rel(T(Op,(S)) = T(Rel(Oll,(S)))
(Lemma 3)
= T(conv(Opl(Rel(S)))) (induct) = conv(T(Op,(Rel(S))))
(Lemma 3)
= conv( Op( Rel( S)))
Etc. for other cases.
Idea of proof for sharpness: Also by induction:
Case: Op' is union
R 4 0 p ( S ) ) = R 4 V ; op,(S)) = cow( Ui Rel(-1Op .(S-)))
(Corn.)
= conv(Ui conv(Opi(S))) ( h d Hyp)
= conv(U; Opi(S)) = conv( Op( S ) ) Case: Op' is linear a f b e T.
&l(Op(S)> = T(Rel(Op,(S)) = T(conv(OP,(S)))
(Lemma 3)
(Ind HYP)
= conv( T(OP1( S ) ) ) = conv( Op( S))
Etc. for Cartesian Product We illustrate the use of these technical theorems. Suppose:
LECTURE 3
55
+
Since (Sl x 8, S3) V S4 does not contain intersection, by the Theorem on Commutativity we have: Rel((8, x
S,
+ S3) v S4) = conv((Rel(S1) x Re@,) + Rel(S3)) U Rel(S4))
In addition, if the
8; are sharp for i = 1,2,3,4 we have:
Rel(S, x S2 Since S3
3.7
+ S3) V S4) = conv((S1 X S2 + S3)U S4)
+ Ss A S4 does not contain union, we have:
Hereditary sharpness
DeE A representation S is hereditarily sharp if, for every partial assignment W of control variables to binary values, with resulting representation S(W), we have Rel(S(W)) = conv(Ev(S(W))) Thus, a hereditarily sharp representation is sharp (take W = 8). A hereditarily sharp representation does not require reformulation at lower nodes of the search tree, in order aolely to retain aharpneaa.
Theorem on Hereditary sharpness If the composite construction Op(.) does not contain occurrences of intersection, it is hereditarily sharp ( i F takes hereditarily sharp representations to the same). See [Jer 1984~1 for more details on hereditary sharpness. Disjunctive representations are hereditarily sharp. In [Jer Low 19851 w e give a sharp representation which is not hereditarily sharp and which arises naturally.
This Page Intentionally Left Blank
57
LECTURE 4 TOPICS IN REPRESENTABILITY Summary: We conclude our survey of MIP representability with a brief discussion of these four topics: 4.1 Reformulation of MIP via distributive laws, which will serve to justify some claims made in Lecture 3, and will be used again in Lecture 9 in connection with logic; 4.2 A discussion of the regularity condition which arises in nonlinear (specif-
ically, convex union) representability, and its relation to some classical results; 4.3 An example of the kind of result which allows a significant reduction in the size of disjunctive representations, when a "combinatorial index set"
is present; this result simultaneously extends common uses of Martin's variable redefinition and the union construction of disjunctive methods; 4.4 Some experimental results which illustrate the value of disjunctive formu-
lations.
4.1
Reformulation via distributive laws
The material in this section occurs in [Jer 1986b]. It is well-known that:
More generally we have: Proposition: If Op( U,W) has one occurrence of the set U,and W is a vector of sets, then: Si, W ) = OP(si, W )
OP(U 1
U I
R. JEROSLOW
58
Theorem: For a composite construction Op(U.W) with one occurrence of the representation of g,and W a vector ofrepresentations:
2 Rel(Vi0p(S;,W)) with = in place of 2 when Op has no occurrence of intersection.
(*) Rel(Op(V.S. - a-?- W))
The content of the theorem is that the LR is improved (made smaller) by moving union outward. The reader can verify that this is what we did in Lecture 3.3 to derive (2) from (1) and to derive (3) from (2). Note that this improvement in the LR generally increases the size of the representation, unless there are simplifications which occur.
Idea of prooE An induction on the formation of Op:
where Op' is a basic construction and we allow that any Op;(W) may simply be a representation in W, similarly for Op,(U, W). We have by commutativity
Also note:
Therefore we have that Rel(*(V;
&, W) contains:
LECTURE 4
59
A slightly more involved argument treats the = case. It is well-knownthat
Su
(nSi) = n ( su si) 1
1
More generally, we have Proposition: If Op(U, W) has one occurrence of the set U and W is a vector of sets oP(n si, W )c OP(si, W )
n i
a
Moreover, if every linear a 5 e
T in Op is 1- 1,
op(nsi, W ) = i
O P ( ~ ; ,W ) i
Theorem: For a canonical construction Op(U. W) with one occurrence of the representation U,and W a vector of repreemtations
(Asi ,w))c R d ( A Op(Si,w))
R ~WP
i
a
Note: This is opposife the has the better LR.
2 for v. Here a generally smaller representation
W = (Wl, ...,Wt) and W' = (Wi,...,Wi) be vectors of sets with representations W,,xand let 9 be a composite Monotonicity PrincipZe: Let
R. JEROSLOW
60
construction o f t arguments. Let Rel(E) Rel(=) for i = 1,...,t .
2 Rel(W')
abbreviate Rel(W;) 2
Then Rel(W)
2 &l(W')
+
Rel(op(W))
2 Rel(Op(W'))
By using the two distributive laws and the monotonicity principle, we obtain lattices of reformulations of the same MIP constraint set, where we can predict changes in the LR. There are, at any given point, m a n y options for which distributive laws will be used to obtain a reformulation. The example below illustrates this, with D = A k D k . The arrows point to an improved LR, while "hc" written on an arrow indicates that the problem representation would increase, unless simplifications occur. Similarly, "Dec" indicates a decreased size of the representation:
61
LECTURE 4
Figure 15: A Lattice of Representations
R. JEROSLO W
62
Here are some reformulation principles, to aid in tightening the LR: 1. Always begin by pushing intersections inward (via distributive laws) wherever this is valid (e.g. all linear &e T are 1- 1). 2. Gradually move unions outward. When a union reaches the outermost
part of the construction, consider the (generally preferred) alternative of branching on it. 3. Always check reformulations for possible simplifications which may reduce size. 4. In choosing which union to move outward, consult the user for informa-
tion to rank which 'or' subproblems appear to be most crucial. Pretesting may be used to validate user perceptions.
4.2
Convex union representability
In this section, we consider the representation of certain nonlinear sets, specifically, convex constraints with some variables binary. This material is drawn from [Jer 1984bl. Let F ( z ;y) be a vector of positively homogeneous closed convex functions, with no --oo values. The following technical regularity condition plays an essential role:
(*I
F( 0 ; y )
5 0 -.+
y =0
Def: A set S is bounded con~lezrepresentable (b.c.r) when there is a vector b, an F ( z ;y) as above, satisfying (*), and a subset K of the indices of auxiliary variables y with z ES
t)
there is 1 with y j E ( 0 , l ) for j E K and F ( z ;y) 5 b
Theorem: A set S is b.c.r. iff S = S1 u u St is a finite union of closed, convex sets Si with rec(Si) independent of i. When S is b.c.r., it has a sharp representation, i.e. an 3, Ev(S) = S , with
---
Rel(S) = cow( S)
63
LECTURE 4
We remark that the disjunctive construction can be adapted to produce the sharp representation S. Also, the canonical constructions go over, with new conditions on their domains (see below). Moreover, the sets representable with K empty are exactly the closed, convex sets. h the regularity condition (*) the constraints 0 5 yk 5 1 for k E K can be viewed as included in the vector of functions F, without changing its homogeneous nature, etc. Thus (*) is actually equivalent to the apparently stronger requirement: F(0;y) 5 0 and yk = 0 for k E
&e
K
+
y =0
We now check what the regularity condition becomes in the case of a linear operation. Here: If S is represented by F then: z E T ( S )e, there are y, w with y j E (0,1} for j E K, F ( w ; y) 5 b,
2
= T(w)
WritingT(w) = A(w)+wo, with A linear,the regularity conditionbecomes:
F(w;y)5 0,
0 = A(w)
---t
9 = 0, w = 0
It need not be true. Let us now specifically focus on the two common applications of linear a i l h e operations, namely, su11~8and projections. Let T(S1, ...,St) = S1+ ... S, and each S; described by &(z; y(')) 5 b(;):
+
zE
S1+
...+ St
t)
there exist y(;), w(;) with y)! for all j E K ( i ) , all i , 0
5 yj 5 1 for all
j E K ( i ) and all i , J!!(W(qyw)
The regularity condition is:
< b(') all i
E {0,1}
R. JEROSLOW
64
Lemma: [Jer 1984b] rec(S;) = {wI
for some g(') with ' :9
= o for j E K ( i ) ,
F;(w;y(')) 5 0) provided S; # 0. Hence to meet the regularity condition, we need only assume: w(;) E rec( S;) for all i and
xi wG) = 0
-+ all w(4
=0
This is a clcrseicd requirement for SI+ ...+ St to be closed and convex. Similarly for projection of 2 = (&I, &I) onto &I: E proj(S)
f+
there exists d2), with gj E (0, I}, jEK, O
F(z('),&); y) 5 b The regularity condition becomes: F(0, d2); y)
5 0 with gj = 0 for j E K
--+
&I = 0,
y =0
By the lemma, we need only assume the following to get
a
representation of
proj(S):
(o,#))
E rec(S)
-+
Ja)= o
This also is a classical condition to insure that proj(S) is closed and convex. In summary, the regularity conditions to obtain representations for T(S) are typically classical hypotheses to insure that T(S)is closed. They are therefore fairly minimal conditions, since a non-closed set S cannot be representable. Since recession cone restrictions occur on the domain of the union construction in the linear case, and on the domain of the linear f i e construction in the nonlinear case, it is worthwhile to collect together results which allow us to compute recession cones.
65
LECTURE 4 DeE rec*(,5) = (21 for some y with y j = 0 for j E K, F ( z ;y) 5 0) Proposition: For S # 0 closed and convex, rec*(s) = rec(S) Hence rec*(S) is independent of
S representing
S if S
#0
Proposition: rec*(s) = rec*(RL(S)) Thus for S
# 0, rec(Rel(2)) = rec( S) is ind.
of S.
Theorem: (See [Jer 1984bj) (a) rec*(v;&) = rec*(&) (recall rec(S;) is ind. of i)
(b) rec*(Ai&) = nirec*(S;) (c)
rec*( xi&) = x;rec*(S;)
(d) rec*(T(S)) = T(rec*(S))
4.3
Using combinatorial principles in representability
A number of recent papers on representability have addressed the issue of the representation of "simple lot-sizing models" which occur in production, for example [Epp Mar 19851, [BarVan Wol 1984a, 1984bl. The model itself is a classic of Operations Research and it and its variants have been widely used and are solved by dynamic p r o g r d g [Wag Whi 19681, [Vei 19691, [Zan 19691. The representation of the constraints associated with lot sizing allow their use in general MIPS in which lot sizing is only part of the constraint set. Our discussion here follows the lines of [Epp Mar 19851 and is based on Martin's variable redefinition approach. In simple lot sizing, decisions are to be made concerning a single product, which can be produced in each of T periods. Its production involves a fixed charge in each period of positive production, and a concave cost (typical, a linear variable cost) for the amount produced. Finished product inventory may be carried, with a similar cost structure.
R. JEROSLOW
66
Let us use the variables
zt = amount produced in period t
It = inventory carried from period t to period (t + 1) Y:=
{
0 if 2 t = O 1 if z t > o
and let Dt be the (known) demand in period t . It is known [Vei 19691 that the extreme points of the set of solutions have the property that, in any period of positive production the entering inventory is zero. For example, in a T = 5 period problem where every three adjacent components of the 15-component vector represent a triple ( z t , I t , y t ) in ascending order of the period t, the following vector represents one extreme point solution:
(01
+ Da, Da, 1,0,0,0,D3 + D4 + Ds, D4 + D5,0,0, D 5 , 0 , 0 , 0 , 0 )
In this solution, we manufacture in period 1 for the f i s t two periods and in period 3 for the last three periods, using inventory to satisfy demand in periods 2 , 4 and 5 where we do no production. Note that the vector above can be viewed as the s u m of the vectors
+
(01 Da, Da, 1 , O , O , O,O, O,O, O,O, O,O, 0,O) and
+ D4 + Ds,D4 + Ds,O,0,D5,0,0,0,0)
(0,0,0,0,0,0,~3
These two vectors can, in turn, be viewed as labels for certain arcs in the following graph; specifically, the first vector labels the arc from node 1 to node 3, while the second labels the arc from node 3 to node 5: In a similar maaner, all extreme point vectors are obtained by choosing a path from node 1 to node 5 in the network of Figure 16, and adding up the associated arc vectors. In general, there are exponentially many paths through such ”lot sizing networks;” nevertheless, a compact representation of the extreme points will be possible, using the network structure as an “index set” for the vectors. In addition, the linear relaxation of the representation of extreme points will provide a representation of all solutions, i.e. will be sharp. The sharpness will be a consequence of a sharp representation for the indez set - i.e. for the set of all paths in a network. Indeed, by placing unit capacities
67
LECTURE 4
Figure 16: Network on an Index Set on all arcs in the network, a unit flow into the "source"node 1 and a unit flow out of the "sink" node 5 , from the work of Ford and W e r s o n [For M19621, the extreme point solutions to the standard conservation equations at nodes will be the indicator vectors of the paths, where these latter vectors have a component for each arc in the network. In this manner, we have an example of a set of vectors which is not itself a combinatorial set, and so not the indicator vectors of a combinatorial problem, but for which there is a "hidden" combinatorial structure that "indexes" the set and can be used to achieve sharpness for a compact representation. Let us now abstract this example somewhat.
In place of the arc vectors, we consider general representable (b-MIP.r) sets Tj, rather than simply singleton sets. The natural representation of singleton sets is sharp, and in general, we will require a sharp representation for each Tj. In place of the network, we consider a general combinatorial structure F of ordered pairs (i, j),where in our application (i, j ) E F abbreviates "arc j is on path i." We shall require a sharp representation for the incidence vectors of the collection of sets { j l ( i , j ) E F} as i (the paths) varies. In this broad setting, we are now prepared to state the representation result which will account for lot sizing as one application. Theorem: Let S = Ui S;,each S; = Cj/(i,j)EFTj, with each Si independent of j .
Let
fj(2;
@) < b ( j ) , yf) E {0,1} for p E Kj, be
a
# 8, rec(Tj)
sharp representation of
R. JEROSLOW
68
T,. Let g ( u ; w ) 5 b, wp E {0,1) for p E K be a sharp representation of
V = {u binary I for some i,
[uj
=1
if (i, j ) E F ] for all j } Then the following is a sharp representation of S: z E
S
++
for some z(j),y(j),u, w with
wp E {0,1) for p E K , I#) E {0,1) forpEKjallj, o < y f ’ < u j for p E K j all j , we have fj(z(j); ,W)
5 b W U j all j
s(u,w)5 b and 2 = C, z(j) We shall provide a proof of the theorem above in a forthcoming paper.’ The description 3 = Vi(~j~~i,j~cF&’j) is already sharp, since it has no occurrence of intersection. Only the stze of the representation is at issue in this result. When the Tj are general b-MIP.r sets, but F is the diagonal i.e. F = {(j,j ) l all j ) with sharp description C juj = 1, this result becomes the union construction of the disjunctive methods. Several variations of simple lot sizing are treated by the same approach, so long as the problem reduces to the set of extreme points described earlier (or a very similar set). The versions in the literature differ in this respect from what we described above, that only the implication from zt > 0 to yt = 1 is sequited in the literature. To summarize, polyhedral combinatorics can often be used to obtain sharp representations of the base sets (the ”modules”) of a composite construction. It is equally usetul as a means of entirely restructuring the representation of a composite set, when relevant combinatorial objects are used merely to indez such a construction. The results obtainable from polyhedral combinatorics me, on the one hand, fairly speciulized. However, they are among the most powerfil principles for limiting enumeration and making representations compact. ‘Added in proof: See our technical report, “Two Mixed Integer Programming Formulationa Ariaing in Manufacturing Management.”
LECTURE 4
69
These new formulations furthur extend the concept of "special structures."
4.4
Some experimental results
Here we summarize two computer experiments reported in [Jer Low 19851 and described in furthur detail in an earlier technical report. All problems were randomly generated with specific structure as described below. 4.4.1
Either/or constraints
The scenario is a multi-division firm in which each division has a choice of technologies. The composite structure is of depth 3:
Here P represents the common constraints on corporate resources, and P;j is the j-th technology available for division i. In all problems run,P has three constraints, and all Pij are 3 by 3. "N1- Nan below meam 1"' divisions, Na technologies per division. In the following tables:
MIPP= formulation via composite construction, with disjunctive methods for the either/or constraints of technology choice;
MIPS = "standard" formulation for either /or constraints The right-hand-side (r.h.s.) multiplier is a measure of the degree to which the common constraints dominate the problem. At the setting 1.1, these are moderately tight, and so diminish the advantage of the sharp formulation for either/or constraints. At the setting 1.9, the either/or constraints dominate. In all three tables, the feature which stands out is the LP/Discrete ratio, which gives the ratio of the value of the linear programming relaxation to the value of the integer program. The fact that the two programs are so close in value accounts for the favorable results, and it is an algorithm-independent measure (our problems were run on Apex IV). In general the size of our "sharp" formulation MIPP was at least twice that of the "standard" formulation MIPS.
R. JEROSLO W
70
I TIME
MIPP
I
I
I
I
MI S
NODES
MIPP
MIPS
7
1.0005
1.084
0.92
19
1.0047
1.098
5
29.08
65
1.0007
1.174
3.92
12
6.67
13
1.0048
1.088
6.93
3.75
7
24.25
57
1.00275
1.102
8.85
33.5
5.16
10
69.25
165
1.0008
1.1336
9
21.43
484'
4.11
9
140,
1.00168
1.1466**
4
29.1
399'
5
6
1.00046
1.144.'
#
MIPP
3-2
12
.63
0.39
1.83
3
3.75
5-2
12
1.86
2.36
2.5
6
8-2
12
3.58
3-3
12
1.64
0.91
5-3
12
3.15
8-3
12
12-3 15-3
Problem
MIPS
Avg
Max
Avg
Max
-
11.52 2.17
only one sample ++
382
-
the ratio is the LP over the best solution found
LECTURE 4
+ 0.83
0.69
2.2
71
Table II: MULTI-DMSION PROBLEMS: R.H.S.MULTIPLIER (1.3)
I
I
I
I
I
?P
Problem
#
-
-
I
I
I
NODES
'
I
'
MIPS
1
RATIO LP/DIS( RETE
-
MIPP
MIPS
Avg
MIPP
MIPS
7
1.0066
1.142
Avg
Max
4
4.9
MsI
-
3-2
18
5-2
18
1.92
2.72
2.3
4
9.0
29
1.0009
1.118
8-2
18
5.89
18.60
2.7
7
39.6
91
1.0013
1.174
3-3
18
1.69
1.52
3.4
10
9.7
17
1.0032
1.133
5-3
18
4.20
11.52
3.9
10
38.2
119
1.0011
1.174
8-3
18
10.99
123.5
4.5
18
194.7
597
1.0017
1.116
12-3
3
19.4
-
6.3
9
-
1.00008
9
-
1.00008
15-3
3
-
30.5
-
4
I
I
R. JEROSLOW
72
CIPLIER (1.9)
-
I MIPP
TIME
I RATIO
I
I
I
MIPS
NODES
LP/DISCRETE
Problem
#
MIPP
MIPS
3-2
8
0.97
1.44
2.0
3
8.75
14
1.0026
1.186
5-2
8
2.32
10.42
2.0
3
32.75
55
1.0006
1.2383
a-2
8
6.60
64.92
3.0
6
115.4
278
1.0034
1.1926
3-3
8
1.59
4.13
2.38
3
21.1
31
1.0088
1.1971
5-3
8
5.57
41.77
4.12
9
107.7
337
1.0041
1.2393
8-3
8
12.22 311.68
3.63
7 407.4
762
1.0013
1.2283
12-3
3
25.0
1.0004
1.1408
4
35.15
1.0008
1.315
15-3
-
325* 3.66 306*
7
4
218'
14
270'
-
* only one sample ** the ratio is the LP over the best discrete found when stopped
LECTURE 4 4.4.2
73
Multiple fixed charges
Fixed charge problems with multiple charges and increasing returns to scale were modelled. The graph of a typical function of this type is as drawn below
Figure 17: Complex Fixed Charges In this scenario, the zi were the number of components of various types which could be used to assemble several different types of end products, with known demand for sets of ”equivalent” end products. We explored several formulations. In the ”sharp” formulation, the cost functions Fi were represented by the disjunctive formulation. In the ”linked sharp” formulation, the cost functions Fi were each additively decomposed into a s u m of fixed charges plus an ”economies of scale” function. Each component was modeled sharply via a disjunctive formulation, and these were then added. In addition, formulations I and II are similar to those appearing in the literature and were used as two ”standard” formulations. In our data, the parameter p is a measure of the (minimum) per cent of
R. JEROSLO W
74
TOTAL
INTEGER
MODEL
# CONSTRAINTS
# VARIABLES
VARIABLES
STANDARD I
10 * N X i + N C
11 * N X i + N Y t
6* NXi
STANDARD I1
13 * N X i
+ NYt
6*NXi
+NC
14 * NXi
SHARP
6*NXi+NC
10*NXi+NYt
3*NXi
LINKED-SHARP
11 * N X i + N C
19 * N X i + N Y t
6*NXi
cost tied up in fixed charges; these were relatively high on the average. Also N X i is the number of component variables z;. As can be seen in Table IV, the size of the sharp representation is the smallest of all representations. Here is an instance where hereditary sharpness is achieved with an improvement in representation size, due to problemdependent simplifications. As c a n be seen from the data in Table V to VIII, these problems were difficult for all methods of representation. This occurs because the linear relaxation for even a sharp representation of fixed costs is a poor estimate of the actual cost in most of the range of the variable, and here the fixed costs were dominant in the data. Nevertheless, the sharp representation was better, and its relative advantage increases with problem size, as in 4.4.1.
75
LECTURE 4
t+
SAMPLES
P = 0.5
DISCRETE/ LPRATIO
SHARP
1.36 1.86
34.04
I
11
1.36
I #
# SAMPLES
I
P = 0.1 1 TIME NODES ~
SHARP I
II
, r 83138
DISCRETE/ LP RATIO 1.22
25.01
65.61
138.25
1.48
106.29
131.38
1.22
I
Table VI: Seven problems with P = 0.3. NX1 = 5
I
I
Avg-time
Avg.time
to find
Total
Number
Formulation
to LP
optimum
time
of nodes
Discrete/LP
Sharp
1.3 sec
14.3
18.7
63.3
1.26
I
2.0
8.7
41.3
84.4
1.50
Linked-sharp
3.8
30.0
60.9
79.3
1.26
76
R. JEROSLOW Table VII: Six
roblems with P = 0.3, N X 1 = 6
v .
~~
Avg t ime Avg.time
tofmd
Total
Number
optimum
time
of nodes
Discrete/LP
2.0 sec.
22.6
36.5
94.8
1.32
3.0
40.3
120.2
176.6
1.56
Linked-sharp 5.5
25.2
129.2
119.2
1.32
Formulation
Sharp
I
to LP
Table Vm: A harder prob em at NX1 = 6, P = 0.3 Formulation
1
Time to LP
2.0 sec. Linked-sharp
I
I Total time
of nodes
57.4
62.9
137
67.7
2133
2900
113.4
2400 unknown
Time to OPT
Number
1 :!i!
Discrete/LP
Part I1
LOGIC-BASED APPROACHES T O DECISION SUPPORT
This Page Intentionally Left Blank
79
LECTURE 5 PROPOSITIONAL LOGIC AND MIXED INTEGER PROGRAMMING Summary: We begin our discussion of the logic-based approaches to systematizing human intelligence by an exposition of the propositional logic and its relation to mixed-integer programming. This is a natural starting point for mathematical programmers, since propositional logic can be viewed as a special kind of integer programming constraint set. In addition, many of the successful current practical uses of logic in decision support do not go far beyond propositional logic, although this situation may soon change. We defer until the next lecture a fuller discussion of the logic-based approaches, and a treatment of a more complex logic, the predicate logic. Predicate logic is the theoretical basis of the theorem-proving framework of PROLOG [Clo Me1 19841 and of related technologicalefforts, including the Japanese ”Fifth Generation Project” [Feig McCor 19821.
5.1
Introduction
The propositional logic concerns assertions such as ”John is tall,” ”Mary went to the store,” etc. with a definite meaning, such that these assertions are either true or false. The focus of this logic is not on the meaning of the assertions, nor even necessarily on whether they are true or false. Rather, the focus concerns how the ”unanalyzed” basic assertions are combined by means of the logical connectives ’and’, ’or’, ’not ’, ’implies, ’ etc. and on the laws governing such combinations. The unanalyzed assertions are represented by a numbered sequence of Zettercr PI,Pa,P3, ..., which, in informal discussions, are written P,Q, R,etc. A Iiteralis a letter ( P j ) or its negation (-#j). More complex propositions are built up from PI, Pa,P3, ...by means of the connectiuecr. These propositions are denoted A, B, C, etc. The meaning of the
R. JEROSLO W
80
connectives is as follows:
A A B Asserts that both A and B are true. ’A’ is read ’and’ (conjunction). A V B Asserts that at least one of A, B is true (possibly both are true). ’V’ is read ’or’ (disjunction).
1A
Asserts that A is false. ’1’ is read ’not’ (negation).
A 3 B Abbreviates - A V B . ’3’is read ’implies ’ (implication). We remark that A 3 B does not assert that A causes B, only that either A is false or B is true. Other notations used elsewhere for A 3 B are A + B and (in logic programming) B t A (read: B if A). Let + f j abbreviate Pj and let -Pj abbreviate 1Pj. A disjunctive clause is a proposition of the form VjEKkPj for some finite index set K & { 1’2’3,...}.
E.g.
V
PI V Pa0 V 1P17)
is a disjunctive clause.
A conjunctiae n o m d form (CNF) is a conjunction of disjunctive clauses: (4’2 V
P1 V Pa0 V 1P17) A ( i P 1 V Pie) A (Pa V -Pa0
V Pi7
is a CNF.
Sometimes a CNF is given as a list of disjunctive clauses: V Pi V Pa0 V -Pi, 1Pl v 4 0 Pa V -@a0 V Pi7 It is tacitly understood that all clauses are asserted as true.
The propositional logic has greatly influenced the disjunctive methods of mixed-integer programming. In fact, these methods concern simply the negationless propositional logic in which unanalysed propositions have been replaced by systems of linear inequalities. We observe that a binary mixed integer program constraint set
(Az
2 b) A ( 2 1 = 0 V 2 1 = 1) A
...A
(2,
=0V
Z,
= 1)
is a CNF, in which the propositional letters are systems of linear inequalities. In MIP, negation does not occur. In fact, spatial negution of a linear inequality system (e.g. 2 1 = 0 ) is typically not closed (as e.g. 2 1 # 0)’ hence not representable.
LECTURE 5
81
In many settings, relative complement (relative to the b-MIP.r set forming the 'universe ') can serve as a negation for at least some of the 'letters.' E.g. = 1) 7(z1 = 0) = ( A z 2 b ) A (zl= 1) A (z2 = 0 V ...A (z+ = 0 V z+ = 1) i(Az5b) = 8 (but typically no negation for individual inequalities inside Az 2 b). Let us return to the propositional logic proper. We note some obvious basic laws, such as:
Symmetry: A A B = B A A , A v B = B V A Associativity: A A ( B A C ) = ( A A B ) A C,A v ( B v C ) = ( A v B ) v C De Morgan Laws:
l ( A ~ B =) i A V i B ~ ( A v B=) 7 A ~ + 3 In these laws, the meaning of the equality is that the left-hand-side (1.h.s.) proposition always has the same truth d u e (true or false) as does the righthand-side (r.h.s.), and this holds regurdless of the truth values of the unanalyzed propositions Pj. These laws are used to 'drive negations inward' until they are against letters (TPj) or disappear (as 11Pj = Pj). This subroutine is efficient, in fact, it is linear time. Two other important laws are the:
Distributive laws: ( 1 ) A A ( B v C ) = ( A A B ) v ( AA C ) (2) A v (BA
C)= ( A v B ) A ( A v C )
After using the De Morgan Laws to put any negations against letters, (2) can be repeatedly employed to reach a conjunctive normal form. We illustrate this general fact by a simple illustration:
P v (Q A R ) v ( X A Y ) =
[ ( P V Q ) A (PVR)] V (XA Y ) = [(XAY)V(PAQ)l A [ ( x A Y ) V (PV R)] = [(XVPVQ)A(YVPVQ)] A [ ( x V P V R ) A (Y V P V R)]
R. JEROSLOW
82
This use of distributive laws can, in the ’worst case’, require ezponential space. To see this, apply it to e.g. (PI A Pa)V (P3 A P1) V V (&-I A Pzn). A corollary of the above use of distributive laws is:
...
Corollary: Any proposition has an equivalent proposition in conjunctive normal form. The concept of a disjunctive normal form is developed by analogy with the CNF, by interchanging the roles of ’and’ with ’or’. Specifically, a conjzmctive clause is a conjunction of Literals AjEK &Pj. A disjunctive normal form (DNF) is a disjunction of conjunctions ofliterals (as e.g. abovein P v ( Q A R ) v ( X A Y ) ) . Using the De Morgan Laws and distributivity we have: Corollary: Every proposition has an equivalent proposition in disjunctive normal form. With this perspective, some of the computational issues in connection with the disjunctive methods can be explained as follows. The unanalyzed MIP is in CNF, while the disjunctive methods require a DNF, and the natural conversion of CNF to DNF requires exponential time and space in the worst case. For this reason, there is a need to analyze substructures where, either, the conversion is simple or simplified, or where the D N F is the natural formulation. For the same reason, there is a need to be able to use formulations intermediate between the CNF and the DNF, and to provide means of moving stepwise from CNF to D N F in a way which guarantees step-by-step improvements (thus our distributive laws for relaxations in Lecture 4). Glover’s polyhedral annexation approach is notable in this perspective, as it allows derivation of cutting-planes directly from a CNF formulation [Glo 1975b], and this can be advantageous.
5.2
A ”natural deduction” system for propositional logic
In the propositional logic, the tautologies play a special role - these are the composite propositions A which are true regardless of the truth values assigned to the letter Pj. For example, PIV 4’1 is a tautology: and so is A A B 3 .B A A (by the symmetry laws) as well as A A ( B V C) 3 ( A A B ) V (A A C) (by the distributive laws).
LECTURE 5
83
A mechanical way of testing a proposition A to decide if it is a tautology, is to try out all (exponentially many) possible truth values for the letters in A, and to see if all of these make A true. This method can be wasteful of computation as we see from A A B 3 B A A, where the form of the proposition makes it a tautology. Generally, many methods have been devised to speed up tautology testing, none of which is known to be faster than exponential time in the worst case. The classical approach of logic to generating (as opposed to testing) exactly the set of tautologies, are various systems of deduction. We favor the natural deduction systems, as they s e e m closest to human reasoning. Natural deduction is due to Gentzen (see e.g. [Gen 19691) and the specific system we next present is the propositional logic part of Prawitz' system [Pra 19551. Prawitz' s monograph is now hard to obtain, and also the majority of logic texts present other systems, which focus more on characterizingthe tautologies by logical axiom and rules than on the "naturality" of the system. Good texts are [Men 19641 and [Shoen 19671. In the natural deduction systems, there is a special symbol 'A' which stands for 'absurdity.' T A abbreviates A 3 A. (If 'absurdity' is true, then everything is true). Each propositional connective, ezcept negation, has an introduction rule and an elimination rule.
(AI)
A B AAB
(AE)
AAB A
and
AAB B
( A ) and ( B ) in the ( V E ) rule, ( A ) in the ( 3 I)rule, as well as ( T A )in the ( A c ) rule, above the premiss of a rule, can be discharged by the rule (i.e. no longer count as an 'assumption'). Deductions are in tree f o m . Natural deduction systems have no axioms. Their theorems are those
R. JEROSLOW
a4
propositions having proofs with no assumptions (i.e. all top formulae of the proof are discharged somewhere in the proof). We next give some illustrations of proofs in this natural deduction system.
Ezample 5.2.1; A A B ~ A A B1 B A BAA 3 BAA' AAB
Since all assumptions (i.e. top formulas) are discharged by the bottom line, the bottom line is proven. Let us read this proof, line-by-line. From the leftmost top formula A A B we deduce B by ( A E ) ;from the rightmost we deduce A also by ( A E ) . Then we deduce B h A using ( A I ) . Now we note that B h A has been proven from A A B ; so we deduce A A B 3 B A A using (3 I),and discharge both top formulae as assumptions. Since A A B 3 B A A has no assumptions, it is a tautology. The proof above certainly does follow humanlike reasoning, if perhaps a little slowly. We next give three more proofs of tautology, which will illustrate that some "less natural" proofs can also be implemented in natural deduction, and provide some practice with this system, The reader should justify every step in each proof. We use superscripts to mark places in the proofs where assumptions are discharged, with the s a m e number occurring at the discharged formulae.
Ezample 5.2.2: B
B' B v lB h
V
i B (excluded middle) iB2 B V i B i [ BV 1B)3 A
l [ B v ,BI3
A B V 1B3
LECTURE 5
85
Ezample: 5.2.3 (half a De Morgan Law)
1A3 A - i AV 7 B 4
AAB' A
A A B ~ B
1B3 A
i ( A A B)' i ( AA B)3 - A v i B 3 - ( A A B)4
' ( A A B)'
Ezample 5.2.4: (half a Distributive Law)
A A ( BV C)z A AA(BVC)' BvC
A A ( B v c)' A C' AAC ( A A B ) v ( A A C)
B' AAB ( A A B ) v ( A A c) ( A A B ) v ( A A C)l A A (Bv C) 2 (AAB)v(AAC)'
While the system above possesses a naturality, it can seem cumbersome to use, due to the need to keep track of the tree structure and discharging. While compact notations alleviate this for machine implementation, many authors prefer simpler proof systems which involve some axioms, a rule of deduction, and linear proofs with no discharging. All the different logical systems for propositional logic must be justified by a completeness theorem, i.e. a theorem that they prove exactly the tautologies. Such a result holds in this case as well.
Theorem: The above system proves B X B is a tautology.
5.3
Propositional logic as done by integer programming - 1
A propositional form B is satisfiable iff it is true for at least one truth valuation (equivalently, i B is not a tautology, so that B can be true). The satisfiability problem is:
Given: a proposition B To determine: is B satisfiable?
R. JEROSLOW
86
Integer programming is oriented more toward satisfiability testing than toward tautology testing, although these are equivalent tasks. There is a 'standard' way of 'imbedding' propositional logic into Euclidean space, so as to deal with it by integer programming techniques. By 'standard', we mean it is found frequently in the technical literature and in textbooks. (In Lecture 9, we shall briefly cite alternative imbeddings, of which there are many with advantages over the standard embedding). The standard representution of a disjunctive clause V j E *Pj ~ is xjEKz(APj)
2
z(Pj) E
(o,l)
di where z ( 1 P j ) is 1 - .(Pi). For example, the standard rep. of (1 - .(Pl))
V
Pa
V
+ .(Pa) + .(&)
Ps is 21
Here the value '1' stands for 'true' and '0' for false. This linear inequality holds exactly if the clause is true. In this manner, a list of disjunctive clauses (i.e. a CNF) becomes a list of (pure) integer programming constraints of equivalent satisfiability. How hard are satisfiability problems to solve, when done by integer programming? Our experience with satisfiability problems is confined to randomly-generated problems. We have made a fairly extensive search for "real-world" (non Horn) satisfiability problems, only to fmd that few of these occur. Theoretical results about randomly generated problems appear to depend very much on the probability distribution chosen for problem generation and the solution method used. For example, one version of satisfiability problems are very easy to solve by simple heuristics, when satisfiable [F'ra Ho 19861, at least with a diminishing probability of error, while other versions are intractible by a moderately skillful exact method (private communication from V. Chvatal). In our experience, randomly-generated satisfiability problems are easy to solve by a standard MIP code (APEX IV) with no adaptations or special features. Our random generation methods are described in detail in [Low 19841. Briefly summarized, for pure satisfiability problems, after fixing a clause length and a number of clauses and letters, each clause is filled by drawing equiprobably from the letters without replacement. The sign of the letters is then chosen either at random in each occurrence (fist method), or to be opposite the sign of the previous occurrence (second method) with the f i s t occurrence sign at random.
LECTURE 5
87
In some problems, very fast heuristics were used to prescreen the problems and to leave a part which was "hard for humans." These problems are the ones in which either the number of clauses or letters fail to be a multiple of five. In other problems, the prescreening was not done. It seemed not to make any difference to the computer, as was also the case with the "boundary" between NP and polynomial time (i.e. clausal size s = 3 versus s = 2). We used three MIP codes: BANDBX supplied by Clarence H. Martin (actually an IP code), a code from the book by Land and Powell, and APEX N.We used APEX N after it arrived at our campus. We summarize some of OUI results in the next three tables. PROBLEM SIZE
SATISFIABLE? NODES
TIME (CPU secs) LP TOTAL
~~
L=31,
c=44,
s=2
NO
2
4.0
6.5
L=35,
c=45,
s=2
NO
3
4.7
9.8
L=37,
c=45,
s=2
YES
4
4.7
10.8
L=36,
C=52,
s=2
NO
3
6.0
14.4
L=46,
C=63,
s=2
NO
3
8.7
21.3
L=53,
C=68,
s=2
YES
3
10.1
21.5
L=36,
c=39,
s=3
YES
1
3.9
4.1
L=38,
c=45,
s=3
YES
2
5.5
6.5
L=43,
c=45,
s=4
YES
1
5.2
5.4
L=40,
c=45,
s=4
YES
2
5.3
7.0
L=25,
c=35,
s=5
YES
1
3.5
3.6
Table I: SATISFIABILITY TESTS USING BANDBX
R. JEROSLOW
88
TIME (CPU secs)
PROBLEM SIZE
SATISFIABLE?
NODES
LP
TOTAL ~~~
L=38,
C=40,
s=2
YES
1
3.4
3.5
L=42,
C=40,
s=2
YES
1
3.6
3.7
L=44,
c=45,
s=2
YES
2
4.8
6.5
L=45,
c=45,
s=2
YES
5
4.7
13.0
L=44,
C=52,
s=2
YES
1
5.5
5.6
L=43,
c=55,
s=3
NO
3
6.3
13.7
L=45,
C=60,
s=3
NO
3
7.6
13.4
L=45,
C=60, s=4
NO
3
7.3
15.5
Table 11: SATISFIABILITY TESTS USING LAND AND POWELL’S CODE
TIME (APEX) CLAUSES
LITERALS
CONSISTENT?
NODES
LP
TOTAL
300
160
YES
1
1.4
1.8
300
120
YES
1
1.0
1.4
300
100
YES
1
1.3
1.7
400
100
YES
1
0.9
1.3
400
60
YES
1
1.1
1.4
500
60
YES
2
1.8
2.9
500
50
YES
1
1.2
1.6
600
60
YES
5
4.2
35.3
Table 111: TESTS USING APEX
IV
LECTURE 5
89
We were able to create a hard problem by selecting certain letters in a satisfiability problem, and fixing these to truth values. When "too many" were fked, the problem was inconsistent. As we gradually relaxed the number of letters fixed, the problem moved toward LP feasibility, and the highest run times occurred just at the point feasibility began. The original satisfiability problem which we modified had 400 clauses, 100 letters, and three literals per clause. The data is in the Table JY. We believe that the "incumbent finding feature" of branch-and-bound, which is present in MIP approaches to satisfiability, but not present in traditional logic approaches, was crucial to our favorable run times on consistent problems. It remains open, whether incumbent finding by linear programming can be replaced by a faster routine developed by list processing. I would conjecture that o w run times can be improved by two orders of magnitude, via specialized codes.
TOTAL FIXED
CONSISTENT NODES ? TOTAL
TIME APEX UNITS LP TOTAL
38
NO
1
INFEAS
11.3
20
NO
1
INFEAS
40.9
18
NO
1
INFEAS
71.0
16
NO
1
INFEAS
62.4
14
NO
1
INFEAS
84.1
12
NO
13
88.7
266.8
10
NO
16
77.0
299.8
8
YES
13
75.6
170.2
Table Iv:CREATING A HARD PROBLEM
90 S.4
R. JEROSLOW
Clausal chaining: a subroutine
We shall be studying one of the most effective logic-based algorithms for satisfiability testing, the algorithm of Davis and Putnam [Dav Put 19601 in the form treated in [LOV19781, which we call DPL. DPL is very closely related to MIP algorithms, at least when these utilize the standard representation of disjunctive clauses. To learn DPL, we first learn its most important subroutine, which we call "clausal chaining" (CC) and which is also called "unit resolution" [Lov 19781. Here is a description of clausal chaining: Given: A list of disjunctive clauses First: Delete any clause which contains both a Pj and i P j . Go to repeat. Repeat: Look for unit clauses (i.e. one literal clauses). If there are none, stop. If there is a unit clause kPj which has been made false, declare the problem inconsistent and stop; Similarly if both Pj and 1Pj are unit clauses. Otherwise, make ktpj true; delete clauses in which '3 occurs: delete T P from ~ any clause in which it occurs. Go to repeat.
Diagnosis: If the list of clauses is empty when clausal chaining stops, the original list was consistent. If CC declares inconsistency, it is correct. If neither case holds, we don't know.
Clausal chaining is not a satisfiability tester, since it can stop due to non unit clauses, although the problem is inconsistent. However, for certain distributions of problem instances, it is very powerful when combined with some trivial tests (see [&a H o 19861). Clausal chaining can be implemented in linear time, by using the proper data structures adapted from [Dow Gal 19841. We next give an example of CC. Note that the first step is to remove the fifth clause, where both P4 and iP4 occur.
91
LECTURE 5 Example
Pl
-+
F
(satisfiable) Ps
-+
F
Empty List
Pz
T
P3 + T Clausal chaining is a special instance of resolution, in which one of the clauses is a unit.
92
R. JEROSLO W Resolution is the following rule of logic:
A V Pj
iPj V B
AV B
In the above, A and B denote the remainder of a disjunctive clause. For C C , either A = 0 or B = 0. Lemma: All truth settings and inconsistencies which are discovered by C C , are also discovered by the linear relaxations (LR) of the standard formulation. Proof:If both Pj and i P j occur in a clause, the standard formulation has
...+ .(Pj) + ...+ (1- % ( P j )+) ... 2 1 Due to cancellation, this constraint is always satisfied in the LR. Suppose there is a unit clause Pj. Then in the standard formulation
z(Pj) 2 1 occurs. So if we have already set z ( P j ) = 0, the LR is inconsistent. If also i P j is a unit clause, we have 1 - z(Pj)3 1
i.e.
z(Pj)5 0
so again the LR is inconsistent. Otherwise the LR sets .(Pi) = 1,so all clauses in which Pj occurs are satisfied in the LR. Any clause in which P, occurs e.g. 7Pj gives (1- .(Pj)) z ( B ) 2 1
+
so
it is equivalent in LR to
this analysis is then applied inductively on the number of steps in C C .
Q.E.D. Conclusion: [Bla Jer Low 19851 The truth settings and inconsistency diagnoses of clausal chaining are exactly those of the linear relaxation of the standard formulation.
Proof: We need consider only the case in which CC terminates with a nonempty list and no inconsistency diagnosis.
LECTURE 5
93
4
In this case, all clauses left have two or more literals. Set all .(Pi) = for all unset Pj, and the LR is satisfied. Thus there are no more truth settings to find and no inconsistency in the LR.
Q.E.D. While the conclusion above at first seems to state that linear programming and clausal chaining are equivalent, that equivalence is restricted to inconsistency diagnoses and to variables fixed at "true" or "false." By chance, the linear program may find a satisfying truth valuation involving many variables which are not fixed in value (incumbent finding). This "chance" event happens more frequently than one expects, particularly when linear programs are solved repeatedly in branch-and-bound. Resolution with nonunit clauses can go beyond the linear relaxation of the standard formulation. For example:
P z v TP1
Pl v p 2
However
i.e. .(Pa) 2
4
i.e. P a is "half true"
We now turn to the "Horn clauses", which are a restricted form of disjunctive clauses, specifically, those with zero or one positive literals. For predicate logic, these have played a central role in both expert systems and in the PROLOG language. We present Horn clauses in their forms as implications. An implication is Horn if all its hypothesis are (unnegated) letters and it has zero or one conclusion, which is also a (nunnegated) letter
El ,...,H, 3 H i.e., - I EV~... V -E, V a or H1,...,H, 3
i.e.
-IH~ V ... V i H ,
R. JEROSLOW
94
This is a restricted format for representing knowledge. Here is one diflerence in scope for some important artificial intelligence methods, as contrasted with MIP models. The Horn clause format requires that some definite single conclusion follow from a consistent set of positive facts. In MIP models, for instance, while statistical data may justify a warehouse being located in a certain metropolitan area, there can be alternatives as to its nature and size. Much is gained algorithmically by restricting attention to Horn clauses, as we see in the next results. Note fist that a unit resolution done on Horn clauses results in a (shorter) Horn clause.
Conclusion: [Bla Jer Low 19851 A set of Horn clauses is inconsistent iff clausal chaining finds an inconsistency if€ the linear relaxation of the standard formulation is empty.
ProoE We need consider only the case that CC terminates with no inconsistency and a nonempty list of Horn clauses of size two or more. Just make all unset letters false and these remaining clauses are satisfied.
Q.E.D. From the above and Khachian’s result [Kha 19791 that linear programming is polynomial time, we see that Horn clause consistency testing is polynomial time. Actually, by the result of Dowling and Gallier, it is linear time. The above results do not generalize to nonhorn clauses. For example, the following list of clauses are Horn except for the f i s t , and the list is inconsistent:
Pl v p2 lP1 v P2
Pi V lPa -Pi V lP3
Clausal chaining takes no action (no unit clauses occur). So it does not detect inconsistency. Resolution does detect inconsistency:
Pl V p2
lP1 v P2
PI v -Pa
Pa
7Pi v 7P2 -Ja’
A
LECTURE 5
95
Similarly, branching on .(PI) in IP would detect inconsistency for both branches. A Horn+ clause is a Horn clause
with a nonempty conclusion H. Note that a set of Horn+ clauses is always consistent (make all letters "true").
S.5
Some properties of fkequently-used algorithms of expert systems
Two algorithms which are frequently used in expert systems are "forward chaining" and "backward chaining". Sometimes they are used together, and with other algorithms. Here we shall define them, illustrate them, and relate them conceptually to clausal chaining. Fomad Chaining consists of evaluating all positive unit clauses Pj as "true" all negative unit clauses Pj BB "false" and then inductively: If HI, ...,H, have been set "true"
a,, ...,H, 3 H is among the rules, HI, ...,H, II
and if
{
set H true declare inconsistency and stop. Thi process of using a rule is called "firing"the rule, and the overall process of forward chaining is sometimes called "the chase." Forward chaining is implied by clausal chaining, as we illustrate: fien
HI
l H 1 V 1Hi
Hi
-iHi
H,
-iH,V H
H
V
V
... H, v H -.I
... V T H , v H
R. JEROSLO W
96
Remark: For Horn+ clauses, the truth settings achieved by FC and CC are the same. ProoT: For Horn+ clauses, initially all unit clauses Pj are set "true." Inductively, the only truth settings possible are "true." All rules fired by FC are true for the valuation found by FC. Consider a rule not fired by FC:
Hi,...,H72 H At least one of the Hi has not been given a truth value. (Possibly H has a truth value 'true") All unvalued letters can be set "true" or all can be set "false," and this will make all rules "true." Thus there are no more settings to find, so CC also has no more settings.
Q.E.D. Such a result does not hold for Horn clauses. e.g.:
7E
i.e. H 3
1G V H i.e. G II H FC puts no value on G,but CC sets G "false." Thus, CC is more powerful than FC. However the following holds:
Remark: For Horn clauses, the consistency diagnoses of FC and CC agree. Proof: Without loss of generality, FC finds no inconsistency. Consider an d k e d rule: Hi,...,H , 3 H or H I , ...,H , 3
At least one Hi has not been set "true." (It m a y have been set "false"). If these have not already been set "false", they can be set "false" and the rul e is "true." Hence the list of Horn clauses is consistent.
Q.E.D.
97
LECTURE 5
On non-Horn clauses, FC is much weaker than CC.e.g.:
CC will find this list of clauses inconsistent, but FC has no provision to handle such non Horn clauses (recall that only Horn clauses are permitted in expert systems). We now turn to a discussion of backward chaining, an algorithm based on the humanlike procedure of reducing a task to its subtasks and trying to accomplish these subtasks, either directly or by a furthur reduction. As there m a y be several ways of accomplishing the task, alternative list of subtasks can rise which lead to a complex structure to represent the overall reduction process. These structures are the and/or trees, and can require significant computation. We describe backward chaining largely by illustration. Backward chaining (BC) works as follows. Given a "goal" or "subgoal" of finding H true, when H is not known to be true, all nrle.9 concluding H are retrieved from the rule base. E.g.:
H I , H7, HQ 3 H
rule 17
H3,Hs 3 H
rule23
Ha,H7,Hla 3 H
rule 108
At least one set of premisses must be found true, which sets up subgoals in an and/or tree ( s e e Figure 18). The double arcs in Figure 18 indicate "and." The process is then iterated with each subgoal viewed as a goal. The number of subgoal nodes of the tree cannot exceed the total number of conditions in all rules (see two occurrences of H7 above). Loops can occur as e.g. caused by a rule
H 3 Hia which can be ignored during BC. Backward chaining has a serious deficiency, specifically, it can fail to diagnose an inconsistency on Horn clauses. e.g. God
G
R. JEROSLOW
98
Figure 18: An And/or Tree Rules concluding goal: Several (lots of computation!), which eventually are found to imply G Some other rules: G3H, H 3 These are not activated by BC, but 1G is implied. To avoid running into such difficulties, BC should be restricted in use to Horn+ clauses. Given the problematic nature of both FC and BC, as well as the significant computation time of BC, I do not see reasons to use these algorithms when an excellent linear-time algorithm for Horn clauses can be adapted to linear time unit resolution. These linear time algorithms also come in ”forward” and”backward” versions, but not problematic ones. Beyond being linear time, they are exceptionally efficient. For those who are concerned with the study of cognitive processing, a heuristic method used by human problem solvers is of interest in its own right. However, toward the goal of decision support, the humanlike nature of a heuristic is not a sufficient recommendation. Humans are not always optimally efficient or even free of error. However, consideration of humanlike heuristics can be a useful starting point for an exact analysis and for experimental evaluation. In point of fact, the linear-time algorithms of Dowling and Gallier can be viewed as an approach to making forward and backward chaining both efficient
LECTURE 5
99
and complete for Horn clauses. Humanlike heuristics ought therefore to be a subject for further analysis. When the "rule base" (i.e. set of rules) of an expert system naturally partitions into parts, with Werent parts solely relevant to different queries, then there is a basis for potentially a less-than-linear-timeprocedure. In general, one cannot hope for less time in execution than simply that needed to read all the rules. Without very special structural features, one cannot hope for an heuristic to quickly find just those few rules which pertain to answering a query; typically, there are more than a few.
5.6
The Davis-Putnam Algorithm in T w o Forms
To complete our presentation of DPL, we need to give its two remaining subroutines: monotone variable fizing (MVF) and splitting. MVF is used after CC stops without having determined the consistency of the list given. Any Pj appearing only as Pj is set true. Any Pj appearing only as -d'j is set false. MVF cannot change the consistency of the list. It is not done by math programming algorithms, although it can be validly added as a subroutine for consistency testing. MVF is not valid in general if there is a nonzero objective function. Splitting can be done by either of two subroutines, resolution (the original method) and brunching (second method). Splitting is done after MVF, when the consistency of the list has still not been determined. Any Pj must occur as both Pj and 7Pj. The situation is: Pj V R1, lpj
,
T l
v s1,
...) Pj V R,
- - - dauses containing Pj
..., ...,
1Pj v s b
---
T,
- - - no occurrences Pj
clauses containing 1Pj
Splitting via resolution
The list of clauses above is actually equivalent (for satisfiability) to the given list. To see this, the reader should first prove the following lemma.
R. JEROSLOW
100
Lemma: Any truth valuation making all R; v Sj true either makes all R; true or makes all Sj true.
Splitting via branching Create two subproblems:
"Pj false"
"Pj true"
R1, ...,R ,
-..,s b Tl, ...,T,
TI,
...,T,
S1,
The original problem is consistent if€ at least one of the two subproblems is consistent. In Lecture 9, we shall prove a general result which implies that branching is superior to resolution, in terms of the linear relaxation and hence also in terms of unit resolution (it also is usually superior in terms of total size as well). W e summarize the relations between DPL and branch-and-bound (for the standard formulation) with these intuitive equations: DPL = CC
+ MVF + Branching
Since C C =LP-Incumbent Finding We have: DPL = BB (standard formulation of clauses)
+MVF - Incumbent Finding As noted earlier, MVF
can be added to BB. We return to DPL in Lecture 9, where we use it to assist in MIP, just as here we have used h4IP to assist in satisfiability testing.
5.7
Some recent developments (December 1087)
When we performed the earlier testing reported in Section 5.4, we made a few informal attempts to contact other researchers who had experience with satisfiability algorithms. We were told that most algorithms would simply not be expected to work (in any practical amount of time) on problems of the size routinely solved by APEX.
LECTURE 5
101
On the other hand, our belief was that the main contribution of linear programming, for just satisfiabilityproblems in the standard formulation, was its incumbent-finding feature. If an effectivelist-processing procedure could be devised to perform incumbent finding, we felt that the comparison could favor DPL. After all, it is very inefficient to carry around bases when all one is doing is list processing. In joint research with Jinchang Wang, we have confirmed our earlier conjecture. By enhancing DPL with a linear-time version of CC and with incumbentfinding, we have experienced run times roughly ten times faster than APEX on "easier" problems, and over a hundred times faster on "harder" problems. We report our methods and results in our paper, "Solving Propositional Satisfiability Problems." However, we do not feel that these results are the final story for a comparison of logic-based versus discrete programming-based approaches. There remain issues of alternate embeddings of logic, and enhancements of logic to include other commonly-occurringconstraints. We will discuss these issues in Lecture 9. We feel it is not a question of one approach versus the other, but of utilizing ideas from logic and discrete programming together. In another recent development, MI. Wang and I have discovered an interesting property ofessentially the standard imbedding of logic. By working with the negation variables .(Pj) = 1- z ( P j ) ,and nonnegative criterion functions b j V ( P j ) (with b = ( b l , ...,&) 2 0), one obtains a dual whose solution can be interpreted as describing the structure of Horn clause proofs. The precise interpretation varies with the vector b 2 0 chosen. For b 2 0 the j - t h unit vector e j , dual optimal solutions can be interpreted as proofs of Pj, when Pj is provable. Dual optimal solutions give near proofs of Pj when Pj is not provable, where a n e w proofis a proof structure which would be entirely valid if only exactly one non-versed hypothesis were given as a "fact" (trivially, Pj is always a near proof of itself, but the interest lies in other alternate near proofs). If several propositions are viewed as "targets" of proof, it would be appropriate to use as b 2 0 a vector with one's in the coordinates of these targets, and zero otherwise. An interesting aspect of this approach to Horn clause logic via linear programming are the new features it allows. Changes of the given facts of a situation correspond to changes only in the objective function of the dual program, and the same is true of changes in the rules of reasoning. Thus linear programming postoptimality can be used and a problem can be restarted from the previously optimal basis, rather than having to rerun it from scratch (or, even worse, recode the algorithm to add or delete rules of reasoning). The results are reported in our joint paper, "Dynamic Programming, In-
zf=l
102
R. JEROSLO W
tegral Polyhedra, and Horn Clause Knowledge Bases,” where connections to classical topics in Operations Research are also made. The principles we develop and apply for Horn clause logic are also applicable for obtaining compact and sharp MIP formulations of the problems described in Lecture 4, with regard to variable redefinition, as well as for formulations of more difficult problems.
5.8
Exercises
All remaining problems 12-15 can now be worked.
103
LECTURE 6 A PRIMER O N PREDICATE LOGIC Summary: We introduce and discuss the logic of predicates from an intuitive point of view, with either [Men 19641 or [Shoen 19671 as references which go in detail. We seek here to tie it into its potential uses in problem solving, and to indicate some of the potential obstacles from a theoretical perspective.
6.1
Introduction
The predicate logic concerns models, i.e. relations (predicates) on domains of individual objects. While the propositional logic treats isolated assertions like ”John is tall,” the predicate logic treats assertions such as ”John is the father of Susan,” where the latter is viewed as Father(John,Susan) i.e. as an instance of relation Father (2, y) of fatherhood on the domain of all persons. The specific assertion Father(John,Susan) is called a complete inetuntiution (instance) of Father (2, y), obtained by instantiating ”John” for the variable 2 and ”Susan for the variable y. The predicate logic allows us to concisely state general principles, such as these two: Father(z, y) 3 Anc(z, y) Mother(z, y) 3 Anc(z, y) These Horn clauses state that both fathers and mothers are ancestors. Together with the Horn clause
A m ( = , z ) A Anc(z, y> 3 Anc(z, y) that all ancestor of an ancestor is an ancestor, we have completely captured the ancestor relation on the domain of all persons.
104
R. JEROSLOW
We can replace the predicate logic knowledge representation above by complete instantiation as (2, y, z ) run over all triples of persons. However, this makes three Horn clauses become billions of Horn clauses. It is a reduction of predicate to propositional logic, but is value is limited. The use of predicate logic as a language thus significantly increases our ability to express knowledge and general principles. However, by a result of Plaisted cited in [Den Lew 19831, this expressibility is purchased at the cost of a significant jump in computational complexity. Specifically, that predicate logic just for Horn clauses is complete at exponential time. Since Horn clauses propositional logic is linear time, the great increase in time is entirely accounted for by the generally exponential number of complete instantiations. Exponential-time complexity is a new barrier, substantially more serious than N P completeness. Even if P # NP, NP may be "only slightly nonpolynomial." In contrast, "exponential time" cannot be reduced. While the worst-case complexity need not dominate Horn clause predicate logic, it is a "flag" not to be ignored, and focuses attention on the use of special features of a predicate logic rule base. It marks the deduction problem as probably harder than MIP. Indeed, in the successful expert system applications for which large segments of the rule base have been published, the bulk of such rule bases involve a good deal of instantiation. These rule bases are close to being Horn-clause propositional logic in which the use of predicates serves to structure the database. The predicate logic is a very useful language for expressing constmints between rektions over a set of objects. As a mathematical subject, it comes together with a number of diflerent techniques for carrying out deductions from these constraints and for answering queries regarding objects in the set. Predicate logic is used as the primary or sole approach in azltomated theorem propting and in the "logic based" approaches to artificial intelligence. It also h d s application as a language within many of the other AI approaches, and therefore is an essential subject to those interested in machine intelligence. As a language, it "meshes" well with modern database methods, particularly relational databases [Ull 19821, [Codd 19721. However, for use in query processing there is need for furthur research on streamlining means of answering commonly-occurring queries on large databases. Predicate logic was developed in the period 1910-1940as a language formalization of logical deduction. Lowenheim, Skolem, Bernays, Hilbert, Godel, Herbrand and Gentzen all made significant contributions. The subarea of mathematical logic concerned with deduction in formal systems is called proof theory.
LECTURE 6 6.2
105
Predicate logic: basic concepts, notation
In the predicate logic, for each integer n 2 0 , there is an infinite supply of "predicate symbols" which intuitively represent n-ary relations on a set:
P"' , Pnl, P"S , ... For example, with n = 2, P:(z, y) may represent the "parent" relation on the set of persons ("x is a parent of y"), and P:(z) may represent the fact that "x is a parent" in a given model. Constant8 represent individual objects in the set: El,
c2, c3,
In addition to the propositional connectives for constructing more complex relations, there are quantifiers: V
- universal quantifier read "for every"
3
--
existential quantifier, read "there exists"
Example:
is read: "For every individual 2 1 in the set, if there is an individual 2 2 for which ( C I , 2 2 ) is in the relation denoted by Pf, then 2 1 is in the set denoted by Pi." However, this rigorously correct,reading is too complex! Instead one would read: "For every 2 1 , if€ P:(cl,
22)
for some 2 2 , then also P i ( z 1 ) is true."
In the model where P:(z, g) is the parent relation and Pi(z)asserts that "x has black hair," this formula of the predicate logic asserts: "If c1 has any children, then everyone has black hair." It can be true in the usual model of all persons only if c1 is childless; otherwise, it is certainly false. A sentence like
106
R. JEROSLO W
is either true or false in a given model. A valid sentence is one that is true in all models. It is the predicate logic analogue of a tautology. A satisfiable sentence is one that is true in some model. Above we have described the pure (first order) predicate logic with individual constants. In the applied (first order) predicate logic there are also fvnction symbols in the second order logic we have quantifiers 3T" and VT" with T ranging over n-ary subsets of S. Unless otherwise stated, our discussion concerns only pure (first order) logic with individual constants. As regards validity,
q,
, = S x S , P i = 0 and it is false. It is satisfiable. is not valid. Put S = { q }P: Put s = {Cl}, P; = 0 , P i = 0. The variable occurrence of 2 1 i n P i ( z 1 ) is called bound by the quantifier occurrence ( V Z ~which ) has 21 in its scope, i.e. in
(*I
(322) P?(cI, 22)
3 Pi(2i)
Similarly, the occurrence of 2 2 in P?(cl, 2 2 is bound by the 3 2 3 . The opposite of bound is free. Thus 2 1 is free in (*). Some occurrences of a variable can be bound, others free. For example in
the first occurrence of Also in
21
(3x1) P,"(Cl,2 1 ) A P3(21) in P? is bound, while the occurrence in Pi is free.
(321) (Pl%l,
21)
A P,l(Zl))
both occurrences of 21 are bound. A sentence is a formula with no free variables. Parameters a l , a2, can be inserted for occurrences of variables. Since we never quantify over parameters, they always occur free. We might view parameters as generic individuals. We shall need the following symbolism:
...
LECTURE 6
107
F," = result of substituting u for u in all its free occurrences in F To obtain a system for predicate logic, we add these new rules to the earlier ones for propositional logic (see [Pra 19651):
(VI)
F (V4FZo
(YE)
(V4 F FZE t a constant
or a parameter
In the predicate logic, there are special restrictions on the use of these rules: ( V I ) : 'a' does not occur in any assumption (i.e. undischarged top formula) on which F depends
( 3 E ): 'a' does not occur in ( 3 z ) F or in G, and 'a' does not occur in any assumption on which (the top occurrence of) G depends, save only for occurrences of F," The reason for the restriction on ("I)is to avoid erroneous "proofs," such as:
p; (a1
(erroneous use of ( V I )
The sole topformula has been discharged by a ( 2 I),but the bottom formula is not valid. The bottom formula asserts that, if there is any individual 2 2 such that P;(zz)holds, then P;(zl) holds for all individuals 21. In some models, this is not true.
108
R. JEROSLOW
The restriction on ( V I ) insures that no special assumptions have been made regarding 'a', so that a general claim (Vz)Fg can follow from F. The restriction on ( 3 E )is similarly motivated.
For this logic, Prawitz has established a completeness theorem (by combining his results with those [Giidel1930] and [Gen 19691).
Completeness theorem (following Giidel, Gentzen, Prawitz)
The natural deduction system above proves a sentence F if and only if it is valid.
A similar result holds for applied predicate logic.
This completeness theorem is remarkable, since it states that the sentences true in every model - finite or infinite - are exactly those proven in the natural deduction system described. In contrast to propositional logic, however, where one can test for tautologies, there is no algorithm whatever for testing validity. If a sentence is valid, a proof eventually will show up. If not, one will never show up. However, we cannot tell when to "stop looking. "
We use a simple example from [Pra 19651 to illustrate a proof in predicate logic.
Ezample 6.2.1:
109
LECTURE 6
Some other surprising properties of the predicate logic are cited in the next results and example. A set E is consistent if absurdity E\ cannot be deduced from in predicate logic.
Theorem: (Henkin) Any consistent set of formulas has a model. Corollary: If 2 is a set of formulas, such that every finite subset has a model, then E has a model. Proof: E is consistent, since a derivation of finite subset of C.
A necessarily
would be due to a
Q.E.D. Ezample 6.2.2: C = axioms for equality all true sentences of arithmetic on nonnegative integers. -(c = 0 ) , -(c = I), -(c = 2), etc.
+ +
Every finite subset of C has a model - the usual one, with 'c' larger than any integer named in the subset of formulas. By the corollary, has a model,
R. JEROSLO W
110
but clearly it is not the ”standard” one. Skolem was probably the first to understand how to create nonstandard models (using Merent techniques than those here). We next turn to the Prenex Laws, used for moving the quantifiers within formulas. These will be very useful in Lectures 7 and 8, as well as immediately below. The Prenex Laws are easily verified as valid, and they are:
Prenez Laws
( 3 z ) P 3 Q = ( V z ) ( P3 Q ) (Vz)P 3 Q = ( 3 z ) ( P3 Q) z not free in Q P 3 (3z)Q = ( 3 z ) ( P3 Q ) P 3 (Vz)Q = ( V z ) ( P3 Q ) z not free in P
(3z)PV Q = ( ~ z ) (VPQ ) ( V z ) Pv Q = ( V z ) ( PV Q ) z not free in Q l(3Z)P = (V2)lP +z)P
= (3Z)lP
Using these laws, quantifiers can always be ”moved to the front” to obtain an equivalent formula in Prenez Normal Form, with a propositional matriz. We next turn to Skolem’s reduction, which from our perspective gives an efficient method which takes a predicate logic sentence into another sentence, with these properties: 1. The first sentence is satisfiable exactly if the second is;
2. The second sentence is in Prenex Normal Form, with all occurrences of any V preceding all occurrences of 3.
This result is stated in synopsis below, and we only illustrate it. However, our method in the illustration is general.
Theorem: The satisfiability problem for sentences of pure predicate logic is fragment. equivalent to the satisfiability of just the
77
LECTURE 6
111
Ezumple 6.2.3 Without loss of generality, we can assume a Prenex Form with a propositional matrix: (V21)(322)(v23)(324)M(Z1,22,23,24)
This formula is satisfiable exactly if there is a model and a function F (called a "Skolem function") on it with
(VZl)(V23)(323)(324)M(Zlr F ( z l ) , 239 2 4 ) i.e. exactly if there is a model and functions F, G on it with
(v21)(v23)M(21, F ( z l ) , 23, G(zl, 2 3 ) ) Let us use F' respectively G' for the graphs of F resp. G. Then the original formula is satisfiable exactly if there is a model for
i.e. model for
6.3
Applications for problem-solving
Kowalski [Kow 19791 has pioneered the application of predicate logic in problem-solving. In this section, we use illustrations drawn from his book. The methods for predicate logic problem solution are drawn from the resolution-based approaches to theorem proving, as developed initially by Robinson [Rob 1965, 19681,which have led to a large literature (see e.g. [Ble Lov 19831,[Lov 19781. There are approaches to theorem proving which do not
112
R. JEROSLOW
use resolution (see e.g. [Ble 19741, [Nev 19741). The resolution-based approach for Horn clause predicate logic is embodied in the PROLOG interpreter first developed by Roussel [Rou 19751. The resolutiob-based approach requires CNF(”c1ausal form”) although some other approaches do not. Ezample 6.3.1: We are given Mother(z,g) 3 Parent(=, y)
i.e.
TMother(z, y)
Father(z, y) 3 Parent(z, y)
i.e.
Etc.
Parent(z, z ) A Parent(%,y)
i.e.
Etc.
2 Grandparent(z, y)
i.e.
V
Parent(z, y)
plus perhaps other rules. These rules are stateL with free varia les as they involve general principles. In addition, we are given a databcrse which contains:
Father( Zeus, Aries) Father(Aries, Harmonia) etc. plus a wealth of other data.
Question: Is Zeus a grandparent of Hannonia? Solution method: Add -Grandparent( Zeus, Harmcnzia) and obtain a contradiction (if false) or show there is no contradiction. If we were to proceed by complete instantiation, we would substitute constants for free variables in ewery poseibk? way. If there is a contradiction, this will uncover it and reduces the problem to propositional logic. Upon “instantiating” in every possible way, we obtain: Father(Zeus, Aries) 3 Parent( Zeus, Aries) Father(Aries, Harmonia) 3 Parent(Arie3, Harmonia) Parent( Zeus, Aries) A Parent(Aries, Harmonia) 3 Grandparent( Zeus, Harmonia)
113
LECTURE 6
Clearly, with these instances ("instantiations") of the rules, we can obtain Grandparent(Zeus, Harmonia). In detail, the resolutions are: Father(Zeu8, Ark#) iFather(Zeu8, Ark@)V Parent(Zeui, Arks) Parent( Zeu8Ark8) Father(Arie8, Elormonk) iFather(Atk8, Elarmonk)
V
Perent(Atk8, liarmonk)
Parent(Ark8ffarmonk) Parent(Zeu8, A r k # ) iParent(Zeu8, Arkr) V iParent(Ark8, Elurmonk)V Grandparent(Zeur,Ark#)
~
Grandparent (Zeur ,Ark 8 )
The last line contradicts -Grandparent( Zeus, Harrnmia) and we are done. However, complete instantiation also produces m a n y useless instantiations, e.g.
Mother(Zeus, Aries) 3 Parent(Zeus, Aries) Father(Elurmonia, Zeus) 3 Parent(Hurmoniu, Zeus) etc.
In even a moderate size database, aast numbers of useless instantiations created by this process. Furthurmore, to complicate our task, it is not always clear which are useless. PROLOG does not proceed by complete instantiation. Instead, it seeks to do resolution in the most general setting possible. It fixes a value of a variable (binding the variable) only where that is needed to let resolution proceed. This technique of "late binding" tries to keep variables free as long as possible. For example, we have in the database
are
114
R. JEROSLOW Father( Zeus, Aries) Father( AriesHarmonia)
and the rule
Father(z, y) 3 Parent(z, y)
No resolution is possible at this point. However the bindings (z,y) = (Zeus, Aries) and (2, y) = (Aries, Harmonia) d o w resolutions and we have Parent( Zeus, Aries) Parent(Aries, Harmonka) we also have in our database
Parent(%,z ) A Parent(z, y) 3 Grandparent(z, y)
Under the binding (z,y) = (Zeua,Aries), by resolution we have:
Parent(Arle8, y) 3 Grandparent(Zeus, y) with the binding y = Elarmonia, resolution obtains:
Grandparent( Zeus, Harmonka) Here we used forward chaining, also called ”bottom up” inference. This result can also be obtained by backward chaining, called ”top down” inference. In general, any complete implementation of clausal chaining - via all bindings which would continue CC - will be adequate in the Horn clause setting. While ”late binding” techniques are better than complete instantiation, they also face combinatorial growth: Out of all possible bindings that allow resolution to continue, which should be done now? This is the issue of a ”control strategy”, and more sophisticated versions of PROLOG permit complex heuristics for choosing the ”best” resolution and and binding.
115
LECTURE 6
For non-Horn clauses, the ”partial instantiation” of late binding plus resolution needs to be supplemented by fuctoring, which unify variables within a clause (see [Lov 19781).
Ezample 6.3.2 Here we give a database application Suppose that a certain relational database has three relations, with fields as indicated.
Supplier No
SR(
Status
Name
7
City
9
9
(key) Part No Part(
Color
Name 9
3
Weight
9
(key1 Supplier No
SY(
Part No
Quantity
9
f
1
and a relation ”generated as needed”
Lt(r, y)
”2
is less than y”
(Computable functions can also be entered in this ”as needed” manner as a graph, thus avoiding function symbols). We show how possible queries (questions regarding the database) can be formulated in predicate logic, so that theorem-proving devices can be used to retrieve answers to the queries. In the queries below, we restrict ourselves to queries answered by lists. The predicate formula in the right-hand-column is to be refuted, e.g. for the f i s t query one is to obtain a contradiction from the database plus the clause lPart(z, bolt, y, z ) V +y(u, 2, w) V TSr(u, u, u , t). A listing of all the values of the parameter ‘a’ is desired.
116
R. JEROSLO W Possible queries
Translation for Logical deduction
What are the names of the suppliers of bolts?
Par+, bolt, y, %) SY(%%W) A
SR(u, a,ZI, t ) 3
What are the names of the parts supplied by Apex? What are the names of suppliers located in London who supply nuts weighing more than one ounce?
SR(z, a,y, London) A SY(Z,U,ZI) A Part(u, nut, w , t ) A Lt(1,t) 3
In all three instances, logical deduction is to be used to determine all possible settings of variables and parameters which lead to a contradiction, and then to print out the parameters only (projection). As an approach to improving response at run time, the forms of the most commonly asked queries can be anticipated, and all possible deductions be precompiled. The original query is thus reduced to a union of ”easier” queries which require only database lookup. In this manner, deduction can be largely avoided at run time, and if there is not a large number of alternative ”look ups” for a given query, this approach can be efficient. The precompilation approach is due to Reiter [Rei 19781 and Henschen [Hen Nag]. In Reiter’s approach to interfacing logical inference and database, the purely existential validity fragment of predicate logic plays an important role. We illustrate some of these ideas in the next example and the results following. Ezample 5.3.2 Given a database of ”facts” D (no variables) and given principles which govern the database domain of quantifier-free matrices, Horn or not :
117
LECTURE 6
We have a quantifier-freematrix Q(a,y) and we wish to know all a such that in that domain there necessadyare y with Q ( a , y ) true. We note that the validity of
is equivalent to, by Prenex Laws, the validity of
L
A
1.e. it is of the purely ezistentiaf form (3 F)Q'( a, 2 ) . The following wellknown result is thus h e l p l l (see e.g. [Den Lew 1983)).
Theorem: (3 G ) B ( G ) ,B quantifier free, is valid (i.e. true in all models) iff B(;) is a tautology, where is the set of all vectors of constants drawn VEEV from B (if none in B, add one).
-
Corollary: In every domain where D is true and where
is true for all i = 1,...,t there necessarily are domain elements true, for a among domain elements, if and only if
A
d
with Q(a, c )
is a theorem of predicate logic. Reiter has shown that, in addition to the axioms (V ~ ( ' ) ) C ~ ( ~one ( ' m) a) y, add axioms for equality, together with the graph, of the equality relation, plus an axiom ('domain closure') that all objects occur in the database, and this will not change the answers to existential queries. The result is significant because equality is not easy for resolution-based procedures to handle. In Lecture 8, we give an approach to theorem proving which focuses on the validity fragment of predicate logic, or, equivalently, its ? satisfiability fragment. Indeed, (3 S)Q(G)is valid exactly if (V ;)-IQ(G) is not satisfiable. F'rom Skolem's reduction, satisfiability in predicate logic is reducible to v 3 satisfiability, so that the latter is not testable. However, we shall see that 3 t/ satisfiability is reducible to satisfiability, which, by the theorem above, reduces to propositional logic and so is testable.
7
-4
A 4
118
R. JEROSLO W
v%
In the applied predicate logic, the fragment reduces (by use of function symbols) to the fragment, and hence all of predicate logic reduces to the V fragment. Hence the fragment of applied predicate logic is not testable, and the theorem above is restricted to the pure predicate logic with constant symbols.
7
4
119
LECTURE 7 COMPUTATIONAL COMPLEXITY ABOVE NP: A RETROSPECTIVE OVERVIEW Summary: We survey some of the complexity results on problems which are harder than NP, and interject our own perspective. This lecture is a digression and is not needed to understand the subsequent lectures. However, it will be useful for the reader to have a broader framework for algorithms for the predicate logic, which appears to have a complexity above the N P class widely known to those in Operations Research. Moreover, we will relate some of these higher complexity classes to problems which naturally occur in Operations Research.
7.1
Introduction
Researchers have long sought general measures by which they could discern various degrees of ”difficulty” in different practical problems. Such measures would guide in the modelling of a practical situation, favoring models of lower difficulty; and would help to set expectations for the performance of algorithms. In the 1970’s the precise concept of ”computationalcomplexity” was viewed a s meeting this need for a measure of difficulty, to the extent that polynomialtime algorithms (at least those of low degree) were viewed as ”tractible” while others were viewed as “intractible”. However, this view is very much in dispute today, due to the fact that the Simplex Algorithm can be exponential time for certain problems [Kee Min 19711, [Jer 1973b], while some nonpolynomial problems (such as knapsack problems and satishbility problems) are typically satisfactorily solved for practical needs. Moreover, the entire thrust of current research in MD? is to develop efiicient means for solving problems which, in the terminology of complexity theory, are “intractible”. The fact is, that this research is meeting with success. At this point in time, most
120
R. JEROSLOW
applications-oriented applied scientists, including those in Computer Science, simply ignore the intractibility recommendations of complexity theory. However, computational complexity retains a central role for setting expectations as to dgotithm frcrmeutorks for problem solution. For instance, upon learning that a problem is NP-complete, one often is lead to consider a branch-and-bound approach (or dynamic programming or cutting-planes) as the general solution method, within which many other partial solution methods may be imbedded. In addition, computational complexity can be used to set expectations on the worst-case performance of proposed algorithms. To some extent, complexity can be used to suggest or motivate algorithms by means of the conceptud schemes for computation that are associated with complexity results. This is a valuable contribution. In terms of practical measures of computational difliculty, the current choice is among the "wind tunnel" method, sophisticated analyses of the performance of specific algorithms on randomly generated data, and sophisticated analyses of probabilistic algorithms in the worst case. There have been a number of critiques of the use of randomly-generated data and their match to real-world problems, which we do not wish to repeat here. Moreover, results in this area are often exceptionally hard to obtain, even for "rudimentary" algorithms which lack realism. The "wind tunnel" method consists of trying algorithms experimentally against real-world data. The folklore ascribes to Gene Lawler the idea of a "computer Olympics". Here "contestants" would try out their algorithms against a library of real-world problems on which many other algorithms have been tested. A variant of this idea is a centralized algorithm testing facility, which could be a long-term, sustained activity with the capability of aiding many industries via quantitative modelling techniques. A major function of such a facility would be to create and maintain an extensive library of actual or modified problems from various industrial settings. Of course, such a facility would be open to all researchers who wished to try out their algorithms and approaches. While computer usage and run time certification would be on a fee basis, access to the problem library would be inexpensive and easy. To avoid conflict of interest, such a facility would neither engage in or contract out any algorithm development, etc. It would be overseen by a board of respected scientists representing a variety of interests and approaches. In my view, a centralized testing facility is clearly a very appropriate way of linking applied decision sciences to applications. It dominates the problemat-a-time, client-at-atime approach which is currently the dominant practice, where often even the difficulty of solved problems has not been ascertained by
LECTURE 7
121
alternative algorithms. Occasionally, the problem datasets are not available for testing and verification by other experimenters. In terms of computational complexity, which remains an essential part of applied science and which is necessary for any sophisticated perspective in our area, a theoretical consideration is the need to address trial-and-error procedures (see e.g. [Jer 19751, [Go1 19651, [Put 19651). In these procedures, while no one algorithm may be &cient, there can nevertheless be a sequence of algorithms between which users switch, over time, to achieve solution of larger and larger problems. This issue has not been addressed in the current complexity theory. A more recent development for computational complexity derives from database theory, where, in very large databases, even a quadratic-time algorithm may be far too slow. Here the emphasis is often on algorithms with worst case time complexity that is linear, or even less-than-linear,in the size of the database. In the following sections, we assume a familiarity with the elements of the theory of N p complexity, and we review more than we introduce concepts. Background material is the fist three chapters of [Gar Joh 19791.
7.2
The findamental distinction: conceptions vs. their instances
Many of the results on computational complexity derive from the huge gap between what humans can conceive of in principle, on a theoretical level, versus what they can actually implement. Most people have little sophistication concerning the process of obtaining, from the powerful human imagination, concrete, usable outcomes. We can have absolutely clear und concGe conceptions of the basic me&ical functioning of a computing device (or of axiom ) and yet have no under8tandhg of what will be the oufpuf of a long calculation (or of a long deduction... ) It is these clear conceptions which lead to succinct formulations in a logic. It is our lack of understanding of output which leads to "hard" problems. The result is, that tasks which are seemingly simple to describe are hard, and predicate logic may prove "impossible" without taking advantage of special processing which is possible for some structured problems. A second phenomenon is as follows. If we can give an absolutely clear conception of a class of tasks, we obtain from this a clear conception of a harder task. This is the "diagonalization"
...
R. JEROSLOW
122
principle "which we shall shortly illustrate. As a consequence, at least some "complexity hierarchies" do not "collapse" to lower levels, so that complexity takes on a graded structure. We now illustrate the diagonalization process. Fix a function e.g. F(n) = 2". Consider all programs which, given an input z of length 1.1 5 n, output a "yes" or "no" in time 5 F(n). Call this a class Pi of programs p. Now consider this program po: Given z = 11 = 1...l ( n one's), apply the n-th program to 2. If it does not stop in time 5 F(n),output a "yes" or "no". If it stops in time 5 F(n), output a "no" if it does not output a "no"; output a "yes" otherwise.
Fact: pb takes time "a little longer" than F( n ) , and it is not equioalent to a program in Pf. For if pb has number no and stops in time a 5 2"O, it must answer both "yes" and "no" to input no. Diagonalbationis an ancient phenomenon as e.g. the Cretan Liar Paradox, "this sentence is false". It is reflected in the "paradoxes" of informal set theory (e.g. "the set of all those sets which are not members of themselves"). With great technical skill, Godel showed it to be a means of proving complexity results and incompleteness results [Godel 1931, 19341. It is a widelyu s e d technique of complexity theory. Here the ingenuity of the proof lies in finding meBIls to express a "paradox like" condition in a logic or computation which does not apriori appear to be sufficiently expressive. The surprising expressiveness of simple fragments of logic in turn derives from the power of even simple conceptions. Further progress in complexity theory has been hindered by a lack of new insights beyond the two phenomena of "gap" and "diagonalization". There is a consequent inability to determine central interrelations, e.g., if polynomial time is the same as nondeterministic polynomial time (P=?NP).
7.3
Two fundamental results
We now give two examples, both central to complexity theory, of clear, concise ideas having consequences which are hard to determine. We assume that the reader is familiar with the concept of a "Turing machine," a general-purpose programmable computer which is user unfriendly, having a primitive machine language and no compiler. Only persistent students of logic have actually programmed these machines. Conceptually, they could hardly be simpler as each
LECTURE 7
123
0
works on tape divided in "squares"
0
there is one symbol per square
0
there is one read/write head over one square
0
the next move is 0, fl squares
This is certainly a clear, concise conception. It is, in principle, adequate for computing anything we can compute on the most modem digital computers, although the modern machines of course run much faster. These primitive, theoretical computing devices will allow US to see the surprising expressiveness of the propositional calcdus. We will see that m y polynomial-time computation with these machines can be expressed as a polynomial-size proposition, such that an accepting computation corresponds exactly to a satisfiable proposition in conjunctive normal form. Furthermore, a polynomial-time computation even "with guesses" (i.e. a nondeterministic polynomial time computation) yields to the same treatment. The sole differencebetween a deterministic and a nondeterministic polynomial-time computation will be that the former yields a proposition with Horn clauses only. As a consequence of this fact, clausal chaining (unit resolution) or linear programming is proven complete for polynomial-time [Do1 Lip Rei 19791 (see also [JonLaa 19741, [Sky Val 19851,and [Val 19821). The result for linear programming of course relies on Khachian' s algorithm [Kha 19791, to show that linear programming is in polynomial t h e . It is "hard" to anticipate what a Turing machine will do, with guesses, in polynomial time. As a consequence, it will be hard to determine satisfiability in conjunctive normal form. We follow essentially the original construction of Cook [Coo 19711, and proceed as follows: 0
0
We introduce a propositional letter for each possible tape location, at each time, for each alphabet letter and each possible machine state. This leads to a polynomial number of propositional letters. Then in order to say that only permissible moves can be made in going &om time t to (t 1); we write implication clauses like:
+
A
letter 01 alone in square ( i - 1) at time t letter ul in square i un'lh read head and in state q at time t A letter 4 3 done in square ( i 1) at time t 3 allowable transition .......
+
R. JEROSLO W
124
We use other implications similarly to state that other squares are unchanged, that we start with a specified input, etc. Now note that non-determinism comes "for free": replace allowable transition.. by allowable translv allowable trans2V.....
...
"Polynomial time" is of interest for theoretical reasons, e.g. closure under multiplication (as above) or addition, dominance of a madmumof polynomials by a polynomial, and other properties. Practical interest lies in the low order po1ynomiale.g. O ( n ) , O ( n g ) ,O(n3), and marginally O(n*),O(n6). Further interest in nondeterministic polynomial time N P was motivated by the discovery of a huge number of Operations Reseamh problems which are in NP time (=NP) i.e. polynomial time in the correct "guesses". In fact, there are hundreds of instances cited in the book by Garey and Johnson. The connections between Operations Research problems and NP computational complexity were first developed by Karp [Kar 19721. Propositional satisfiability is a "special case" of many of these, while from Cook's construction, it is also the fundamental problem of the NP completeness paradigm. As we noted, Turing machines are relatively in&cient and cumbersome. However, changes in the model of computation do not usually change the tasks computable in polynomial time, although such changes can speed up tasks by constant factors or even orders of magnitude. As long as the minute operations of a model of computationis to change bits, its computation will be expressable in propositional logic. Only when more global "minute" operations are used (as e.g., multiplication of integers in constant time, independent of size) can there be changes in what is computable in polynomial time. We now turn to our second fundamental result. How much logical expressability do we need to concisely describe nondeterministic ezponential time computation? The answer is due to H. Lewis, and it is: ? satisfiability in predicate logic, i.e., the complement of what R. Reiter identifies as the fragment of predicate logic most relevant to database queries. Lewis main technical idea is that, to discuss an exponential length tape during an exponential time computation, we need to be able to use base m numbers of length m (where mm = c", and c" is the computation time). In this manner, a short symbol indicates an ezponentially large time.
LECTURE 7
125
We shall use the constants &, d1, ...,&-I of predicate logic to denote m - 1 to base m. The base m succeseorrelation can be developed using universally-quantified principles: E.g. for all i = 0, rn - 2 an axiom: 0,1,
...,
...,
(Vz0)...(Vzrn-2)
SC(zo,z1,
zrn-2,d;yzOy 21,
.--, zrn-2, 4+1)
plus other axioms to precisely specify successor. We then defme 5 in terms of successor, etc. We nezt develop a computation predicate: n
At time Z O . . . ,,,-I ~ in tape square yo ...yrn-l the symbol is q and the read head is/is not located (with internal state Q)" These predicates can be introduced and axiomatbed in a manner similar to Cook' s approach for propositional logic, but using the universal quantifier to replace what would otherwise be an exponential number of propositions. The details are in [Lew 19801. As in the propositional logic, nondeterminism comes "for free" in Lewis construction. Lewis' construction takes a nondeterministic exponentialtime computation and makes it correspond to a s m a l l set of universal sentences of pure predicate logic with constant symbols, such that the computation ends in acceptance exactly if the sentences are satisfiable. As we saw in Lecture 6, this 7 fragment of predicate logic is, in turn, decidable in nondeterministic exponential time, since complete instantiation results in an exponential-size proposition. Thus, nondeterministic exponential time can be identified with this fragment of logic. Plaisted's related result is that, for deterministic exponential time, only tl Horn clauses are needed, and so that time class can be identified with that logic class. The reader may consult the useful figure in [DenLew 19831 for placement of the complexity of other fragments of predicate logic. In that reference, the pure predicate calculus (without constants) is used, so initial existential prefixes need to be removed to compare to our way of stating results here. A
7.4
What if we increase expressability "a little bit"?
We've seen how expressability of universal quantification leads to nondeterministic exponential time computation. Suppose we add function symbols for
R. JEROSLO W
126
successor, and plus; and we add in the equality relation, but we ignore the other symbols of predicate logic. In this language, let Presburger arithmetic consist of all true statements of arithmetic. Now the time complexity will jump above Zn, or even the " '2 upper bound for nondeterministic exponential time, according to our next two results.
Theorem: [Coop 19721 There is a 2''" burger arithmetic.
- t h e algorithm for determining if a sentence is in Pres-
(Presburger proved deudability in [Pres 19291).
Theorem: [Fis Rab 19741 Any nondeterministic algorithm for deciding Presburger arithmetic requires at least 22'' time. Next suppose we add a three-place letter to represent multiplication, and we let first order arithmetic (FOA) be all true statements of arithmetic in this language. A subset of FOA is Peano's arithmetic (FPA), which consists of all statements in this language provable from Peano's axioms. The Peano axioms are a small list of basic facts about successor, the recursion relating successor to addition, the recursion relating addition to multiplication, and then the axiom schema of induction; see [Men 19641 or [Shoen 19671 for details. The following well-known result is a direct consequence of [Godel 19311.
Theorem: [Godel 19311 There is no algorithm for determining which statements are in FPA. Godel's proof essentially consists of an ingenious way of representing any computation of any length in FPA, so that all halting T u r i i machine computations can be proven from Peano's axioms. A diagonalization then completes the proof. Subsequently, Robinson [TarMos Rob 19531 produced a s m a l l , finite theory Q which also is adequate to establish all halting computations. Hence Q also has no algorithm for deciding theorems, and neither does predicate logic (since the theorems A of Q exactly correspond to the theorems 3 A of predicate logic). In addition to such "negative" results as the non-existence of an algorithm, Godel's techniques can also be used to obtain some algorithms. Specifically, from a mechanical enumeration of some of the nontheorems of FPA, we au-
LECTURE 7
127
tomatically can find a nontheorem this procedure does not enumerate. We illustrate this with a second diagonalisation. Indeed, from such an enumeration one obtains a program TQ such that it stops on (nl "program n stops on input n" is enumerated }. We ask: does program stop on input no? If it did, "program no stops on input no" would be a theorem of FPA, hence not in the enumeration, so no would not stop on input no, by definition. This is a contradiction. Hence, program no does not stop on input no. Is "program no stops on input no" enumerated? No,for if it is, program no would stop on input no, by definition. However, we h o w it does not stop. Hence "program TQ stops on input no" is a non-theorem yet not enumerated! How ezpressive i s FOA P Suppose we added an "oracle" to decide all halting computations instantly? If that isn't enough, add an oracle to the oracle, etc.
Theorem: (Due to E. Post and cited in [Rog 19671). No finite-level oracle can determine the theorems of FOA. From Post's theorem above, even among problems which are unsolvable, "oracles" we can distinguish degrees of difficulty, and in fact a whole hierarchy of unsolvable problems. Let's m e t a e direction and cut back on expressiveness. When are things "only aa complez" as in Operations Research?
Theorem: [Opp 19681
The existential statements of Presburger arithmetic form an NP-complete set.
Theorem: (Lewis)
The existential satisfiable sentences of pure predicate logic form an NPcomplete set. Theorem: (Bledsoe and Shostak)
There is a 2"-time procedure for the universal statements of Presburger arithmetic.
R. JEROSLOW
128
As we have seen from our excursion into more complex, and hence more expressive, logical theories, things get "curiouser and curiouser". In the next section, we turn to what are the relatively "lower levels" of complexity which still lie above NP,but which are not definitely known to be nonpolynomial. In this lecture, we have not touched on some very significant higher complexity results (as e.g. [Mey Sto 19721, [Sto Mey 19731, [HarStea 19651, [Har Lew Stea 19651) as these are of an automata-theoretic nature, and do not directly intersect with our topic in these lectures. Also see [Jer 19734 for an Operations Research problem which is unsolvable (i.e. integer programming with quadratic constraints), as a consequence of the results in [Mat 19701and [Dav Put Rob 19611.
7.b
The Polynomial Hierarchy, Probabilistic Models, and Games
The Polynomial Hierarchy of Meyer, Stockmeyer and Karp uses the concept of an "oracle," and is done in analogy with Post's hierarchy.
NP = NPtime or
C1
are those sets of the form:
Here p is a polynomial and P is a polynomial-time computable predicate. Thus are those problems decidable in polynomial time, gioen correct nguesses". (We use IzI to denote the length of z ) . Suppose there is an oracle for any given C1 set which answers membership questions instantly. An NP computation using such an oracle for e.g. { w ( ( 3 u1,.1 5 p ( ~ w ~ ) R ( w , would u ) ) have the form:
' is the predicate for lisb of words in R. Using Prenex Laws, this Here R converts to:
129
LECTURE 7
( V d q 5 p(lw'q)) This is of the form
which is called Za - form and involves an alternation of quantifiers from 3 to V. Conversely, any set defmed by a BZ form is NPtime in an oracle for a El (i.e. NP) set. Now suppose we had oracles for all Ba sets. The analogy continues, i.e. by an "time computation we arrive at E3 sets, which have two quantifier alternations:
We proceed similarly for all finite levels Z,, n 2 0. Let Bo be Ptime (polynomial time) by definition. We have the hierarchy:
It is unknown if the hierarchy is one of strict inclusions. As presented in this way, the polynomial hierarchy is very abstract. However, even at its lower levels, it relates to Operations Research problems, specifically, to parametric mixed integer programming. Suppose that an MIF' description is given of a plant and warehouse location problem, and we need to know: "Is it the case that, no matter where I locate plants 5-10on the allowable sites, there will be a way of meeting all customer demand with our client Apex Corp. requiring no more than $ 1million in transportation costs this year?" Our query has the form:
( yrtims ) ( V
3 Other locations, distributions
(meets demand and Apex trans 5 $1 Mil)
R. JEROSLO W
130
The question is in "c0-X~"(complement of &). Suppose instead of "a way" we want "an optimal way". Then OUT question has the form:
(
)(
V L;y;.
V Possibly different locations, distributions
3 Other locations, distributions
The other locations, distributions meet demand and
If the possibly merent locations, distributions meet demand, they have at least as high a cost and Apex transportation costs do not exceed $ 1 million This is c 0 - X ~ . See [Bla Jer 19831 for some parametric programming problems which are in P or NP. Different levels of the hierarchy suggest different types of algorithms. XI suggests the use of "or" trees. If any choice succeeds, accept 2. "OR" trees are like branch-and-bound, ezcepf that in an "OR" tree there is no communication between alternative paths (unlike BB, where there is). & suggests the use of "or/and" trees. We note that all En involve binary trees with branches of polynomial length. So membership a En can be tested in polynomial spuce (and exponential time, via buckfmcking). Letting PSPACE denote polynomial space we have:
Eo c Xi C_ ... c En C ...
...P S P A C E
The following result "caps" the polynomial hierarchy: Theorem: [Sto 19771
PSPACE =
U En nZO
From the theorem, if P=NP then P=PSPACE, and the converse implication is clear. Moreover, in polynomial space at most an exponential number of
LECTURE 7
131
recast as a binary
tree
Figure 19: “Or” Tree tape configurations, combined with read head positions and states, is possible. Thus, PSPACE is contained in exponential time (EXPTIME), so the entire polynomial hierarchy lies below the Horn (satisfiable) fragment of predicate logic. Here is another reason why predicate logic is probably very difficult. Just as boolean expressions, in the form of propositional logic, play a central role in NP or XI, they also provide complete sets at the various levels of the polynomial hierarchy.
Theorem: [Sto 19771 The set of all true statements of the following form is complete for Xn (under Ptime functions):
in which B is a propositional form, and there are (n- 1) alternations of quantifiers. Papadimitriou gives a variant construction which involves random variables (continuing work of Gill,Valiant, others). This construction exactly follows that for ezcept h a t the universal quantifiers are replaced by random choices of a vector, and the final condition is to be attained more than half the time (for En, it is ”all the time”).
R. JEROSLO W
132
,,(2)
Such trees can be recast as binary - branching. To succeed need any y(') such that all y ( p ) from it succeed.
Figure 20: "Or/and" Tree So P& is:
p(Iz1)) (given a realization of a random Iy(2)l 5 p(I.1)) ( 3 1 ~ ( ~5 ) 1p(lzl))P(z,~ ( l ) , y ( ~ )( , ~ 1 )is true more than half the time. (3ly(l)l
Similarly for PX,,and by definition PPSPACE (probabilistic PSPACE) is the union Un>O - PX,,. Theorem: [Pap] PPSPACE=PSPACE
PPSPACE is essentially equivalent to finite horizon, dynamic non Markovian decision processes with terminal state rewards (i.e. non additive) in which one gets to see the realization of a random event before having to make the next decision (which will be followed by another random event). Although there is an exponential number of possible policies, there are only polynomially many state8 over all stages. However, transition probabilities depend on the entire past hietory of decisions. Call this FHDNM. Any policy is evaluated by its expected reward. Given a description of a FHDNM process and a quantity B, we consider this form of a question: "Is there a policy of cost 5 B?"
LECTURE 7
133
Theorem: [Pap]
FHDNM is PSPACE complete. The "surprise feature" of this result is not that FHDNM is "hard" (it seems impossible!) but that it places "so low". We now have this picture:
IP
PARA
MIP
PPSPACE
C;NexptimeCN(2*"-
time)(=.
..C no algorithm
Figure 21: A Complexity Chain The polynomial hierarchy can also be related to various generalizations of linear programming, generalizations motivated by considerations of public policy and of delegation of authority (agent problems). Bi-Zeoel programs (games) are LP's in which two players have control over disjoint sets of the variables. They move in a definite order.
R. JEROSL 0W
134 "Policy maker" sets variables first,
then "citizen" reacts. "Policy maker" and "citizen" each have their own linear criterion function involving both sets of variables. AZZ data is known to both players. In what follows, we assume the LP gives a polytope (bounded). We ask the following:
Question: What move should policy maker choose, to maximize his/her benefits, with clear knowledge that citizen is dso a maximizer?
A variant of our question (for use in complexity) is: can the value of the bi-level program be 2 B? Bi-level programa occurred in a policy setting at the world bank (Candler and Townsley) and more general sequenced-move games date much earlier (Von Stackelberg). Shce all variables are continuous and all constrahts h e a r , they are very simplified programs and do not consider e.g. possible controls over price (criterion functions), taxation policies, discrete alternatives, etc. Therefore complexity results for bi-level programs awe very serious statements about potentid barriers to efficient solution of 8CtUd policy questions, etc. Two results here are: Theorem: (Fa&, see [Bard Fal19821) The optimum vdue of a bi-level program occurs at an extreme point. Theorem: [Jer 1985~1 Bi-level programming is NP-complete. p level linear prognrms are a direct generalization of p = 2 levels. Again, it is sequenced - move with complete information, and each player must leave feasible moves for those yet to go. Variables are continuous. A practical example for p = 3 occurs when a CEO (Chief Executive Officer) specifies possibilities to a divisional president, who further specifies possible actions to a divisional executive.
LECTURE 7
135
Theorem: [Jer 1985~1 From the ability to solve (p 1) level linear programs up to 50% of the optimal value to the first-moving player, one can decide membership in sets at level C, of the polynOmial hierarchy.
+
(Upper bounds are not known). We ask this question, in view of the above complexity results: How can a game theoretic solution concept be "normative" when it cannot be computed? How can we recommend as a "solution" what we ourselwe8 cannot implement? For a use of complexity theory to challenge a cooperative model interpretation in game theory, see [Chv 19781. As a practical matter, Candler reports on a policy question solved to optimality as a linear program, under the assumption that the "citizens" share the goals of the "policy maker," versus the same question solved to within 1% as a bi-level program. The LP solution was wrong by a factor of two. It would be very useful to have better algorithms for bi-level programs. Artificial intelligence methods are viewed as particularly suited to illstructured situations, in which a clear problem definition or god statement may be lacking, and in which an implementable normative framework of a traditional type is lacking (see [Sim 19731). It follows fkom Chvatd's work and the work cited above that, even in highly structured situations with clear normative measures, computational complexity alone can obscure the implementabaty of these measures. Potentidy this provides another role of nontraditional (e.g. satisficing) approaches; this matter needs to be made precise. Conversely, in those instances of structured problems with usable normative measures, all approaches to problem-solving can be gauged by these measures.
This Page Intentionally Left Blank
137
LECTURE 8 THEOREM-PROVING TECHNIQUES WHICH UTILIZE DISCRETE PROGRAMMING
Summary: We show how ideas from discrete programming can be used in conjunction with theorem-proving techniques, with the potential to improve the efficiency of formal deduction. From discrete programming, we utilize the emphasis on propositional logic and, specifically, the use of incumbent-finding techniques for satisfying valuations. Incumbent-findingcan be efficiently performed by suitable list processing routines, as well. From theorem-proving techniques, we utilize the simple form of unification for pure predicate logic with constant symbols.
We focus on decidable fragments of this predicate logic, notably the G satisfiability fragment, although in principle all predicate logic can be treated by the methods discussed here. For the decidable fragments, we develop finite algorithms with worst case time bounds equal to the theoretical ones from complexity theory, provided that nondeterminism is replaced by exponentiation. This property insures that our algorithm does not waste time unnecessarily. However, the intrinsic complexity of these fragments of logic is exceptionally high, and we expect that, while practical methods may utilize the gamut of known devices, problem structure will be essential to exploit. Our theorem-provingphilosophy is influenced by Nevins’ view [Nev 19741, on the value of doing a c i e n t logic subroutines first, prior to utilizing routines which can cause explosive growth in space or time requirements. Our algorithm was described in [Jer 1985dl.
R. JEROSLOW
138
Reduction of Predicate Logic to a Structured Propositional Logic
8.1
We begin with some general results on reducing pure predicate logic, with constant symbols, to propositional logic. The specific reductions here are not typically useful directly, but they will serve to guide our algorithm development. We proved the theorems below in [Jer 1985d], only to later confirm that much of them were in the logical folklore (see e.g. [Den Lew 1983]), but at the time we did not know a reference in the literature. We include a sketch of our proofs. Theorem: Let A = A( z , Y ) respectively B = I?(;) be quantifier free, let z be of length n, and let C be the set of constants occurring in A resp. B. (If no constants occur in A resp.B, put C = { c } where c is a constant). Let C denote the set of all vectors of constants of C of length n, and put t = IC(. Let be a vector of constants of length n, none of which is drawn from C. Then: A
-
4
2
(A) (3 ;)I?($’) is satisfiable iff B ( 2 ) is satisfiable iff (3 ; ) B ( ; ) has a model of size 5 n t .
+
(B) (V ; ) B ( s ) is satisfiable iff model of size 5 t.
s)(V
A-
- B ( < ) is satisfiable iff (V ;)I?(;)
c EC
d
A
has a
(C) (3 $)A(;,$) is satisfiable iff (3 $ ) A ( d , Y) is satisfiable iff (3 )(V $)A(z,y) has a model of size 5 n t .
+
2
Corollary: 3-SAT is NP-complete. A
Sketch of proofi
(A) If (3 G ) B ( s ) has a model, let those elements which satisfy I?(;) be denoted by a vector of new constants d . From the model we obtain a truth valuation making B ( 2 ) true. Conversely, from such a truth true. valuation we obtain a model making I?(;) and so (3 ;)I?(:) A
139
LECTURE 8 (B) If (V ;)I?(;)
is satisfied in a model, that model provides a truth valuation making A ; E z B ( Z ) true, since for each c E C the model contains an element denoted by c. Conversely, from such a truth valuation, we obtain a model whose elements are exactly C in which AB(2)
-
i.e. (V
c EC
Z)B(Z)is true.
(C) Combines ideas of (A) and (B).
Q.E.D.
<:
Recall from Lecture 6 that satisfiability subsumes all satisfiability in predicate logic. By the next result, ? : satisfiability reduces to the simdtan e o w satisfiability of a structured (infinite) sequence of propositions, and in this manner it can also be treated by our algorithm. Logicians will note that our result gives a variant form of the Skolem universe.
Theorem: Let A( 2 , Y ) and C be as in the previous theorem. Put D1 = C and, inductively, let &+I consist of the union of Dk with a set of new constants, , G k is all vectors of size n drawn from Dk. one for each d ~ D kwhere A
-
-
A
Then (V ;)(3 ;)A(;, $) is satisfiable
x there is a truth valuation making d
R. JEROSLOW
140
Preliminary discussion
8.2
From the fust theorem in 8.1, the predicate logic form (V G ) B ( G )is satisfiable
-
iff the propositional logic form A-c EC B ( ; ) is satisfiable. However, the latter is of exponential size. We need techniques of ”late binding” similar to those from theorem-proB ( T ) if possible, while still exving, in order to try to avoid forming A-. C EC tracting the information desired. The approach is called partial instantiation. It will diger from most other approaches in that:
-
0
clausal form is not required (as for Nevins’ and Bledsoe’s approaches); A
0
In the decidable case of 3 -SAT, the worst case is ~ 3 V-SAT the worst case is c2(3L)L.
2for ~G-SAT ~ and ;
A-
0
0
It is guided by truth valuations and can terminate well in advance of all possible unifications; It extends to certain combinations of logic with linear inequalities, so as to subsume both MIP and theorem-proving as two primary applications.
It shares with other approaches: 0
0
0
The use of the unification algorithm of Robinson The need for heuristic8 to guide unification, and other subroutines, in order to limit search The need for precompilation of commonly used logic and/or special representation techniques to implement simple logic (as inheritance in frames).
Before we proceed, we need to review that part of the unification algorithm we shall use [Rob 1965, 19681. We do so by illustration. First we ask: what is a most general common instance of P(z1, C Z , 2 ~ , 2 4 ) and P ( Z Z , Z ~ , Z ~ We , Zneed: ~ ) ?21 = 2 2 , 2 4 = c2, zz = 23, 2 4 = 2 5 . The solution is thus: P(zl, cz, zl,C Z ) . z1 can be ”renamed” resulting in what is called a variant of the formula. For the first occurrence of P , we use the substitution: 2 2 4 21, 2 4 --+ c2. For the second occurrence, use: 2 2 -+ 21, 2 3 + 21, 2 4 + c2, z g + cz. When the first occurrence of P is in a larger quantifier-free (Q.F.) formula B ( z 1 , 2 2 , 2 3 , 2 4 , ...) this substitution needs to be performed throughovt B , in
141
LECTURE 8
order to ”unify” the f i s t occurrence of P with the second (which may be in a Merent formula). Up to ”renaming” of 2 1 (i.e. up to variants), P(zl,cp,21,cp) is a most general unifier (common instance). Any fully-instantiated instance of both has the form P ( c ,c p ,c, cp) with the same constant c in the fist and third slot. Next, we ask: what is a most general common instance of P ( z l , c l , zp,t 4 ) and P ( c t , 2 p , z 1 , 2 5 ) ? Here there is no common instance. A common instance would require: 2 1 -, cp, zp -,c1, zp = 2 1 . This cannot be done since c1 and cp are distinct syntactic objects (even if in some models they denote the same object). We next need to discuss a technical concept unique to our algorithm, specifically: cowering of domain vectors by a partially instantiated quantifier-free ( Q . F . ) formula B ( 2 ) . A Q.F. formula B ( z l , z p , ...,2,) initially covers all domain vectors (c1, c p , ... , c,) of constants. During computation, certain variables are fixed at constant values and certain others are set equal, by unifications. E.g., we may later encounter B ( z l ,c3, z l , 2 5 , 2 5 ) which covers domain vectors (ci, cg, c;,c j , ... , c j ) since these describe all the possible complete instantiations B(c;,c3, cj, Cj). We will consider procedures which initially start with B ( z l , z p , ..,zn). Then, due to unifications, a direct specidization (or two) will arise and be added to a list of such instances of B(z1, zp, 2,). At any point of time, we will have a list B1, Bz, Bt (also denoted B1 A Ba A ...A Bt). Unification of some predicate letter P occurring in B; and Bj (possibly i = j ) will be performed, resulting in formulas Bt+l respectively Bt+p which is a direct specialization of B; resp. Bj and a specialization of those formulas of which Bi resp. Bj are direct specializations or specializations. (A formula is also viewed as a specialization of itself). The variables of Bt+l will be distinct from any of those occurring in B1, ...,Bt and in the other Bt+l (by renaming, i.e. by forming variants). A formula B; on the list directly cover8 a domain vector f = (c1, ...,c,J if it covers f and no nontrivial specialization of B; (on the list) covers T. Given a domain vector and a list B1 A Bp A A Bt one can quickly determine (”using the specialization of” directed graph) if Bi covers and if B; directly covers f. Since the list began with B(z1, ...,z,) which covers all c , at least one B; on the list will directly cover f. We may remove from the list any B; which does not directly cover any c EC. W e may do so as this fact is discovered. Note however:”Bi directly cov-
...,
...,
...,
...,
...
4
A
-
142
R. JEROSLO W
ers some
8.3
The algorithm framework
In the algorithm we shall describe, one proceeds by use of a sequence of partial reductions of predicate to propositional logic, motivated by the results in Section 8.1 above. Truth valuations on each partial reduction can be used to guide the computation. In some instances, the truth valuations are sufficient to let us know definitely that (V <)B(<) is or is not satisfiable, in which case our work can stop. In other instances, we do not reach a definite conclusion, and we must continue on to the next propositional "approximation", which is typically larger. Truth valuations are thus used to help to "fathom" the problem, as done in Operations Research. Eventually, we shall fathom the problem, if for no other reason than the fact that we eventually will reach complete instantiation, a finite but unfortunate outcome. In order to describe our algorithm, we introduce the technical concepts of "blocked" and "unblocked" valuations. A blocked valuation causes the next approximation to be taken, hence the name (i.e. we are blocked from t enninating computation). Given a list B1 A Ba A A Bt
...
arising from B(zl,Za, ...,2,) by a series of unifications, suppose we have a truth valuation of the list which makes it true. The truth valuation is called blocked, if, for some predicate letter P, there is a candidate pair, or pair of variants for P,the first occurrence of P in Bi and the second in Bj (possibly i = j ) , such that the truth values of the two occurrences are opposite. Otherwise, the truth valuation is unblocked.
LECTURE 8
143
In this definition, the truth valuation is purely propositional, and treats e.g. P ( c 1 , q ) as a propositional letter having no relation to P ( q , c z ) , even though they have the common instance P(c1, c1). An unblocked valuation results in fathoming, as we see in our next result.
Theorem: Ifan unblocked truth valuation exists whichmakes B l h B 2 h...hBt true, then (V G ) B ( z )is satisfiable. Proof: Fix P' as any fully-instantiated instance of a predicate letter P.
Let 2 be any vector of constants of B ( 2 ) such that B ( T ) contains P'. (Recall that if B ( 2 ) has no constants, one is added). Let Bi be any formula on the list which directly covers T. In B; there is an occurrence of P which becomes P' after B; is fully instantiated by f. Give P' the truth value assigned by the valuation to this occurrence of P . We cfuimfirst: the truth value found for P' does not depend on f,B;, or the specific occurrence of P chosen in Bi. The proof is by contradiction. If the claim were false, there would be two occurrences of P in formulas Bi and Bj with opposite truth values. Bk hypothesis, these occurrences cannot be variants, consequently, these occurrences would be unifiable, as both have P' as common instance. If either of the formulas resulting from unification were a specialization of Bi on the list, say the result from Bi, then Bi could not be a direct cover of 2, which it is. So neither result is on the list as a specialization, and the pair of occurrences of P constitute a candidate pair that blocks the valuation. However, the valuation was assumed to be unblocked. We now cfuim: the truth valuation thus constructed for all fully-instantiated predicate letters makes all B ( f ) true. This is clear: it makes B; true and, as a direct cover of f, the truth valuation for B; is that for B(;). Thus AzE; B ( c ) is true and so (V g ) B ( g )is satisfiable. Q.E.D.
If a satisfying valuation does not exist, again we can fathom. Theorem: If no valuation exists which makes B1 A Ba A (V ; ) B ( z ) is not satisfiable. A
-
... A Bt true,
then
Proof: Each B; covers at least one ciEC. /\&, ll(<;)arises from a substitution of constants for variables in B1 A Ba A ... A Bt. As the variables of the
144
R. JEROSLO W
different B; are dl distinct, from a valuation which makes A:=l B(<;) true, we can obtain one which makes B1 A B2 A A Bt true, by simply making a letter in ~itrue or false accordingIy as its instance in ~ (
...
fiable, so neither is (V
...
-
s)B(s).
Q.E.D. The chart in Figure 22 summarizes our algorithm framework. Lonceptual outline
Iinitialize list as ~ ( x l ., . . Select Ei covered by B i
I
1
1
1
V3%3(XJ is not satisfiable be made true?
-
If two variants are oppositely valued, instantiate a position in one of them. Otherwise, perform unification on the candidate pair: update the "specialization of" relation: add t o the formula as necessary, keeping d m variables in newly-added formulas
Figure 22: Algorithm Outline
This becomes an algorithm, once given a subroutine for truth valuations. In the case of position instantiation in a variant, means can be developed to avoid more than two specializations. Our procedure for treating variants as
LECTURE
145
8
stated above has the undesirable property that it greatly multiplies the size of the list of Bi's. Note that the test for a satisfying truth valuation for At=, B ( z i ) ,as cited in the outline, is actually more stringent than the test first described above, in which predicate variables were used. When the test for a truth valuation is done by tree search, and fathoming does not occur, the previous tree can be updated to form the basis for the next search. More details are given in [Jer 1985d1, where this algorithm is presented (the version to appear in fact has an improved algorithm). We now analyze the worst-case computation time of our algorithm for deciding the satisfiability of (V ;)I?(;).
Theorem: For a formula B ( z ) of length L , this procedure requires time at most 23LL to determine if (V G)B(;) is satisfiable. Proof: In B(zr,z2, ...,2,) with n zi and m domain constants c j , there are at most (n t m)" distinct ways of substituting variables and constants into the Slots. As two variants of the same formula do not occur in a list, this bounds the list size. If w propositional letters occur in B, then the number of propositional letters in a list is at most w ( m t n)"; hence the n-br of truth valuations is at most 2 4 m + 4 " at any it-atibn. At an iteration, a given formula on the list has a "state" which consists of d the it directly covers. As there are m" such 2 it has at most 2m" states. At an iteration, the "state" of a list is the Cartesian product of the states of its formulas. Thus there are at most ( m ~a)"2"'~list states. These cannot be repeated, since a unification changes list state (at least one formula no longer directly covers at least one c EC)in a nonreversible way. Total time is thus bounded by:
+
A
-
~ 2 ' " ( ~ + " ) "(work per iteration, assuming complete enumeration of truth valuations) x (m n)=2"'" (number of iterations) = ~ 2 ~ ~ + ~ ( ~ + ~ ) "
+
5 &LL+L.LL-'+La
< - c23LL
(note that m,n, tu 1 1 and w + m
+ 2n 5 L ) . Q.E.D.
+ ~ ~ ~ g
R. JEROSLO W
146
8.4
Illustrations and comments
We next give some simple illustrations of the algorithm. Ezumple 8.4.1: We wish to determine i f ( 3 q ) P ( c 1 , 2 1 )A ( V z 2 ) (Vz3)1P(22,23) is satisfiable. F i s t , we transform it, using Prenex Laws to (3zl)(Vza) ( V 2 3 ) [ P ( q ,2 1 ) A +(q, 2 3 ) ] and then we use the reduction of to We obtain (t/%Z)(V%3) [ P ( c l ,c2) A l P ( 2 2 ,z s ) ] . Thus we begin with P(c1, c2) A 1 P ( t 2 , 2 3 ) . A unique valuation makes it true (P(c1,c2) true, l P ( z z , 2 3 )false) and this valuation is blocked. We can unify by 2 2 + c1, 2 3 + C a , and then we obtain a new liet:
yv
P(c1, c2jn
v.
+(z2,23) L
q c 1 , c2)
f-+l,
c2)
The &ITOW indicates "specialization of" (i.e. it is not a search tree). This list has no satisfying valuation: hence the original formula is not satisfiable. Ezample 8.4.2 Is ( 3 z l ) P ( c l ,2 1 ) A(3za)(vc,) ~ P ( 2 1 , 2 3satisfiable? ) We first ~ )H '),I [ P f c l , z l ) A l P ( z p , z 3 ) ]and we use the reductransform to ( 3 ~(32-1 tion to obtain: 3 1 1 c -
Again there is a unique valuation making this true. However, no unification is possible, and the valuation is unblocked. From our fathoming results, the sentence is satisfiable. In propositional logic we can define positive and negatiue occurrences of a letter P (following Craig,Beth, Schutte, others). A given occurrence of P as P is positive. Then inductioely: if an occurrence of P in A is positive (negative), then it is negative (positive) in -A: positive(negative) in A h B or in A V B; negative(p0sitive) in A 3 B; and positive (negative) in B 3 A. Next, we observe that, if all occurrences of any specific propositional letter P in the formula B a,re of the same sign, one can make B true by setting "true" all positive letters and "false" all negative letters. We call the resulting valuation, the quick valuation.
147
LECTURE 8
For instance, on an individual clause V P a V P3 v 1 P 4 the letters P2 and P3 are set "true," and PI and P4 are set false. More generally, we observe that if there is any satisfying valuation at all, without loss of generality any letter occurring onlypositively can be set "true," and any letter occurring only negatively can be set "false." This principle s m e s significant time in looking for satisfying valuations, since each B; typically contains many letters occurring only in it. For clausal form B(z1, ...,tn),each Bi is of clausal form. In the two clauses
Bi = Bj =
V
P(z1, ~ 324) , V
...
...V ~P(C~,ZS,ZS) V ...
We can assume that, if there is any satisfying truth valuation, P ( Z ~ , C ~ , Z ~ ) is "true" and P ( c l , t ~t ,g ) is "false". This pair of occurrences can unify, and will be candidates for unification, if that does not result in a specialization of either B; or B j . Recalling that classical resolution-based theorem-proving proceeds from clausal form, to unify and resolve any opposite occurrences of predicates, if they can unify at all. We reach the following conclusion: When classical methods use a nontrivial unification, they proceed as if the list were satisfiable. However, when the unification is trivial and produces no new bindings (SO that the pair of occurrences would not be a candidate pair), the operations of residue formation can be viewed as part of a procedure to decide if the list is satisfiable. Thus, resolution procedures mix features of the quick valuation with features related to the tree search for a satisfying valuation. As mentioned in our summary, the tree search procedure need not be necers a d 9 implemented via Operations Research techniques. Linear programming can be replaced by list processing heuristics. However, linear-programming-based methods have a versatility, due to the ability to change the representation of logic problems from the standard one. We next give one illustration of this, using the disjunctive methods. Given propositional letters PI, ...,Pt we can assign variables Z j = z ( P j ) to these and use the standard imbedding:
Pj is viewed as the b-MIP.r set:
zj
= 1, all zi E ( 0 , l )
1Pj is viewed as:
zj
= 0, d
E {O,I)
Then the basic constructions of Lectures 3 and 4 can be used to obtain representations of any propositional form in PI, Pt. The question "is the form
...,
R. JEROSLO W
148
satisfiable?” converts to ”is this b-MIP.r set nonempty?” and satisfying truth valuations correspond to elements of the set (i.e. feasible solutions). For example, the union construction for iP1 V P2 V P3 gives:
%r’
”61) = 1.m2
$1 = O.m1
(2) Z1
= 1.m3
5 1.ml 5 0 5 .%g) 5 1.ml
0
5
0
5
z:), z g ) binary
z1 ( ~9 z(’) ) 3 binary
0
mi
+ mi +
5
0
1.m2
5 %f) 5 1 . m ~
5 1.m~0 5
m3
213)
5
1.m3
z1(3)9 % (3) binary
= 1 dl m; binsry
We seek to simplify the above, and we note:
Clearly, in the LR (Linear Relaxation) we have: (1- zi )
+ + > mi + + .Z3
%2
m2
m3
=1
+ +
Actually, 1 - z1 z2 z3 2 1 is the o d y constraint needed in the LR in addition to 0 5 t j 5 1 all j = 1,2,3. (This is an exercise, similar to the logic example in Lecture 2). So here we have only recovered the standard representation. However, in our next example, something new will occur. We next consider the (inconsistent) logic constraint
Pl v p2 lP1 v Pz P1 v l P 2 lP1 v l P 2
i.e. (PI V Pa) A
(4’1 V
Pa) A (PI V
This will formulate in the standard imbedding
a+ 1- Z l +
z2 22
q+ 1 - q 1-t1+
1-z,
as:
L1 L1 11 21
4’2)
A (+I
V +’a)
LECTURE 8
149
The LR is consistent (21 = za = )). It can be found inconsistent by the LR after branching on either PI (i.e. zl = 0 vs. q = 1)or on Pa (i.e. q = 0 vs. z~ = 1). Branching on PI is an implementation of this reformulation.
Here is a second reformulation, via the distributive law applied to (PI V Pa):
The composite construction for the first conjunct gives:
Clearly, the above forces za = 1. Also from the clauses (PI V 1 P Z ) A (4’1 V i P a ) we have: z1 - za 2 0 (so t i = I), -z1 - 2 1. This is also inconsistent although “less distributed.” In this way, the linear relaxation can be tightened via a new representation, which is an improvement on the standard one (but here, not as powerful as fully branching). For this logic constraint, the inconsistency can also be detected via three resolutions, as we saw in Lecture 5.
A comparison of resolution vs. branching for propositional logic will be given in Lecture 9. Here we saw the value of an ”intermediate distribution.” In a complete distribution, where the or connective becomes outermost, limited experimentation indicates that it is more efficient in the LR to subdivide into subprograms (i.e.
R. JEROSLO W
150
”branch”) than to keep the problem in one LP via the union construction. However, simplifications in the formulation can occur which may change this recommendation.
8.5
A generalization: predicate logic together with linear constraints
Let a t-dimensional b-MIP.r set S be given. To each co-ordinate zj of an element (z1,22, zt) E Re, we assign a variable zj(z) with ”variable index” z = ( 2 1 , ...,2,). We allow that z j ( z ) actually only depends on some elements of 2, and possibly is constant (i.e. does not depend on z). In addition, a set C = (c1, ...,c,,,} of domain constants c; is given.
...,
This includes the case of predicate logic by having z j ( z ) be the binary variable for the predicate letter Pj occurring in B ( z ) ,and by letting S be the composite construction for a linear representation of B ( z ) . It also includes M I P (take C = ( ~ 1 ) ) )and some MIPS with repetitive structural patterns in the constraints. The previous techniques g o over. In place of a satisfying truth valuation, we seek a solution to the constraints. Unification is now triggered by candidate pairs where the variables, whose indexing can unify, have different values (however small). Here unification provides dynamic activation of constraints, when there is a large number (potentially exponential) of these. Types can easily be added, in which certain co-ordinates of z = (z1,22,..., 2,) can take only a certain set of constants as values. Certain changes in the concept of unification accomplish this. For example, let q(z1,2 2 ) intuitively mean that customer z1 is serviced from distribution center 2 2 . Let zz(z2) mean that DC 2 2 is open. We have the constraints:
LECTURE 8
151
i.e. a customer is serviced only by an open DC, and also
i.e. a customer is serviced only by exactly one DC. Howewer, the formulation device does not handle constraints where the coefficients or constants change with 2. These must be listed out indioidudly. Thus it is more restrictiwe than the ordinary usage of "for all" regarding a set of MIP constraints.
This Page Intentionally Left Blank
153
LECTURE 9 SPATIAL EMBEDDINGS FOR LINEAR AND LOGIC STRUCTURES Summary: The predicate logic is concerned with models, specifically, what is true in all models. To treat any specific model, axioms for it and its database of relational information must be added to predicate logic. An alternative approach would be the construction of a representation for any spec% model. This representation should be in Euclidean space in order to have access to polyhedral theory and to m a n y of the methods of Mathematical Programming, such as linear programming algorithms, decomposition methods, etc. These methods may in turns be studied in specific applications, with a view toward simplification to list processing routines. In this lecture, we introduce the concept of an embedding (or: imbedding) of a model structure into Euclidean space, and we cite some basic results regarding embeddings. Surprisingly, a given model structure, when embeddable, has a wide variety of fundamentally different embeddings. This in turn leads into several other issues, including: alternative MIP representations, a generalization of Benders’ partitioning, and the relation between branching and resolution as means of splitting. The material here is reported in [Jer 1985b] and [Jer 1986133 and formed the basis for our technical remarks in an earlier essay in OPTIMA [Jer 1985eJ.
9.1
Deflnition of an Embedding
Our motivation in this lecture is to achieve a spatial representation of a model structure which is, in principle, adequate for optimization subject to propositional constrrrintt~.We will also obtain results for predicate constraints, but these are (at this time) theoretical and require exponential space in general. The canonical ezample to keep in mind is the standard embedding of propo-
R. JEROSLOW
154
s i t i o d logic on n letters Pi, ...,Pn into R" by:
Pj
1Pj
+ zj --+
"j
= 1, = 0,
2;
2;
E (0,l) E (0,l)
all i all 1
For this lecture, it is important to note the change of focus from satisfiabdity and validity to specific modela. Embeddings will provide us with: 0
0
A conceptual unification of logic-based Artificial Intelligence with Operations Research. The possibility of using Operations Research algorithms to aid logic processing (e.g. MIP reformulations) The possibility of using logic algorithms and ideas to improve MIP calculations
0
0
The ability to add optimization to logic, as well as "at least ..." and "at most..." conditions. The ability to use Euclidean space in the natural manner for those parts of a model structure which are dretady spatid (e.g. flow balances, budget constraints, etc.)
As regards the last item above, on flow balances, etc., we note that present techniques in Artificial Intelligence for "planners" have great difficulty in going beyond pure logic, to treat these kinds of constraints to which Operations Research is well-suited. Typical planners, utilizing various heuristics for resource allocation and reservation, are often unable to fathom an infeasible plan until it becomes quite detailed. We now begin the technical development. By a model M we s h d mean a structure M = (X, D , W ;P I ,P2,...,P,) where X C R" is a subset of some Euclidean space R",D is a set, W C X x D is a nonempty subset of the Cartesian product of X and D ; and each P; C_ W is a predicate on W ,where W E X x D . We c d X the "spatial" coordinates of the model M ,and D the "logic" or "database" coordinates. We define i P j by relative complement: 7 P j = W \ Pj
An embedding will involve nonempty b-MIP.r sets I M B ( W ) , I M B ( P j ) R' satisfying conditions to be specified below. The dimension t of an embedding need have no relation to the dimension n of X or to the number of predicates on M.
LECTURE 9
155
First of all, IMB shall extend to a mapping of propositional forms in P I ,Pz, ...,P, by the conditions: IMB(1Pj) = I M B ( W ) \ I M B ( P j ) where IMB(iPj) shad also be b-MIP.r with rec(IMB(Pj)) = rec(IMB(W)) (= rec(IMB(1Pj)) if iPj # 0) for all j , and also: For propositional forms L1,Lz we shall require
IMB(L1 v L*) = IMB(L1) u IMB(L2) IhfB(L1 A &) = IhfB(L1)tl IMB(L2) IMB(L1 3 Lz) = IMB(-.L* v A*) and for propositional L we will have
I M B ( 1L) = I M B ( W )\ I M B ( L ) I M B ( 4 ) shall be b-MIP.r with rec(IMB(4)) = rec(lMB(W)) if IMB(1L) # 0. Note that here I M B ( 4 ) is the complement relative to I M B ( W ) , and it is not the set-theoretic complement relative to R'. The latter typically is not b-MIP.r. For example, in the standard embedding of propositional logic, the complement of z j = 1relative t o the vertices of the hypercube, are those vertices with z j = 0. In addition to the previous requirements of an embedding, we shall also require that the spatial part of the model shall imbed in space in an "as is" manner. Precisely put:
(*) For all vectors c E Bn and all propositional forms I;, inf(CZl(2,d) E L } = min{czl(z,y) E I M B ( L ) }
In particular L # 0 ifFIMB(L)# 0. We note the following, with regard to the preceding definition of an embedding: 0
Imbeddings concern relations Pj W and do nof cite indioiduals in M . Each instantiation of concern is viewed as a new relation Pj(c) and
Pj =
-U- P j ( 2 ) c EC
0
The dimension t of R', the space of the embedding, need not count auxiliary variables which are used in a representation for I M B ( L ) . Which ones are or are not counted is an option.
R. JEROSLOW
156 a
Embeddings are designed to treat all possible (propositional) queries regarding M. They provide a general conceptual framework. However, when only a limited list of queries is of interest, many simplifications are needed in any practical setting.
PF(') where G(j) E { -1,1} Define an elementary conjunct to be L' = for all j . The following is a basic result, which characterizes embeddable model structures: Theorem Let M = (X, D , W ; PI, P,)be a model with X Rn. If n = 0, M is imbeddable. If n 2 1, M is imbeddable if€ for every nonempty elementary conjunct L', dconv(T(L')) is a polyhedron and, if T(L') # 0, rec[clconv(T(L'))] is independent of L', where:
...
T(L') = { X I for some d E D , (X,d) E L'}
As we will see in the next section, embeddability is a concept which includes representability but is broader. Embeddability is not restricted to unions of polyhedra, even in the purely spatial case (D = 0). For a propositional form L, I M B ( L ) is a subset of Euclidean space. Since it is b-MIP representable, it will have a representation I M B ( L ) , which we designate by underlining. We next explore a phenomenon which arises, when the embeddings are represented in a way which "commutes" with the basic constructions of Lecture 3. Given an embedding of M, there will be representations of elementary conjuncts. We can then extend these to representations of general propositional forms by: IMB(L1 V L2) IMB(L1 A La) IMB(LcI3 Lq)
= = = = =
Let I M B ( L ) be abbreviated by
IMB(L1) v IMB(L2) IMB(L1) A IMB(L2) I M B ( i L 1 ) V IMB(L.2) IMB(lL1)A IMB(lL2) IMB(lL1)V IMB(lL2)
L. W e have the following result.
Theorem: (Existence of a sharp embedding).
LECTURE 9
157
...,
If model M = (X, D,W ;PI, P,)is embeddable, it has an embedding with this property: for all c E R" and all propositional forms L, inf{czI(z,d)E L } = min{czl(z,y) E Rel(L)) Recall that, for typical representations,,I and La of b-MIP.r sets L1 and La, r\Lais not a sharp representation of L1 A La even when Ll and La are sharp. However, the theorem above asserts, as a special case, that L1ALatoill be sharpprovided that the embedding is suitably chosen and sharp representations of the elementary conjuncts are used. In particular, any representation can be "redone", in theory, to achieve this kind of additional sharpness. The next result shows that, in order to achieve such a sharp embedding, typically a very large number of auxiliary variables are needed.
Theorem: (Dimension restriction on sharp embeddings) Let M = ( X ,D , W ;PI, P,) be a model with an embedding and fixed representations I M B ( W ) , I M B ( P j ) , IMB(+j). Let q be the dimension of the auxiliary variables inTiGiXedding, n the dimension of z, and let q* be the number of nonempty elementary conjuncts L'. Then if q < q' - n - 1, there are disjunctive normal forms L1 and La with
...,
(i.e. the embedding cannot be sharp) As an illustration of the dimension restriction, consider a propositional logic on 8 letters PI, P,. For general propositions with no specific logical restrictions (i.e. for the free boolean algebra), there are q* = 2' nonempty elementary conjuncts L', and n = 0. If the dimension q of the auxiliary variables J of the imbedding satisfies q < 2' - 1, the embedding cannot be sharp. In particular (as is wellknown), the standard embedding into dimension 8 is not sharp. A sharp embedding requires at least dimension 2' - 1. In fact, one can obtain a sharp embedding of this propositional logic in dimension 2' - 1. One uses the unit vectors together with the origin, each point representing an elementary conjunct. Properties of the simplex then will ensure a sharp embedding (we leave details to the reader). Current research on embeddings focuses on dynumic means of "increasing the sharpness" of an embedding. In specific applications, only a few logical forms are of interest, for only a few criterion functions. This fact allows for the possibility of "adequate sharpness" at lower dimensions of auxiliary variables. The dimension restriction above, which typically gives an exponential number
...,
R. JEROSLO W
158
of variables, is a restriction only if one is truly concerned with all possible queries and all possible criterion functions.
9.2
Illustrations of embeddings
Ezumple 9.1.1: An ordinary MIP constraint set
W = (21 Az 2 b, z j E {0,1) for j E K}
M = (W, 0, W ;PI,...,P,)with possibly 8 = 0. Here certain of the Pj may be defined by Pj = { z E W l z j = l}, with others defined by other conditions. Put I M B ( W ) = W, I M B ( P j ) = Pj. Models of this type are "entirely spatial" (D = 0) and aheady embedded in Euclidean space as given. A rule imposed upon elements of W: can be viewed as a model
L1 and ...Lt implies LO can be viewedas a domainrestriction to I M B ( L ) where L = iLlV Also constraints with strict inequalities among Az(
>
...ViLtVLO.
) b can be treated,
as long as clconv(t(l')) is polyhedral (with the recession condition) for elementary conjuncts L'. Thus, non representable S C R" m a y be nevertheless embeddable. However for nonrepresentable Pj we cannot put I M B ( P j ) = Pj.
...,
Ezample 9.2.2 A propositwnal logic on 8 letters PI, P, corresponds to a model structure M = (8, D , D; PI, ...,P,) which is "entirely database", in which D consists of all truth valuations on these letters and Pj is identified with all valuations which make it true. A predicate logic on 8 letters PI, PIis viewed as a propositional logic in letters P(2),where 2 is a vector of constants C and variables. Then V satisfiability can be treated by reduction of (V g ) B ( g )to AIf(;). c EC However our techniques of Lecture 8 (partial instantiation) does not use this embedding directly. It proceeds instead via a series of embeddingsof larger and larger substructure, until "enough" is embedded to answer a specificquery (dynumic uciioution of an embedding). A datubuae can be viewed as a model restrictionon predicate logic, in which only those truth valuations making Pj(c) true (respectively false) are allowed
...,
A
-
LECTURE
159
9
if Pj(C) (resp. lPj(C)) is in the database. Here satisfiability for the model restriction.
9.3
2 validity is treated via V
A
Results for predicate logic embeddings
w; ...,
For a model M = (X, D, PI, p,) let a quantifier-free propositional form = B(Y) be given, where Y represents a list of variables 9h inserted for co-ordinates of the relations Pj (where the same yh can be used for different coordinate positions in L). A quantified predicate m a y then be formed as e.g. A
A
which represent a relation on the Cartesian product of the co-ordinate domains for the free variables (possibly different from W C X x D). For any finite collection of quantified variables, a model structure M' can be specified over which these are relations. We have the following result (which can be strengthened; see [Jer 1985bl).
Theorem:
...,
D , W ;PI, P,)be given. Suppose that D is finite Let a model M = (X, and that, for each d E D, these sets axe all b-MIP.r, with zero recession:
Then any finite collection of quantified predicates yields an embeddable model M'. It is not practical in general to construct embeddings for quantified predicates "at once". Instead, dyumic actitration can be used. We illustrate this process for two common cases. Ezample 9.3.1: (Bender's partitioning extended to polyhedral union sets; see [Bend 19621). Write S = PIu Pz u ...u Pt where each Pi is a polyhedron
160
R. JEROSLOW
and let y be a vector of variables drawn from z = (y,z). Write A(% = d 4 y + ~ ( ' 1 % .We have: = (3,)Pl u (3y)P2 u .*. lJ (3Y)Pt for some y) = {zICP)y + D(')z 2 b(')
(3y)S
....
U{zlC(')y
and for each i = 1,
+ D(')z 2 b(:)
for some y}
...,t (3y)Pi = {zI for all basic feasible w 2 0 with W d ' ) = 0, c k w h = 1, we have WD(')Z 2 tub(')}
From polyhedral theory, (3y)P; is a polyhedron. Dynamic activation here consists of computing the basic feasible w 2 0 as needed. The case t = 1 is Bender's partitioning. At a given point in time, we have an "outer approximation" to (3y)S obtained by adding a set of previously generated "cuts" which place restrictions on z. We optimize over these cuts of the "master problem", obtaining a solution z+. If this leads to a z+ for which there are no new "cuts" for at least one Pi, we may slop (i.e. without completing the calculation of (3y)S). Otherwise, we find a w* which cuts off I*, i.e. for which wD(')z' < w*b(')for at least one i. We add w'D(') 1 w'b(') to OUT list of cuts for the "master problem", and continue. As regards the criterion function, without loss of generality, it can be taken to be among the variables of z. Indeed, we may always start by adding a new z variable set equal to the criterion function. Ezample 9.3.2: With notation as before, we wish to compute (Vy)S, or more generally (Vy E P)S,where P = {ylCy 2 b} is a polyhedron. We have
E P)S = (Vy E P)(P1 u Pa u ...u P:) = l ( 3 y E P)(Pf n pi' n ... n P;)
(V,
where Pf
= ((y,
.)I
= ((y,z)l
some constraint in c w y + D('1.Z > - b(')fails}
for some k, cf'y
+Dpz < b p }
LECTURE 9
161
= k-th row of C(') D f ) = k-th row of D(') b t ) = k-th element of B(')
and Cf)
Let g be a function from (1,
...,t } such that g ( i ) is a row index of C(')for
alli. Then:
Now note that
iff for every basic feasible solution to
xi
+ xi
w i d8(i) ' ) z < ub wib('). 8 ( 3 if w we have ub 5 0 if w = 0 and we have Thus i f P # 0 (so UC = 0, u 1 0 implies ub 5 0) we have (Vg E P)S =
for some basic feasible solution to u c ZiWiC$) = 0,u 1 0,w 2 0 with w # 0 we have
+
xi wiD,(,)z (4
2 ub
# 0.
Z ; w ; + C;Uj = 1
+ xi ~ b $ ) }
We obtain an inner approximation as more b. f. s. are generated. In this context, if z+ is the current solution to the "master problem" we add a new alternative inequality if, for some function g , we find a b. f. s. (u', w + ) with a nonredundant inequality of the form wtD${,z 2 u'b Xi w;b$,. If we find no such basic feasible solution, we may stop with the current z*. Here in the universal (spatial) quantifier case, the search is more extensive, and can be computationdy very demanding. Of course, this is due to the nature of this quantifier.
xi
+
162
R. JEROSLOW
8.4
Logic as pre-processing routines for MIP: an example via the DP/DPL algorithm
In this section, we address this issue: to what extent do the DP and DPL algorithms generabe to MIP processing? During our discussion, we will also uncover an interesting relationship between branching versus resolution. For simplicity, assume that each proposition Pj is replaced by a polyhedron. (Actually, Pj can be a general b-MIP.r set; see our fuller discuss on in [Jer 1986bl). In general, the variables in each Pi are shared. We also consider the Cartesian Product (CP) case, in which the variables z ( j ) occutIing in Pj do not occur in Pi for i # j. (The CP case generalizes truth valuations). In this context, there also are polyhedral Pj and i P j with
Pj n lPj = 8 there is a domain constraint 2
E Pj U l P j allj
and Pj and -Pj are to have the same recession cone. This essentially is an instance of an embedding with D = 0, and we have:
W = ni(PulPj), 1Pj = Pj \ w Suppose first that we are given a list of clauses in PI, ...,P,. We ask if clausai chdning goes over. In fact, it does. We now explain why. If both Pj and 1Pj occw in a clause, 2 E PjUlPj by the domain constraint, so the clause can be deleted. If unit clauses Pj and 'Pj both occur, it is inconsistent by Pj n 1Pj = 0. If Pj occurs alone as a unit clause, we add " z E Pj" as a side constraint; and we can delete any clause containing Pj, and remove -.Pj from other clauses (by Pj n 1Pj = 0 ) . We next see that, in general, monotone variable fin'ng does not go over to MIP. E.g. if the clauses are PI V Pa, Pz v P3, we cannot assume "z E P1." If Pz, P3 C -.PI, actually all solutions have z E lP1. However, monotone fixing i s valid in the Cartesian product case (as e.g. in propositional logic) for then the setting "dl) E Pln does not affect or d3). We next ask if splitting goes over to MIP. Here we are given three sets of clauses: Pj V R1, ...,Pj V R, l p j v s1,..., l p j v sb Tl, .*, Tc
163
LECTURE 9
In the resolution form splitting is
It is ultuuye implied (as z E Pj U -Pi). However, in general it need not be equiwdent. We may have all R; = Pi and all Sk = -Pi, so the problem is inconsistent, but the resolved form need not be. In the Cartesian product case, resolution splitting does give an equivalent problem (exercise). Splitting in the brunching form is always equiwdent provided the relevant condition "t E Pj" or "2 E 1Pj" is added as a side constraint. This completes our discussion of the three basic subroutines of DP/DPL. Note that, as we proceed, we accumulate a list of side constraints, all of which are polyhedral. Thus all side constraints together define a polyhedron. Prior to splitting, we attempt to fathom the problem by solving the linear relaxation (LR) of the current list (using representations P; of the Pi): 2
E Rd((Pj V R,) A ...A
(Pj V A(1Pj v A ( 1 P j V Se) AT1 A A,.. A
S,)
t
R,)
Tc)
E side constraints
z E Rel(Pj v l P j ) all j
If we prove inconsistency, or obtain a solution, we may terminate calculation. We now ask: which version of splitting has the superior LR? Brunching does, in ull cuses (consult Lecture 4 for the distributive laws):
Of course, Rel((Ail\k (Ri V
&)
A
(Aprp)) describes the linear relaxation
of the problem resulting from resolution, while Rel(Ai R;A ApT p )respectively Rd(Ak SL A Ap T p ) is the linear relaxation of the problem on the first resp.
164
R. JEROSLO W
second branch due to branching. In this MIP setting, any side constraints obtained in clausal chaining are already implied by the linear relaxation, hence in both branches of more unit resolution will be possible than in the completely resolved form. This comparison of LR’s is further strengthened if we take into account the additional side constraints from branching. Moreover, in the case of only propositional logic, the above analysis shows that clausal chaining is at least as powerful in generating new side constraints in either branch as it is in the resolution form. We proved this earlier in [Bla Jer Low 19851. The reluzation dominunce of branching is clear from the above. Moreover, in most cases, the sum of the sizes of both branches is smaller than that of the resolved form, although this statement is not a rigorous result. This kind of advantage, of branching over resolution, continues in the predicate logic treatment of Lecture 8. In the search tree process of locating a truth valuation, branching on a propositional letter will subsume doing all possible combinations of resolutions on that letter (prior to unification). Thus, branching on a letter resulting from unification subsumes all possible resolutions arising from that specific unification. In this section, we have seen how ideas and algorithms from logic can be used to aid in MIP solution, here in the form of pre-processing devices. This complements our earlier work in Lecture 5 and our remarks above on branching versus resolution, which generally show how to aid logic processing by devices from mathematical programming. Indeed, the two subjects of discrete programming and applied logic are very much interrelated.
165
LECTURE 10 TASKS AHEAD Summary: In this discursive lecture, we present some of our own views on trends and possible futures for Mathematical Programming, and we cite some general research projects which appear to be worthwhile. We mention the many research activities that connect parts of Artificial Intelligence to Operations Research, which are quite Merent than the connections covered in our lectures here. We conclude with some information on Artificial Intelligence instruction at a few business schools which we contacted. In trying to obtain a strategic picture of Mathematical Programming, we have been very much influenced by the ideas of Michael Porter in both [Por 19801 and [Por 19851, although the frameworks there do not directly apply to academic research areas. The main difEculty I found, in trying to anticipate future developments, has been the speed with which possibilities become realities in the very fastmoving areas of computer technology. Some of the developments I "guess" at here are already underway.
10.1
Three "top-down" Views of Mathematical Programming
We begin by giving three ways to view Mathematical Programming: intellectual history, academic settings for instructional programs, and end uses and users of OUT methods. All three of these views are at such a high level, that virtually none of us will recognize much in common with OUT day-to-day activities. Nevertheless, such a "stratospheric" analysis is appropriate in order to see trends and opportunities, and to understand how o w aggregate activities may be viewed by others outside our field. The section concludes with a prognosis and recommendations from my own perspective.
R. JEROSLO W
166 10.1.1
The Intellectual Heritage
Operations Research is an area, born during the Second War out of military needs of that period, which represented a fusion of several academic areas, notably those in Economics and Mathematics. We might draw the influences of the 1940's and 1950's in this manner: 1940'8 and 50'8
Economics
Graph theory and combin.
I Figure 23: Intellectual Heritage
In this diagram, the influence from linear algebra is represented by linear programming algorithms, from real analysis by nonlinear programming theory and algorithms of that period, and the influence from graph theory and combinatorics began of course with network flow theory. The "theory of the h" to which we refer is largely microeconomic profit-maximizationmodels, and "competition" is reflected in the early work in noncooperative game theory, which f i s t entered Operations Research via zero-sum two-person games. Each link above can be identified with the research of one or a smallnumber of individuals at this early stage of our field; let us not attempt to praise famous men by citing them!
LECTURE 10
167
In the mid 1980’s’the picture looks more like this: 1980’8
rn Mathematics
numerical analysis
Electrical Engineering
I Complexity
I
JGrnputer Science
Data structures
and
Data
bare
An. Int
algorithms
? I Psychology
Figure 24: Current Influences
168
R. JEROSLOW
Here the influence of general equilibrium has entered Mathematical Programming through the development of pivoting methods and associated results in Mathematics and Game Theory. The effectsof advances in these areas was widely felt in the 1960's and 1970's. Much of the influeace of economic modeling in Operations Research attained national prominence in the work of W. Hogan and in later work by H. Greenberg on the PIES model and other energy models. Related work is continuing and we shall cite it in 10.3 below. The most recent change in the diagram is the emergence of a new field, Computer Science, created at the juncture of applied logic and electrical engineering. The engineers supply the hardware, plus studies in pattern recognition and signal processing; the logicians supply most of the theory for hardware and software, particularly the latter. Computer Science breaks down in turn into many subfields, as it has become increasingly specialized and tends to deiine itself more autonomously from its origins. Computational complexity was the first to influence Mathematical Programming via the efforts of R.M.Karp, who built on J. Edmonds' earlier concept of a "good" algorithm and S. A. Cook's work in automata theory. The influence from data structures (for use in algorithms) originates in the work of F. Glover and D. K h p a n on network algorithms. Initially their research proceeded from practical experience with Mathematical Programming algorithms, quite independently of Computer Science. Tarjan's monograph [Tarj 19831 has continued the intellectual emphasis in this important and growing area. In the 1980's diagram, I have drawn two dotted lines for connections which are now beginning to develop. More and more mathematical programmers are learning and occasionally teaching database query languages and related database topics, including database design via programming models. A. Geofhion's development in "structured modelling" [Geo 19851 reflects this connection as well, although it has a far broader scope I will cite below. While database is slowly moving into programming, at least outside computer science departments, information systems design and systems policy has moved into Management Information Systems, rather than Operations Research. However, a recent paper by D. Klingman, N. Phillips, and R. Padman which details an extremely successful implementation of OR techniques, along with techniques from AI and IS, describes how the starting point for applications work can lie in the information system. Our efforts in these lectures has been to illustrate a part of the Linkage that I have drawn between Mathematical Programming and Artificial Intelligence. I will overview other links briefly later on in this lecture. What is significantly different about Artificial Intelligence is that, unlike
LECTURE 10
169
the other areas of Computer Science, it has been influenced by cognitive processing models drawn from Psychology. This influence has, in turn, led to the very ambitious goal of mimicking human intelligence. This goal may not be feasible, and we should not expect to see it achieved during our lifetimes, although some new computer architectures now being discussed may change things. What we can expect, and in fact do see now, is far more intelligent software which further assists in the decision-making processes of organizations. This latter decision support function is the primary role to which Mathematical Programming makes valuable contributions. Hence the connection to systematized human intelligence is important for us, and in that respect the role of Psychology is and will remain central. The potential for creating decision support systems with intelligent capabilities is a major theme of [Bon Hol Whi 19821, which strongly influenced my own efforts. A. Whinston is probably the first to articulate this development, and to arrive at this perspective from a background of Economics and Operations Research. His work and those of his collaborators and students are reflected in the research and projects of the Management Information Systems group at the Krannert School at Purdue, as well as in commercialproduct offeringsfor micro computers, of which GURU is the most recent. In his approach, intelligent modelling is interpreted by and evaluated by traditional utility measures [Moo Whi 19861. The link from Matbematical Programming to Economics has traditionally been weak and almost pro forma. The primary activity in this linkage has been algorithms for computing solutions, or results on structural forms of (conditions for) solutions. Relatively little work has been done on the interpretation of solutions or on solution concepts (e.g. interpretive applied microeconomics), even though this was a focus of research initially [Koop 19511;Wolsey’s paper [Wol1981]is a happy exception. Our failure to develop this link further is probably a missed opportunity. As a social entity, the values and raerence points for basic Operations Research has remained in Applied Mathematics, where most of the founders of the field were trained. Via the use of competing automata based on concrete AI paradigms, the link to economics may be strengthened. 10.1.2
Academic settings for Mathematical Programming
As we saw in the preceding lectures, Mathematical Programming has been historically defined as a collection of certain areas of Mathematics and, to a
170
R. JEROSLOW
lesser extent, of Economics, which are useful in many contexts of decisionmlrlring and resource allocation. The justification for a special identity for these topic areas is to allow a focus for the further development of these areas, and the associated algorithms, modeling techniques and applications. The possibility for such an identity also arises for these reasons, via the support of user communities in many other academic areas, and in industries. The outstanding success of the Simplex Algorithm greatly aided this process. Operations Research is, in this regard, entirely similar to Statistics and theoretical aspects of Computer Science, Electrical Engineering, and other areas of Applicable Mathematics. We must keep in mind that, until fairly recently, Mathematics communities proper had little interest in applications, and little understanding of the role of computer experimentation in applied mathematics. Even within Applied Mathematics, there was substantial competition between "traditional" Applied Mathematics as used e.g. in physics, and the newer Applied Mathematics exemplified by Statistics, Computer Science, and Operations Research. Attitudes within the Mathematics communities have changed and are continuing to change (see e.g. [Renew 19841). These productive developments will curtail unnecessary fissions and bring new vitality to Mathematics, as it removes the earlier "push" felt by applied mathematicians. However, the "pull" from user communities is, if anything, stronger than ever; so more and more applied mathematicians will be working in settings outside of Mathematics proper. The various "pushes" and "pulls", which earlier defined Operations Research, represent forces which change over time. Thus its continued viability needs to be periodically re-examined, particularly when very new trends become evident. To see the current state of affairs, we can begin by taking note of academic settings in which Operations Research and Mathematical Programming are represented. Here we take the perspective that Operations Research consists of Mathematical Programming plus several areas of Applied Stochastic Methods (including applied probability and statistics, simulation, stochastic processes and stochastic optimization). These two major divisions of Operations Research, which do overlap at many points (e.g. dynamic programming, stochastic programming, stochastic analysis of algorithms, etc.), are not always housed in the same academic unit. Mathematical Programming and Operations Research has been located in, both, Colleges of Engineering, of Science, and sometimes of Liberal Arts. This diversity reflects in part the diverse academic settings for Mathematics and Applied Mathematics. In addition, Operations Research has been located in
171
LECTURE 10
Colleges of Management, which is a professional school setting, and in some instances Operations Research is a free-standing department. We can draw the different academic settings as follows:
Education
.. Management
A OR Dept
Figure 25: Academic Settings
The most common academic settings are in Departments of Mathematics, Statistics, Industrial and Systems Engineering, Computer Science, or of Management. Operations Research is also carried out in some Departments of Electrical or Mechanical Engineering and of Economics; no doubt other settings have also arisen. In each case, there is usually significant adaptation at the level of the instructional program. For example, in the introdzcctory linear programming course in a Management setting, one stresses recognition of real-world prob-
R. JEROSLO W
172
lems as linear programs, the use and interpretation of computer output, ”what if” questions and some decision-support issues. In Industrial Engineering, the focus for the same topic would on the algorithms and software for computer solutions. In Mathematics, issues arise of, both, numerical analysis and extensions to ordered fields; although today’s students, with weak Mathematics backgrounds, often can explore these topics only at a graduate level. Thus the ”same” course actually becomes three very different courses in three different academic settings. The diversity of settings, and the degree of adaptation to teaching programs, is very unusual. Equally surprising at the intellectual level of research is the cohesion and identity as Operations Researchers. This derives from professional associations, meetings, and journals, which continue the values of the founders. As a consequence of the diversity of settings, changes in the content of Mathematical Programming will be driven by needs of instructional programs, as much as by intellectual thrusts. 10.1.3
Users’ Perspectives
For a moment, let us adopt the view of users of Operations Research. They may see things in this way: 0
Both Operations Research and Artificial Intelligence construct models to represent situations and have techniques to assist in achieving satisfactory outcomes.
0
OR is stronger on numerical calculations.
0
A1 is stronger on calculations with symbols.
The users have problems which typically involve both numbers and symbols. Therefore, users will seek one source to go to for both kinds of calculations. Currently, there is no such source. This is not simply a matter of juxtaposing related disciplines. Fundamental approaches and issues in both fields are deeply intertwined. Substantial intellectual cross-fertilization is to be expected. In more specific detail, historically the primary user areas of Operations Research have included Production/Operations Management, Finance, and Marketing. In Production/Operations Management, sophisticated schematic techniques from Artificial Intelligence are being used, in conjunction with some Operations Research methods, to represent and to solve scheduling problems.
LECTURE 10
173
The relatively more developed application approach of expert systems is beginning to be used in Finance, Marketing, and Accounting. Retention of these applications areas seems crucial, yet unsophisticated user communities can perceive that the choice of a technique from one area (Operations Research or Artificial Intelligence) precludes the use of techniques from the other area. Operations Research has historically a greater and long-standing emphasis on the real-time performance of its algorithms on large-scale, real-world applications. This relative advantage, however, cannot be indefinitely guaranteed. For instance, for those problems which are accessible to solution by Horn clause expert systems, we saw earlier that linear time algorithms exist to implement computation. On the other hand, some practitioners of Artificial Intelligence are gradually becoming more aware of ways in which their algorithms can be improved by use of quantitative algorithms. This presents a good opportunity to combine methods to mutual advantage. For well-developed AI technologies, such as expert systems, one expects these to move directly to the user communities as concrete applications. Other more formative AI symbolic technologies can profitably undergo a combination with OR quantitative technologies, and even expert systems can be combined with other OR techniques and software modules in configuring large decision support systems. The opportunity to benefit users, by combining traditional quantitative techniques with newer symbolic techniques, is by no means restricted to Operations Research. It constitutes a "gap" of huge potential for many subject areas of application, for Computer Science, and for Applied Mathematics generally. The issue one must consider is the proper positioning of Operations Research for the long range to better exploit that part of these opportunities which are accessible to its methods. Within the business school environments, instruction in expert systems is emerging in Management Information Systems groups more readily than in Operations Research groups. This increases the likelihood that MIS will absorb more of AX concepts and technologies. There is also a trend for Operations Researchers to make commitments to the MIS area. While situations vary at different universities, overall it would be fruitful for MIS and OR to develop cooperative efforts, and for more OR academics to deepen interest in MIS. It would be beneficial to do so for database, modelling, decision support, and implementation reasons, as well as for exploiting opportunities in AI.
174
R. JEROSLOW
Such a development would be, after all, only a continuation of the association between OR and computer theory and technology, which began when the Simplex Algorithm was first coded up. Had computers existed in the last century, very likely OR would have started then. Advances in computer technology have always been to the benefit of OR. 10.1.4
Some conclusions
Now about midway through this tenth lecture, it can hardy be surprising to the reader that I recommend the growth of the linkage between Operations Research and Artificial Intelligence. However, I see this linkage developing as a consequence of wider changes which are needed in the orientation and graduate training of Operations Research. I believe that, after nearly forty years, the base provided us by the founders of our field is no longer adequate, by itaeZf, for the continued growth of the field. Operations Research remains viable on this base, but only for a decreased number of applied scientists. A contraction of the field, provided it also resulted in higher quality research, would not be bad per se; but the increasing isolation from user communities is a very negative development. To some extent, a contraction in the size of the Operations Research community can be offset by a growth from abroad, as more developing nations seek to acquire the skills of our field. Export, however, is not an attractive path to take. Our efforts should also benefit our own nations. Moreover, export is a temporary solution. If we conceive of Operations Research today as being composed, half by Applied Stochastic Methods, and half by Mathematical Programming, I would propose that both these areas be reduced to one-third and that the remaining third be taken up primarily by Computer Science and Applied Logic, with contributions from Economics, Organizational Behavior and Psychology. Within Computer Science, Artificial Intelligence will naturally emerge as crucial; it does not need any special emphasis per se. The linkage to Applied Logic is equally as crucial; a thorough one-semester course in logic should be in all OR curricula at the doctoral level, in my view. A second logic course is advisable. There is sufficient common interest between Operations Research and Applied Logic to act on it directly, rather than waiting for the "trickle down" through Computer Science. The emphasis on Computer Science will help to take Operations Research further "downstream" toward the users. In addition, we need means of bypassing, to some extent, the many obstacles in moving from improved methodology to its use in practical contexts. We should not remain entirely dependent on
LECTURE 10
175
the ”trickle through” of our techniques to the functional areas of business, or via individual consulting efforts, even the best of which are necessarily of limited duration and scope. To this end, the establishment of a library for storing datasets of applied problems from industry, is the crucial link to rapid and rapidly - used advances in methodology. In terms specifically of the relation of OR to AI, I would not discount either the efforts or the risks involved in making the connections to AI. Such work is primarily for the more entrepreneurial among us. Nevertheless, as I shall discuss in the next section, several OR academics are already far along in this direction. A good place to begin is with an understanding of expert systems (e.g. [Har King 19851, [Hay Wat Len 1983]), including a ”hands-on” technology, and a reading of that part of the OR literature I will cite in 10.3 that seems most interesting to the reader. The shift to a weight of one third on Computer Science and Applied Logic, as I have just suggested, does indeed imply that the twethirds in our inheritance retains its value. After forty years in a much changed world, some adjustment would be expected. It remains important to retain our roots in Mathematics, and in axiomatic studies. I would expect that, following the influence from Applied Logic, much progress will come to depend on new uses of Mathematical Analysis.
10.2
Some research challenges related to these lectures
In this section, we sketch briefly several research projects related to our work, which we believe can be fruitfully implemented now. We divide these into two groups, in accordance with whether they concern primarily MIP representability or primarily the interface of A1 and OR. 10.2.1
Research on MIP representability
We list five potential projects: 10.2.1.1 Further integration of polyhedral combinatorics with disjunctive methods, including a detailing of potential applications. 10.2.1.2 Empirical studies of formulations which arise in practice; useful taxonomies; exploration of ”automatic formulation” for a user description of the M I P ; detailing of frequently-occurringsimplifications.
176
R. JEROSLOW
10.2.1.3 Non lattice-theoretic treatments to obtain "fairly s h q " formulations, for problems which have neither efficient disjunctive formulations nor efficient combinatorial ones. 10.2.1.4 Delineation of cases where the linear relaxation (LR)"almost" solves the problem, with practical asymptotic results (perhaps related to Shapley-Folkman-Stan,see Cassel's proof). 10.2.1.6 Adaptation of search algorithms to best exploit good LR's, including dynamic reformulation at search nodes and postoptimality information. Of these potentialprojects, 10.2.1.2 is the most needed and, a% an empirical effort, it would requke a smaU team of researchers.
10.2.2
Research on the AI/OR Interface
These five projects are all worthwhile:
10.2.2.1 Improvements in the representation and handling of propositional logic. 10.2.2.2 Use of theorem schema, saving of search trees, precompilation (to save effort at run time; this is a form of "learning"). 10.2.2.3
Development of techniques for utilizing the Horn clause fragment of which is "almost Horn".
a problem
10.2.2.4 Integration of logic techniques into frame-based reasoning; use of "generalized pointers". 10.2.2.6 Development of techniques for combining list processing routines of logic with OR "number crunching," for problems where both logic und linear structure occur.
Of these potential projects, work is underway at several OR locations for 10.2.2.1; some AI researchers are beginning to address 10.2.2.3; and some software systems are currently offered which claim capabilities similar to those mentioned in 10.2.2.4. 10.3
Some other research programs in the AI/OR Interface
Here we list those methodological efforts in the AI/OR interface of which we are aware, and which are not related to MIP representability. We would be happy to learn of other efforts.
LECTURE 10
177
10.3.1 Work on heuristics, borrowing from AI and OR (Glover, Pearl) 10.3.2 Integration with databases (Geoffrion; Klingman, Phillips, and Pad-
-); 10.3.3 Data structures (Tarjan, Glover and Klingman); 10.3.4 Model management (Greenberg, Vance) including automated -lanations; 10.34 Techniques for configuring decision support systems from rule bases, databases, and algorithms (Whinston); 10.3.6 Uses of parallelism, particularly to aid search algorithms of MIP (Meyer, Phillips and Rosen, Balm); 10.3.7 Natural language interfaces for OR software (Greenberg)
10.3.8 Uses of intelligent systems to select among Mathematical Programming models and to guide search strategies (Schittkowski; Greenberg; Murphy and Stab; Minoux).
10.4
Some programs and courses in the AI/OR Interface
We did a nonsystematic survey in April and May 1986, by calling up some colleagues and asking them about AI activity in their group or department. In some cases, programs were under revision and may be very Merent now. We srlmmnrire the information briefly, in close to the order it came in. What emerges (except for Purdue) is a picture of AI as "starting up" in business schools,with the initial course as one in expert systems, and given by the MIS group. 10.4.1
Purdue Univurity, MIRC
The Management Information Research Center (MIRC) of the Krannert School has 800 undergraduates and 8-12 Ph.D. students in residence, with four faculty. Its main intellectual interests are: decision support systems; accounting information systems; and automated manufacturing. It is heavily into Artificial Intelligence and has been so for several years. Source: Andrew Whinston
178 10.4.2
R. JEROSLO W University of Texas at Austin, Ph.D. Programs in MIS and OR
The MIS program has six required courses and eighteen courses available. Among the required courses are database, decision support, and database administration. Other courses include expert systems and artificial intelligence. The Management Science program requires five courses and an elective, and thirteen courses are available. Software design is required, and students can select two minor areas of three courses each. Source: Darwin Klingmm 10.4.3
Carnegie-Mellon University, GSIA and SUPA
The course offerings in Artificial Intelligence are under major revision at the Graduate School of Industrial Administration (GSIA) and the School of Urban and Public Mairs (SUPA). Currently there is a general introduction to AI in two half semester courses, followed by an expert systems course. A LISP workshop and a Texas Instruments Personal Consultant workshop is offered. A significant broadening of AI instruction is expected, including expansion to full semester courses and a differentiation between Ph.D. (technical) and MBA (managerially-oriented) course offerings. Source: Peng Si Ow 10.4.4
University of Iowa, Management Sciences
An AI course including Prolog programming, practice with an expert system shell and exposure to automated AI planning has been taught twice and is
expected to become a required course for Ph.D. students with an MIS emphasis. A workshop on management-oriented AT applications will also be offered frequently. Source: ColinBell 10.4.6
University of Colorado at Boulder, MIS and OR
The Center for Applied Artificial Intelligence is a research center which includes OR and MIS faculty. About four courses have been or will be taught, including experimental courses. An expert systems course is a regular offering, using Texas Instruments Personal Consultant Plus. In 1986-87 an AI programming course will also be taught on an experimental basis.
LECTURE 10
179
Sources: Claude McMillan and David Monarchi 10.4.6
Northwestern University, MIS
Courses are offered in Decision Support Systems in Planning, Information Systems Analysis and Design, Database Management Systems, and Artificial Intelligence and Expert Systems. Source: Eitan Zemel and Benjamin Mittman 10.4.7 Duke University, the Fuqua School
A course on “Expert Systems in Management” has been offered twice on an experimental basis by Marketing faculty, and is likely to become a permanent offering. There is no MIS group and interest in AI is restricted to proven technologies. Software support has been through use of Ml, with current plans to switch to GURU. Sources: Joe Mazsola and John McCann. 10.4.8
Massachusetts Institute of Technology, the Sloan School
There is an expert systems course which is jointly listed with Computer Science. Other AI topics, including frames, are taught in other MIS courses. Students learn LOTUS, and learn LISP as the only programming language which is required in the introductory MIS course. 10.4.0
Georgia Institute of Technology, Management Science
A special topics course has been taught on decision support systems, which includes expert systems as a decision support tool and, potentially, as a form of assistance in working with a toolbench of integrated software packages. Readings have been used from the books by Harmon and King, Sprague and Watson, and Boncsek, Holsapple and Whinston. The course included exercises in PROLOG, and a project which required the construction of a simple expert system in TURBOPROLOG. A system shell was demonstrated (TI Personal Consultant Easy).
10.5
Guessing Ahead
The ”shocks” to our industrial and geopolitical environments, which emanate from computer technology, are likely to increase in severity in the next five to
180
R. JEROSLOW
ten years. This is due to the fact that ideas for drastic changes are numerous, as are well-funded experiments to test these ideas. Several of them will "pan out." Competitive pressures dictate continued experimentation. A stable technology is a distant dream. Having followed all these developments from the 1920-21period of Emil Post's notebook, through Turing's 1930's conceptualization of a general-purpose computer, into the modern age, I am, personally, amazed at the effect of useful ideas on the practical world. What was needed for realization of the ideas was advances in materials, the commitment to proceed, and good development work. The challenge now is to munuge the outcome of this process, from basic science to everyday use, and s e e what principles we can learn as the pace of the process accelerates. I a m going to conclude the lecture series with a few guesses at likely developments which will affect Operations Research. They are mostly extrapolations on current trends or themes which seem to have near-term potential. I hope I will not be too embarrassed if I re-read this list in three or four years. I insert it here more as a "fun" way to conclude, than as a serious exercise.
10.6.1
Probably
Exploratory p r o g r b g environments will drop further in price to the "high price" end of personal computers, and will become available in non-LISP languages with high numericd computation capabilities greatly accelerating the pace and diversity of software development. Comment: This was written in May 1986. Now in August 1986 I a m aware of two product offerings which claim these capabilities, one priced at $ 5000 per copy, with a quantity discount.
-
10.6.2
Probably
Effective means of utilizing parallel and "non-von" architecture will become available, and will give a strong boost to artificial intelligence by allowing substantially more real-time processing of complex tasks. Comment: This is a very risky guess, since it is not clear at all as to how to exploit parallelism. I am betting based on the number and quality of scientists engaged in this work and the availability of new architectures for experimentation. There will be some surprises.
LECTURE 10 10.6.3
181
Probably
Linear programming will be put on a chip (once the algorithm framework stabilizes!) which will improve real-time response and search methods which rely on the linear relaxation. Comment: This idea was related to me in the 1960's. While technically feasible now, the proliferation of new LP algorithms have made it riskier to invest heavily in hard-wiring one algorithm. Thus the likehood of 10.5.3 is s m a l l e r than three years ago. A compromise solution is the hard-wiring of certain common subroutines, as e.g. Gaussian elimination. Let us hope for fewer surprises in the future! 10.6.4
Probably
Computers will become available which utilize analog means of processing 3dimensional spatial scenes. When combined with new methods for spatial representation of symbolic structures, this will allow powerful techniques for pattern matching, reasoning by analogy, and several kinds of learning. Commenls: This guess reflects my bias, that most of human intelligence is based on combining a vast data store and an intricate pointer system in it (for crose-referencing), with a very powerful pattern matcher probably adapted from vision centers. Logical deduction, which is often associated with human uniqueness, is probably done in a "recent" and s m a l l set of circuits, primarily as a check on the matching activity. I do not believe that people do logic particularly well. That is why they are so aware of it, as when one consciously places one's feet when beginning to walk. Even the construction of formal proofs is probably supported by a similar symbol matcher. The pointer structure in the memory store can provide research for logicians, as the pattern matcher can for analysts.
Of course, all these guesses may be wrong. Atlanta, Georgia August 1986 &vised November 1986, March 1987 and 1988.
This Page Intentionally Left Blank
183
ILLUSTRATIVE EXAMPLES 1. Give a representationfor the condition: z E [1/1,U1] or z
E [&,U2] or
or
2
E
[hut]
Simplify as far as poasible (as always). 2. Give a representationfor the graph of a piecewise linear function of one variable, with the two line segments
Figure 26: Piecewise Linear Function
R. JEROSLOW
184
3. Give a representation for the graph of this function, where 0
< ul:
if 2 1 = 0
4. Give a representation for the epigraph of the following function, which
uses onZg one binary variable (and no other integer variable): (Ingeneral, t convex sections requires ( t - 1) binary vbls).
Figure 27: Two Convex Sections
EXAMPLES
185
In each of the three problems to follow, decide whether (from the point of view of the linear relaxation) it is better to represent the function on the left directly, or as the s u m or minimum of the two functions on the right. (Use graph representability and a disjunctive representation). (HINT: first obtain min(u,u ) for 0 5 u, u 5 2). 5. x2
t
+
I
(0. -1)
Figure 28: Concave S u m
R. JEROSLO W
186 6. x2
t
vs.
+
Figure 29: Linear S u m
EXAMPLES
187
7.
vs.
Figure 30: Minimum of Functions
R. JEROSLOW
188
8. Suppose that at least one of these constraint systems hold for at least one i = 1,...,m: C ? = l ~ j z j I bi
allzj
2 0
Develop a disjunctive characterization when all 0;j > 0 and all b; 2 0. 9. Give a representation of the graph of the AND function. 1.e. 21,22,23
are bmary variables where 23 = 1if€ 21 = 2 2 = 1. 10. What occurs if the disjunctive construction is used on the following condition:
The variables zl, 22,23 are continuous and nonnegative and:
+ 222 + 23 5 10
either
21
or
221
+
22
55
11. Provide a disjunctive formulation of the following condition, which has the convex span as linear relaxation. Simplify this formulation as far as possible.
The condition is: "Exactly two of the nonnegative variables Z ~ , Z Z23 , and 2 4 are positive, and those which are positive lie in the interval [l, lo]. Moreover, if 2 1 is positive, then either 2 2 or 23 is positive." After giving your formulation, determine if this "standard" formulation has the same hear relaxation: Zj
5 ~j 5 l O ~ j ,
z1
+ %a + +
21
5 %2 + %3
%3
%4 =
Zj
E {O, 1) for
j = 1,2,3,4
2
(HINT: The standard formulation has possible in its LR).
21
= 5,22 =
+,23
= 0, 2 4 = 10
12. (This example is usehl after Lecture 5, and after reading the paper by Blair, Jeroslow and Lowe on MIP approaches to propositional logic. It is
189
EXAMPLES
concerned with Tseitin’s efficient algorithm for obtaining an equivalent conjunctive normal form (CNF)). Use the linear-time conjunctive normal form (CNF)procedure to obtain a cnf (in possibly additional letters) for the following condition of propositional logic. Compare the linear relaxation of this formulation to that obtained from the usual CNF. The condition is:
(4’1
A
Pa
A
P3) V (PI A Ps A P6) V l(P1
-+
P7
V
Ps)
13. Apply the Davis-Putnam procedure in branching form, to determining the satisfiability of the following condition of propositional logic. Use PI,Pa, P3, as the order of choosing branching variables.
...
The condition is:
Pi
V-P,
-Pi
ViP3 VP4
P3
V7P4
Pa
VP3
VP4
14. Determine if the following set of Horn Clauses is satisfiable and, if so, give satisfying valuation:
190
R. JEROSLOW
15. For the following logic problem, give an alternate formulation of your own which has a better relaxation (LR)than the "standard formulation." Show that your LR is better. (HINT:Look at the alternative settings and use these in a disjunctive formulation, possibly introducing new variables. A convex listing it too long, so "condition" your analysis on the four possible truth d u e s for PI and Pa. To show that your LR is stronger, it may help t o note that 1, 1,i) or some related point, solves the given LR but perhaps not the stronger one).
(3, i, i,
191
SOLUTIONS TO EXAMPLES 1. Pi = {.It;5 z 5 Ui}, rec'(Pi) = ( 0 ) A disjunctive formulation is:
Actually, from the simplification given in Lecture 2, as long as Li for i = 1, ...,t the following simplification is also sharp:
I U;
A second disjunctive representation uses a Werent description of the same polyhedra: pi = (2 I rec*(P;) = (21
for some y,
2
= yLi
+ (I - y)Ui, O 5 I I 1)
for some y, 2
= y(& - U;),y = 0) = ( 0 )
The disjunctive formulation is:
This simplifies upon substituting out for each di).
R. JEROSLO W
192 2.
Simplifies to: (use
3.
X = A,)
193
SOLUTIONS
Simplification gives (using X = ~
2 ,=
y(')):
and similarly rec*(Pz) = ((21, z z ) l z ~= 0, zz 2 0). The disjunctive formulation simplifies to (aher substituting for (l)
(z)
(l)
21 9 2 1 922
=PI):
To obtain only one binary variable A(= XZ), use e.g. A1 = 1 + X in the above and delete "A1 + Xz = 1."
R. JEROSLO W
194
In Examples 5 to 7, the left-hand-side (LHS)function is called f , and the right-hand-side (RHS) functions are called g and h. LR=linear relaxation. 5. (3, Q)is in conv(grph(g)) (i.e. the convex span of the graph of g) and also (3,3) E conv(grph(h)). So (3, is in the LR of the RHS formulation,
y)
but not in the LR of the LHS formulation. The LHS formulation is superior.
i)
6. (1, E conv(grph(g)) and ( 1 , O ) E conv(grph(h)), so (1,;) is in the LR of the R H S formulation-but not of the LHS formulation.
Again the LHS formulation is superior. 7. When the min(u,u) function is formulated disjunctively, its LR(linear relaxation) is the convex span of (I,u,u ) = (0, 0, 0), (0,2,0), (0, 0,2) and (2,2,2). Therefore (z,z1,1 - 21) is in the LR of these four points,
when the ELHS (i.e. separate) formulation is used. Thus for suitable 0 1 , 0 2 , 0 3 , 0 4 we have:
In fact, (1) is simply long-hand for ( t , z 1 , 1 - 21) = @1(0,0,0)+ %(O, 2,O) 03(0,0,2)+ 04(2,2,2). Note that (1) implies
+
by adding the rows for
21
and 1 - 21. We will use (2) later.
When the LHS (i.e. pooled) representation is used, ( z , 21) is in the LR if€ it is in the convex span of ( O , O ) , ( l , l ) and (0,2)-i.e. if€ there are 61,62,63 with 61+62+63=1 d 6 k > o
SOLUTIONS
195
It sufficesto show that a solution to (1)gives a solution to (3). (We know that the converse is true, since the pooled formulation is best possible). Given a solution to (l), put 62 = 2@4,S3= 0 2 . Then from (1)we at once 6, = O2 + 2 0 4 5 have z = &,zl = 263 6., Also using (2) we get 6, 2 0 1 +2@3+4@4= 1, so ifwe set 61 = 1-82 -63 we get 61 2 0. So we do obtain a solution to (3).
+
+
We conclude: as far as the linear relaxations are concerned, here the pooled and seperate formulations are equdly good This outcome is atypical, as commonly the pooled formulation is strictly superior.
Then PI = {%I for some 01,
...,O n > 0 with xy Oj = 1,
we have z = xy[Oj-$]ej} where ej is the j-th unit vector. We have that the desired set S =
PI u ...u P,.
After simplification, we obtain:
9.
One way of representing it is from the four polyhedra:
R. JEROSLO W
196
The disjunctive construction gives:
This simplifies to:
Xl+Xz+X3
a l l X ; binary
In this case, since 23 = XI one may also use the more standard simplification: 21
23
2 23,
23
2 23;
x1,22, 23 binary
+(a- 2 3 ) 5 1
+(21-23)
I1
/I
II
A1
A3
A2
i.e.
x1+x2
5 1+x3
For if the z j are binary, and satisfy these constraints, the A; binary.
a r e also
10. Put x = ( z ~ , z z , z ~ and )
We have rec*(P1) = {(O,O,O)} but rec*(P2) = {(O,O,x3)1~3 2 0) so the recession condition fails and PIU P2 is not representable.
If we proceed anyway with the disjunctive construction we obtain:
dl) 2 0,
zp)
+2x$l)
20
22p
+zy
x(1)
+a$)
5 10x1 < 1OAZ
X I + A 1 = 1,
= &I
Xibinary
+Z(4
197
SOLUTIONS
If A1 = 1 then
A2
= zf) = 0 but
= 0 and
Thus we obtain Q 1 = (2 1 Olzl+ 222 obtain Pa,as is easily checked.
So the above represents
Q1 U
” z j int ” abbreviate ”1 logical possibilities:
11. Let
or
a$)
can be anything.
I 10) if A1
= 1.
If A2 = 1, we
P2 instead of PIu P2.
I z j I 10.”
Here is an enumeration of the
21
int
21
int
22
int
22
=0
23
= 0 or
23
int
24
=0
24
=0
21
=0
22
int
or z2int
23
int
23
24
=0
24
21
=0
21
=0
22
=0
=0
23
int
int
24
int
or
The disjunctive construction introduces variables z y ) ( j = 1,2,3,4) for the i-th logical possibility (i = 1,...,5) together with control variables
Xi
E {O,
I}.
In the i-th system above, replace x j by 2): and ” z j int ” is replaced by ”1.A.1 -< 28’ 5 1O.Ai”. Then erase all ”orns (i.e. these become ”ands”) and add constraints
By direct addition on i , the previous implies:
198
R. JEROSLOW The above is easily checked to be a correct integer formulation. (Actually, it also preserves the sharp LR).
In the standard formulation, we may have: 21 24
= 5,
%I=
= 10,
z4
1
f,
x p = 31 ,
22
= 31 ,
= 1, (note z1+ zp
23
= 0,
+ z3 + z4 = 2,
%3
z1
=0
5 zp + 23).
We suppose that such x, settings are possible in the LR of (*), and we shall arrive at a contradiction. From 2 3 = 0, (*) gives X p = X 3 = XS = 0. From AS = 0 and gives A4 = 1. Since A; = 1, A1 = 0. But x 1 = 5 gives A1 (*), and this is the contradiction.
xi
24
= 10, (*)
+ A 2 2 3 in
12. Put A = 1P1 A Pp A P3, B = Pi A P5 A P g , C = PI A 7P7 A ~ P E (= -(Pl + P7 v Ps)).The linear algorithm produces these clauses: for A
for B:
1P4 v
P4
V
TB
1Ps v 1 P g
V
B
for C :
v
1c
Ps v
c
PI
-7P1 V P 7 V
for D:
-B
V
D
-4 V
D
I’
etc.
199
+
Thus, we 16 1 = 17 inequalities, which actually simplify to 13 upon using z ( D ) = 1 (simplifications occur in the last group). The usual CNF has 33 = 27 clauses and so 27 inequalities, of which 3 are redundant (as PIand -PI occur in the disjunction). These correspond to the 27 ways of picking one element from each of A, B and C. The (projection of) the LR from the linear time procedure is identical with the LR from the usual procedure, since no literal appears twice in the same clause of the usual CNF (Zetters may appear twice, if they are of opposite sign). See [Bla Jer Low 19851.
13. As there is no unit clause or monotone letter, we begin by branching on Pl (see Figure 31 for a complete solution).
putP,T (monotone)
4
4
L
put P, --T (unit)
-
done a satisfying valuation is found
Figure 31: Truth Valuation Search Tree
R. JEROSLO W
200
14. For Horn clauses, the subroutine CC (clausal chaining) alone will detect an inconsistency, if any exists. Here there is one:
1
starting unit clauees
Pl p3 p4 +2
1PS lP1 v 7P3 v Pe TP6 v TP7 v P2 7Pa v -Pa
V
Pa
1P1 v lP8 v 7P7 v PS l P @v lP10 v PI1 lP1 v l P , v Pro 1Pe v -Pro v P7 after all starting units are used : unit+ unit
+
unit+ unit+
Pe p3i p 4 -7Pe v 7P7 P2,PS + F 7Ps v 7P7 Pro 1Pe v lP10 v P7 1P7 p3, p4, p6, PI0 TPS v TP7 p21 pi3 + F P7
--.)
Two contradictory literals, hence inconsistent.
15. The idea is to look at all m h h d fixing (of truth values) which makes all the clauses true. For truth values which do not matter, we put -, which will later be interpreted as any ( 0 , l ) value. As PI and P2 occur so frequently in all clauses, we condition the analyses on their settings. Our approach is illustrative of a pre-processing routine.
SOLUTIONS
201
Pi Pa P3 P 4 0 0 0 1 0 1 1 0 1 0 1 1
1
-
-
0
-
Pa 1 1 1
s
1 0 0 1 0 0 1
- -
This kind of preprocessing can be implemented by fust a 4-way branching on PI,Pa; then clausal chaining (CC) to locate fixed values; then some more branching and CC. Via the same kind of simplifications as in Example 11, we obtainr
521 5 ( A 6 + h + A 7 f x 8 )
(AI+xb+A7+x8) (14
+ +
AS)
+
A6)
A7
(A4
5 2 2 5 (A4 + A7 + AS) 5 21 5 (h+ A6 + 'b) I
24
5 (Xi
+ As + Aa + As + 'b + A7 + As)
(=I-&) A1
+& + + xb + A, A7
5 a6 5 5 ae 5
(A1 (A1
+ + + + + Aa + A4
A7
AS)
A4
+As
+ x6 + + As) A7
(= 1)
where 2 j E {0,I} is the truth value of Pj, and
+ A2 + ...+ A0 = 1
all A i E {O,I) (#,#,l,i,l,#) is clearly in the LR of the "standard" repn of dauses (as in Problem two). However: A1
23
so then
= 1 -+
26
A4+ AS
= 1 -+
A4
+ A6 = 1
= 1 + As =
XI = 1 -+
23
-+
A1
= A1 = A3 = A7 = A8 = 0
= 0 and we reach a contradiction:
= 1, but
23
1 =5
Actually, the formulation above describes the convex hull of the satisfying truth valuations-a property of this technique in general.
This Page Intentionally Left Blank
203
BIBLIOGRAPHY [Aho Hop UIl 19741 A. V. Aho, J. E. Hopcroft, and J. D. Ullman, The Design and Analysis of Computer Algorifhms, Addison-Wesley Pub. Co., Reading, Mass, 1974. [And 19761 P. Andrews, "Refutations by Mating," IEEE Tmnsactions on Computers C-25 (1976) 801-806. [Ball9651 E. Balas, "An Additive Algorithm for Solving Linear Programs with Zero-One Variables," Operations Reseamh 13 (1965) 517-546. [Bal 19741 E. Balas, "Disjunctive Programming: Facets of the Convex Hull of Feasible Points," no. 348, GSIA, Carnegie-Mellon University, 1974. [Bal 19751 E. Balas, "Disjunctive Programming: Cutting-planes from Logical Conditions, " in 0.L. Mangasarian, R. R. Meyer, and S . M. Robinson, Nonlinear Programming 2, Academic Press, New York (1975) 279-312. [Ball9791 E. Balas, "Disjunctive Programming, in P. L. Hammer, E. L. Johnson, and B. H. Korte, eds., Discrete Optimization 11, North Holland Publishing Company, 1979, pp. 3-52. [Ball9851 E. Balas, "Disjunctive Programming and a Hierarchy of Relaxations for Discrete Optimization Problems, SIAM Journal on Algebruic and Diecrete Methods 6 (1985) 466-486. [Balc 19841 Osman Balci, "Requirements for Model Development Environments," Tech Rep. SRC-85-006, Dept. Computer Science, Virginia Polytechnic Institute and State University. [Balc Nan 19851 0. Bald and It. E. Nance, "Formulated problem verification as an Explicit Requirement of Model Credibility, " Simulation 45 (1985) 76-86. [Bal Pul 19831 E. Baku and W. Pulleyblank, "The Perfectly Matchable Subgraph Polytope of a Bipartite Graph, " Networks 13 (1983) 495-516.
R. JEROSLO W
204
[Bar Van Wol1984al Imre Barany, Tony J. van Roy, and Lawrence A. Wolsey, "Strong Formulations for Multi-item Capacitated Lot Sizing," Manugement Science 30 (1984) 1255-1261. [Bar Van Wol1984bl Imre Barany, Tony van Roy, and Lawrence A. Wolsey, "Uncapacitated Lot Sizing: The Convex Hull of Solutions," Mathematicul Progrummis;; Stud9 22 (1984) 32-43. [Bar Van Wol19851 I. Barany, T. J. van Roy, and L. A. Wolsey, "Valid Linear Inequalities for Fixed Charge Problems," Operatiom Research 33 (1985) 842-861. [Bard Fall9821 J. F. Bard and J. E. F&, "An Explicit Solution to the Multi level Programming Problem," Computers and Operations Research 9 (1982) 77-100. [Barr Fei 19811 Avron Barr and Edward A. Feigenbaum, The Handbook of Artificiul Intelligence, Heuris Tech Press and William Kaufhan, Inc., Stanford and Los Altos, California, vol. I, 1981. [Barre 1985 ] Eamon Barrett, "Proving Theorems in Continuous Logic by Linear P r o g T d g Techniques," Smart Systems Technology, McLean, VA, 1985. [Baz She 19761 M. S. Baearaa and C. M. Shetty, Foundatiom of Optimization, Springer-Verlag, Berlin, 1976. [Bea 19791 E. M. L. Beale, "Branch-and-Bound Methods for Mathematical Programming," in Discrete Optimization 11, eds. P. L. Hammer, E. L. Johnson, and B. H. Korte, North Holland, Amsterdam, 1979 201-221. [Bend 19621 J. F. Benders, "Partitioning Procedures for Solving Mixed Variable Extremum Problems," Numetische Math 4 (1962) 238-251. [Ben 19801 M. K. Bennett, "An Embedding Theorem for Finite Lattices," Journal of Combinutorics, Information, end Systems Sciences 5 (1980) 54-57. [Ber 19171 Leonard Berman, "Precise Bounds for Presburger Arithmetic and the €&als with Addition: Preliminary Report," Proceedings of the 18th Annual Synposivm on Foundutwm of Computer Science, 1977, pp. 9599.
BIBLIOGRAPHY
205
[Bla 19771 C. E. Blair, "Two Rules for Deducing Valid Inequalities for ZeroOne Programs," SIAM Journal of Applied Math 31 (1977) 614-617. [Bla 19781 C. E. Blair and R. G. Jeroslow, "A Converse for Disjunctive Constraints," Journal of Optimization Theory and Its Applications 25 (1978) 195-206.
[Bla Jer 19831 C. E. Blair and R. G. Jeroslow, "Computational Complexity of Some Problems in Parametric Discrete P r o g a m m i q , I," S e e Mathematics of Operations Research 11 (1986) 241-260. [Bla Jer 19841 C. E. Blair and R. G. Jeroslow, "Extensions Of a Theorem Of Balas," Discrete Applied Mathematics 9(1984) 11-26. [Bla Jer Low 19851 C. E. Blair, R. G. Jeroslow, and J. K.Lowe, "Some Results and Experiments on Programming Techniques for Propositional Logic," January 1985. See Computer and Operations Research 13 (1986) 633645.
[Ble 19741 W. W. Bledsoe, "The SUP-INF Method for Presburger Arithmetic," University of Texas memo ATP-18, Dept. of Mathematics, 1974. [Ble Lov 19831 W. W. Bledsoe and D. W. Loveland, eds. Automated Theorem Prowing: After 25 Years, Contemporary Mathematics, vol. 29, American Mathematical Society, ahode Island, 1983. [Bon Hol Whi 19821 Robert H. Bonczek, Clyde W. Holsapple, and Andrew B. Whinston, Foundations of Decision Support Systems, Academic Press, New York, 1982. [Bor 19721 J. M. Borwein, "The Minimum Of a Family Of Programs," in Opemtions Reseamh Verfiahren31, Franz steffens, ed, Athenaum verlag, 1979, pp. 99-111. [Bro Pur 19821 Cynthia A. Brown and Paul W. Purdom, "An Empirical Comparison of Backtracking Algorithms," IEEE Tmnsactions on Pattern Ancrlysis and Machine Intelligence 4 (1982) 309-316. [Cass 19751 J. W. S. Cassels, "Measures of the Nonconvexity of Sets and the Shapley-Folkman-Stan Theorem," Mathematical Proc. of the Cambridge Philosophical Soc. 78( 1975) 433-436. [Cha Sto 19761 A. K. Chandra and L. J. Stockmeyer, "Alternation," in Proceedings of the 17th Annual Symposium on Foundations of Computer Science 1976, pp. 98-108.
206
R. JEROSLOW
[Cha Har 19821 Ashok K. Chandra and David Hard, "Horn Clauses and the Fixpoint Query Hierarchy," 1982. [Cha McD 19851 E. Charniak and D. McDermott, Introduction to Artificial Intelligence, Addison-Wesley Publ. Co., Reading, Mass. 662 pgs.
+
[Char Coo 19611 A. Charnes and W. W. Cooper, Management Models and Industrial Applications of Linear Progrumming, Vols. I and II, John Wiley, New York, 1961. [Che Hen ] Michael C. Chen and Lawrence J. Henschen, "On the Use and Internal Structure of Logic-Based Decision Support Systems," t o appear in Decision Support Sgstems. [Cho 19831 Ui C. Choe, Am-Path Approaches to Fized-Charge Network Problems, Ph. D. Dissertation, Georgia Institute of Technology, 1983. [Chv 19731 V. Chvatal, "Edmonds Polytopes and a Hierarchy of Combinatorial Problems, " Discrete Math 4 (1973) 305-337. [Chv 19781 V. Chvatal, "Rational Behavior and Computational Complexity," McGill University, 1978. [Cla 19841 A. Claus, "A New Formulation for the Travelling Salesman Problem," SIAM Journal Alg. Discrete Methods 5 (1984) 21-25. [Clo Md19841 W. F. Clocksin and C. S . Mellish, Progmmming in PROLOG, 2nd Edition, Springer-Verlag, New York, 1984 [Cod 19721 E.F.Codd,"Belational Completeness of Data Base Sublanguages," in Data B w e Systems,edited by R. Rustin, Prentice Hall Publ. Co, 1972. [Coo 19711 S. A. Cook, "The Complexity of Theorem Proving Procedures," Proceedings of the Third SIGACT Synpoeium, 1971 pp. 151-178. [Coop 19721 D. C. Cooper, "Theorem Proving in Arithmetic without Multiplication," in Machine Intelligence 7, B. Meltzer and D. Michie, eds, American Elsevier, New York, 1972,pp. 91-99. [Cro Joh Pad 19831 H. Crowder, E.L. Johnson and M. W. Padberg, "Solving Large-Scale 0-1 Linear Programming Problems," Operations Research 3111983) 803-835.
[Cro Pad 19801 H. Crowder and M. W.Padberg, "Solving Large-Scale Symmetric "ravelling Salesman Problems to Optimality," Management Science 26 (1980)495-509. [Dan 19571 G. B. Dantsig, "Discrete Variable Extremum Problems," Operations Research 5 (1957)266-277. [Dan 19631 G . B. Danteig, Linear Programming and Eztensions, Princeton, New Jersey, Princeton University Press, 1963. [Dan 19821 George B. Danteig, "Reminiscences About the Origins of Linear P r o g r d g , " Operations Research Letters l(1982) 43-48. [Dav Put 19601 M. Davis and H.Putnam, "A Computing Procedure for Quantification Theory," Journal of the ACM 7 (1960) 201-215. [Dav Put Rob 19611 M. Davis, H.Putnam, and J. Robinson, "The Decision Problem for Exponential Diophantine Equations," A n d of Mathemat ~ C 74 S (1961) 425-436. [Dav Len 19821 Randall Davis and Douglas B. Lenat, Knowledge-Based Systern in Artificial Intelligence, McGraw-Hill International, New York, 1982. [Dav Buc Sho 19771 R. Davis and B. Buchanan, E. ShortWe, "Production Rules as a Representation for a Knowledge-Based Consultation Program," Artificial Intelligence 8 (1977) 15-45. [Den 19841 L. Denenberg, Computational Complezity of Logical Problems: Formulas, Dependencies, and Circuits, Ph.D. Thesis, Harvard University, 1984. [Den Lew 19831 Larry Denenberg and Harry R. Lewis, "Logical Syntax and Computational Complexity, " in Proceedings of the Logic Colloquium at Aachen, Springer Lecture Notes in Mathematics 1104,1983,pp. 109-115. [Dob Lip Rei 19791 David Dobkin, Richard Lipton, and Steven %as, "Linear Programming is Log Space Hard for P, " Information Processing Letters 8 (1979) 96-97. [Dow Gal 19841 W.F. Dowling and J. H. Gallier, "Linear Time Algorithms for Testing the Satisfiability of Horn Formulae," Journal of Logic Programming 3 (1984) 267-284.
R. JEROSLO W
208
[Epp Mar 19851 Gary D. Eppen and R.. Kipp Martin, "Solving Multi-item Capacitated Lot Sizing Problems Using Variable Redefinition," Graduate School of Business, University of Chicago, May 1985. [Epp Gou 19851 G. D. Eppen and F. J. Gould, Quuntitcrtioe Concepte for Management, Prentice-Hall, 2nd edition, 1985. [Feig McCor 19833 Edward A. Feigenbaum and Pamela McCorduck, The Fifth Generution: Artificial Intelligence and Japan's Computer Chdlenge to
the World, Addison-Wesley Publ. Co., Reading, Mass. 1983. 267 pp. [Fis Rab 19741 M. J. Fisher and M. 0. Ftabin, "Super-exponential Complexity of Presburger Arithmetic," in Complezity of Computation, vol. 7 , SIAM-AMS Proceedings, h e r i c a n Mathematical Society, Providence,
RI., 1974. [For N 19621 L. R. Ford, Jr., and D. R. Fulkerson, Flows in Networks, Princeton University Press, Princeton, New Jersey, 1962. [Fra 19841 John Franco, "Probabilistic Analysis of the Pure Literal Heuristic for the Satisfiability Problem," Annals of Operations Reseamh, I, 273289 (1984).
[FraHo 19861 John Franco and Yuan Chuan Ho, "Probabilistic Performance of a Heuristic for the Satisfbbility Problem," Computer Science Department, Indiana University, 1986.
[FraPau 19831 John Franco and Marvin P a d , "Probabilistic Analysis of the Davis Putnam Procedure for Solving the Satisfiability Problem," Discrete Applied i%fathematicr, 5, 77-87 (1983).
[Gar Joh 19791 M. Garey and D. Johnson, Computers and Intractability, W. H.Freeman, 1979. [Gar Nem 19721 R. Garfinkel and G. L. Nemhauser, Integer Progmmming, John Wiley and Sons, New York, 1972. [Gen 19691 G. Gentzen, "Investigations into Logical Deduction," in The Collected Papers of Gerhard Gentzen, M.E. Szabo, ed., North Holland Publishing Company, London, 1969,68-131. [Geo 19741 A. M. Geoffrion, "Lagrangean Relaxation for Integer Programming," in Mathematical Programming, Study 2, North-Holland, Amsterdam, 1974.
BIBLIOGRAPHY
209
[Geo 19851 Arthur M. GeoEion, Structured Modeling, UCLA Grad School of Management, 1985. [Geo Gra 19741 A. M. Geofiion and G. W. Graves, "Multicommodity Distribution System Design by Benders Decomposition," Management Science 20 (1974) 822-844.
[Glo 1975al F. Glover, "New Results on Equivalent Integer P r o g r m m h q Formulations, " Mathematical Prognamming 8 (1975) 84-90. [Glo 1975b] F. Glover, "Polyhedral Annexation in Mixed Integer and Combinatorial Programming, " Mathematical Prognamming 9 (1975) 161-188. [Glo 19851 Fred Glover, "Future Paths for Integer Programming and Links to Artificial Intelligence," CAM Report 85-8, University of Colorado, 1985. [Glo Hul Kli Stu 19781 F. Glover, J. Hulte, D. Klingman, and J. Stutz "Generalized Networks: A Fundamental Computer-Based Planning Tool," Management Science 24 (1978) 1209-1220.
[Glo Kli McM 19771 Fred Glover, Darwin Klingman, and Claude McMillan, "The NETFORM Concept: A More Effective Model Form and Solution Procedure for Large Nonlinear Problems," Center for Cybernetic Studies, University of Texas, 1977 [Glo McM and Kli] Fred Glover, Claude McMillan, and Darwin Klingman "Modeling Combinatorial Mathematical Programming Problems by NETFORMS: An Illustrative Application," in Nonlinear Propmming 3, Academic Press, pp. 303-336. [Godel 19301 K. Godel, "Die Vollstandigkeit der Axiome des logischen funktionen Kalkuls," Monutschrij3 fir Math. Phys 37 (1930) 349-360. [Godel 19311 K. Godel, "Ueber formal unentscheidbare Satre der Principia Mathematica wnd verwandter Systeme I, " Monatschrift fir Math Phys 38 (1931) 173-198.
[Go1 19651 E. M. Gold, "Limiting Recursion," Journal of Spbolic Logic 30 (1965) 28-48.
[Gom 19631 R. E. Gomory, "An Algorithm for Integer Solutions to Linear Programs," in R. L. Graves and P. Wolfe, eds., Recent Advances in Mathematical Programming, McGraw-Hill, 1963.
210
R. JEROSLOW
[Gom 19691 R. E. Gomory, "Some Polyhedra Related to Combinatorial Problems," Linear Algebra and its Applications 2 (1969) 451-558. [Gran Ham 19741 F. Granot and P. L. Hammer, "On the Role of Generalized Covering Problems," Cahiers du Centre d'Etude8 de Recherche Operationnel, 16 (1974) 277-289. [Granl 19801 D. Granot and F. Granot, "Generalized covering Relaxation for 0-1 Program," Operations Re8eurch 28 (1980) 1442-1449. [Green 19691 C. C. Green, " Theorem proving by Resolution as a Basis for Question Answering Systems," in Machine Intelligence, vol. 4, edited by B. Meltzer and D. Michie, American Elsevier Publishing Company, New York, 1969, pp. 183-208. [Greenb 198381 H. J. Greenberg, "MIP Tactics," in Mized Integer programming and Mathematical Programming Systems, R.H.F. Jackson and R.P. O'Neil, eds, MPS/COAL Special Joint Issue, 1983. [Greenb 1983b] Harvey J. Greenberg, "A Functional Description of ANALYSE: A Computer-Assisted Analysis System for Linear Programming Models," ACM Transactions on Mathematical Software 9( 1983) 18-56. [Greenb 19851 H. J. Greenberg, "The Fifth Generation of Mathematical Programming Systems: Toward an Intelligent MPS," presentation at the TIMS College on the Practice of Management Science Workshop, July 2 8 - A ~ g ~ 2, s t 1985. [Greenb 19861 H.J. Greenberg, "Diagnosing Infeasibility in Min-Cost Network Flow Models," University of Colorado at Denver, 1986. [Greenb Lund May 19841 H. J. Greenberg, J. R. Lundgren, and J. S. Maybee, "Signed Graphs of Netforms," in Congreseus Numerantium 44 (1984) 105-115. [Ham 19741 P.L. Hammer, "Boolean procedures for Bivalent programming," in "Mathematical Programming in Theoty and Practict?', P.L. Hammer and G. Zoutendijk (editors), North Holland Publ. Comp., Amsterdam New York, 1974, pp. 311-363. [Ham 19761 P. L. Hammer, "Boolean Procedures in Optimization," S p p o s i a Mathematica 19(1976) 103-121.
BIBLIOGRAPHY
211
[Ham 19791 P.L. Hammer, "Boolean Elements in Combinatorial Optimization," Annals of Discrete Mathematics 4 (1979) 51-71.
[Ham Joh Pel 1974 ] P. L. Hammer, E.L. Johnson and U. N. Peled, " R e p -
lar 0-1 Programs. " Cahiers du Centre d'Etudes de Recherche Operationnelle 16 (1974) 267-276.
[Ham Han Sim 19791 P. L. Hammer, P. Hansen and B. Simeone, "Best Linear Relaxations for Quadratic 0-1 Optimization," RR no. 79-37,Dept. of Comb. and Opt., University of Waterloo, 1979. [Ham Han Sim 19841 P.L. Hammer, P. Hansen, B. Simeone, "Roof-Duality, Complementation and Persistence in Quadratic 0-1 Optimization, " Mathematical Progmmming 28 (1984)121-155. [Ham Ngu 19793 P.L. Hammer and S. Nguyen, "APOSS - A Partial Order in the Solution Space of Bivalent Programs, " Combinatorial Optimization, N. Christofides et al (editors), John Wiley and Sons, 1979, 93-106. [Ham Pel 19721 P.L. Hammer and U. N. Peled, "On the Maximization of a Pseudo-Boolean Function," Journal of the Association for Computing Machinery 19 (1972) 265-282. [Ham Rud 19701 P. L. Hammer and S . Rudeanu, Boolean Methods in Operations Rerearch and Related Areas, Springer Verlag, Berlin/Heidelberg/ New York 1968, 330 pages. Edition F'rancaise: Dunod, Paris, 1970. [Ham Sim 1987a] P. L. Hammer and B. Simeone, "Order Relations of Variables in 0-1 Programming," Annab of Discrete Mathematics 31 (1987) 83-112. [Ham Sim 1987bl P.L. Hammer and B. Simeone, "Quadratic Functions of Binary Variables", RUTCOR Research Report, Rutgers University 1987. To appear in: Tends in Combinatorial Optimization, Bruno Simeone (editor), Lecture Notes in Mathematics, Springer Verlag Berlin, 1987. [Han 19793 P. Hansen, "Methods of Nonlinear 0-1 Programming." Annals of Discrete Mathematics 5 (1979)53-71. [Har Kin 19853 Paul Harmon and David King, Ezpert Sgwtermr: Artificial Intelligence in Business, John Wiley and Sons, New York, 1985. [Har Lew Stea 19651 J. Hartmanis, P. M. Lewis, and R. E. Stearns, "Classification of Computations by Time and Memory Requirements," Proc.
212
R. JEROSLOW
IFIP Congress 65, International Federation for Information Processing Sparton Books, Washington, D.C. pp. 31-35. [Hat Stea 19651 J. Hartmanis and R. E. Steams, "On the Computational Complexity of Algorithms," Tmnsactions of the AMS 117 (1965) 285306. [Hay Wat Len 19831 F. Hayes-Roth, D. Waterman, and D. B. Lenat, Building Ezpert Systems, Addison-Wesley, Reading, Mass,1983. [Hel Kar 19701 M. Held and R. M. Karp, "The Travelling Salesman Problem and Minimum Spanning Trees," Opervrtiona Research 18 (1970) 11381162. [HenNaql Lawrence J. Henschen and Shamim A. Naqvi, "On Compiling Queries in Recursive First-Order Databases," to appear. [Hen 19791 L. J. Henschen, "Theorem proving by Covering Expressions, Journal of A CM 26 (1979) 385-400.
"
[He& 19491 L. Henkin, "The completeness of the First Order Functional Calculus," Journal of Symbolic Logic 14 (1949) 159-166. [He& 19501 L. Henkin, "Completeness in the Theory of Types," Journal of Symbolic Logic 15 (1950) 81-91. [Ho 19761 A. C. Ho, "Cutting-Planes for Disjunctive Programs: Balas' Aggregated Problem," Carnegie-Mellon University, 1976.
[Hoc 19841 Dorit S. Hochbaum, "The Minimum Spanning Tree Problem, notes 1984.
"
[Hof Pad 19841 Karla Hoffman and Manfred Padberg, "LP-Based Combinatorial Problem-Solving," 1984. [Hol Whi 19861 Clyde W. Holsapple and Andrew B. Whinston, Manager's Guide to Ezpert Systems Using GURU, Dow Jones-Irwin, Homewood, Illinois, 1986. [Hop UIl1969] J. E. Kopcroft and 3. D. Ullman, F o m d Language8 and Their Relation to Automata, Addison-Wesley, Reading, Mass,1969. [Iba 19761 T. Ibaraki, "Integer Programming Formulation of Combinatorial Optimization Problems," Discrete Mathematics 16 (1976) 39-52.
BIBLIO GRAPIiY
213
[Imm 19821 Neil Immerman, "Re1ationd Queries Computable in Polynomial Time: Extended Abstract," 1982. [Jack 19861 Peter Jackson, Introduction to Ezpert Systems, Addison- Wesley Publ. Co., Reading, Mass. 1986. [Jac O'Ne 19811 R. H. F. Jackson and R. P. O'Neill, eds. COAL Special Issue: Mized Integer Progmmming In Mathematical Programming Systems, an ORSA and COAL/MPS publication, 1981. [Jer 1973al R. Jeroslow, "There Cannot be any Algorithm for Integer Programming with Quadratic Constraints, " Operations Research 21 (1973) 221-224. [Jer 1973133 R. G. Jeroslow, "The Simplex Algorithm with the Pivot Rule of Maximizing Criterion Improvement, " Discrete Mathematics 4 (1973) 367-377. [Jer 19741 R. Jeroslow, "Cutting-Planes for Relaxations of Integer Programs," no. 347, GSIA, Carnegie-Mellon University, 1974. [Jer 19751 R. G. Jeroslow, "Experimental Logics and A! -Theories," Journal of Philosophical Logic 4 (1975)253-267. [Jer 19771 R. Jeroslow, "Cutting-Plane Theory: Disjunctive Methods," Ann& of Discrefe Mathematics I (1977) 293-330. [Jer 197881 Robert G. Jerodow, "Cutting-plane Theory: Algebraic Methods," Discrete Mathematics 23( 1978) 121-150. [Jer 1978bl R. G. Jeroslow, "Cutting-planes for Complementarity Constraints," SIAM Journal on Control and Optimization 16 (1978) 56-62. [Jer 19801 R. Jeroslow, "Representations of Unbounded Optimizations as Integer Programs, n Journal on Optimization Theory and Its Applications 30 (1980) 39-351. [Jer 198481 R. Jeroslow, "Representability of Functions," Georgia Institute of Technology, 1984. To appear in Discrete Applied Mathematics. [Jer 1984bl R. Jeroslow, "Representability in Mixed Integer Programming, I: Characterization Results," Georgia Institute of Technology, 1984. See Discrete Applied Mathematics 17 (1987) 223-243.
214
R. JEROSLO W
[Jer 1984~1R. Jeroslow, "Representability in Mixed Integer Integer Programming, II: A Lattice of Relaxations," Georgia Institute of Technology, 1984. [Jer 1985a] R. Jeroslow, "On Monotone Chaining Procedures of the CF Type," College of Management, Georgia Institute of Technology, 1985. To appear in Decision Support Systems. [Jer 1985bl R. Jeroslow, "An Extension of Mixed Integer P r o g d g Models and Techniques to Some Database and Artificial Intelligence Settings", Georgia Institute of Technology, 1985. [Jer 1985~1Robert G. Jeroslow, "The Polynomial Heirarchy and a Simple Model for Competitive Analysis," Mathematical Programming 32 (1985) 146-164. [Jer 1985d] Robert G. Jeroslow, "Computation-oriented Reductions of Predicate to Propositional Logic," Georgia Institute of Technology, 1985. To appear in Decision Support Systems. [Jer 1985el R. Jeroslow, "The P r o g r d g of (Some) Intelligence: Opportunities at the OR/AI Interface," OPTIMA (Newsletter of the Mathemati d Programming Society) 14 (1985) 1-3. [Jer 1986a] Robert G. Jeroslow, "A Simplification for Disjunctive Formuh tions," Georgia Institute of Technology, 1986. To appear in European Journal of Operations Research. [Jer 1986b] R. Jeroslow, "Alternative Formulations Of Mixed-Integer Programs," Georgia Institute of Technology, 1986. To appear in Approache8 to Intelligent Decision Support. [Jer Low 19841 R. Jeroslow and J. K. Lowe, "Modelling with Integer Variables," Mathematical Programming Studies 22 (1984) 167-184. [Jer Lowe 19851 R. G. Jeroslow and J. K. Lowe, "Experimental Results on the New Techniques for Integer Programming Formulations, " Journal of the Opemtional Research Society 36 (1985) 393-403.
[Joh Kos Suh 19851 Ellis L. Johnson, Michael M. Kostreva, and Uwe H.Suhl, "Solving 0-1 Integer Programming Problems Arising from Large Scale Plannihg Models, " Operations Reseuwh 33 (1985) 803-819.
BLBLIOGRAPEY
215
[Jon Laa 19741 N. Jones and W. Laaser, "Complete Problems for Deterministic Polynomial Time," Proc. Sizth Ann. ACM Symp. Th. of Computing (1974) 40-46.
[Kar 19721 R. M. K q , "Reducibility Among Combinatorial Problems," in Complezity of Computer Computations, edited by R. E. Miller and J. W. Thatcher, Plenum Press, New York,. 1972, pp. 95-104. [Kha 19793 L. G. Khachian, "A Polynomial Algorithm for Linear Programming," Doklady Akad. Nauk USSR 224 (1979) 1093-96. Translated in Soviet Math. Doklady 20, pp. 191-194. [Kle Min 19711 V. Klee and G. J. Minty, "How Good is the Simplex Algorithm?" in Inequalities III, 0 . Shisha, ed., Academic Press, New York, 1971.
[Kli Pad Phi 19861 D. Klingman, R. Padman, and N. Phillips, "Intelligent Decision Support Systems - A Unique Application in the Petroleum Industry," University of Texas and University of Minnesota, 1986. [Koop 19511 T. C. Koopmans, ed., Activity Analysis of Production and Allocation, John Wiley and Sons, New York, 1951. 404 pp. [Kow 19791 Robert Kowalski, Logic for Problem-Solving, North Holland, Amsterdam (1979). [Kre 19671 G. Kreisel and J. L. Krivine, Elements of Mathematical Logic, North-Holland Publ. Co., Amsterdam, 1967.
[LanDoi 19601 A. H. Land and A. G. Doig, "An Automatic Method for Solving Discxete Progrnmming Problems," Econometricu 28 (1960) 497-520. [Land Pow 19731 A. Land and S . Powell, Fortmn Codes for Mathematical Programming: Linear, Quadratic, and Discrete, Wiley, London, 1973. [Law 19761 Eugene L. Lawler, Combinatorid Optimization: Networks and Mafroids, Holt, R.inehart and Winston, 1976. [Leu 19851 Janny M. Y. Leung, Polyhedral Structure of Capacituted Fized charge Problems and a Problem in Delivery Route Planning, Ph.D. Dissertation, M.I.T., 1985. [Lew 19801 H. R. Lewis, "Complexity Results for Classes of Computational Formulas," Journal of Computer and Systems Sciences 21 (1980) 317353.
216
R. JEROSLOW
[Lew Pap 19811 € R. I. Lewis and C. H. Papadimitriou, Elemente of The Theoty of Computation, Prentice-Hall, 1981. [Lit Mur Swe Kar 19631 J. D. C. Little, K. G. Murty, D. W. Sweeney and C. Karel, "An Algorithm for the Travelling Salesman Problem," Operations Research 11 (1963)979-987. [Lov 19781 Donald W. Loveland, Automated Theorem proving: A Logical Basis, North Holland, Amsterdam, 1978. [Low 19841 James K. Lowe, Modelling With Integer Varkbles, Ph.D. thesis, Georgia Institute of Technology, March 1984. [Mar 19841 R. Kipp Martin, "Generating Alternative Mixed-0/1 Linear Programming Models Using Variable Redefinition," Graduate School of Business, University of Chicago, July 1984. [Mar 1986b] R. Kipp Martin, "A Sharp Polynomial Size Linear Programming Formulation of the Minimum Spanning Tree Problem, " Graduate School of Business, University of Chicago. [Mat 19701 Yu. V. Matijasevic, "Enumerable Sets are Diophantine," Doklady An SSR 191 (1970) 279-282(English translation in Sowiet Mathematics DoMady 11 (1970)354-357). [Men 19641 E. Mendelson, Introduction to Mathematical Logic, D. van Nostrand Company, Inc., New York, 1964. [Mey 19741 R. R. Meyer, "On the Existence of Optimal Solutions to Integer and Mted-Integer Programming Problems", Math. Ptogramming 7 (1974)223-235. [Mey 19751 R.. R. Meyer, "Integer and Mixed-Integer Progammin& Models: General Properties," J o u d of Optimization Theory and Applications 16 (1975) 191-206. [Mey 19761 R. R. Meyer, "Mixed-Integer Minimization Models for PiecewiseLinear Functions of a Single Variable," Discrete Mathematics 16 (1976) 163-171. [Mey 19811 R. R. Meyer, "A Theoretical and Computational Comparison of 'Equivalent' Mixed Integer Formulations," Naval Research Logistics Quarterly 28 (1981)115-131.
BIBLIOGRAPHY
217
[Mey Tha Hal 19801 R. R. Meyer, M. V. Thakkar and W.P. Hallman, "Rational Mixed Integer and Polyhedral Union Minimization Models," Mathematics of Operatiom Research 5 (1980) 135-146. [Mey Sto 19721 A. R. Meyer and L. J. S t o h e y e r , "The Equivalent Problem for Regular Expressions with Squaring Requires Exponential Space," Conf. Reconi, IEEE 13th Annual Spposium on Switching and Automate Theory, New York, pp. 125-129. [Mit 19731 G.Mitra, "Investigation of Some Branch-and-Bound Strategies for the Solution of Mixed-Integer Linear Programs," Mathematical P m p m ming 4 (1973) 155-170. [Moo Whi 19861 James C. Moore and Andrew Whinston, "A Model Of Decision- Making with Sequential Information-Acquisition," Purdue Univ., 1986. [Mur Sto 19851 F. H.Murphy and E. A. Stohr, "An Intelligent System for Formulating Linear Programs," €US 9s and GBA 85-40,Temple University and New York University. [Nan Bal 19831 R. E. Nance and 0.Balci, "The Objective and Requirements of Model Management," Tech Rep. CS8302-4-R,Virginia Polytechnic Institute and State University. [Nan Mez Ove 19811 R. E. Nance, A. L. Mezaache, C. M. Overstreet, "Simulation Model Management: Resolving the Technological Gaps," Proc. 1981 Winter Simulation Conf., T. I. Oren, C. M. Delfosse, C. M. Shab, eds, pp. 173-179. [Nev 19741 Arthur J. Nevins, "A Human Oriented Logic for Automatic Theorem-Proving, " Journal o j the Association for Computing Machine- 21 (1974) 606-621. [New Sim 19721 H. A. Simon and A. Newell, Human Problem Solving, Prentice-Hall, Englewood ClifFs, New Jersey, 1972. [Nil 19711 Nils J. Nillson, Pmblem-solving Methods in Artificial Intelligence, McGraw-Hill, New York. [Nil 19801 Nils J. Nilsson, Principles of Artificial Intelligence, Tioga Publ. Co., Palo Alto, 1980. 427+ pgs.
218
R. JEROSLO W
[Opp 19781 Derek C. Oppen, "A 21F Upper Bound on the Complexity of Presburger Arithmetic," Journal of Computer and Systems Sci 18 (1978) 323-332. [Ove Nan 19851 C. M. Overstreet and R..E. Nance, "A Specification Language to Assist in Analyses of Discrete Event Simulation Models," Communications of the ACM28 (1985) 190-200. [Owe 19731 G. Owen, "Cutting-planes for Programs with Disjunctive Constraints," Journal Optimization Theory and its Applications 11 (1973) 49-55. [Pad 19791 Manfred W. Padberg, "Covering, Packing and Knapsack Problems," Annab of Discrete Mathematice 4 (1979) 265-287. [Pad Van Wol19851 M. W. Padberg, T. J. Van Roy, and L. A. Wolsey, "Valid Linear Inequalities for Fixed Charge Problems," Operations Reseurch 33 (1985) 842-861. [Pap ] Christos H. Papadimitriou, "Games Against Nature," to appear. [Pap Ste 19821 Christos H. Papadimitriou and Kenneth Steiglitz, Combinutorial Optimization: Algorithm and Complezity, Prentice-Hall, Englewood CWs, New Jersey, 1982. [Papp ] M. Pappdardo, "A Characterization of the Convex Hull of the Union of Convex Sets," Dispartimento di Recerca Operativa e Scienze Statishticha, Pis& [Pea 19841 Judea Pearl, Heurietics: Intelligent Search Strategies for Computer Problem Soloing, Addison-Wesley Publ. Co. Reading, Mass, 1984.
[Phi Ros 19861 A. T. Phillips and J. B. Rosen, "Multitasking Mathematical Programming Algorithms," Tech Rep 86-10, University of Minnesota, Computer Science Department, 1986. [Pla ] J. A. Plaisted, "Complete Problems in the First Order Predicate Calculus," manuscript.
[POCWol 19861 Y. Pochet and L. A. Wolsey, "Lot-Size Models with Backlogging: Strong Reformulations and Cutting Planes," CORE Discussion Paper no. 8618, Universite Cathloique de Louvain.
BIBLIOGRAPHY
219
[Por 19801 M. E. Porter, Competitive Strategy: Techniques for Analgsing Induafriee and Competitors, The Free Press, M a c d a n Publ. Co. New York, 1980. 382+ pp. [Por 19851 Michael E. Porter, Competitive Advantage: Creating and Suataining Superior Performance, Free Press, Collier MacMUan Publ., New York, 1985. [Pos 19441 Emil Post, ”Recursively Enumerable Sets of Positive Integers and Their Decision Problems,” Bulletin of the American Mathematical Society SO (1944)284-316. [Pra 19651 Dag Prawitz, Natural Deduction: A Proof-Theoretical Study, Almqvist and Wiksell, Stockholm, 1965. (Stockholm Studies in Philosophy 3). [Pres 19291 M. Presburger, “Uber die Vollstandigkeit eines gewissen Systems der Arithmetik ganzer Zahlen, in wlechem die Addition als eniziger Operation Hervortritt,” Congr. Math des Pays Slaves, Warsaw 1 (1929) 92-101. [Put 19651 H. Putnsm, ”Trial and Error predicates and the Solution to a Problem of Mostowski,” Journal of Symbolic Logic 30 (1965)49-57.
[Rar Cho 19791 R. L. Rardin and V. Choe, ”Tighter Relaxations of Fixed Charge Network Flow Problems,” 1979. [Rei 19781 R. Reiter, ”Deductive Question Answering on Relational Databases,” in Logic and Data Bases, edited by H. Gallaire and J. Minker, Plenum Press, 1978. [Renew 19841 Renewing U.S. Mathematics: Critical Resources for the Future, Report of the Ad Hoc Committee on Resources for the Mathematical Sciences, Academy Press, 1984. [Ric 19831 Elaine Rich, Artificial Intelligence, McGraw-Hill, 1983. [Rob 19651 J. Robinson, ”A Machine-Oriented Logic Based on the Resolution Principle,” Journal of the ACM 12 (1965) [Rob 19681 J. A. Robinson, ”A Generalized Resolution Principle,” Machine Intelligence S, Dale and Mitchie, eds., Oliver and Boyd, Edinbrugh, 1968, 77-93.
220
R. JEROSLO W
[Roc 19701 R. T. Rockafellar, Convez Analysis, Princeton University Press, Princeton, New Jersey, 1970. [Rog 19671 H. Rogers, Jr., Theory of Recursive Functions and Eflective Computabality, McGraw-Hill, New York, 1967. [Rou 19751 P. Roussel, Prolog: Manual de Reference et d’Utilization, Group d’ Intelligence Artificielle, Marseille-Leminy. [Sat Fox Gre 19851 A. Sathi, M. S. Fox, and M. Greenberg, ”Representation of Activity Knowledge in Projects Management, ” CMU-FU-TR-85-17, Robotics Institute, Carnegie-Mellon University, 1985. [Sch 19801 A. Schrijver, “On Cutting Planes,” Annals of Discrete Math 9 (1980), pp. 291-296. [She and She 19801 H. D. Sherali and C. M. Shetty, Optimization with Disjunctive Constraints, Spring Verlag, Lecture Notes in Economics and Mathematical Systems, Berlin and New York, 1980,156 pp. [Shoen 19671 J. R. Shoenfield, Mathematical Logic, Addison-Wesley, Reading, Mass, 1967. [Sho 19771 Robert E. Shostak, ”On the SUP-INF Method for Proving Presburger Arithmetic,” Journal of the ACM (1977) 529-543.
[Sho 19793 Robert E. Shostak, ”A Practical Decision Procedure for Arithmetic with Function Symbols,” Journal of the ACM (1979) 351-360. [Sim 19731 Herbert A. Simon, ”The Structure of Ill-structured problems,” Artificial Intelligence 4 (1973) 181-201.
[Sky Val 19851 S. Skyum and L. G. Valiant, ”A Complexity Theory Based on Boolean Algebra,” Journal of the ACM 3 2 (1985) 484-502. [Spr Wat 19861 Ralph H. Sprague, Jr., and Hugh J. Watson, Decision Support Systems: Putting Theory into Practice, Prentice-Hall, Englewood Cliff, New Jersey, 1986. [Sto 19771 Larry J. Stockmeyer, ”The Polynomial-Time Heirarchy,” Theoretical Computer Science 3 (1977) 1-22. [Sto Mey 19731 L. J. Stockmeyer and A. R. Meyer, ”Word Problems Requiring Exponential Time,” Proceedings of the Fifth Annual ACM Symposium on the Theory of Computing, New York, pp. 1-9.
BIBLIOGRAPHY
221
[Stoe and Wit 19701 J. Stoer and C. Witzgall, Convezity and optimization in Finite Dimensions I, Springer- Verlag, 1970. [Swa 19861 E. R. Swart, ”P = NP,” University of Guelph.
[Tarj 19831 Robert Endre Tarjan, Data Stmctures and Nettoor& Algon’fhms, CBMS-NSF Regional Conference Series in Applied Mathematics, SUM, Philadelphia, PA., 1983.
[Tar 19531 A. Tarski, A. Mostowski and R. Robinson, Undecidable Theories, Amsterdam, 1953.
[Tar 19781 Robert E. Tarjan, ”Complexity of Combinatorial Algorithms,” SIAM Reoiew 20 (1978) 457-491. [Tse 19701 G. S. Tseitin, ”On the Complexity of Derivations in Propositional Calculus,” in Studies in Constructioe Mathematics and Mathematial Logic, Part II, edited by A. 0. Slisenko, Steklov Math Inst., Leningrad, 1968. English translation by Consultants Bureau, New York, 1970, pp. 115-125. [Tur 19361 A. M. Turing, ”On Computable Numbers, with an Application to the Entscheidungs Problems,” Proceedings of the London Mathematical Society 42 (1936)230-265.
[Tur 19501 A. M.Turing, ”Computing Machinery and Intelligence,” Mind 64 (1950) 433-460.
[Ull 19821 Jeffrey D. Ullman, Principles of Database Systems, 2nd ed., Computer Science Press, Rockville, Maryland, 1982. [Val 19821 L. G. Valiant, ”Reducibility by Algebraic Projections,” L ’Enseignement Mathematique, T. XXVIII, fasc. 3-4, 1982, pp. 253268. [Van H 19771 Jean van Heijenoort, From Frege to Godel: A Sourcebook in Mathematical Logic, Harvard University Press, Cambridge, Mass, 1977. [Van Wol19851 Tony J. Van Roy and Lawrence A. Wolsey, ”Valid Inequalities and Separation for Uncapacitated Fixed Charge Networks,” Operations Research Letters 4 (1985) 105-112. [Vei 1969 ] A. F. Veinott, Jr., ”Minimum Concave-Cost Solution of Leontief Substitution Models of Multi-facility Inventory Systems,” Operations Research 17 (1969) 262-291.
R. JEROSLO W
222
[Wag Whi 19681 H. M. Wagner and T. M. Whitin, "Dynamic Version of the Economic Lot Size Model," Management Science 5 (1968) 89-96. [Wil 19741 H. P. Williams, "Experiments in the Formulation of Integer Programming Problems," Mathematical Propmming Studies, Study 2 , (1974) 180-197.
[Wil 19851 H. P. Williams, Model Building in Mathematical Programming, John Wiley and Sons (Wiley Interscience), 2nd edition, 1985,. [Win 1984al Patrick Henry Winston, Artificial Intelligence, 2nd ed., AddisonWesley, London, 1984. [Win Pren 1984bl P. H. Winston and K. A. Prendergast, The AI Business: The Commercial Uses of Artificial Intelligence MIT Press, Cambridge, Massachusetts, 1984. [Wit 19631 C. Witzgall, "An All-Integer P r o g r d n g Algorithm with Parabolic Constraints," Journ. Society for Industrial and Applied Mathematics 11 (1963) 855-870. [Wol 19811 Lawrence A. Wolsey, "Integer Programming Duality: Price F'unctions and Sensitivity Analysis," Mathematical Programming 2 0 (1981) 173-195.
[Won 19801 Richard T. Wong, "Integer Programming Formulations of the Travelling Salesman Problem," Proceedings of 1980 IEEE International Conference on Circuits and Computers, pp. 149-152. [Wra 19761 C. Wrathall, "Complete Sets and the Polynomial Hierarchy," Theoretical Computer Science 3 (1976) 23-33. [You 19711 R. D. Young, "Hypercylindrically-deduced Cuts in Zero-One Integer Programs," Operations Research 19 (1971) 1393-1405.
[Zan19691 W. I. Zangwill, "A Backlogging Model and a Multi-Echelon Model of a Dynamic Economic Lot Size Production System-A Network Appr~ach,~ ' Management Science 16 (1969) 506-527.