Optimal Shutdown Control of Nuclear Reactors
M A T H E MAT1C S I N SCIENCE AND ENGINEERING A S E R I E S OF M O N O G R A P H S A N D T E X T B O O K S
Edited by Richard Bellman University of Southern California
TRACY Y. THOMAS. Concepts from Tensor Analysis and Differential Geometry. Second Edition. 1965 TRACY Y. THOMAS. Plastic Flow and Fracture in Solids. 1961 2. 3. RUTHERFORD ARIS.T h e Optimal Design of Chemical Reactors: A Study in Dynamic Programming. 1961 JOSEPH LASALLEand SOLOMON LEFSCHETZ.Stability by Liapunov’s 4. Direct Method with Applications. 1961 LEITMANN (ed.). Optimization Techniques: With Applications to 5. GEORGE Aerospace Systems. 1962 RICHARDBELLMANand K E N N E T HL. COOKE. Differential-Difference 6. Equations. 1963 FRANKA. HAIGHT.Mathematical Theories of Traffic Flow. 1963 7. 8. F. V. ATKINSON. Discrete and Continuous Boundary Problems. 1964 Non-Linear Wave Propagation: With AppliA. JEFFREY and T . TANIUTI. 9. cations to Physics and Magnetohydrodynamics. 1964 10. J U L I U S T. Tow. Optimum Design of Digital Control Systems. 1963 HARLEY FLANDERS. Differential Forms: With Applications to the Physical 11. Sciences. 1963 SANFORD M. ROBERTS.Dynamic Programming in Chemical Engineering 12. and Process Control. 1964 SOLOMON LEFSCHETZ. Stability of Nonlinear Control Systems. 1965 13. DIMITRISN. CHORAFAS. Systems and Simulation. 1965 14. A. A. PERVOZVANSKII. Random Processes in Nonlinear Control Systems. 15. 1965 MARSHALL C. PEASE,111. Methods of Matrix Algebra. 1965 16. V. E. BENES.Mathematical Theory of Connecting Networks and Tele17. phone Traffic. 1965 WILLIAM F. AMES.Nonlinear Partial Differential Equations in Engineering. 18. 1965 19. J. A C Z ~ LLectures . on Functional Equations and Their Applications. 1966 R. E. MURPHY. Adaptive Processes in Economic Systems. 1965 20. S. E. DREYFUS. Dynamic Programming and the Calculus of Variations. 21. 1965 A. A. FEL’DBAUM. Optimal Control Systems. 1965 22. 1.
MATHEMATICS I N S C I E N C E A N D E N G I N E E R I N G 23. 24.
25. 26. 27.
28. 29.
A. HALANAY. Differential Equations : Stability, Oscillations, Time Lags. 1966 M. NAMIK OEUZTORELI. Time-Lag Control Systems. 1966 DAVIDSWORDER. Optimal Adaptive Control Systems. 1966 MILTONASH. Optimal Shutdown Control of Nuclear Reactors. 1966 DIMITRIS N. CHORAFAS. Control System Functions and Programming Approaches. (In Two Volumes.) 1966 N. P. .ERUGIN. Linear Systems of Ordinary Differential Equations. 1966 SOLOMON MARCUS. Algebraic Linguistics; Analytical Models. 1966
In preparation A. KAUFMANN. Graphs, Dynamic Programming, and Finite Games MINORU URABE. Nonlinear Autonomous Oscillations Dynamic Programming: Sequential Scientific A. KAUFMANN and R. CRUON. Management GEORGELEITMANN (ed.) . Optimization: A Variational Approach A. M. LIAPUNOV. Stability of Motion Y. SAWAGARI, Y. SUNAHARA, and T . NAKAMIZO. Statistical Decision Theory in Adaptive Control Systems MASUNAO AOKI.Optimization of Stochastic Processes F. CALOGERO. Variable Phase Approach to Potential Scattering J. H. AHLBERG,E. N. NILSON,and J. L. WALSH.The Theory of Splines and Their Application HAROLD J. K U S H N E RStochastic . Stability and Control
This page intentionally left blank
Optimal Shutdown Control of Nuclear Reactors MILTON A S H E. H . Plesset Associates, Inc. Santa Monica, California
0
Academic Press 1966
New York and London
Copyright 8 1966, by Academic Press Inc. All rights reserved. No part of this book may be reproduced in any form, by photostat, microfilm, or any other means, without written permission from the publishers. Academic Press Inc. 11 1 Fifth Avenue, New York, New York 10003 United Kingdom Edition published by Academic Press Inc. (London) Ltd. Berkeley Square House, London W. 1 Library of Congress Catalog Card Number: 66-22148 Printed in the United States of America
To my wife, Shulamite, and children, Avner, Miryam, Reuel nlwY5l 71nw’;, (Sayingsof ihe Fathers, IV, 5 )
This page intentionally left blank
Preface
The shutdown problem of a high flux, thermal neutron nuclear reactor, with respect to the kinetics of the radioactive fission products it contains, principally xenon-135, is about twenty years old, having emerged from the Manhattan Project experience of World War 11. That the existence of the isotope xenon-135 in thermal reactors is to be contemplated with a healthy respect is attested to by the fact that xenon is one of the principal factors that limit the maximum operating power of present conventional thermal neutron reactors. There is another class of xenon problems which come under the heading of xenon spatial oscillations, sometimes called “flux tilt” oscillations. These are discussed only peripherally in this monograph. Simultaneous with the development of nuclear reactor physics, and nucleonics in general, were the advances made in the areas of applied mathematics stemming from the requirements of practitioners of military operations research and analysis. Such advances also began as a product of the World War I1 experience and are still being made today. One of these is dynamic programming, for which a better descriptor is multistage decision theory, and which is presently coming of age in manifold applications. These include applications in most branches of engineering, economics, statistics, and physics, not to mention the needs of modern operations analysis in a fast changing geopolitical and technological world. In the late 1950’~~ the obvious interdisciplinary potential in the confluence of the above type of applied mathematics, principally dyix
X
Preface
namic programming, with certain problem areas in nuclear reactor engineering and physics was recognized. Out of this grew the application to the specific class of problems that comprise the contents of this book. Aspects of this confluence are built into the structure of the first three chapters for pedagogical reasons. That is, an exposition of elementary dynamic programming is included in the first three chapters which also comprise the background for, and introduction to, the central problem of this work - the investigation of nuclear reactor optimal xenon shutdown programs. This monograph is addressed to those engaged or interested in nuclear reactor physics and the engineering sciences, as well as to their counterparts in modern optimal control theory. It is felt that readers with a mathematical bent will appreciate the portions of this class of problems that are cast in a mathematical framework, while those with more physical predilections will derive much from what is presented on the physical level. The writing of most of the manuscript, as well as the final computations, was done during my stay as an Israel Atomic Energy Commission Fellow, in the academic year 1964-1965, at the Soreq Nuclear Research Center, Yavne, Israel. I wish to thank the Israel Defense Ministry for their courtesy in allowing me the use of their large scale digital computing facilities, Professors S. Yiftah and I. Pelah, respectively Scientific Director and Head of the Physics Department of the Soreq Nuclear Research Center, under whose auspices this work was completed, Dr. R. Bellman, University of Southern California, Los Angeles, and Dr. R. Kalaba, The RAND Corporation Santa Monica, for their assistance, suggestions, and reading of the manuscript, my son Avner for helping to proofread the manuscript and galleys, Mr. Wayne Jones of the System Development Corporation, Santa Monica who essentially wrote the central digital computer programDYNPROG described in Chapter 6 , as well as provided vital critical discussion, and Mrs. Ilana Maik, also of the Soreq Nuclear Research Center, for her painstaking efforts in expertly typing the manuscript. MILTONASH Nahal Soreq, Israel Santa Monica, California
Contents
ix
PREFACE Chapter 1. Xenon in Nuclear Reactors.
Dynamic Programming I 1.1. 1.2. 1.3. 1.4. 1.5. 1.6. 1.7. 1.8.
Introduction and Historical Review Xenon Spatial Oscillations Fission-Product Poison Production Absorption Cross Section of Xenon Thermal-Reactor Xenon Difficulties Conventional Approaches to Circumvent Xenon Dynamic Programming Principle of Optimality and Two Examples
1 3 4
6 7 11 12 13
Chapter 2. Reactor Poisons.
Dynamic Programming I1 2.1. 2.2. 2.3. 2.4. 2.5. 2.6.
Long-Term Fission-Product Poisons Poison Reactivity Xenon Spatial Oscillations Revisited Discrete Optimal Control Averaged Control and Terminal Control Impact on Linearized Control Theory xi
21 23 24 25
27 30
xii
Contents
Chapter 3. Poison Kinetics and Xenon Shutdown.
Dynamic Programming I11 Reactor-Poison Kinetics Equations Immediate Flux Shutdown Xenon and Samarium after Protracted Shutdown Xenon Minimum and Minimax Problem Statements 3.5. Constraints 3.6. Dynamic Programming. Absolute Value and Minimax Criteria 3.1. 3.2. 3.3. 3.4.
34 39 41 43 44 47
Chapter 4. The Maximum Principle 4.1. 4.2. 4.3. 4.4. 4.5. 4.6. 4.1.
Introduction Two Examples Bang-Bang Control Continuous and Bang-Bang Control Optimal Orbital-Rendezvous Control Simplified Xenon Shutdown Control The Two-Point Boundary-Value Problem
52 56 59 61 63 64 67
Chapter 5. Minimum and Minimax Xenon Shutdown 5.1. 5.2. 5.3. 5.4. 5.5. 5.6.
Mathematical Restatement of Optimal Xenon Shutdown Mathematical Restatement of Constraints Dynamic-Programming Functional Equation Derivation of Bellman’s Equation Bang-Bang Control Dilemma Dynamic-Programming versus Maximum-Principle Optimal Shutdown Solutions
69 71
73 75 77 80
Contents
xiii
Chapter 6. Computational Aspects 6.1. 6.2. 6.3. 6.A.
Introduction and Calculation of Fk Tables The Xenon Override Constraint DYNPROG and COAST Input-Data Format Appendix to Chapter 6
87 90 91 93
Chapter 7. Experimental Verification 7.1. 7.2. 7.3. 7.4. 7.5. 7.A.
Introduction and IRR-1 Reactor Description Immediate Shutdown of IRR-1 to Zero Flux Shutdown to Nonzero Power Levels Xenon and Iodine Buildup and Decay Experimental Results Appendix to Chapter 7
102 103 107 109 115 118
Chapter 8. Results and Conclusions 8.1. Introduction and Xenon Unconstrained Extremals 8.2. Xenon Constrained Extremals 8.3. Interdependence of Flux and Xenon Constraints 8.4. Two Types of Optimal Shutdown Payoffs 8.5. Short Allowable Shutdown Durations 8.6. Strongly Limited Xenon Override Shutdown 8.7. Conclusions of Experimental Investigation
121 124 126 127 129 130 136
Chapter 9. Summary and Equivalences 9.1. Reprise 9.2. Equivalence between the Optimality Principle and the Maximum Principle
140
143
Contents
xiv
9.3. Comparison of Optimal Shutdown Criteria 9.4. Other Equivalences 9.5. Higher-Order-SystemFormulations
152
References
155
Bibliography Document Glossary Xenon Bibliography
INDEX
147 149
157
158
1 65
CHAPTER 1
Xenon in Nuclear Reactors. Dynamic Programming I
1.1. Introduction and Historical Review In a large high-power thermal-energy t nuclear reactor that is operating at steady state, the myriads of fissions from which the power output is derived produce high concentrations of various fission-product radioactive nuclei which are detrimental to normal operation. Certain fission-product nuclei, and their decay products, have tremendous absorption cross sections t for thermal-energy neutrons. The neutrons are the “lifeblood” of nuclear reactors, because they maintain the chain reaction by causing fissions, which in turn release more neutrons. The above fission products are called poisons, because they adversely affect the maintenance of the constant neutron population required for equilibrium reactor operation. The principal fission-product poisons of interest are the isotopes samarium-149 and xenon-135, whose thermal neutron absorption cross sections are 50,000 and 3,500,000barns, 0 respectively. Such t This refers to the kinetic energy of the bulk of the reactor neutron population -thermal energy is essentially the energy at room temperature (300 “Kelvin), which corresponds to 0.025 electron volts. $ “Cross section” can be thought of as the microscopic interaction probability per pair of interacting particles. Thus the above absorption cross section is the probability that a fission-fragment nucleus will literally absorb an incident neutron. The noun, cross section, is derived from the fact that it has dimensions of area. 8 1 barn = 10-24 cm2; this unit of cross section originated during Manhattan Project days and, as can be surmised, the derivation hints of someone or something that cannot hit the broad side of a barn.
1
2
Xenon in Nuclear Reactors. Dynamic Programming I
11
cross sections are orders of magnitude greater than those ordinarily encountered in reactor physics or engineering. As a matter of fact, the above xenon cross section is the largest known neutron absorption cross section of any nucleus on the Segrt chart (periodic table). The reasons for the existence of such large cross sections will be discussed briefly later. In an operating reactor, at least enough fuel (uranium or plutonium) must be supplied, in addition to the required critical mass, that sufficient additional neutrons are released from fission to “satiate the appetite” of the fission-product poisons existing in the steady state. However, if the reactor is disturbed from steady-state operation, such as being shut down, inordinately large amounts of poisons accumulate, which can severely restrict the flexibility of subsequent reactor control. The theme of this book centers about the manner of coping with this difficulty. Specifically, it is desired to find the means to shut down a large high-power thermal reactor on a program that permits maintenance of the flexibility of subsequent control. Such flexibility is delineated in the later discussion. The problem of fission-product poisons with respect to their effect on preserving the operating integrity of a high-power thermal reactor exists for at least a score of years. It was first manifest in the initial operation of the large plutonium-producing thermal reactors at Hanford, Washington, during the World War I1 Manhattan Project. It was found that these reactors were slowly shutting themselves down for no apparent reason. Since they were intended to produce weapongrade plutonium, as opposed to gaseous-diffusion methods just starting to produce weapon-grade uranium (the diffusion technology had just been born), the whole atomic-bomb program was thought at that time to be in jeopardy. As is well known, both methods for producing atom-bomb fuel were “successful,” in that uranium and plutonium were the separate constituents used in the two atom bombs dropped on Japan during the closing phase of World War 11. As can be imagined, a top-priority investigation was launched into the reasons for the improper performance of the Hanford reactors. Within two days, Enrico Fermi and J. H. Wheeler found the difficulty. It was caused mainly by the fission-product xenon-135, which was
1.21
Xenon Spatial Oscillations
3
poisoning the reactors by depleting them of neutrons because of its (now known) tremendous absorption cross section. The Hanford reactors were immediately supplemented with additional fuel to maintain steady-state operation-i.e., to keep the reactors critical. Thus in presently existing high-power thermal-nuclear reactors, at least enough additional fuel must be incorporated to counteract the influence of fission-product poisons to provide for steady-state reactor operation. For such operation, the amount of fuel must be increased over that required for a critical mass, often by a factor of 2. However, the fuel load is increased over critical for other important reasons, such as the simple fact that fuel is being used up. For example, the fuel charge of a modern Polaris nuclear submarine is many times greater that than called for by the criticality equations, to provide enough fuel over the anticipated military life of the submersible. Such heavy fuel loading in a submarine yields, at the same time, sufficient poison-control flexibility. That is, there is always enough reactor fuel, except when it is severely depleted just prior to recharging with a new fuel core, to override the adverse effect of the fission-product poison concentration in order to restart at will after shutdown. In the shutdown state, the xenon concentration, for example, can rise to many orders of magnitude over that at steady state. As will be seen, this is due to the kinetic imbalance of the production, decay, and absorption of the various fission products following shutdown. However, in conventional land-based stationary high-power reactors, such heavy fuel loading is prohibitively expensive, so that only partial xenon override is possible. That is, the reactor can be restarted only during a short time following shutdown. If this time is exceeded, the reactor cannot be restarted until the excess poison has decayed away naturally, which is a matter of two days or more.
1.2. Xenon Spatial Oscillations There is another class of reactor xenon problems. These are caused by spatial oscillations of the flux or power? throughout a large thermal + It is easily shown that the reactor power is proportional to the neutron flux (see footnote, page 8). Hence flux and power are often used synonymously in this book.
4
Xenon in Nuclear Reactors. Dynamic Programming I
11
reactor due principally to the space-time kinetics of xenon-135. Such oscillations tend to occur when the reactor is so heavily loaded with fuel that the power density is constant almost throughout the reactor volume. A classic case of a tendency to xenon spatial oscillations occurs in the tritium-producing reactors at the Savannah River facility. These flux-tilt oscillations are discussed only qualitatively herein. Suffice it to say at this point that the oscillation periods are measured in days or fractions thereof. Thus their effect on the reactor can be controlled adequately and easily by manual surveillance on the part of the reactor operator. 1.3. Fission-Product Poison Production It would be well now to discuss briefly the origin and subsequent behavior of the radioactive-isotope species that play an important role in xenon control considerations. As mentioned, fissions are the specific causative agent of reactor power production, by virtue of the kinetic energy of the resulting fission-fragment particles, which heat the fuel and consequently the reactor itself. Coolant is circulated through the reactor to extract the heat generated, which is transformed to useful power external to the reactor system. Certain of these fragment-particle nuclei transmute through beta decay (i.e., by emitting /? particles, which is another term for electrons) to form the poison isotopes. Specifically, tellurium-135 is a fission fragment occurring in 5.6 per cent of fissions. It has a half-life of 2 minutes, /?-decaying into iodine-135. With a half-life of about 6.7 hours, iodine-135 in turn /?-decays into the troublesome isotope xenon-135. Iodine-135 decay provides the principal source of xenon in a steady-state reactor, even though xenon itself is produced directly as a fission fragment. However, the latter occurs in only about 0.3 per cent of the fissions. Xenon-135, if it does not absorb a neutron to become xenon-136, which is a harmless nonpoison, will /?-decay into cesium-135 with a half-life of 9.2 hours. Cesium-135 will ultimately /?-decay, with a half-life of 20,000 years, to barium-135, which is a stable isotope. The pertinent decay schemes are depicted in Fig. 1.1. For xenon kinetics purposes, upon the realization that the two principal isotopes to be considered are xenon and iodine whose half-
Fission-Product Poison Production
1.31
5
Xenon- Iodine Decay Scheme
1
Te’35 (fission product)
2 min
p
I135
(30’
~
?
/
Xe 135
~
@~ ~A
)
p
\ \Y
(70 per cent)
Xel35
Samarium- Promethium Decay Scheme
Ndt4’
(fission product)
(stable)
FIG. 1.1. poisons.
Decay schemes for xenon-iodine and samarium-promethium reactor
lives are measured in hours, two approximations usually are made. The first is that tellurium does not “exist” (half-life only 2 minutes) but that iodine-135 is assumed to be directly formed from fission with the same fission yield (5.6 per cent) as tellurium. The second is that although iodine-135 itself, like xenon-135, absorbs neutrons, its absorption cross section is millions of times smaller than xenon. Then the corresponding term in the xenon and iodine kinetics equations, to be developed later, is ignored. In a similar manner, samarium-149 kinetics can be considered.
6
Xenon in Nuclear Reactors. Dynamic Programming I
11
Samarium has the next highest thermal neutron absorption cross section of the reactor-poison fission products, but it is some 1/70th that of xenon. Analogously, neodymium-149 is a fission fragment in 1.4 per cent of fissions and /?-decays, with a half-life of 1.7 hours, to promethium-149. The latter also /?-decays, with a half-life of 53 hours, to the neutron-absorbing samarium-149. Samarium, unlike xenon-135, is a stable isotope. Also, analogous to xenon-iodine, the approximations made on realization that the two important isotopes are samarium and promethium is that neodymium does not “exist” (its half-life is but 1.7 hours compared to 53-hour promethium) but that promethium is assumed to be created directly from fission with the neodymium fission yield of 1.4 per cent. As before, the absorption of neutrons by promethium is small compared to samarium, so that the corresponding term in the samarium and promethium kinetics equations, discussed later, is ignored. However, because of the small thermal neutron absorption cross section of samarium compared to xenon, the reactor optimal-controlflexibility problem depends principally on the xenon concentration, with samarium poison being minor by comparison for high-power thermal reactors. In the main theoretical development later, the emphasis will be on xenon, with the mental reservation that samarium poison is present as well.
1.4. Absorption Cross Section of Xenon As the reason for existence of the xenon control problem hinges on the stupendous thermal neutron absorption cross section of xenon-135, it is appropriate to discuss briefly the reason for such a neutron affinity. When nuclei contain a large number of protons and neutrons, they exhibit a complicated behavior with regard to their interactions with other subatomic particles. This applies as well to the xenon-135 nucleus and incident thermal neutrons. Without going into the quantum mechanical and nuclear physics explanation, it is found experimentally that the cross sections of such nuclei exhibit resonance properties, in that for certain incident neutron energies, the cross section can increase greatly compared to the remainder of the energy
Thermal-Reactor Xenon Dificulties
1.51
I
range. In the case of xenon, such a resonance occurs at thermal neutron energies. In fact, for the case of xenon and thermal energy neutrons, the cross section value at resonance is found experimentally to be quite close to the theoretical upper bound. As mentioned before, it is the largest known thermal neutron absorption (capture) cross section. Figure 1.2 depicts the behavior of the xenon cross section as found by experiment.
“r
Incident neutron energy, electron volts
FIG.1.2. Total cross section of xenon-135. (After H. M. Sumner, [32].)
1.5. Thermal-Reactor Xenon Difficulties
As will be seen from later considerations, the amount of xenon poison concentration at steady-state reactor operation increases with increasing equilibrium power output for low-power reactors. These correspond roughly to the conventional research reactor of up to 1-megawatt power output. For high-power thermal reactors, whose output is reckoned in scores or hundreds of megawatts, the xenon concentration at steady state approaches a limiting value independent of the power. However, when the power output of a high-power thermal reactor is reduced, and especially if this reactor is shut down, the xenon
8
Xenon in Nuclear Reactors. Dynamic Programming I
11
concentration quickly builds up (almost as soon as the shutdown procedure is completed) to such proportions that subsequent control flexibility to increase power or to start up is lost, unless sufficient extra fuel has been incorporated into the reactor fuel charge. That is, the xenon poison concentration is so high after shutdown that it is unfortunately necessary to wait until the xenon decays away naturally in order to restart - a matter of some 30 to 50 hours, depending on the equilibrium flux (power) prior to shutdown. After enough xenon has decayed away, the subsequent neutron population will be able to maintain the equilibrium state (critical reactor), since the tremendous xenon absorption of neutrons is then essentially absent. On the other hand, if there is sufficient additional fuel to provide enough neutrons to “feed” the xenon at all times and corresponding xenon concentrations in the post-shutdown period, then there is no difficulty. However, for high-power, or high-fluxt thermal reactors, it will be seen presently that the amount of additional fuel required is at least an order of magnitude greater than that needed if no xenon were present. In fact, this consideration, plus the amount of fuel needed even to partially cope with the burgeoning xenon poison following shutdown, proscribes the present-day practical maximum design power of large thermal reactors. However, it is the thesis of this work that steps can be taken to alleviate this xenon difficulty through optimal shutdown programs that reduce the post-shutdown xenon poison concentration. To provide a concrete example of how much additional fuel is needed to override the post-shutdown xenon concentration, the following development is given. This derivation will dwell only on aspects important to the thread of this work. For more details, references in nuclear reactor physics and engineering texts can be consulted [l-31. Consider a reactor a t equilibrium. Then for such a critical reactor, a parameter k,called the multiplication factor, must be equal to unity. + Flux, another important reactor parameter, can be thought of as the number of neutrons per square centimeter per second. Its knowledge is important in reactor physics and engineering, since multiplying it by the appropriate macroscopic cross section of the element under consideration yields the corresponding number of interactions per cubic centimeter per second.
1.51
9
Thermal-ReactorXenon Dificulties
If k < 1, the reactor is subcritical which means that, for whatever reason (e.g., insertion of control rods), the neutron population is decreasing, resulting in eventual shutdown. If, on the other hand, k > 1, then the population is increasing, a situation called supercritical. The fractional change in k from the equilibrium state, called reactivity ( K ) and usually measured in units of p, which is another reactor parameter, is K = ( k - l ) / p k . One unit of p is called one dollar of reactivity, and fractions thereof are expressed in cents of reactivity. For U235, /3=0.0064. It is first important to derive an expression for the negative reactivity corresponding to a given amount of xenon concentration in the reactor. In terms of xenon poisoning, k can be expressed as a ratio of particular cross sections. Thus for a reactor not containing xenon poison, k can be considered to within a constant of proportionality as given [l] by (1.1) k = zfuel/(zfuel + zrnoderator) 9
whereas for the same reactor containing xenon poison the analogous proportion is (1.2) kt = zfuel/(zfuel + zmoderator + zpoison) a
Cruelis the total macroscopic absorption cross section of the fuel for incident thermal neutrons. This includes absorption that produces fissions, in turn producing a new generation of neutrons, as well as parasitic absorption, which merely implies neutron loss. zfuel = Nfuelgfuel,
where cruelis the total microscopic absorption cross section (see footnote, page 1). N is the average number of fuel nuclei per cubic centimeter in the reactor. Correspondingly, Cmoderator is the macroscopic absorption cross section of the moderator t for incident thermal t The fuel core of a thermal-nuclear reactor is immersed in a moderator, often water or graphite, which slows (moderates) the fast neutrons, born of fission, down to thermal energies through collisional processes. This must be done so that the resulting thermal neutrons can cause new fissions efficiently (fuel fission cross sections are highest for thermal neutrons), thus maintaining the chain reaction. The moderator often acts as the coolant, in which case it is normally circulated through the reactor out to an external heat exchanger.
Xenon in Nuclear Reactors. Dynamic Programming I
10
11
neutrons, while Zpoisonis the macroscopic absorption cross section of the xenon poison concentration. As the value of k, or k' is normally not too far from unity, an equivalent xenon poison negative reactivity can be defined as K x = (k' - k ) / P k ' .
(1.3)
Then in terms of the previous definitions for k and k', K,
=-
mP/p(l
+ m)
(dollars),
(1 -4)
where the poisoning factor P = Zpoison/Zfueland m = Zfuel/&,,oderator. This is the desired relationship for the reactivity due to the poison. For enriched-fuel reactors, m S 1, so that the poison reactivity is directly proportional to the poisoning factor; thus K , = -P / p (dollars). It is an easy step to obtain an expression for the required increase in fuel concentration for a given amount of xenon poison reactivity in dollars. That is, for maintaining a reactor critical after xenon has accumulated, the fuel concentration Nfuelmust be increased to the value Niuelto compensate for the xenon poison, so that k=k'. Therefore, zfuel/(zfuel
+ zrnoderator)
= Ziuel/(Ziuel
+ Zmoderator + zpoison)
(le5)
The ratio of poisoned to unmust hold, where Z~uel=N;uelofue,. poisoned fuel concentration is then given in terms cf the (poison) reactivity by N i ' u e l l N r u e l = 1 + P ( m + 1) IKxI . (1.6)
For p=0.0064, which corresponds to U2j5fuel, and a typical value of m-20 for a water-moderated reactor using highly enriched fuel (almost pure uz3'), N i ' u e l / N f u e l = 1 + 0.13 I K x l * (1 -7) As will be seen, the post-shutdown xenon concentration in a highpower thermal reactor can climb to a maximum of hundreds of dollars of negative reactivity. Then, from (1.7), it is seen that the fuel concentration required for xenon poison override at will, in the postshutdown phase, can be at least an order of magnitude greater than that called for when no xenon is present. t See note added in proof, page 20.
1.61
Conventional Approaches to Circumvent Xenon
11
1.6. Conventional Approaches to Circumvent Xenon There exists a number of limited methods by which the xenon problem can be overcome at least in principle, and sometimes in practice. Most of these methods come under the category of broad physical or chemical means. For example, circulating-fuel reactors can be constructed to circumvent the xenon difficulty. This type of reactor, in the research and development phase at present, contains a mixture of fuel and fluid (slurry) which circulates into and out of the reactor proper. Part of the circulation loop is external to the reactor, so that heat can be extracted from the slurry. Then the xenon itself can be extracted from the slurry, external to the reactor, by chemical means. With the advent of the newly discovered xenon compounds, perhaps methods can be found to extract the xenon from within the reactor in situ. The existence of the reactivity poisoning effect of xenon was one of the important reasons that much early interest was evinced in epithermal reactors. In these reactors, means are used to keep the thermal neutron population negligible, so that the xenon poison concentration would be of no consequence. In this system the neutron population now peaks above thermal energies (epithermal), where the xenon cross section is negligible compared to its very large thermal-energy magnitude. However, other difficulties with this type of reactor, especially in early applications to the nuclear-submarine-reactor program, resulted in a curtailment of military interest. One new wave of the future in nuclear energy seems to be the fast reactor. This reactor is built to function with fast (high-energy) neutrons only, because of the desirability of breeding new fuel, which is done most efficiently with fast neutrons. That is, an operating fast reactor will simultaneously produce fuel by breeding the fertile U238 isotope into new fissionable PuZ3’ fuel, for example. Ultimately, the use of fast reactors will produce sufficient fissionable fuel (Pu239from U238,or U233from Th232)from the plentiful world supplies of U238 and Th232.Then the availability of cheap fuel, and therefore cheap power, will no longer be a world problem, especially for the developing nations. However, more to the point in the present context, the xenon accumulation will no longer pose a problem because of its negligible cross section for fast neutrons.
12
Xenon in Nuclear Reactors. Dynamic Programming I
11
Another approach to maintain a modicum of xenon control flexibility is the use of an on-line xenon analog computer. That is, the reactor flux as a function of time is monitored and read into this computer, which then computes and displays the corresponding xenon concentration in real time using the xenon-iodine kinetics differential equations, developed later. The reactor operator can thereby monitor the computed xenon concentration at all times, especially after shutdown. For example, he would then be apprised of when it is most expeditious to restart the reactor following shutdown. With a view toward calculating the optimal shutdown programs mentioned earlier, the thought might occur to one to set up the xenoniodine kinetics differential equations on an analog computer, and then attempt a search for an optimal shutdown by trying various flux shutdown functions. However, such a “cut and try” method would result, at most, in obtaining relative optimization while missing the actual class of shutdown programs that yield the correct answer. This will be recognized later, as the optimal shutdown functions will be seen to be essentially piecewise-constant-hardly the type of results that conventional analog computers would yield.
1.7. Dynamic Programming Simultaneous with the post-World War I1 reactor development program was the development of the mathematical discipline of dynamic programming which grew out of post-war operations research and analysis needs. This technique, invented by the mathematician R. Bellman, is an interesting variant of multistage decision theory [4]. It has application to complicated control processes in which it is desired to optimize the control so that a predetermined criterion, or cost, functional is satisfied. By compartmentalizing the particular problem into a series of coupled subproblems, dynamic programming can render solutions to quite complicated control processes. The mathematical aspects of the coupling are manifest in a novel functional equation. This equation provides a recurrence relation or algorithm by which the problem is solved stepwise, obtaining a value for the control variable at each step. The resulting sequence of control or decision values forms the optimal control policy with respect to extremizing the predetermined criterion functional.
1.81
Principle of Optimality and Two Examples
13
In the limit of an infinite number of stages, each of which is infinitesimally short, the sequence of decisions devolves to the sought-for control function, while the preceding recurrence relation becomes a novel partial differential equation called Bellman’s equation. It resembles an ordinary first-order partial differential equation with the salient difference that the maximum or minimum operation occurs in the midst of the equation, which provides the novel aspect. As will be discussed further, Bellman’s equation is essentially the Hamilton-Jacobi differential equation description of the control process. The Pontryagin maximum principle, also to be discussed later, provides a complementary description of the dynamics of the control process, which corresponds to a formulation in terms of Hamilton’s canonical equations. These descriptions are two modern approaches KO control processes, or problems, which come under the aegis of the calculus of variations. Actually, the Pontryagin maximum principle is a generalization of the Weierstrass necessary condition for the existence of an extremal, to include control functions that form a closed set. In our context, this means merely control functions that are constrained. Both dynamic programming and the maximum principle provide new avenues for the solution of complicated control processes containing unwieldy constraints and difficult state-variable behavior, which the classical calculus of variations handles in a strained and artificial nianner, at best.
1.8. Principle of Optimality and Two Examples The essence of the method of dynamic programming is contained in its principle of optimality. First, the problem or process is compartmentalized into stages in a way that is as natural as possible. Then the principle of optimality is applied, from which the functional equation, unique to dynamic programming, springs. The principle of optimality is: An optimal policy has the property that whatever the initial state and initial decisions are, the remaining decisions must constitute an optimal policy with regard to the state resulting from the first decision. This is a statement of principle about optimal policy, which will be understood as that policy which extremizes a predetermined criterion unique to the problem at hand. Although “optimal policy” appears in its own statement of principle, it will be realized from the examples
14
Xenon in Nuclear Reactors. Dynamic Programming I
11
that follow, where an optimal policy is defined, that the optimality principle does not constitute a tautology. Example 1. An idealized breeder reactor [5]. Consider a hypothetical enriched uranium reactor. This means that the naturally occurring fertilet fuel (mostly U238),containing 0.7 per cent of the fissionable U235, has been enriched in the diffusion plants to upward The reactor, while consuming U235,breeds of 90 per cent of U235. Pu239from the fertile U238as well. For a given fuel charge (lumping ~ 2 . 3 5and Pu239together) containing x kilograms, assume that the reactor breeds a net amount rx kilograms per fuel cycle. At the end of such a cycle, certain fuel elements which have been depleted of fuel through irradiation are rearranged and/or removed and new elements are substituted. Such complications will be neglected in that the fuel will be considered homogeneous, so that any given fraction up to ( r - l ) x , since x kilograms must remain as a critical mass, can be removed at the end of each fuel cycle. It is also assumed that the reactor is able to accommodate to more fuel than the critical mass x , if need be, through its control system. The reactor will be operated for N fuel cycles before it is dismantled for overhaul. At the end of each cycle, fuel is removed from the reactor and used according to a utility function g ( Y k ) , where yk is the amount of fuel removed at the beginning of the kth cycle. The problem is to determine the optimal fuel-removal policy, which is the one that maximizes the over-all utility for N cycles of operation. Let JN(x)be the maximum over-all utility obtained from the fuel removed from the breeder reactor after N cycles of operation using an optimal fuel-removal policy. First consider a one-cycle operation, which will be embedded in a two-cycle operation, etc. Assuming that g ( y ) is an increasing function, then for a single cycle, one (trivially) obtains fl ( X I = max 9 ( Y l ) = 9 (rx) ( Y l = r.1. (1.8) O Q y l
That is, after a one-cycle operation, the optimal policy is to remove all fuel to obtain the maximum one-cycle utility g ( r x ) . A fertile isotope is one that, upon absorption of a neutron, through a particular radioactive decay chain will ultimately produce a fissionable isotope. An immediate ultimately being transmuted to fissionable Puzag. example is that of fertile UZ3* f
Principle of Optimality and Two Examples
1.81
15
For a two-cycle operation, at the beginning of the second (and last) cycle, the reactor has bred a net amount of fuel rx. If an arbitrary amount of fuel y, < ( r - 1)x is removed from the reactor, to be used with a return of g ( y , ) , the total two-cycle return is g ( y , ) + f i ( r x - y , ) . g ( y , ) is now considered to be the return from the first cycle of a twocycle process, while the remainder of the fuel, r x - y , , is used in an optimal manner to give the remaining one-cycle return fl( rx-y,). Now for an optimal choice of y,, from the definition offN, and the optimality principle,
f z (XI = 0 4 y 2max 4(r-
1)x
[9 (Yz)
+fl
(rx
-Y d l *
(1.9)
Generalizing to a k-cycle operation, fk(.)
=
max
[ g (yk) + f k -
OSykS(r-1)~
1
( r x - yk)]
( k = 2~3,
N, * (1.10)
That is, the optimal return from a k-cycle operation is obtained by maximizing the return g ( y k ) plus that from a ( k - 1) cycle operation starting with an amount of fuel rx-y,. The maximization is over 0
The solution consists of the optimal fuel-removal policy, the sequence { yk}, and the sequence of optimal returns { fk(x)} of whichf,(x) is the
16
11
Xenon in Nuclear Reactors. D Y M ~ Programming C I
return for the N-cycle process. Specifically, consider the cases
(1.12)
Now, for a two-cycle process,
f,(x) = -
max
[Yz
O
max
O
1)x
+ r(rx - Y d l
[- ( r - 1)y,
+ r2x].
(1.13)
As the bracket on the right side is linear in y , with a negative slope, its maximum lies at the extremities of the y , range. Here then, yzmax =0, givingf,(x) = r ’x. Continuing, it is easy to obtain the optimal control =yN=O, and fN=rNX. The index N is policy as y1=rx, y , = y 3 = the number of cycles of operation remaining, so that an N-cycle operation telescopes into an N - 1 cycle operation, etc., until only a one-cycle operation remains. Then the optimal policy for case (1) is to remove no fuel until the very end, and then remove all. (2)
From ( l . l l ) ,
g(y)=y1/2.
f l (x) = (rx)’/Z
max f k (x) = [Y;” 0 < yk < ( I - l ) X Then f,(x) =
+fkmax
1 (rx
0 Q yl Q ( r - 1)x
[y:”
- yk)]
.
(k = 2,3, . N ) .
(1.14)
a,
+ (r(rx - Y , ) ) ~ / ~ ] .
(1.15)
Now the bracket is seen to be concave in y,, so that its maximum lies in the interior of the y2 range. Therefore, equating the derivative of = r x / ( l + r ) . Inserting the bracket with respect to y , to zero gives yzmax this value of y 2 back into the bracket above gives f z = [ ( 1 + r ) r x ] ” 2 . By induction, it is realized that the return function f k is proportional to (rx)l/’. Then let fk=Ck(rx)’/’, where the coefficient ck is to be de-
1.81
Principle of Optimality and Two Examples
17
termined. Inserting thisf, into (1.15) and carrying out the maximization operation yields the recurrence relations yk = rx/(l
+ rck-l) 2
c1 = 1
ck
= (1
+ rck2-1)1/2 (k
so that
rk - I
1/2
/.=(XIx) yk=
= 2,3, ..., N ) ,
r ( r - 1). ,.k-l
(1.
( k - 1,2,..., N ) .
For this type of utility, the optimal policy is seen to decrease quickly for small k, less rapidly for large k. Since k is to be construed as the number of cycles remaining, the optimal policy asserts that relatively small amounts of fuel are removed at the beginning of the N-cycle operation, with increasingly large amounts removed as the irradiation progresses, as depicted in Fig. 1.3.
rx
Reactor core dismantled
P
Irradiation ends
o y x ; optimal amount of fuel removed fN.k;
optimal return function
Cycles to go ( k ) f-
Increasing tlme fw y,
Irradiation begins
FIG.1.3. Optimal fuel-removal policy for utility function yl”
Example 2. A simple one-dimensional reactor [ 5 ] . This example, which is closer to the context of optimal control processes in the sense of the variational calculus, is the consideration of a simple onedimensional reactor whose kinetic differential equation of state is iz=uun,
(1.18)
18
Xenon in Nuclear Reactors. Dynamic Programming I
11
where n(t) is the state variable and u is the control variable. The object is to find the optimal control function u that minimizes the following criterion, or “cost,” functional J
= ctlogZn(T)
+ p/
T
UZdt 0
(1.19)
when the system proceeds from its present non-equilibrium state to its final equilibrium state c, accomplishing this task in time T . In the functional J, the first term is the cost of control at the end of the duration of control T. The second term is a “mean-squared” cost of control as it is exerted during the given control duration T. First define the functional F(c,T)=minp/ u
T
u’dt o
F(c,0)=alog2c.
(1.20)
F(c,T), even though it is a functional of the control variable u, for a unique optimal control u*, is a function only of the state c and the control duration T . Now F(c,T) is rewritten
where a is an intermediate time between 0 and T. Using the optimality principle, this can be rewritten
(1.22) That is, the optimal cost F(c,T) is given by the minimum, over u, of the cost from 0 to a seconds, plus the optimal cost F of a new control process (embedded in the original control process) with the system in the state c+Gri(t) dt, but with a control duration of T - a seconds. The cost of the “initial condition” F(c,O), which is the terminal cost, since the time remaining to control is zero, is F(c,O)=ct log’c. The process can be thought of as being compartmentalized into two stages. The first stage lasts from 0 to a seconds, during which the optimal control is exerted. This control thereby advances the system state to c+J:iz(t) dt, where now only T - a seconds of control duration
19
Principle of Optimality and Two Examples
1.81
remain in the control phase. From a seconds on, the system is controlled optimally, incurring the cost F(c+J",(t) dt, T-a). Then the optimal cost F(c,T) over the whole control duration T is obtained by choosing u to minimize the right side of (1.22). Now let a become a small time interval A so that, after Fis expanded about (c,T) in the right side of (1.22), the result is pu2d -
1
aF + F (c, T ) + aF -h ( A )A - - A + O(d2) + ... . aT ac
(1.~23)
Canceling F on both sides of (1.23), dividing by A , then letting A approach zero and substituting from the equation of state (1.18) yields Bellman's equation, aT
.
F(c,O) = ulog2c.
(1.24)
As mentioned before, the above is essentially the Hamilton-Jacobi equation for the control process. In general, such equations are difficult to solve analytically, so that resort must be made to numerical integration on large-scale digital machinery. For this example, however, the optimal u* is found by first equating the derivative of the right side of (1.24), with respect to u, to zero. This is because the right side is quadratic in u, and not linear as it will be for the problems to be considered later. The vanishing of the derivative with respect to u gives (1.25) When this is substituted back in (1.24), the result is 2
F ( c , 0) = a log2c
(c
> 0) ,
(1.26)
With the solution, the optimal cost, given by (1.27)
20
Xenon in Nuclear Reactors. Dynamic Programming I
11
Knowing F(c,T), the optimal control is obtained by inserting (1.27) in (1.25), to give (1.28) It is seen that the optimal cost decreases with the control duration T. This simply reflects the type of cost criterion selected. The same holds for the optimal control u*, which is seen to be negative if the system (reactor) is supercritical, i.e., c> 1, and positive if subcritical c< 1. This assumes that the critical or equilibrium state is c = 1. If the system is in the desired c= 1 state, F=u* = 0 , so that no control is exerted and thereby no control cost is incurred.
*
*
*
NOTEADDEDI N PROOF (See page 10.) Actually Eq. (1.5) should read Zfuel/(zfuel
+
Zmod) = Z&od/(ziuel+
Z k o d f zpoison)
since adding fuel must displace moderator atoms in this assumed homogeneous reactor model. However, the relationship corresponding to Eq. (1.6) is NiuellNtuel = N k x d N m o d
+
&I
+ 1)1&\.
For normal concentrations of fuel, the ratio of fuel atoms to moderator atoms is less than one per cent. Hence, N k o d / N m o d is unity to good approximation So that Eq. (1.6) follows.
CHAPTER 2
Reactor Poisons. Dynamic Programming 11
2.1. Long-Term Fission-Product Poisons Besides the fact that fission-product poisons play a central role in the xenon control problem, as mentioned in Chapter 1, they also affect the long-term behavior of nuclear-reactor operation. It is this latter aspect that is discussed in the following, even though it has only a secular effect on the xenon kinetics. As can be inferred from the illustrative example of Chapter 1, normal long-term reactor operation can be thought of as being compartmentalized into “fuel cycles.” At the beginning of a fuel cycle, a new charge of fuel is inserted into the reactor core. As the reactor operates, fuel is consumed as it is being irradiated by the reactor neutron flux. Fuel cycle times are normally reckoned in weeks to years. As the fuel is irradiated, in a U235*238 thermal reactor, for example, the reactivity first increases, because more fissionable fuel (PuZ3’from U238)is being produced than is being consumed. As long-term irradiation proceeds, the fissionable fuel available from the fertile fuel approaches an equilibrium concentration, since it is now also being consumed (burned) as well. As time passes, the fission-product concentration begins to increase rapidly, and as some of these are neutron absorbers, like xenon, the reactivity tends toward a negative value. The original fissionable fuel charge is being consumed as well, depleting the total amount of fuel, which further contributes to the negative reactivity trend. The agglomeration of these effects persists until the reactor begins to go subcritical, marking the end of the particular fuel 21
22
I2
Reactor Poisons. Dynamic Programming II
cycle. The depleted fuel elements must be removed, or else the reactor will shut itself down. After replacing the depleted fuel elements by fresh fuel, they are usually then chemically processed for removal of Pu239,other fissionable materials, and the remaining unburned fuel. There are interesting substantive optimal control processes in the area of fuel cycle methodology. These center around finding optimal methods of fuel irradiation, such as only removing certain depleted fuel elements, rearranging others within the core, and adding new ones. It is felt that this is another field in which modern optimal control theory, and especially dynamic programming, has something to offer. Much recent progress in optimal processing in the chemical industry has been made using dynamic programming [ 6 ] . For the case of xenon, which is a consideration in short-term as well as long-term xenon kinetics, one must provide sufficient extra fuel to counteract the effect of the poison. If optimal shutdown programs are ignored in high-flux thermal reactors, the amount of additional fuel needed to restart at will following shutdown is at least an order of magnitude more than that required merely to counteract the existing xenon concentration during equilibrium power operation. This can be expensive, since for one thing, U235fuel costs $12,000 per kilogram. From (1.9) and the xenon concentration maxima appearing in Fig. 3.1, Fig. 2.1 depicts the required increase in fuel concentration needed to Iooo
r
Nf,,
;100
= 1
+
0.13
IKrnonI
lKmoil is obtained from the
C c
0
/Nf,,l
xenon maxima of Fig. 3 . 2
-
0,
z C
r
c
0
I
lo'*
1013
L
1
1014
lot5
Equilibrium thermal flux, neutrons/cm2
I
10'6
- set
FIG.2.1. Relative fuel concentration required, over equilibrium xenon, to override post-shutdown xenon maximum.
2.21
Poison Reactivity
23
“override” the xenon at its post-shutdown maximum. The xenon maxima are expressed on the abcissa in terms of the corresponding flux magnitude at equilibrium operation. It is seen from the graph in Fig. 2.1 that for high-flux thermal reactors, the required xenon concentration grows prohibitively large. 2.2.
Poison Reactivity
From Chapter 1 it should be realized that there is a direct proportion between negative reactivity and the xenon poison concentration. In the case of xenon and iodine, which both decay radioactively, it will be seen that a maximum in the xenon concentration will occur from 7 to 12 hours after immediate reactor shutdown, depending on the steady-state (equilibrium) operating power. After the reactor is shut down, xenon-135 is no longer transmuted to the harmless xenon-136, because there are virtually no neutrons remaining in the shutdown state for it to absorb. Therefore, the xenon-135 concentration increases as the major xenon source, iodine-135, decays. Xenon itself decays away as well, so that after a sufficient time following shutdown has elapsed, the rate of iodine decay into xenon equals the rate of xenon decay loss. At this time, the xenon concentration is a maximum. For subsequent times, the xenon decay loss is greater than its production from iodine, since the latter is rapidly depleting. The xenon concentration then decreases, so that after 30 to 60 hours it is reduced to a negligible value. In the case of samarium and promethium, however, the situation following shutdown is somewhat different, because samarium is a stable isotope, so that it is not lost due to radioactive decay as is xenon. Therefore, as the promethium decays following shutdown, the samarium concentration builds up and approaches a final asymptotic concentration which depends on the equilibrium power prior to shutdown. Fortunately, even though samarium does not decay away like xenon, its accumulation does not pose nearly as severe a poison problem as xenon, because the thermal neutron absorption cross section of samarium is only 1.4 per cent that of xenon. At high steadystate operating power, it will be seen that the xenon equilibrium poison reactivity is about $8.00, while even the asymptotic post-shutdown
24
Reactor Poisons. Dynamic Programming II
12
samarium reactivity is only $6.00. For xenon, the post-shutdown maximum reactivity can be hundreds of dollars, depending on the equilibrium power prior to shutdown.
2.3. Xenon Spatial Oscillations Revisited There is a second mode of xenon behavior, mentioned briefly in Section 1.2, which is the spatial power oscillations that can take place in a large high-power thermal reactor. The spatial aspect refers to the fact that when the power waxes in one part (or half) of the reactor, it wanes simultaneously in the other part. Such oscillations are most pronounced in very large systems. Large implies that the principal dimensions of the reactor are large compared to the average “crowflight” distance that a neutron in such a reactor travels from birth to death. Even though these oscillations do not pertain directly to the xenon poison control problem at hand, they are sufficiently interesting for a qualitative description of their behavior to be given. Actually for a more refined treatment of the xenon poison control problem, account must be taken of the spatial oscillations. This is not included in this book, however. For further details of spatial oscillations, references in the nuclear engineering literature should be consulted [7, 81. Suppose that in a large thermal reactor operating in the steady state, a perturbation occurs, for whatever reason, resulting in a slight increase in power in one half of the reactor. An immediate increase in the absorption of neutrons by the xenon present ensues. This is much larger than the immediate increase in production of xenon by fission, as the fission yield of xenon is only 0.3 per cent, as discussed earlier. With a net reduction now in the xenon poison in this half of the reactor, its corresponding power will begin to rise, since less neutrons are being gobbled up by xenon. As the power rises, reinforcing the original perturbational power increase, more xenon is absorbed, increasing the power further. The power continues to increase until the xenon poison being produced by the augmented iodine concentration (due to the power increase, which produces more iodine with a halflife for decay to xenon of 9.58 hours) becomes manifest, and begins to counteract the power increase. The power now at its maximum, corresponding to the first quarter-cycle of oscillation, begins to
2.41
Discrete Optimal Control
25
decrease as the iodine keeps decaying to xenon. This increase in xenon thereby reduces the power still further, until it approaches its equilibrium value, which marks the first half-cycle of oscillation. The xenon poison persists, now dropping the power below its equilibrium value. The increasing xenon, 180 degrees out of phase with the power, drops the power to its lowest value corresponding to a state where the bulk of its source, the iodine, is now depleted. The power is at its minimum, and three-fourths of the cycle has transpired. With the iodine depleted, little xenon is produced, so that the power begins to rise and return to its equilibrium value, thus completing the cycle. With regard to the spatial aspect of the oscillation, assuming that the total reactor-power output is kept constant, an increase of power in one half of the reactor will thereby result in a decrease of power in the other half, and vice versa. Then, not only is the xenon concentration 180 degrees out of phase with the power in its respective half of the reactor, but the power magnitudes in each of the halves of the reactor are also 180 degrees out of phase. Calculations reveal that the above cycle has a period of approximately one day. It has been shown that for high-power thermal-reactor normal operation, such oscillations are bounded. That is, the reactor is “stable.” These oscillations contribute no direct hazard except that connected with the fact that an unanticipated increase in power in one part of the reactor may cause local overheating if the heat-extraction capability (coolant) is not large enough. As mentioned earlier, the period of the oscillation is sufficiently long that the reactor operator can take remedial action, to compensate for the power increase, by standard means at his disposal.
2.4. Discrete Optimal Control The second example given in Section 1.8 was the dynamic-programming solution to a continuous optimal control problem. This was meant in the sense of the limit of a discrete control process made up of discrete stages. In the following no such limit will be taken. For numerical computation reasons, all control processes must be discretized ; i.e., the resulting algorithm which is used in the computation must be in discrete form. The following is one illustration of such an algorithm, similar to that in Section 1.8. As in that example (p. 17),
Reactor Poisons. Dynamic Programming II
26
12
consider the identical reactor kinetics equation of state
(2.1)
ri = u n .
In terms of the time being discretized, so that a given control epoch is given by T = N A , the above is rewritten in difference form,
n,,, = (1 + u,)n,
no = c .
(2.2)
Let the control criterion functional be given by C= an2( T)+pjF u2 dr, as before. The system is so controlled to travel, from whatever present state it finds itself, in duration T with final state c, so as to minimize C while on trajectories given by (2.2). The discretization of the criterion is C= anN' +BEiN -=1lui2 .The first term is the final cost at the termination of control; the second term is the accrued cost due to the control itself. Now define Fj(c)= cost of the control process in state c, with j stages remaining to control, j = N , N - 1,. . .1, 0, and using an optimal policy, i.e., one that minimizes C . At the end of the control phase, with no more control decisions remaining to make, Fo(c) = U n N 2
= uc
2
,
(2.3)
since for this trivial one-stage process, the system is simultaneously at the beginning and final states. Then, with j stages remaining to control, the principle of optimality yields the functional equation Fj(c) = min [puj2 "j
+ Fj-l (c(1 + u j ) ) ]
( j = 1,2,3, ..., N ) .
(2.4)
The first term on the right side above is the cost of control, per se, when in thejth stage. The second term is the subsequent cost incurred after thejth-stage control is exercised, Fj- l(c(l + u j ) ) , when the system state has been advanced from c to the next stage and state c(1 + uj), as given by the equation of state (2.2). Since Fo=ac2, from (2.3), ~ ~ ( c ) = r n i n [ p u , ~ + ~ +u1))]=min[pu12+uc2(1 ~(c(l +u~)~]. UI UI (2.5) As the bracket is quadratic in u l , simple differentiation yields the optimal one-stage control function u l * , which is uc2
ul* = - ___ p uc2
+
2.51
Averaged Control and Terminal Control
27
Substitution of ul* back into the bracket of (2.5) yields the optimal one-stage cost. It is
The above process is then repeated for j = 2 , 3 , ..., until the desired N-stage cost, FN, is found. This together with the control sequence { u N } constitutes the solution. As this is a tutorial example, the solution was found analytically. For substantive optimal control problems, such is a rare situation. For most control processes, including the xenon control problem, resort must be had to large-scale high-speed digital computational facilities, to compute the optimal cost functional and control policy.
2.5. Averaged Control and Terminal Control There are two general kinds of control criterion functionals. These are, first, the integral type, which is to obtain F(x,u) dt = min. This is equivalent to an average (running) cost of control, which accrues as the control proceeds. This is seen from the example in Section 2.4, which possesses in part such a control criterion. The second type of control is that of the terminal cost functional. This means that the criterion is a functional only of the final control state. If this is the only control “cost” considered, the implication is that the running cost of control is zero. That is, it normally does not matter how the system is transformed from the initial to the final state, since there is no running control cost. The xenon shutdown control process is essentially a terminal type of control, as will be seen. The first type of control is exemplified by desiring to minimize
Z(U)=
rT
J
0
F(x,u)dt,
where the state variable is x and the control variable is u. The process “trajectories” are governed by the constraining differential equation,
~ ( 0=) C. (2.9) It is convenient to discretize this process by dividing the control interval T into N parts, so that T = N A . Time is counted in the sense f = G(x,u)
28
I2
Reactor Poisons. Dynamic Programming II
that N corresponds to the stages remaining to control, as depicted in Fig. 2.2. Equation (2.8), discretized, becomes N
IN(u) Control Start
=
1 F ( x k , uk)A
k= 1
0
I
2
I
I
I
N
N-l
N-2lstageS N-jtl remaining)
Time j - 1 I
I
I
(2.10)
*
N
j+l
I
N-j N-/-I
I
I
0
Control Finish
FIG.2.2. Optimal-control-time line.
Likewise, (2.9) has the discrete analogue (xk+ 1
- xk)/d = G ( X k , uk)
XN
=c
( k = 0, 1, ..., N ) . (2.1 1)
Now define the optimal criterion functional as fN
( c ) = min u
N k=l
(2.12)
(xk, uk)
Then, for a one-stage process, fi
( c ) = min ~ ( cu , ) A
.
(2.13)
u1
From the optimality principle, with j stages remaining to control, fi(c)=min[F(c,uj)A + f j - l ( c + G ( c , u j ) A ) ] UJ
( j = 2 , ..., N ) . (2.14)
Often, as in the functional equation (2.4) of Section 2.4, the units of time are chosen so that A = 1. Equations (2.13) and (2.14) constitute the dynamic-programming functional equations of the problem. The latter is also a recurrence relation which can be used as an algorithm for numerical computation on digital-computing machinery. As already mentioned, the example of Section 2.4 is a special case of the above, where F(xk,#k)=ukZ and G ( X k , U k ) = 1+ U k . For the case of a terminal control criterion, it is desired to minimize a functional of the final state, that is, to obtain min I
= min J ( x ( T ) ) U
(2.15)
2.51
Averaged Control and Term‘nal Control
29
and, for example, with the same system of trajectories,
(2.16)
x(O)=C.
i=G(x,u)
Discretization of this type of process can be obtained in the same way as the previous example. Then, define the optimal return functional as f N ( c ) = minJ(x(T)).
(2.17)
U
Again, for the trivial single-stage process, where the system finds itself simultaneously in the initial and final states, since “control” has ended, f,(c)=minJ(c)=J(c).
(2.18)
u1
With . j stages remaining to control, the principle of optimality yields fi(c)
= minfj-
(c
+ G ( c ,u i ) d )
( j = 2, ...,N ) .
(2.19)
uj
It will be seen that the above type of problem is a one-dimensional version, akin to the xenon shutdown-control problem. As an interesting simple application of the above type of functional equation consider the following [4]. It is desired to choose the magnitudes of the N numbers xl, x2, xN so as to maximize their product ..a,
xlxz ‘ * ‘
X’N
(2.20)
9
but with the two constraints Let
x1
+
XN
x2
=c
fN(c) = max Trivially, - fl
(c)
=
xi 2 0.
n xi. N
i= 1
max x1 = c .
osx,sc
(2.21) (2.22) (2.23)
From the optimality principle, the corresponding functional equation is given by ( j = 2, ..., N ) . (2.24) fi(c) = max xif (c - x i ) osxjsc
Equations (2.23) and (2.24) comprise the recurrence algorithm from which the solution can be built up. The solution is easily found to be f N = (C/N)N
J” = GIN.
(2.25)
30
Reactor Poisons. Dynamic Programming I1
I2
This incidentally provides a simple proof of the fact that the arithmetic mean of positive numbers is greater than their geometric mean. This is apparent, since (2.22) and (2.25) assert that f N = max
n xi N
i= 1
(2.26)
=( c / N ) ~ .
Then if the maximum operator is removed, the Nth root of both sides yields the inequality c N -N. ( n xi =i l)
1/N
(2.27)
or (2.28) with equality holding, of course, when x 1 = x 2 = ...
=X N .
2.6. Impact on Linearized Control Theory
In the realm of control-theory-design analysis and synthesis, the emphasis and tradition developed over the past score of years has been essentially on the perturbation approach. This amounts to a linearization of the particular system (control process) model, and it is manifest in the transfer function and all its ramifications. As can be appreciated, this approach is one that holds only “in the small,” that is, for small fluctuations about equilibrium states of the particular system under study. Further, in terms of the closed-loop feedback linear theory, the specific feedback-control criterion using the transferfunction idea is essentially to minimize the difference between the output and the input reference. As is realized, actual control systems must have the capability to operate in the large. The salient state variables do change over orders of magnitude, depending on the particular control process. Then the transfer-function analysis falls short of providing the correct control regimen. However, if such processes are considered from the point of view of the calculus of variations, in that the process is controlled according to the extremization of a given criterion functional, then this is an important step toward being able to handle systems in the large. Such a schema includes the linear system, and its associated
2.61
Impact on Linearized Control Theory
31
transfer function, as a special case. However, the classical calculus of variations has limitations, and it will be seen how some of the modern approaches, including dynamic programming, provide new avenues of investigation for overcoming them. To provide a specific conceptualization of the above statements, consider a general system described by the following differential equations: f1
=fl(~1,~g,...,~N;
i N
..,
;
~
= f N ( X 1 ,xz, ...)XN
;
u1, u 2 ,
(XI 9 ~
t) ; t)
u1,uz,...,um;
2 , XN
f z = fz
1 u2,
um
3
..., urn; t )
xi(O)=c1, (0) = ~2 3
x2
(2.29)
XN(0) = C N .
The above set of equations can be written in a more compact (vector) notation as . i = f (2,n) 2 (0) = E , (2.30) where 2 is the state vector (xl, x 2 , x N ) , f ( 2 , C ) is a corresponding vector function, Li is the control vector (ul u2,..., urn),and m < N . The essence of feedback control is to make use of the deviation of the system state from the desired state (usually the equilibrium state) to restore the system to the desired state. Let the control vector C be given by a function of the desired state i and the actual state 2. That is, ..a,
zi = G ( 2 , i ) .
(2.31)
To obtain the linear theory approximation, both 2 and J(2,ii) in (2.30) are expanded about the steady state vector i through linear terms to yield the linear approximation to the system description, i j = A i j + bii,
(2.32)
where i j is the vector deviation from the steady state; i.e., i j = 2 - i . A and b are constant matrices embodying the parameters that characterize the system kinetics. If the Laplace transform of (2.32) is taken, the result is i j ( s ) = [s - K ( s ) ] - ’ b ( s ) 1 7 ( s ) , (2.33) which yields the generic “response = transfer function x input-forcing function” relationship.
32
Reactor Poisons. Dynamic Programming 11
I2
The open-loop (without feedback) transfer function is then
T(s)= [s - K ( s ) ] - B (s) ,
(2.34)
where s denotes the Laplace-transform variable. In the linear theory, the control variable is taken as the difference between the input and the output, as depicted in Fig. 2.3. Then from (2.31), let G ( , f , 2 ) = ii - f j .
(2.35)
Open-Loop System
Closed- Loop System
FIG.2.3. Open- and closed-loop systems.
Inserting this expression instead of ii in the right side of (2.33) gives the analogous closed-loop relationship,
(2.36)
+
where ?'(s)/( 1 T(s))is the closed-loop transfer function. This is depicted in Fig. 2.3. Again, the above holds only in the small, since it is based on a linear approximation about the desired state. Therefore, only small deviations from the desired state can be tolerated, in order that the aforementioned expansion through linear terms remain valid. The theme of the generalized optimal control problem is to choose optimally the control vector ii in (2.30), where 1 is given by (2.31). No loss in generality is incurred if ii is made to depend on the difference
2.61
.f - 5 ; then
Impact on Linearized Control Theory
fi = G ( f - T),
33
(2.37)
where in the linear theory G(.f - t)= f - 2, a simple difference between the actual and the desired states. To investigate the control process in the large, the problem is restated in terms of the calculus of variations. That is, for the system characterized by the state equations i!= f ( . f , f i ) ,
(2.38)
it is desired to find the optimal control vector 6, which is the 1 that renders a control criterion functional
(2.39) an extremum. The usual physical implication is that Z(P) is a cost functional, so that min,Z is sought. The function L(.f,P) is usually chosen to emphasize a particular physical aspect of the control process. A typical control criterion functional is of the mean-squared deviation type; i.e., Z = J i (2)’ dt. Then the optimal control policy P is chosen so as to minimize the (time) mean-squared deviation of the system from its desired state. T is the given epoch during which control is exerted. However, for many of the modern optimal control processes, such as those that occur in orbital guidance and space-vehicle trajectory synthesis, as well as the xenon optimal shutdown control problem, the classical calculus of variations also falls short of providing optimal control policies. It is at its best for certain kinds off (2,P) and L ( 2 ,P). For example, iff (2,ii)is a linear form in 2 and P, and L(2,ii) is a quadratic form in 2 and 6, and when both f and L are continuous with continuous derivatives, then the classical variational calculus yields straightforward optimal control policies. For other than the above forms, and especially if L and f are discontinuous functions and/or constraints exist on 2 and ii and their derivatives, formidable difficulties arise with the classical approach. As a matter of fact, for some of the interesting modern problems, no Euler equation exists at all, because of lack of sufficiently smooth state variables. It will be seen, however, that using some of the newer techniques, such as dynamic programming, which among other things provide a basic computational approach using large-scale digital computers, solutions can be found for many of these difficult but important optimal control problems.
CHAPTER 3
Poison Kinetics and Xenon Shutdown. Dynamic Programming 111
3.1. Reactor-Poison Kinetics Equations
The principal reactor poisons, xenon-135 and samarium-149, are described temporally by the following kinetics equations. These are ordinary differential equations which are essentially concentrationrate-balance relationships. They are the commonly accepted pair for xenon and iodine, as alluded to in Section 1.3. The reactor is assumed to be initially in a steady state, so that the initial conditions are given by the vanishing of the respective derivatives of the xenon and iodine concentrations :
dX dt
+ I I Z - (22 + o ( P ) X
-= Y X Z ~ ( P
dl dt
- = yICfcp - I l l
X(0) E 8
~(O)E
0,
(3.1)
lo = 0 ,
(3.2)
0 =
where X ( t ) is the xenon concentration, xenon-135 nuclei/cm3; Z ( t ) is the iodine concentration, iodine-135 nuclei/cm3; yx is the relative yield of xenon produced from fission (0.003); y I is the relative yield of iodine produced from fission? (0.056); ,YJ is the macroscopic fission cross section of the fuel (U2j5),cm-'; p is the thermal neutron flux, neutrons/cm2-sec; ,Il is the decay constant of iodine, inverse of mean t Iodine-I35 is not a direct fission product, but a decay product of certain isotope progenitors, as explained in Section 1.3. The above yield values are for U2a5fuel fission.
34
3.11
Reactor-Poison Kinetics Equations
35
life (2.9 x lo-' sec-I); I , is the decay constant of xenon, inverse of mean life (2.1 x lo-' sec-'); and a = microscopic absorption (capture) cross section of xenon for thermal neutrons (3.5 x lo6 barns= 3.5 x lo-'' cm'). The first term on the right side of (3.1) is the production rate of xenon due to fission, since CZ'p is the number of fissions/cm3-sec in the reactor and y x is its fractional yield produced per fission. The next term is the production rate of xenon due to the decay rate of iodine. The third term is the loss rate of xenon due to its natural /? decay, as described in Section 1.3. The fourth (and last) term is also a loss rate due to the capture (absorption) of neutrons by xenon. o X = Z x is the macroscopic absorption cross section of xenon for thermal neutrons. Therefore, multiplication by the thermal neutron flux, to obtain aX (Q, yields the number of neutrons absorbed by, and thereby the loss rate of, xenon nuclei/cm3-sec.t Similarly, for the iodine concentration rate of change, the first term of the right side of (3.2) is the production rate of iodine due to fission. The second term is the loss rate of iodine due to its decay into xenon, where it appears as a production term in the xenon equation (3.1). Iodine also absorbs neutrons, but its microscopic absorption cross section is orders of magnitude smaller than that of xenon, as discussed in Section 1.3, so the corresponding loss-rate term, o1Iy,is neglected in (3.2). It is appropriate now to investigate the equilibrium concentration, X,, of xenon-135. This is obtained easily by allowing the left side of (3.1) and (3.2) to vanish. Eliminating the equilibrium iodine concentration between the resulting two equations yields, where p0 is the equilibrium thermal neutron flux, (3.3) The corresponding magnitude of the equilibrium xenon poison reactivity, from its definition in (1.4) is, for an enriched-fuel reactor such that m = C f u e I / I m o , e r a , o r 9 1, Kx = - C p o i s o n l P Z f u e l = - a X O I P ~ f u e 1 + Cf. the footnote in Section 1.5.
I3
Poison Kinetics, Xenon Shutdown. Dynamic Programming I l l
36
or
For U235 fuel, Z,/Zfuel = a,/(a,+ ac+as)r0.84. afu,,=af+aC+as; in other words, the microscopic fuel cross section is the sum of the microscopic fission, capture (absorption), and scattering cross sections. The microscopic scattering cross section, as,is negligible by comparison to the other cross sections and so is ignored in the above computation of a,/afuel.Using p (U235)=0.0064, and the previously given standard values for the other parameters, the resulting equilibrium xenon poison reactivity is
K =-
2.7 x I O - ~ ~ ~ ,
2.1 x 1 0 - ~+ 3.5 x
1 0 - 1 8 ~ ~
(dollars).
(3.5)
Figure 3.1 depicts K , as a function of the equilibrium thermal neutron flux. Another way by which high-power and low-power thermal reactors are distinguished is whether or not the corresponding equilibrium flux term, 3.5 x 1 0 - ' 8 ~ o ,is large or small compared to the other term in the denominator of (3.5). For low-power reactors, i.e., those whose
6.4
-
Equilibrium xenon poison Reactivity,
48
K,
(dollars)
-
3.2 1.6
I
0
10"
I
10'2
10'3
Equilibrium neutron flux,
FIG.3.1.
I
I
1014
10'5
po
Equilibrium xenon poison reactivity versus equilibrium flux.
3.11
Reactor-Poison Kinetics Equations
37
equilibrium operating flux is about 10" neutrons/cm2-sec or less, the equilibrium flux term can be neglected by comparison in the denominator of (3.5). The corresponding xenon poison reactivity will be quite small, as it is given by
K,
N
-
1.3 x 10- l 2 q 0
(dollars),
(3.6)
so that even for a flux of 10" neutrons/cm'-sec, K,- - 13 cents, which is a negligible amount of poison reactivity. For high-power (high-flux) reactors, the equilibrium flux term dominates the denominator of ( 3 . 9 , so that, in the limit of high flux, (3.5) gives
Thus the equilibrium xenon poison reactivity is about 8 dollars, independent of the power in thermal reactors. As will be seen, however, the post-shutdown xenon poison reactivity can build up to two orders of magnitude over the steady-state value. In a similar manner, the samarium-promethium kinetics equations are given by, where equilibrium initial conditions are understood, dS - = - a,Sq dt
+IPP,
dP - = YpC,Cp - APP dt
3
(3.8) (3.9)
where S ( t ) is the samarium concentration, samarium-149 nuclei/cm3 ; P ( t ) is the promethium concentration, promethium-149 nuclei/ cm3; y p is the relative yield of promethium from fissiont (0.011); I, is the decay constant of promethium, inverse of mean life (3.6 x sec-'); and a, is the microscopic absorption cross section of samarium for thermal neutrons, ( 5 x lo4 barns = 5 x lo-'' cm'). Again, in a manner similar to xenon and iodine, the second term on the right side of (3.8) is the production rate of samarium due to prot Promethium-149 is not a direct fission product, but a decay product, as explained in Section 1.3.
38
Poison Kinetics, Xenon Shutdown. DYMIIU~ Programming 111
I3
methium decay. Samarium is not directly formed as a fission product, so the corresponding fission production term is absent. Also, samarium is a stable isotope, so there is only a loss term due to samariumcapturing thermal neutrons being thereby transmuted to samarium-150, which is harmless from our point of view. However, promethium, like iodine, can be construed to be produced from fission, to good approximation, which accounts for the first term on the right side of (3.9). The second term is the loss rate of promethium decaying into samarium. Analogous to xenon-iodine, the equilibrium samarium concentration, So, is obtained by letting the left side of (3.8) and (3.9) vanish. Eliminating the equilibrium concentration of promethium from the resulting equations yields (3.10) The corresponding magnitude of the equilibrium samarium poison reactivity is independent of the equilibrium flux and is given by (Z,/Cf,,,=0.84 for U 2 3 5 )
K, = - ~,SO//?Cfuel = - ypZ,//?Zfue, = - 1.43 dollars.
(3.11)
For mathematical convenience the xenon-iodine and samarium-promethium equations are rewritten in dimensionless form. For xenon and iodine, respectively,
For samarium and promethium, respectively,
i = m0(p - s u ) ,
(3.14)
p=u-p.
(3.15)
The equilibrium initial conditions k(O)=>(O)=S(O)=~(O)=O and u(O)= 1 imply x(O)=y(O)=s(O)=p(O)= 1, where the time scale for the normalized xenon and iodine concentration is measured in units of 1; (9.58 hours), while that for samarium and promethium is measured
Immediate Flux Shutdown
3.21
in units of
(77.2 hours), and,
A,' x
=
xlx,
Y = 1/10 s = s/s,
P
39
= up0
u = cp/cpo
r0 = acp,/i, N 1.2 x 10-'~cp,, mo = a,cp,/l, N 0.14 x 10-13cp0, w = A2/A1 N 21/29,
+ YI), Y 2 = Y X K Y X + 71). Y1
= YI/(YX
3.2. Immediate Flux Shutdown In Section 1.5 the xenon concentration (poison reactivity) upon immediate reactor flux shutdown was said to rise to a maximum which is orders of magnitude greater than its value at pre-shutdown equilibrium operation because of the xenon-iodine kinetics imbalance. The magnitude of the xenon maximum, or xenon peak, will be seen to depend on the equilibrium flux (power). Similarly, the samarium concentration increases greatly over its equilibrium concentration, approaching, however, an asymptote instead of a maximum. Immediate flux shutdown implies an initial negative-step-function flux behavior. That is, p(0-)= y o , while p(O+)=O. Then for t = O + , the xenon and iodine equations (3.12) and (3.13), respectively, are rewritten for immediate shutdown to zero t flux from the steady-state value. They are i=-wx+y,(w+r,)y j=-y
x(O)= 1 , y(0) = 1 .
(3.16)
In terms of the above initial conditions at equilibrium power these equations are easily integrated, to yield x(t) =
[+ 1
71 I"-'+wrO)]
e-W'
- Y1 (w + ro) e - ' . 1-w
(3.17)
The corresponding poison reactivity is calculated from (1.4), using the dimensionless variables defined in Section 3.1. In terms of the dimensionless xenon concentration, and for an enriched U235fuel + Zero flux is to be construed as a shutdown flux of to 10-9 times the equilibrium value, as the flux can never be brought to zero in an operating reactor, owing to the many and diverse neutron sources therein.
Poison Kinetics, Xenon Shutdown. Dynamic Programming III
40
I3
reactor (Zfuel/Zmoderator>> 1, Zf/Zfuel=0.84), the xenon poison reactivity is
K
=-
(Yx + ? I ) Zf P&ucl
rox
w
+ ro
TOX
7.7 x w ro
+
-
(dollars).
(3.18)
Figure 3.2 depicts a one-parameter ( ro) family of the xenon poison reactivity curves for various initial equilibrium flux levels. It is seen from Fig. 3.2 that the xenon maximum, measured in tens and hundreds of dollars of negative reactivity, occurs from 7 to 12 hours after shutdown, depending on the equilibrium flux. An expression for the time at which the xenon maximum occurs is obtained simply by equating the derivative of the xenon concentration i, from (3.17), to zero. It is 1 l-w
t,,, = -In[w(l+
)I1
Y 1 (w l - w+ ro) units of iodine mean life, 9.58 h.
(3.19)
Similarly, the samarium concentration following shutdown is governed by the differential equations (3.14) and (3.15), in which the normalized flux, u, is set to zero. The resulting equations are, then, S = mop
s(0) = 1 ,
(3.20)
p=-p
P(0) = 1 ,
(3.21)
where the time is measured in upits of 2;' (77.2 hours), as opposed to the xenon-iodine equations, where time is measured in units of 2 , (9.58 hours); the integrals of the above equations yield the samarium concentration, (3.22) s(r) = 1 m o ( l - e-').
+
The corresponding samarium poison reactivity for an enriched UZ3' reactor is obtained, from the discussion in Section 1.5, as K,
=
- (yp,Zf/~Cfuel) s = - 1.43s
(dollars).
(3.23)
With s inserted from (3.22), K, = - 1.43(1
+ m0(l - e-'))
(dollars).
(3.24)
At initial equilibrium, K,(O) = - 1.43 dollars, independent of the equilibrium flux. A long time after shutdown to zero flux, the samarium
3.31
Xenon and Samarium after Protracted Shutdown
41
reactivity reaches its limiting value of
-
K,(co) -1.43(1
+ m,) = -1.43(1 + 0.14 x
10-'3q,)
(dollars). (3.25)
Even if the equilibrium operating flux vo= 5 x 1014neutrons/cm2-sec, which is the generally considered maximum engineering design limit 10.000
-
) .
._ .->
t
e
100
'0
1.0
2.0
3.0 40
5.0
6.0 7.0
Post-shutdown time, iodine meon lifetime (9.58hours)
FIG.3.2. Reactivity due to xenon poison after complete step-function flux shutdown, for various equilibrium flux levels.
for thermal reactors, K,( c0)- - 11.44 dollars, following immediate shutdown to zero flux. This is small compared to the maximum reactivity, following immediate shutdown, of - 150 dolars due to xenon for the same equilibrium flux (Fig. 3.2).
3.3. Xenon and Samarium after Protracted Shutdown After the reactor has been shut down for a long time, the xenon and iodine concentrations have diminished to negligible levels, which can be taken as zero in the same sense as that of the shutdown flux as
42
I3
Poison Kinetics, Xenon Shuidown. Dynamic Programming III
discussed in the footnote on page 39. As seen earlier, the samarium concentration has reached its limiting value, as given by (3.25). If the reactor is now restarted to its previous equilibrium operating power, the xenon concentration will also build up to its previous equilibrium value. This is seen by letting u = l in (3.12) and (3.13) with zero initial conditions, to obtain
+ r o ) x + y1 (w + r o ) y + y z ( w + ro)
f = - (w
I’=l-y
x(0) = 0 ,
(3.26)
y(0) = 0 .
(3.27)
The integral gives the xenon concentration after startup from “zero” concentration, which is (units of iodine mean life, 9.58 hours)
+ yz(w
+
“)
w+r,-l
-
{ 1 - exp [ - (w
+ ro)t ] } .
(3.28)
This is sketched in Fig. 3.3, where it is combined with the appropriate xenon concentration shutdown curve taken from Fig. 3.2.
I0
r\
Xenon
I + mo
I
0 Many days
‘mmediotec_
shutdown
FIG.3.3.
immediate
-- startup ~
Time
-
Xenon and samarium concentration buildup after long shutdown.
3.41
Xenon Minimum and Minimax Problem Statements
43
Similarly for samarium, except that since it is a stable isotope, its asymptotic concentration prior to restarting the reactor after a long time is obtained by integrating (3.14) and (3.15) with s(O)=p(O)=1 and u=O as initial conditions. This gives (units of promethium mean life, 77.2 hours) s ( t ) = 1 m,(l - e-*) p = e-t, (3.29)
+
+
so that sasym = 1 m,, and pasym = 0. Upon restarting after a long time, u= 1 and, from (3.14) and (3.15),
giving s(t)=
S=mo(~-s)
S(O)=Sasym,
(3.30)
&=l-p
P (0)
(3.31)
= Pasym 3
I +-[e-‘-rn,exp(-m,t)]. m0 1 - m,
(3.32)
This is sketched in Fig. 3.3 as well. 3.4.
Xenon Minimum and Minimax Problem Statements
With the background of the previous discussions in the areas of nuclear-fission-product poison kinetics and dynamic programming, the optimal xenon shutdown class of problems is here stated in a literal fashion. Their mathematical restatement will come later. The xenon concentration quickly rises, following an abrupt flux shutdown from equilibrium conditions to very low, or “zero,” power, possessing a maximum of considerable magnitude, especially for high equilibrium power (flux) thermal reactors. The corresponding xenon poison maximum reactivities are measured in hundreds of dollars. As discussed in Section 1.5, this necessitates additional fuel loading of one and possibly two orders of magnitude greater than that required for equilibrium xenon operation, to allow for override of the influence of the xenon poison at will in the post-shutdown phase. If such large amounts of fuel are not available in addition to the normal fuel charge, the post-shutdown control flexibility is severely curtailed, because of the ease with which the xenon poison reactivity can overcome the available reactor positive reactivity. If this happens, the xenon poison will keep the reactor in a shutdown state for a protacted length of time, some 30 to 60 hours, until the xenon decays away naturally, as dis-
44
Poison Kinetics, Xenon Shutdown. Dynamic Programming 111
I3
cussed earlier. For example, in the Materials Testing Reactor (MTR) at the National Reactor Testing Station, Arco, Idaho, which is a large high-power thermal reactor, only enough additional fuel is available to override xenon for the first 30 minutes, following an immediate (abrupt) flux shutdown. To alleviate the post-shutdown control flexibility where reasonable, but still limited, amounts of additional fuel for xenon override are available, perhaps optimal flux shutdown programs can be found, in the sense of the following problems. Problem ( a ) . With certain constraints on the flux magnitude, and on the allowable xenon concentration given in Section 3.5, and for a given allowable time T in which to shut the reactor down, what is the flux shutdown program that minimizes the xenon maximum wherever it occurs in the post-shutdown period (later than T)? This is the xenon minimax problem. The time at which the reactor is restarted in the post-shutdown period is assumed irrelevant, but is considered in the formulation of the second problem. Problem ( 6 ) . For like constraints on the flux magnitude and the xenon concentration, and with an allowable shutdown time T, what is the flux shutdown program that minimizes the xenon concentration at a given time T o > T i n the post-shutdown epoch? To is, of course, the time at which it is desired to restart the reactor. This is the xenon minimization problem. It is seen that (a) is a special case of (b) where To coincides with the occurrence of the xenon peak. 3.5.
Constraints
As is realized, the problems just formulated are couched in the language of the calculus of variations. In this sense, it is well known that the type of constraints imposed are the principal influence on the behavior of the optimal flux control programs. Further, the constraints often spell the difference between whether or not the problem is tractable, using the classical variational calculus. This is because of the severe limitations imposed on the smoothness of the state and control variables by the constraints. Such smoothness, or continuity, is a major requirement for solutions of these kinds of control processes, using the classical approach,
3.51
Constraints
45
On the other hand, the computational algorithm obtained from the dynamic programming formulation, as employed on large-scale highspeed digital computers, actually thrives on constraints. This is due to the fact that the constraints actually limit the “search space” of the problem, as this algorithm can be construed as the instrument of a dynamic programming search-theoretic method to obtain optimal control policies. That is, the amount of fast (core) memory, and therefore the computation time, can be reduced greatly by the limits imposed on the state variable space by the constraints. The principal constraints to be considered are those on (1) flux magnitude, (2) inverse period, and (3) xenon concentration. These are considered in turn. (1) Flux magnitude. The constraints on the magnitude of the normal-
ized flux, u, are that it possesses a minimum of zero (see footnote, p. 39) and a maximum M . Usually the maximum corresponds to equilibrium operating power. However, as will be seen later, because of the xenon constraint, there are two modes of obtaining a given optimal-flux shutdown-control policy. One is to conform strictly to the flux constraint maximum, resulting in one form of optimal flux shutdown. The second is to ignore the flux maximum constraint temporarily, so that the system will proceed to a given flux magnitude somewhat higher than equilibrium operating power over a portion of the optimal shutdown program, and then fall off with a negative exponential behavior back to equilibrium power, as will be seen. From the mathematical point of view, it is shown [9] that the flux magnitude must be constrained at least over part of the optimal shutdown program, resulting in a bang-bang form of control; otherwise problems (a) and (b) of Section 3.4 are not properly posed. (2) Inverse Period. In an operating reactor, the logarithmic time derivative of the flux, called the inverse period in nuclear-reactor parlance, is also constrained. It has a positive upper bound given by the particular reactor-plant regulations governing safe standard operating procedure. The inverse period upper bound must ensure that the reactor does not vary temporally on what is termed a dangerously “fast period.”t For example, if the reactor were proceeding from a startup state on a positive inverse period of large magnitude, it could t The term period means the inverse of the logarithmic time derivative of the flux; inverse period means the direct logarithmic time derivative of the flux.
46
Poison Kinetics, Xenon Shutdown. Dynamic Programming 111
I3
pass through its desired equilibrium state (critical) too quickly, becoming supercritical t and possibly causing an accident before safety mechanisms and/or the human operator could intervene. A lower (negative) bound on the inverse period is normally provided in an operating reactor by the inertia of the control rods and associated mechanisms ; occurring as well is the phenomenon of delayed neutrons which also increase the sluggishness of a decreasing (subcritical) neutron population.
( 3 ) Xenon Concentration. To account for the most important fact, and the practical raison d’Ptre of the search for optimal xenon shutdown programs - that there will probably not be a complete xenon override capability in the reactor, i.e., the ability to restart at will following abrupt flux shutdown, there must be a constraint on the allowable xenon concentration. If the reactor possesses complete xenon override capability, optimal shutdown is trivial. In other words, the reactor can be restarted at will following shutdown, so that no optimal shutdown programs are necessary unless it is desired to minimize the post-shutdown xenon concentration for some other reason besides attempting to acquire better post-shutdown control flexibility. The constraint on the xenon concentration corresponds to the given amount of positive reactivity available for partial xenon override, which is provided by additional fuel, as discussed earlier. In the discussion to follow, optimal Aux shutdown programs for problems (a) and (b) will be obtained using constraints (1) and (3) only, that is, constraints on the flux and xenon concentration magnitudes only. Solutions, specifically to include constraint (2) on the inverse period as well, are beyond the scope of this work. This will be discussed with regard to the amount and type of future elaboration of optimal xenon shutdown control. The approach here is that the flux and xenon constraints delineate an idealization of reality sufficient for present needs. The corresponding shutdown programs obtained will be suboptimal because they do not take account of inverse period constraints that exist in an actual operating reactor. The idealization turns + Supercritical implies that the reactor multiplication factor k > 1 results in increasing neutron population which could herald a reactor accident; cf. Section 1.5.
3.61
47
Dynamic Programming. Absolute Value and Minimax Criteria
out to be quite a good approximation to the actual situation because of the time scales involved. The allowable shutdown duration T, time to xenon maximum, post-shutdown epoch, etc., are measured in hours, while the allowable inverse period is measured in minutes, so that flux-level changes can be approximated in a discontinous manner by flux steps or pulses required by the optimal shutdown programs (as derived later). 3.6. Dynamic Programming. Absolute Value and Minimax Criteria One of the optimal xenon control class of problems, which is the central control process investigated in this book, is to find the shutdown control program to minimize the maximum post-shutdown xenon concentration at whatever time it occurs. Before discussing the formulation of that problem, it is important to study some simple illustrations using a “minimax” criterion functional. Such problems are quite difficult to handle in a general way using the classical calculus of variations. However, at the very least using dynamic programming, a computational algorithm can be derived in a straightforward manner from which solutions to such problems can be found. This also holds true for the less-complicated criterion of minimizing the absolute value of a state variable. For example, what is the optimal control u that yields the minimum of the absolute value of the final state of the system, i.e., the terminal control criterion, lxNl = min, (3.33) where the equation of state is x,+1 = ax,
+ u,
xo = c
and the control is constrained in that 0 < u, < M ? Let j” ( c ) = min lxNl .
(3.34)
(3.35)
U
Then at the final state, since zero stages remain to control, trivially
(3.36)
Poison Kinetics, Xenon Shutdown. Dynamic Programming III
48
For one stage remaining, fl
+ u l ) = min(fo(ac),fo(ac + M ) )
( c ) = rnin fo(ac OBuiGM
(3
(3.37)
when u1 is zero or M respectively, since fo is linear in ul. Then lacl. In general, for j stages remaining to control,
f l(c)=
f j ( c ) = min
fj-l
OBUjSM
(ac
+ uj)
( j = 1,2, ..., N ) .
(3.38)
Now consider a minimax criterion problem: to find the { u j } sequence, also constrained as above, that yields ( j = 1,2, ..., N ) .
min max I x j ( ( ~ j )
i
(3.39)
That is, find the optimal control policy u that minimizes the maximum 1x1 in whichever intervalj it happens to occur, using state equation (3.34). Let fN = min max ( x i [ . (3.40) uj
Again, trivially,
j = 1,2, ...,N
fo(c) = I C I
(3.41)
7
since there are no further decisions to make. Now f l( c ) =
min max lxjl uj
or f l( c )
(3.42)
j=O.l
= minmax(lc1, lac
+ ull),
(3.43)
UI
but f l (c) =
min max (IcI, fo (ac + ul)),
(3.44)
OduiQM
so that with j stages remaining to control, fj(c)=
min rnax(lcl,fj-,(ac+uj))
OBujdM
( j = 1 , 2,..., N ) .
(3.45)
It is seen that an equivalent way of writing (3.45) is f j ( c ) = max(Ic1, min
OdujQM
f j - l(ac
+ uj))
( j = 1,2, ..., N ) .
(3.46)
As a second example, examine the following two-dimensional system [6], which achieves a similar solution, but in a more deductive
49
Dynamic Programming. Absolute Value and Minimax Criteria
3.61
manner. It is desired to find the optimal control sequence { u j } that yields min max lxjl, (3.47) uj
j = 1.2, ...,N
where the system state is governed by the pair of difference equations Xr+I = X r Yr+l
+ Yrd
xo=c1, YO=',.
=Yr+g(xr,Yr)d
(3.48)
It is important to appreciate the following seemingly simple idea. For a sequence of k numbers {Nk},any chosen number Ni obeys the partition relationship max (Nl, N,, ..., Nk) = max (Ni,max (N1,N2,..., N,,,Nj,Nk)). Now let
fN(c1,CZ) = min uj
j
max
= 1,2,...,N
(3.49)
[xi[.
(3.50)
Using the partition relationship above, this can be rewritten fN(c1,c2)= minmax(lxNI,
max
j = 1,2,
UJ
...,N -
1
Ixjl).
(3.51)
As the minimum is over the set ujr i.e., the minimum operation is on the i n d e x j only, it can be placed inside the parentheses to give fN(c1,c2) = max(lxNl,min uj
or fN(C1,C2)=max(lxNl? f N - l ( c l
max
j = 1 , 2 , ...,N - 1
(3.52)
Ixjl)
+ c2A,c2 + g(c1,c2)d)),
(3.53)
since with one lessj choice to make for u j , the system now finds itself confronted with a similar minimax problem but now advanced from the old state (c1,c2)to the new state (c1+c2d,c2+g(cl,c2)d)as given by the difference equations (3.48). Note that this method, if applied to the previous one-dimensional problem, will yield (3.46) as one of its solutions. As an example of a simplified minimax criterion functional, consider the old chestnut of finding a bogus coin from a seemingly identical pile of coins, using an ungraduated balance, where it is known that
50
Poison Kinetics, Xenon Shutdown. Dynamic Programming III
I3
the bogus coin is heavier (or lighter) than the others [lo]. That is, find the minimum number of weighings required to guarantee finding the bogus coin from. among N coins. Letf(N)= the minimum number of weighings required, for N coins, to guarantee finding the bogus coin using an optimal weighing policy. The principle of optimality asserts that f ( N )= 1
+ minmax(f(k),f(N k
- 2k)).
(3.54)
It is obvious that the N coins should be divided into three equal stacks, as nearly as possible. Then two stacks each contain k coins, while the third contains N - 2 k coins, yielding k = [ N / 3 ] ; [n] is read as the nearest integer contained in n. The two equal stacks containing k coins each are weighed. This accounts for the first term (unity) on the right side of (3.54). If the scale becomes unbalanced, the search is immediately narrowed to k coins. That case is fortuitous, because if the scale balances, the search narrows to the possibly larger stack of N - 2 k coins, since N - 2k 2 N / 3 is possible. Then, at worst, the search narrows to the stack that has one more coin than the other two. Thence, k= [ N / 3 ] ,and
f ( N ) = 1 + min max (f ( k ) ,f ( N k=[N/3]
- 2k)).
(3.55)
Since the minimum operator commutes with the maximum operator in the above equation, it becomes
f ( N ) = 1 + max (f ( ~ ~ 1 3f1( N) ~- 2 ~ / 3 1 ) ) .
(3.56)
This implies that the minimum number of weighings, f(N),equals the first weighing plus the subsequent minimum number of weighings, f ( [ N / 3 ] ) ,or f(N- 2[N/3]), whichever is the larger, to guarantee finding the bogus coin. To solve (3.56), it is convenient first to let N = 3 m to obtain
f(3rn)=1+rnax(f(rn),f(rn))=I+
f(rn).
(3.57)
Second, let N=3m+ 1 in (3.56), to obtain
f(3m+1)=1+max(f(m),f(rn+1))=1+f(m+l). (3.58)
3.61
Dynamic Programming. Absolute Value and Mimmax Criteria
51
Third, let N = 3rn + 2 in (3.56), to obtain f(3m
+ 2) = 1 + max(f(m + l),f ( m ) ) = 1 + f ( m + 1).
Using these recurrence relations, with m = 1, 2, solutions are easily obtained, f ( N )=M ,
where
...,
(3.59)
the nontrivial
3M-' < N < 3 M .
(3.60)
For example, six weighings are sufficient to guarantee finding a bogus coin (known to be lighter or heavier) from among collections ranging from 243 to 729 coins.
CHAPTER 4
The Maximum Principle
4.1. Introduction
As alluded to in the earlier discussion, the modern theory of optimal control has seen the emergence in the last decade of two complementary methodologies for treating control processes. These are Bellman’s optimality principle of dynamic programming and Pontryagin’s maximum principle. Both of these possess advantages and disadvantages with respect to the formulation and solution of particular control problems. The central problem herein, optimal xenon shutdown control, will be mathematically formulated later, using both principles. The difficulties will be pointed out, and it will be seen that for this class of problems, especially from the computational viewpoint, dynamic programming is the more straight-forward and hence the more efficacious method. The aim of this chapter is to introduce the maximum principle. It will be stated but not proved, and illustrative examples and ramifications will be presented. Proof of the maximum principle can be found in many places in the control-theory literature of today [lo, 111. Consider the system described by the set of ordinary differential equations (2.1 l), rewritten here in expanded form i 1 =f1(x1,x2,...,xn; i 2
= f2 (x 1 9 x2 9
. ,x, **
in=fn(x1,x2,...,x,;
u1,u2 , . . . I u , ; t) ; u 1 ,u2 , ...,u, ; t )
u1,u2,...,un; 52
t)
x,(O)=x,o, = x20
x2 (0)
x,(O)=x,o,
9
(4.1)
4.11
53
Introduction
where the state of the system is described by the vector (xl,x2,..., x,), and the control vector is (ul, u2, ..., u,). The number of components in the state vector 2 and the control vector d need not be the same. Assume that it is desired to find the optimal control policy, i.e., the vector d = ( u l , u2, ..., u,) which will transform the system described by equations (4.1) from its initial state (xl0,x2,,, ..., xn0)to a final state in duration T such that the criterion or cost functional, JCC] =
S:
fn+1(~19X2,...,xn;
~ 1 3 ~ . .2. ,,u , ;
t)dt,
(4.2)
is minimized. If it turns out that a particular criterion is to be maximized, one merely minimizes the negative in this context. Some typical examples of cost functionals are: (1) minimum time control, f , + l = 1, so that trivially J = T ; (2) minimum mean-squared control, f,+l= 117l2; then J=fO'1dl2 dt; (3) minimum mean-squareddeviation,,f,+ = p12;then J=fg12.12dt; or various combinations of these.
Now define a new additional state variable xn+1
=
Sb
fn+l(x1,x2,.*.,xn;
u1,u2,-..yun;
z)dz
(4.3)
u 1 , u 2 , ...,u,;
2)
(4.4)
with its differential equation i n + l =fn+1(~1,~2,.*.,~,;
and initial condition X , + ~ ( O ) = O and final condition x , + , ( T ) = J . x , + ~ is an additional state variable which should be considered as added to the system described by equations (4.1). As can be appreciated, the above general optimal control process is written in the language of the calculus of variations. With the integral constraint on the motion of the system, Eq. (4.2), such a formulation is called a Lagrange problem. Defining an additional variable xn+ as above, so that now the motion of the system, augmented by the additional variable, is controlled, in order that xn+ (T)=min is obtained, converts it to a Mayer problem. That is, the augmented system is transformed to a final state where one particular variable, x , + ~ ,has its extremum, while the others acquire certain fixed values. This is done because the maximum principle is couched in terms of a Mayer formulation.
54
I4
The Maximum Principle
Now define a Hamiltonian H, by letting (4.5)
H = p l f , + P J i +.**+Pn+lfn+l,
where a set of adjoint (dual, auxiliary) variablesp,, p , , ...,p n + are defined in that they satisfy the following system of equations, adjoint to the system given by (4.1):
-p,
+ -ax, p, ax,
afl
= -pp1
afn+ 1
af2 + a * * +
-Pn+ 1
8x2
P , ( T ) = 0,
From the definition of the Hamiltonian, it is seen that the state vector 5, and its dual vector@(generalized momentum of classical mechanics), satisfy the following canonical equations of Hamilton; i.e.,
xn+ and pn+ are seen to be “special” variables, since the definition of x,+ from (4.3) implies
because the Hamiltonian does not depend on xn+ explicity. In classical mechanics, it is said that the Hamiltonian is cyclic in the coordinate xn+1, so that the corresponding generalized momentum p n + is a constant of the motion. Then, for mathematical convenience, p n + = - 1 is chosen, which is the reason for the final condition on pn+ in (4.6). With regard to the control vector a, the maximum principle requires that it belongs to a closed set of functions. As mentioned, the maximum principle is an extension of the Weierstrass necessary condition for the existence of extremal, in the variational calculus, to closed sets of
,
4.11
Introduction
55
control functions. In our terms, this merely implies that the control vector is constrained, so that for each of its components, a, Q u, d bi, i = 1, ..., n. The proof of the maximum principle requires closed sets o f t vectors; otherwise the preceding statement of the control problem is meaningless. This often has a physical analogue as well, since an unconstrained control vector implies the availability of infinite amounts of energy, momentum, thrust, etc., which is physically impossible. The maximum principle, briefly stated, is that the sought-for optimal control 1*,which is the control vector that guides the system satisfying equations (4.1) so that J in (4.2) is minimized, is also that control vector which maximizes the Hamiltonian defined in (4.5). That is, maxH(p(t),a(t),G(t)) I1
= H(p(t),n(t),t*(t)) = constant,
(4.9)
where the adjoint vector p ( t ) satisfies equations (4.6) and pn+ < O . In other words, considering the Hamiltonian as a functional of 1, the optimal control t * is the vector that maximizes H, which maximum is a constant of the motion. H (p, 2,t * ) is a constant throughout the control duration T, and further H ( p , f, t*)=O, if T is not fixed a priori. The maximum principle is a necessary, but not sufficient, condition. However, for most control processes, physical intuition will provide sufficiency arguments as well for the calculated optimal 1*. With respect to the 1* that gives the sought-for H (p, 2,t*) = max, the optimal control behavior falls into two general categories. The first is that if the system equations (4.1) (usually linear) are such that the resulting Hamiltonian is linear in the ui,then with the constraints a, Q u, < b,, i = 1, 2, ..., n, the optimal control lies on the boundary of its closed set. That is, all the ui will equal either a, or 6, over all or part of the control duration T. If the preceding is true only over portions of the control duration, then the u, will switch back and forth between a, and b, during the control phase, so that H ( p , 2,t*)=max will be maintained throughout the control duration. This type of control is called bang-bang, for obvious reasons. The second category is the case where H ( p , 2,t ) is nonlinear in 1.Then the u, are obtained from aH/au,= 0, which generally results in a continuous function of time for the optimal control. Both of these types will be illustrated in the following examples.
The Maximum Principle
56
4.2.
14
Two Examples
Example 1. The first example is one-dimensional (first-order) and is quite similar to the one in Section 1.8 (which was solved by dynamic programming). The equation of state is
n=un
n(O)= 1 .
(4.10)
The cost functional is J = p Jiu2 dt. Following the scheme of the maximum principle, define the additional variable ni
(4.1 I )
with the corresponding additional equation of state j=pu 2 y(O)=O,
(4.12)
thereby converting the Lagrange problem of minimizing p u2 dt to the Mayer problem of seeking y(T)=min. The problem now is one of finding the optimal control u that transfers the system from its initial state n(0)=1 to a prescribed final state nT, in control duration T, such that y ( T ) is a minimum. The Hamiltonian is simply H=plun+p2Pu2,
(4.13)
and the adjoint variables satisfy
Since the Hamiltonian is nonlinear in u, aH/au=O, together with p 2 = - 1, yields u* = n p l / 2 p . (4.15) Then to obtain u* explicity, it is necessary to solve the following twopoint boundary-value problem : n = un
n(0) = 1 ,
(4.16a)
p = pu2
Y ( 0 ) = 0,
(4.16b)
PI
= -up1
p2=0
P1 (T)= 0,
(4.16~)
p2(T)=-1.
(4.16d)
57
Two Examples
4.21
That is, both n and p , must be obtained from solutions of equations (4.16). In this case it is easy, since multiplication of (4.16~)by n yields np,
+ plun = 0 .
(4.17)
Using equation (4.16a), this is equivalent to d -(np,) dt
=0
or
np, = c1 (constant).
(4.18)
This implies that u*=c,/2P, a constant of the motion. Integrating equation (4.16a) yields n = exp(c1t/2P). (4.19) In terms of the final state n,, c, =(2P/T) In n,, which follows from (4.19). In turn, the optimal control is 1
(4.20)
u* = - Inn,
T
and the corresponding state behavior is (4.21)
n = (n,)f’,.
If the final state is greater than the initial state, i.e., n T > 1, then u* is positive, so that the system increases exponentially to its final state, reaching there in time T. On the other hand, if n T < 1, then u*
B
m i n J = PTu*’
maxH
=
1
8-1
(4.22)
=-
P In2+,
(4.23)
u*’ d t ,
(4.24)
TZ
with
or that
ln’n,,
=-
,
T o
T
so that the optimal control u* transforms the system motion so that
58
14
The Maximum Principle
the Hamiltonian is proportional to the minimum mean-squared control cost. Though u* remained constant throughout the control duration, this is not true bang-bang control, as u* does not switch between its bounds at a certain time, or times, during the control duration T. A bang-bang control example will be presented in Section 4.3. Example 2. As the next example, change only the control criterion to J, =j: (u’ + Q’) dt, where the equation of state is the same as before, but the state variable is redefined as Q=log n. Then the augmented equation of state is, from (4.10), (4.25)
Q=u. As before, define a second variable Z = r 0 (u’ Z = U’
The Hamiltonian is
H = PIU and
From
+ Q’)
+ Q’.
dz with (4.26)
+ P Z ( U ’ + Q’),
aHlau = 0 , u* = p J 2 .
(4.27)
(4.29)
Using (4.25), (4.28), and (4.29), it is seen that the optimal control satisfies c*-u * =o. (4.30) Since now 0 = u*, the initial conditions on u* needed to solve (4.30) can be inferred from those on 0. Let Q(O)=u,*, so that the system is initially disturbed from equilibrium, and QT= uT* = 0, for equilibrium to be achieved at the end of the control duration. With these conditions, the optimal control is obtained from (4.30) as
*
u =
u,*sinh(T - t ) sinh T
7
(4.31)
which is seen to be an example of a continous optimal control function. The state-variable motion then follows simply from Q=u* and Q (0) =0.
4.31
59
Bang-kng Control
4.3. Bang-Bang Control [12]
Consider the linear second-order system described by
il= - x 1 + x2 2 2 = 2x1 - 3x2 + U
X l ( 0 ) = 0, Xz(0) = 0.
(4.32)
It is desired to transform this system from its initial equilibrium state (0,O)to the final state (2,2) in minimum time. There are constraints on the magnitude of u given by 0 < u < 7. For minimum time control, the cost functional is evidently J = T.The additional variable is x3 = t, so i3 = 1. The Hamiltonian is simply H = PI(-
+
~
2
+
)PZ(%X~
- 3x2 + u)- 1 .
(4.33)
The adjoint variables satisfy aH pl=--=
ax 1
p1 - pp2
dH p 2 = - - = - p1 + ~ P Z 8x2
aH p 3 = - - = 0. 8x3
(4.34)
The fact that the control is bang-bang can be ascertained immediately, since H can be rewritten H = p 2 u + terms not containing u .
(4.35)
Thence, to maximize H, (4.36)
Then, over the initial part of the control epoch, u* = 7, switching at a time t, < T to u* = O until the control duration is ended. If p 2 = O over part of the control duration, u* is arbitrary, which behavior is called singular control. This occurs here only at isolated points,t so that no difficulties of this type arise. Singular control has only recently begun to come under scrutiny by investigators in modern optimal control theory. t On a set of measure zero (cf. reference [lo], Chapter 5).
The Maximum Principle
60
I4
The solution of equations (4.32) are, within arbitrary constants, given by (1) O < t < t , : x1 - ~
+
~ - ( 7 / 2+ ) r~ ~ - ( 1 / 2 ) r +u
x2 =-iAe-(7/2)'
+ zI g e - ( l / 2 ) r
+
(2) t , < t < T : x1 = F e - ( 7 / 2 ) r
+ Ge-(1/2)'
x2
= - &Fe-(7/2)' 2
*,
+ 1Ge-('/2)'
2
(4.37)
, (4.38)
and the solutions of equations (4.34) are p1 = c e ( 7 / 2 N + De(1/2)t
w/2)' , 5
p2
= - 2ce(7/2)' +
p3
=- 1.
(4.39) (4.40) (4.41)
Using the initial and final states, the two-point boundary-value solution to (xl,x 2 ) is as follows: (1) O < t < f s , u = 7 : x 1 = g e - ( 7 / 2 ) r - 51 4 ~ - ( 1 / 2 ) r+ 4 , x2 =
- -s e - ( 7 / 2 ) r 3
- i3e - ( 1 / 2 ) t
(4.42)
+4.
(4.43)
( 2 ) t , < f < T , u=O: = - ie(7/2)(T-') + 3e(l/2)(7'-0 x2
3
=p 7 1 2 x - r ) +ge(1/2)(~-t)
(4.44) (4.45)
Since x1 and x 2 are continuous (but not their derivatives) at the switching time, t,, the switching time itself, can be determined by matching the above solutions at t = t,. This results in two transcendental equations involving the control duration T and the switching time t,. These can be solved using numerical approximation methods. It turns out that t,= 1.605 and T = 1.802. The constants C and D are obtained from H (p(O), X(O), d* (0)) = 0, since the time T is unspecified t and p z (t,) = 0 in order that H ( P ( t , ) , Z(f,), d*(t,))=O holds. + Since the time is unspecified n priori, H(D(t),R(r), n*(t)) control duration, as stated in the maximum principle.
= 0 during the entire
4.41
Continuous and Bang-Bang Control
61
4.4. Continuous and Bang-Bang Control Let the linear one-dimensional equation of state be given by x(O)=l.
k=-x+u
(4.46)
It is desired to control the system so that it reaches equilibrium at an unspecified time T, i.e., x ( T )=0, while minimizing the cost functional 1121 T J = J (X2+U2)dt. (4.47) 0
There are two cases: (1) no constraint on u, and (2) the control constraint JuI< *. Case (1). The additional variable y satisfies
3 = x’ + u ’ ,
(4.48)
with the Hamiltonian given by
H =PI(- x
+ u ) - (2+ u ’ ) .
(4.49)
As H is nonlinear in u, aH/au=O yields the optimal control as u* = p1/2
and
(4.50)
Eliminating p 1 from (4.46), (4.50), and (4.51) results in f-2x=O
with solutions
u*-2u*=o
= AeZ1/’t + Be-2’12r u* = (1 + 21/2)Ae2112r + (1 - 21/2)Be-2””
(4.52) (4.53) (4.54)
A and B are found by using x(O)= 1 and H ( p , 2, u*)=O. Two pairs occur, viz., A=O, B = l and A = l , B=O. The first pair is chosen because the desired final (equilibrium) state could not be reached otherwise. Then the optimal path, control, and duration are given, respectively, by x = e -21/2r u* = (1 - 21/’)e”’12~ T = 03. (4.55)
62
14
The Maximum Principle
Case (2). This is the same problem as in case (l), but with the additional control constraint IuI <&. Hence the equations of state, the Hamiltonian, and the adjoint equations are the same as in case (1). u* is depicted in Fig. 4.1.
t = 1,
Time
0 -
FIG.4.1. Optimal control u* for continuous and bang-bang control.
It is first appropriate to find the switching time, t,, at which the optimal control u* changes from u*= -& to a continuous function. Since u* = p 1 / 2 holds where u* is continuous, which is the region for the validity of obtaining u* from dH/du=O, then t, is found from -4 - u * ( t s ) = P1 ( t , ) / 2 .
(4.56)
Therefore p 1 rhust be found in the region 0 < t < t,, i.e., where u* = -&. This is done by first solving (4.46) with u= -&. Its solution is x = 6e-' - 1 4
(4.57)
4'
With this x inserted in (4.51), its solution gives
+
p1 = (pl (0) ;)el
+ f - Te 5
-1.
(4.58)
p,(O) is obtained from H ( p l ( 0 ) , f(O), - & ) = O in (4.49). This yields p1 = - L e t + 1 - Se - 1 o < t < t,. (4.59) 10 2 4
Hence t, is obtained from (4.56) and (4.59) evaluated at t,. The behavior of u* for times following the switching time t, is obtained in the same way as in case (1).
4.51
4.5.
Optimal Orbital-RendezvousControl
63
Optimal Orbital-Rendezvous Control
It is desired to transform a system (space vehicle) from an initial state to a final state that is not completely fixed but lies on a given curve [12].This can be construed as orbital-rendezvous control, where a space vehicle is to be launched and controlled to minimize, for example, fuel consumption and to rendezvous with an orbiting space station within a given time T. The additional orbital rendezvous condition that the optimal path must satisfy comes from the calculus of variations, where it is referred to as a transversality condition. This derived condition is a generalized orthogonality property, as can be realized by consulting references in the variational calculus [13]. Consider the second-order system whose equations of state are R, = x,
q(0) = 2,
(4.60)
Rz = u
x,(O) = 0.
(4.61)
It is desired to find the optimal control u* that transforms the state of the system, initially at (2,0),to the orbit (circle) x I Z ( T ) + x z Z ( T )1= in the control duration time T. The control is constrained so that IuI 6 1, and the cost functional to be minimized is PT
(4.62)
IuI dt can be construed as a fuel-consumption cost, while T/2is the rendezvous cost. The additional variable y (IuI 3) dz satisfies the
differential equation
3=
The Hamiltonian is simply H =PIX,
IUI
++.
=yo +
(4.63)
+ p,u - IUI - H1 .
(4.64)
The adjoint equations are given by p1 = o
p2=-p1
p3=o
p3=-1.
(4.65)
The control is seen to be of the “bang-off-bang” type. That is, to maximize H, -1 P2<-1, u=[ 0 -1
64
The Maximum Principle
I4
The solutions of equations (4.65) are, within arbitrary constants, p1=C
(4.67)
pZ=B-Ct.
To determine the constant C, assume that the control regime begins with u * = - 1, which is prior to the switching time t,. That is, let u* = - 1 for 0 < t < t,, so that H (-ii.(O), j ( O ) , - 1)=0 can be used as one condition to determine B and C in (4.67). The other condition is obtained from (4.66); i.e., p z ( t , ) = -1 if u* = - 1 marks the first phase of control. These two conditions result in B= -3 and C = -+ts, so that 1 p1 = - -
2t s
pz
3 It + __. 2 2t,
=- -
(4.68)
Hence, from the case where the first phase of the optimal control is u*= - 1, the second-phase control is u*=O. Now suppose that the rendezvous occurs during this phase of the control duration. Then the transversality condition discussed earlier yields [131
-
P1 ( T ) / x ,(T)= P2 ( T ) / x z( T )
(4.69)
Using (4.60), (4.61), (4.68), and (4.69) at the end of the control duration T , together with the orbit X , ~ ( T ) + X , ~ ( T 1, )will = determine the switching time t, and the control duration T. To provide a measure of confidence in the solution, it is readily shown that H (a, j,u*)=O over the control duration T.
4.6. Simplified Xenon Shutdown Control To supply a foretaste of what is to come, and to become acquainted with an unusual type of criterion functional, the following greatly simplified xenon control problem is given. For convenience, the dimensionless xenon and iodine equations (3.12) and (3.13) are rewritten here as i = - (w
p=u-y
+ r o u ) x + y l ( w + r o ) y + y2(w + r o ) u
x(0) = 1, (4.70) y ( 0 ) = 1 . (4.71)
It is desired to find the optimal control u* such that the xenon concentration reaches the minimum value possible, from initial equilibrium,
4.61
Simplified Xenon Shutdown Control
65
at a time To in the post-shutdown phase. The control duration is T, so that To> T. It is also desired in the post-shutdown phase that u* ~ 0 , and that the control constraints are 0 < u < M . To allow a simplification of the above equations of state, the control period will be considered short (1 to 3 hours) compared to the half-life of the iodine isotope. This means that the iodine concentration can be approximated by its equilibrium value y(O)= 1 in (4.70). Two further approximations are: (1) The high equilibrium flux is such that r o B w. This corresponds to an equilibrium flux rpo 2 1014 neutrons/cm2-sec. (2) The production of xenon directly from fission is small compared to that produced from iodine decay. Hence, the third term on the right side of (4.70) is neglected in comparison to the second. All this reduces the state description to one dimension, expressed by i= - ( w
+ rou)x + y l r o .
(4.72)
The cost functional is given by
J = /
T
kdt=x
0
(4.73)
since it is desired to make x(To)=min, which will be seen to depend on obtaining x(T)=min. The additional variable is y=ro k dz, so that j = i .
(4.74)
The Hamiltonian is
H
= P1(- ( w
+ rou)x + Y P O ) + P 2 3 .
(4.75)
Since p 2 = - aH/dy = 0 implies p 2 = - 1, and 3 = i, then
or
I)(- ( w
H
= (P1 -
H
= - rOux(pl
- 1)
+ rou)x + w . 0 )
+ terms not containing u .
(4.76) (4.77)
The control is seen to be bang-bang, since the maximum H is given when
(4.78)
14
The Maximum Principle
66
as x is always positive. Now p1 = - aH/dx = (pl
so that
p1 - 1 = c exp
- l)(w + r o u ) ,
1:
(w
(4.79)
+ rou)dt.
(4:80)
Since the exponential is always positive, irrespective of the magnitude and sign of u, the sign of c determines that of p1- 1 and in turn the u*. For u* = M, the solution of (4.72) is X =
Ylro
w
+ roM (1 - exp[-
(w
+ r , ~ ) t ]+) exp[-
(w + r o M ) t ] , (4.81)
while for u* = 0 (in the post-control region), it is - Yl'O (1
- e-W('-T))
+ XTe-W(f-T)
(4,82)
9
W
where xT is obtained from (4.81). Assuming M E 1, it is seen that u* = M in the control phase, followed by a post-control u*=O, will result in a minimum xT, and thus a minimum x(To).Therefore, the optimal control is to let u* = M during the control duration T, and then shut down to u*=O, which begins
4
Us0 M 0
T
TO
X
XO
0
O
T
TO
FIG.4.2. Optimal shutdown for simplified xenon shutdown.
4.71
The Two-Point Boundary- Value Problem
67
the post-shutdown phase. This is depicted in Fig. 4.2 and is the control procedure substantiated from physical intuition. For this simple one-dimensional example, no xenon maximum occurs in the post-control phase as in the actual situation. However, the above xenon behavior is analogous to the actual samarium poison behavior discussed in Section 3.2.
4.7. The Two-Point Boundary-Value Problem An immediate difficulty with the maximum-principle method, apparent at this point, is that the corresponding systems of equations constitute two-point boundary-value problems. That is, some of the boundary conditions are given initially, at t =0, while the remainder are given at the end of the control duration, t = T . In these illustrations the twopoint boundary-value difficulty is not felt, but in substantive higherorder problems, as for example the xenon shutdown problem, severe computational difficulties are manifest. These will be discussed in detail later, when a comparison of the solution procedure for the xenon shutdown problem will be made using both the maximum principle and dynamic programming. Suffice it to say at this point that the numerical integration of the differential equations, which is a necessary aspect of the maximum-principle procedure for complex systems, when run on large-scale high-speed digital computers, must proceed in one time direction, whether forward or backward. Then, to initiate the computation, all the boundary conditions must be specified at one end or the other. Since not all the boundary conditions can be so stipulated in the maximum-principle formulation, some must be guessed at when the computational problem run is begun. Therefore a trial-and-error procedure ensues which is in general less than satisfactory, especially in the situation where there is little physical intuition for the behavior of the solutions. Further, for certain bang-bang control problems, the times at which the control is switched between its bound must also be found by trial and error. This will be seen when the maximum-principle and dynamic-programming methods are compared for the xenon shutdown problem. One of the salient features of dynamic programming is that all the boundary conditions are stipulated at one end of the control duration. This is usually at the end of the control epoch, as will be seen in the
68
The Maximum Principle
[4
xenon problems (a) and (b). Then the corresponding dynamic programming computational algorithm proceeds in a straightforward fashion, yielding solutions without a trial-and-error procedure. Thus, another advantage of dynamic programming is that it reformulates often-recalcitrant two-point boundary-value problems into initial-value problems, so that their computational solution easily ensues.
CHAPTER 5
Minimum and Minimax Xenon Shutdown
5.1. Mathematical Restatement of Optimal Xenon Shutdown The optimal xenon shutdown control policies that (1) minimize the post-control xenon maximum, and (2) minimize the post-control xenon concentration at a given time, are stated in Section 3.3 and are called problems (a) and (b), respectively. Problems (a) and (b), in terms of the calculus of variations, are called Mayer problems, but because of their particular characteristics, are of an implicit and nonclassical type, especially because of the types of constraints involved. They have features in common with certain problems in modern optimal control theory, particularly those in the areas of orbital guidance and trajectory synthesis for space vehicles. In fact, using an optimal flux shutdown program over an allowed shutdown control duration, after which the flux is zero, to minimize the xenon concentration in the post-controI (coasting) phase is analogous to the optimally controlled powered flight of an ICBM until engine cutoff, after which the missile coasts ballistically to minimize the target circular-error probability (CEP). In terms of the dynamic-programming hierarchy of optimal control processes, problems (a) and (b) are essentially terminal control problems. This is because the cost functional is expressed in terms of the state of the system at the termination of control [13]. This aspect will be discussed in detail later. In this context then, the normalized xenon and iodine concentrations, x and y, respectively, constitute the state variables of the xenon69
70
I5
Minimum and Minimax Xenon Shutdown
iodine phase space. Equations (3.12) and (3.13) are the equations of state of the “optimally controlled process” and the normalized flux u is the control variable. At the end of the control phase, i.e., the allowable shutdown duration T, the flux becomes zero from then on, so that u* (t 2 T )= 0. u* (t) is the optimal shutdown flux program. Equations (3.12) and (3.13) in the post-shutdown phase, t 2T,where u*=O, are $=-y
y(T)=YT
9
(5.2)
with xT and y , the respective xenon and iodine concentrations at the end of the allowable shutdown duration. For problem (a), the xenon maximum x, occurring in the post-shutdown period is a functional of the optimal shutdown program. This is by virtue of the fact that x, can be expressed in terms of the state at the end of the shutdown duration ( x T ,y T ) ,which state is a functional of the optimal shutdown program. x, is obtained by integrating (5.1) and (5.2), as in Section 3.2, and using the fact that i ( f , ) = O , where t, is the time of occurrence of the xenon maximum. In terms of the final state, the result is
In the same way, for problem (b), the xenon concentration at a given time To in the post-shutdown period (To2 T) is the linear functional+
x(TO)= Y(x,,~,)=x,exp[-w(To-
T)]
Now problems (a) and (b) can be restated: to determine the optimal shutdown flux u* with flux constraints 0 Q u* Q M during the allowable shutdown duration T, u* E 0, in the post-control phase, xenon override t It is to be noted that for highequilibrium flux reactors (ro $. w), two terms of the binomial expansion of the bracketed term in the expression for 9 should suffice. Then 9 becomes a linear functional like Y , so that the optimal flux shutdown programs for problems (a) and (b) will be quite similar. Recall that (a) is a special case of (b).
5.21
Mathematical Restatement of Constraints
71
constraint x, discussed in Section 3.4, and with trajectories given by (3.12) and (3.13), which yields (a) (b)
min maxx = min
OSuSM r2T
@(xT,~T)
OSuSM
min x(To) = min YI(xT,~T)
0SuSM
OSuGM
x(r
< T ) < x,,
(5.5)
x(t
< T) < x,.
(5.6)
As can be seen, both problems (a) and (b) are expressed as the minimization of the functional of the control variable, the shutdown flux.
5.2. Mathematical Restatement of Constraints As mentioned in Section 3.4, constraints play a major role in the behavior of optimal control processes in general and xenon shutdown policies in particular. There are three principal constraints associated with reactor shutdown behavior when xenon is used: those on (1) flux magnitude, (2) inverse period sign and magnitude, and (3) xenon concentration magnitude. ( I ) Theflux magnitude u.
O
(5.7)
which implies that the lower bound on the flux is taken to be zero. However, in an operating reactor, there are many and diverse nuclear isotopic reactions taking place. A number of these result in end products that emit neutrons for a long time compared to the useful life of the reactor. Hence when the reactor is shut down after normal operation, these neutron emitters provide sufficient neutrons during any conceivably long shutdown period that the flux will not decrease to zero in the reactor. Also, in many systems, the original radioactiveisotope neutron source used to start the nascent reactor is simply left inside from that time forward. In sum, the asymptotic reactor flux never drops to zero following shutdown, but to a low value which is usually about to times its equilibrium value. The upper bound M is usually taken as rated operating power, so that in terms of the normalized flux, M = 1. However, it is often desired to make M > 1 if only for reasonably short periods of time. This property will be seen to be of advantage later, in the sense that over a portion of the optimal shutdown program it may be desired that
72
Minimum and Minimax Xenon Slurrdown
IS
M > 1. Such an M behavior may be preferred, because it reduces the number of shutdowns and restarts, if the cost of an elaborate shutdown control procedure involving much alternate reactor shutdown and startup is felt to be too high. ( 2 ) The “a” or inverse period 4lip. -aL
< @ / q< a H .
(5.8)
The lower (negative) bound, -aL, is automatically present by virtue of the reactor population decrease following normal shutdown. This is due to the mechanical inertia of the control-rod mechanisms and the phenomenon of delayed neutrons. Delayed neutrons are a concomitant of fission, whereby certain fission-product isotopes emit neutrons in seconds to minutes following their creation from fission. Such production of delayed neutrons is exploited in the control of reactors, since they can be made to comprise the additional neutron source by which the system is kept in exact equilibrium (critical). Since delayed neutrons act as a source with a time lag of approximately 10 seconds, their amount can be manipulated so that the reactor state changes in reasonable time with respect to present “state of the art” control systems [14]. If delayed neutrons were nonexistent, the time behavior of the reactor neutron population would be capricious indeed, since the corresponding system time constant would then be measured in fractions of a millisecond [14]. The reactor difficulties would become horrendous, owing to the atom-bomb-like quality of the consequent reactor temporal behavior. The positive upper bound, tlH, is limited by reactor standard operating safety procedures, since for very large a H the neutron population will increase on a too-rapid exponential behavior. If this occurs for a supercritical ( k > 1) reactor, the burgeoning neutron population would cause the reactor to far exceed its rated power and an accident could result. Even if the shutdown alarm (reactor scram) system operated in a satisfactory manner, thus avoiding an accident, it is economically very costly to restart a reactor following an unscheduled shutdown. However, both IaJ and ]aH!are measured in seconds, while the time behavior of the xenon and iodine concentration is measured in hours. Hence the approximation will be made in the ensuing solutions for optimal xenon shutdown policies that the reactor can be shut down
5.31
Dynamic-Programming Functional Equation
73
and restarted instantaneously. To actually include the inverse period constraints would result in a modified problem formulation. This will be discussed further in Section 9.5. It is just this type of derivative constraint that would pose difficulties if optimal xenon shutdown were to be investigated using the classical calculus of variations, even using the Valentine artifice [15]. (3) The xenon override constraint x,. x
< x,.
(5.9)
This is an inequality constraint on one of the state variables. When using the classical approach, the constraint x, is handled by the Valentine artifice, which results in an additional variational constraint equation. In the maximum principle formulation, inequality constraints on the state variables give rise to additional difficulties, since, for one thing, the Hamiltonian must be modified to include a Lagrange multiplier, thus introducing an additional variable in the problem. Further, at the bounds of the region of phase space in which x=xc, the corresponding adjoint variable p 1 experiences a jump discontinuity [16]. As discussed in Section 3.4, the constraint x<x, is most important, since if x, were very large, the xenon concentration could increase almost without bound, so that no optimal shutdown programs would be necessary. This is the situation in which there is sufficient fuel in the reactor, corresponding to enough positive reactivity, to override the negative xenon reactivity in the shutdown phase at all times. Then the reactor can be restarted at will following shutdown, irrespective of the amount of xenon in the system. As mentioned before, this amount of fuel inventory is prohibitively large, and therefore extremely costly, in the conventional thermal-reactor power plant. Therefore, the normal situation is that the system possesses sufficient fuel to override xenon only partially following shutdown. This situation can be ameliorated by using optimal shutdown programs, as will be developed in subsequent chapters.
5.3. Dynamic-Programming Functional Equation The xenon minimax problem (a) will be considered first, and it will be seen that the xenon minimum problem (b) is described analogously (both problems were mathematically formulated in Section 5.1).
74
Minimum and Minimax Xenon Shutdown
I5
Let the allowable shutdown control duration T be divided into N intervals, each of length A = T / N . As this class of problems are of the terminal control type discussed in Section 2.5, define a xenon criterion or cost functional as FN
(x (O), Y (0)) = min @(XT, OSuSM
Y T )9
(5.10)
which implies that N control decisions on the set 0 < u < M must be made to minimize the right side of Eq. (5.10) [17]. Even though FN is a functional of u, it is assumed that the optimal flux shutdown sequence {uN*} is unique, so that FN must be a function of the initial state (X(O), Y ( 0 ) ) only 1181. To derive the functional equation, which is simultaneously a recurrence relation algorithm from which Fk,k=O, 1,2, ... N, and {uN*} are obtained, first consider the trivial case beginning at the termination of the control duration, where the system is in one of the set of final states (xT,y T ) . In this state, the reactor is shut down, so that u*=O from there on, and no further control decisions are made. Since the subscript on F implies the number of control decisions remaining in the shutdown phase, the corresponding relationship is that, for problem (a),
Fo ( X T 9 Y T ) = @J ( X T , Y T ) *
(5.1 1)
Now proceeding backward along an extremal toward the initial state for one state removed from the termination of control (a one-stage process), the principle of optimality of dynamic programming asserts that since (xo, yo)=(xT,yT). That is, the optimal cost for a one-stage process Fl(x,, y , ) is given by the minimum over u of the terminal cost Fo but one state removed from the terminal state, viz: (x1, Yl) =(xo + i o ( u , * ) 4
Yo + 3 0 ( U i * ) 4 .
Equation (5.12) can, of course, be rewritten F , ( X , , Y , ) = min @(xo + i o ( u , ) d , OSurSM
Y o +3O(Ul)d). (5.13)
5.41
75
Derivation of Bellman’s Equation
f o ( u l ) and j o ( u l ) are obtained from the xenon and iodine state equations (3.12) and (3.13). By induction then, for k stages remaining in the shutdown phase of duration T, one can write the following functional equation :
Fk(xk,yk)=
min
OdurdM
Fk-l(xk-l
+ik-l(uk)d,
yk-1
+jk-l(uk)d)
( k = 1,2,..., N).
(5.14)
That is, in the kth stage, Fk is obtained by choosing 0 < uk < M so as to minimize Fk-l one state removed from ( X k - 1 , yk- 1 ) , thus obtaining an increment of the extremal arc lying between (xk, yk) and (xk- 1 , yk- l ) . By repeating this procedure (i.e., k = 1, 2, ...), one proceeds along an extremal in the ( x , y ) phase space to obtain ultimately the FN and, more important, the optimal flux shutdown sequence {tiN*}.The same procedure holds for problem (b), except for a different terminal control cost functional, which is Fo(xo, yo)= Y ( x , , y T ) . The xenon override constraint x < x , is implied in the above and does not detract from the generality of the derivational argument thus far. More will be said about the xenon constraint later, where it will be seen to constrain a portion of the extremal arc, thereby rendering a modified xenon shutdown program. The important point should be made that along an extremal in the phase space, FN=FN-1=***=Fo, SO that Fk(Uk*), k=O, 1,2, ..., N , iS a constant of the motion. It is easily shown that along an extremal in the phase space, F= - H , the Hamiltonian of the maximum principle, which is one manifestation of the equivalence between the optimality principle of dynamic programming and the maximum principle [19]. This equivalence will be discussed in detail in Chapter 9.
5.4. Derivation of Bellman’s Equation The continuous analogue of the functional recurrence relation (5.14) is obtained by first elevating the time-like stage index k into the argument of the criterion functional Fk, so that for a time increment d,
F ( x , y , T ) = min F ( x OdudM
+f(u)d,
y
+j(u)d,
T -A),
(5.15)
with terminal control, or boundary, surfaces F ( x , y , 0 )= @ ( x , y ) or F ( x , y , 0 )= Y ( x , y ) for problems (a) and (b), respectively. Again,
76
Minimum and Minimax Xenon Shutdown
15
(5.15) enunciates the optimality principle of dynamic programming in that the criterion functional F ( x , y , T ) , with allowable control duration T and system state ( x , y), is given by the minimum over the possible control set 0 < u < M of the criterion functionals F ( x + i ( u ) A , y + j ( u ) d , T - A ) embedded in a set of control processes beginning in neighboring states ( x + i ( u ) A , y + j ( u ) A ) but of allowable control duration T - A . i ( u ) and j ( u ) are, of course, obtained from (3.12) and (3.1 3). Now expand F on the right side of (5.15) in a Taylor series about ( x , y, T ) through linear terms explicitly, so that F ( x ,Y , T ) OduQM
aF aF F(x,y,T)+iA- + j A - ax ay
1
aF -A+O(A2)+... aT (5.16)
F ( x , y , T ) is canceled on both sides, since it does not depend on u explicitly. Then the result is divided by A , and the limit of zero A is taken, plus substituting from (3.12) and (3.13), which yields finally,
aF -aT O d u d M
(5.17) which is Bellman’s differential equation. The boundary curves are F ( x , Y , 0) = @ ( x ,Y )
or
F(x,y,O)
=
Y(X,Y),
(5.18)
as mentioned above, for problems (a) and (b), respectively. The novel aspect of Bellman’s equation is due to the minimum operator, which, if removed, would result in (5.17) becoming the garden variety of first-order partial differential equation. Such an equation would then be meaningless in our context. If the optimal flux shutdown program u* is known and substituted in (5.17), the result is the Hamilton-Jacobi equation for this optimal control process. Bellman’s equation, as is the case for most substantive partial differential equations, is in general difficult to solve analytically except
Bang-Bang Control Dilemma
5.51
77
for illustrative examples, so that resort must be had to large-scale high-speed digital computers to obtain solutions by numerical integration. In principle, the right side of (5.17) must first be minimized with respect to u, and then solutions to the resulting equation obtained [20]. For numerical computations the discrete analogue (5.14) is used instead, where the minimization and the solutions are generated together in a stepwise manner, as will be discussed later. As in the maximum principle formulation of the xenon shutdown problem to be given later, the discontinuous, or bang-bang, nature of the optimal shutdown control u* is immediately evident. This is seen by merely rewriting (5.17) as aF aT
- = [yI(w
where
aF aF + r o ) y - W X ] ax - y - + min ur, a y OsusM -
aF aF r = [ y 2 ( w + r,,) - rox] ax -+-. a y
(5.19)
(5.20)
It is seen that the minimization in (5.19) calls for an optimal u* given (5.21)
That is, if r is positive, the optimal flux u* immediately drops to zero. For r negative, u* immediately jumps to its maximum value M . Therefore, the optimal flux shutdown program consists of pulses, switching back and forth between 0 and M , where r changes sign. If r=Ofor a finite interval in the allowable control duration T,then u* is arbitrary. However, r=Oabove only at isolated points. Arbitrary u* is termed singular control, which is largely an unexplored area in modern optimal control theory. Where are the points at which r=O?They are the points at which the control is switched from zero to M , or vice versa, and are found by solving Bellman’s equation to yield an expression for r. As mentioned, the discrete analogue (5.14) is used for numerical solution.
5.5. Bang-Bang Control Dilemma For bang-bang control processes in general, Bellman’s equation poses an interesting dilemma. Referring to (5.19) to (5.21), r must be known
78
Minimum and Minimax Xenon Shutdown
IS
to determine the optimal u*. However, r cannot be known until Bellman’s equation is solved, which in turn cannot be accomplished until the correct u* is inserted therein. To break out of this difficulty, additional information about the nature of the optimal process must be known and applied to “start the solution off” or, if the discrete analogue is used, as it must be in a numerical computa.tion, this difficulty is not encountered, since the optimal control u* and the optimal cost criterion functional are determined stepwise together. No such difficulty occurs when u appears in a nonlinear fashion in Bellman’s equation, since u* can ordinarily be determined by differentiation and then eliminated by substitution back into the latter equation. As an interesting illustration of the former method for resolving this dilemma, consider the following reinvestment control problem [21].
I
I -
T
Optimal return on investment
1
7
Optimal profit production
L
Optimal reinvestment pol~cy
a
a
1
FIG.5.1. Optimal reinvestment policy, optimal return on investment, and optimal profit production versus plant age.
5.51
Bang-Bang Control Dilemma
79
Let x be the rate of profit produced by a manufacturer, u the rate of money reinvested (reinvestment policy) in the manufacturing plant, and x - u the rate of dividend production. It is desired to find the reinvestment policy that maximizes the total dividend produced at the end of time T . Assume a growth industry so that i=au
(a>O)
(5.22)
x(O)=c.
Define the criterion functional f ( c , T) as the optimal return on the investment in that f ( c , T)= max
O
s’ (x
0
u ) dt
.
(5.23)
Then, as in Section 5.4, the corresponding Bellman equation is easily derived as
(5.24) or
af -=c+maxu aT
(5.25)
o<usc
and here is where the dilemma appears. The optimal reinvestment policy u* cannot be known (except that it is obviously partially bangbang) until the sign of [a(aflac)-l] is known. The latter cannot be found until u* is inserted in (5.25) to determine the optimal return f and thereby aflac. However, if the following intuition is brought to bear, the difficulty can be resolved. Consider initially the situation for small T . Sincef(c, 0) = 0, aflac is also near zero for small T , as the rate of optimal return with respect to profit production is assumed small for a young company. Then la(afllac)l-g1, so that [a(afl/ac)-1 1 ~ 0 . Therefore, from (5.24) or (5.25), make u as small as possible for small T,so that in this region u* =0, and from (5.25) a f a T = c , hence f = c T . This implies that [ a ( a f / a c ) - 13 = ( U T - 1) < o
T < i/a
(5.26)
until T = l/a, wheref(c, l / a ) = c / a . On the other hand, for T > l / a ,
(5.27)
80
1s
Minimum and Minimax Xenon Shutdown
which from (5.25) dictates that u*=c. This results in [also from (5.25)], with u* = c,
The solution is simply (5.29)
Therefore, the optimal rate of reinvestment control u*=[
1
0
T<-, a
c
T2-.
(5.30)
a
Then the rate of profit making is given by ax
T 2 -1 , a
(5.31)
T<-. a
Since T is measured backward with respect to t, dT= -dt, and
x=/
ce-a
T 2 -, 1
C
T < -1.
a
(5.32)
a
All of the above is depicted in Fig. 5.1.
5.6. Dynamic-Programming versus Maximum-Principle Optimal Shutdown Solutions As can be realized at this point, the optimal xenon shutdown control process can be formulated using the maximum principle as well as the principle of optimality of dynamic programming. In the following, problems (a) and (b) will be formulated using the maximum principle, and the manner of obtaining solutions will be described. Then a com-
5.61
Dynamic-Programming vs. Maximum-Principle Solutions
81
parison will be made between the two methods in terms of the efficacy of deriving optimal shutdown programs. To incorporate the control criterion functionals Q, and Y corresponding to the xenon minimax problem (a) and the xenon minimum problem (b), respectively, define an additional variable, after Rosonoer [191, for problem (a), z = @(x, y ) ; (5.33) for problem (b), z = Y ( x ,y ) . For the minimax problem (a), the equations of state are now f =- (w
+ r o u ) x + y l ( w + r,)y + y 2 ( w + r o ) u u ( 0 ) = x(0) = 1 ,
(5.34a)
y(0) = 1,
(5.34b)
z(0) = 0.
(5.34c)
j = u - y
The state or phase space is now three-dimensional, so that the adjoint vector p = ( p l , p 2 ,p 3 ) , and p 3 = - 1 is assumed as usual. The Hamiltonian follows simply from the maximum-principle schema as outlined in Section 4.1.
+
[
p2
3
--
[u
-y].
(5.35)
The adjoint system of equations follows from those given in E q ~ ( 4 . 6 ) :
p3
=0
p1(7')= 0 ,
(5.36a)
p 2 ( T )= 0,
(5.36b)
p3(7') = - 1 , (5.36~)
where f l and f 2 are the right sides of (5.34a) and (5.34b), respectively.
Minimum and Minimax Xenon Shutdown
82
For the linear functional of problem (b), viz., Y ( X , Y ) = A ( T , T0)X
+ B ( T , T0)Y
(5.37)
7
where A(T, To) and B(T, To) are parameters which can be identified by inspection of (5.4), the corresponding adjoint equations are
p3
p 3 ( T )=
=0
- 1 . (5.38~)
As in the dynamic-programming formulation, the bang-bang nature
of the optimal shutdown control program is immediately evident by rewriting the Hamiltonian for problem (a) as
- wx where
( -)::- , p1
(5.39)
while for problem (b), H b
= Urb
+ Y(Y1 ( w + ‘O)(Pl
where r b
= (P1 -
- A ) - ( p 2 - B)) - w x ( P 1
4 (72 ( w + ro) - rox) + ( P Z
-
- B)*
(5.41) (5.42)
By inspection, it is seen that to maximize Ha or Hb, (5.43) Again, is known only when the two-point boundary-value problem which comprises (5.34) and (5.36) or (5.37) is solved. Knowledge of
5.61
Dynamic-Programming us. Maxitnnm-Principle Solutions
83
r will provide
the points in the control phase where u* switches to M, and vice versa, so that the number, widths, and spacings of the flux pulses will be obtained. To compare the relative efficacy of using the maximum principle or the optimality principle of dynamic programming for obtaining xenon shutdown programs, consider the specific optimal control process for problem (b) when T = To and 0 < u* < 1. That is, it is desired to find the optimal xenon shutdown flux program, with constraints of zero and rated power, that minimizes the xenon concentration level immediately after the shutdown phase is completed. Then from (5.4), with T = To, it follows that A(T, To)=l, B(T, To)=O, and the cost functional is simply Y = x . The adjoint system of equations is then, from (5.38), PI
= (PI -
P2 =
W J+ I.0.)
- (PI - 1) YI ( W
93 = 0
+ ro) + ~2
p1(T) = 0 ,
(5.44a)
( T )= 0 , p 3 ( T )= - 1 ,
(5.44b)
~2
(5.44c)
and from the Hamiltonian in (5.41), the switching times for u* occur at the zeros of rb
= (Y2(W
+ ro) - r o x ) ( P , - 1) + (P2 - 1).
(5.45)
To obtain rb as a function of time, in order to compute the switching times explicitly, x, p l , and p 2 must be obtained. Hence, the two-point boundary-value problem corresponding to the system of equations (5.34) and (5.44) must be solved. Analytical solution is out of the question, unless drastic approximations are made which would distort the system model (cf. the example in Section 4.5). This system, which is made up of the six equations (5.34) and (5.44), would be numerically integrated starting from the end of the shutdown duration ( t = T = To). This is because more is known about the system behavior at t=T, so that fewer “initial conditions” must be chosen a priori. For example, physical intuition tells us that t = T must be the last switching time, since from then on the reactor is shut down. So that u*(T)=O and hence rb(T)=O. The final state ( x T , y T ) i.e., , the state at the end of the shutdown phase, must be guessed to start off the computation. Then the equations of state and their adjoint equations are integrated backward in an attempt to reach the initial state x = y = u = 1, by trying
84
Minimum and Minimax Xenon Shutdown
I5
various final states ( x T ,y T ) . u* is then switched from zero to M and vice versa, as the zeroes of rboccur in the computation. Another approximate method of solution is to assume a given number of switching times, guided in part by certain switching theorems [22]. With the switching times as parameters, it can be easily shown that the solutions to the system of differential equations (5.34) and (5.44) will reduce to a system of algebraic equations in the switching times which can be determined using the ordinary methods of calculus to minimize the criterion functional [23]. A third approximation method for either problem (a) or (b) is to recalculate @ and/or Y assuming a piecewise-constant u* (successively 0 and M ) for an a priori chosen number of switching times. As the criterion functional is now also a function of the switching times, their number and magnitude can be varied until a minimum is realized [23]. This is essentially the Rayleigh-Ritz procedure of the variational calculus, used often for such problems in classical and quantum physics. Of course, the above formulation has not yet taken into account the all-important xenon override constraint x < x,. It has been implied above that either (1) there is always a sufficient excess amount of fuel for xenon override, or (2) the xenon override constraint will be accommodated by an extra train of u* pulses to “burn out” the xenon as one proceeds along the resulting suboptimal extremal in the xenoniodine phase space. How many pulses per train are required is, apriori, an open question. Their number and duration could be determined empirically by experiments on an actual operating reactor, but this is an extremely cumbersome undertaking. Another method to incorporate the xenon override constraint is to modify the cost functional by the addition of an artificial cost term of the form C (x/x,)a, where C and u are sufficiently large, such that when x approaches x, from below, or exceeds x,, the cost becomes tremendous. This “brute force” method will be reflected in the u* behavior which will divert the extremal from the x=x, boundary line in the phase space. However, it will be seen that an equivalent optimal shutdown program includes the x=xc boundary line as part of the corresponding extremal. This will be discussed later. To take the xenon override constraint into account from the formal point of view, the maximum principle must be modified to include a
5.61
Dynamic-Programming vs. Maximum-Principle Solutions
85
variable Lagrange multiplier. The adjoint equations are also modified in that a jump discontinuity must occur in the p i when the extremal intersects the x =x , boundary line. The details have been discussed by Pontryagin et al. [22]. Suffice it to say that the xenon override constraint multiplies the foregoing difficulties, especially when the above approximation methods are used. The maximum principle approach is left at this point, and the dynamic programming method of solution is resumed for comparison. The dynamic programming computational algorithm (5.14) is rewritten here as Fk
(xk, Yk) = min
O$U*CM
Fk-
1(xk-
1
+ ik-
1 (uk)
with “initial” conditions Fo (xo, Y o ) = @ (xo, Y o )
+Pk- 1 ( k = 1,2, ...,N ) ,
(5.46)
Y ( x 0 , Y o )9
(5.47)
Yk-
or
1
since ( x o , y o )= ( x T ,y T ) and uT* = 0, with constraints
0
<M
x
< x,.
(5.48)
Since all the boundary information is given at the final state through the terminal cost functional, no two-point boundary-value difficulties are encountered. The computation is begun by forming a two-dimensional matrix of (xo, yo), numerically computing @ or Y, as the case may be, which gives F o ( x o , y o ) in the form of a stored table. Using (5.46) and (5.48) for k = 1, F, and ul* are also computed and stored. This is repeated for k=2, 3, ..., N , which is discussed in more detail in Chapter 6. The xenon override constraint actually helps matters in that the computer search for min, Fk is limited by the x < x , boundary, implying that less search space, i.e., Fk(xk, Yk) grid points, need be probed. This speeds the computation and reduces the necessary computer fastaccess (core) memory requirements. Here, as in the maximum-principle formulation, the xenon override constraint can also be taken into account by redefining the criterion functional through the addition of a C(x/xc>, term. In sum, it seems that the dynamic programming formulation, at least for two-dimensional control processes of the above type and
86
Minimum and Minimax Xenon Shutdown
1s
complexity, is the more straightforward approach, especially from the point of view of using high-speed digital computation. However, for multidimensional problems, corresponding to higher-order control processes, both methods begin to lose their practical effectiveness. For example, the dynamic-programming algorithm for the corresponding Fk(x,, y,, zk, wk, ...) demands huge amounts of computer memory storage. As will be seen in Chapter 6 , each Fk table comprises a 30 x 30 matrix, or 900 numbers. For N=20, 30 x 30 x 20= 18,000 fast-access memory words are required for storage. The present modern-computer fast-memory storage capability is adequate. But a four-dimensional system needs four-dimensional matrices, so that 30 x 30 x 30 x 30 x 20= 16,200,000 fast-memory words would be required.7 Computers with core memory sizes of this order of magnitude are just appearing on the commercial computer market horizon. Actually there are other methods under present investigation, which give promise for partially circumventing this storage difficulty. These have to do with incorporating Lagrange multipliers into the functional equation to reduce its dimension as well as approximating the Fk by sets of orthogonal polynomials [24]. Fkis then represented only by the stored polynomial coefficients, and it is generated as needed, thus greatly conserving core memory space. This multidimensional difficulty plagues the maximum-principle method as well. A four-dimensionalsystem formulation using the maximum principle consists of 10 simultaneous differential equations, which would constitute a formidable two-point boundary-value problem. As can be appreciated, the treatment of high-order optimal control processes is difficult using either the maximum-principle or the dynamic-programming method. Mentally satisfying adequate methods for handling such systems is still an open question in modern optimal control theory.
+ Storage on magnetic tape, disk files, or other slower-access memory systems begins to increase the computer running time prohibitively.
CHAPTER 6
Computational Aspects
6.1. Introduction and Calculation of Fk Tables This chapter comprises a detailed description of the dynamic-programming computational algorithm solutions for obtaining optimal xenon shutdown programs. The digital computer code for the solution of problems (a) and (b), called DYNPROG, is written in ALTAC 111 (FORTRAN 11) algebraic programming language. A detailed listing of the computer instructions in the above language, a glossary of symbols used, and a sample printout are given in the ensuing sections. This computer program (code) contains approximately 300 ALTAC I11 instructions. For a twenty-stage (N=20) problem, with the 20 Fk tables, each consisting of two-dimensional 30 x 30 matrices, and u* =O or M , the average computer running time per problem is about 5.5 minutes on the Philco TRANSAC 2000, Model 210, or about 2 minutes on the Model 212. The fast access memory of these computers contains 16,384 48-bit words. Because of the relative paucity of fast memory storage with respect to this class of problems, using this class of computers, the Fk tables are successively stored on magnetic tape after computation. To compute the Fk table, the Fk-l table must be searched to find the optimal F k - l and uk* values, as will be explained later. This necessitates recalling the Fk- table from the magnetic tape back into the fast (core) memory, which operation consumes a relatively large amount of computer time. The point is that most of the computer running time is consumed by reading from and writing on the Fk table magnetic tape. 81
88
16
Computational Aspects
If the core memory is large enough to accommodate all Fk tables, k=O, 1, 2, ..., N , simultaneously, the problem running time can be reduced by at least an order of magnitude. However, if the core memo-
ry is truly large, e.g., of the order of 250,000 words or more, higher core-memory priority should be given to expanding the Fk tables for greater accuracy. Double interpolation, used presently in DYNPROG, is time-consuming and less accurate than a finer Fk table grid mesh would be [18]. Using (5.46), (5.47), and (5.48), which comprise the dynamic-programming computational algorithm, the computation proceeds “backward” in time through each stage of the multistage control decision process by which the xenon optimal shutdown problem is construed. Assume for illustrative purposes that the xenon override constraint is initially not invoked. It will be incorporated easily later in the discussion. The computation begins at the possible termination set of states ( x T ,y T ) = ( x O ,yo). With zero stages remaining in the control phase, a set of ( x o , y o ) values are chosen, where O < X ~ < Xand ~ ~ ~ ~ O
or
(6.1)
The Fo table is shown in Table 6.1. Using problem (a) to illustrate,
=
min @(xo
Obu,<M
+ ko(ul)d,
y o +)io(ul)d),
(6.2)
so that the Fl table is manufactured by first choosing a particular pair of (xo,y o ) values. For arbitrary u l , the corresponding transformed state is xl=xo+ko(ul)d and y1=yo+)io(ul)d, where f o ( u l ) and Jjo(ul) are obtained from the xenon-iodine state equations (3.12) and (3.13), respectively. For given valuest of u l , such that 0 < u1 < M , and t Since the optimal control is bang-bang, as shown earlier, u1 = 0, M are the only allowed u values from which to choose in the minimum Fk search computation. The DYNPROG code has the incremental-u-choice option available for future work, when the optimal control may not be bang-bang. Exercising this option even for bang-bang control serves as a good code check to see if the discontinuous nature of the control is actually evidenced.
6.11
89
Introduction and Cahdation Of Fk Tables
TABLE 6.1. Fo(x,,yT)
YT
XI0
X20
x30
x40
x50
X6O
Y 10
Fil
Ff
FA3
Fi4
Fi5
F’,B
~
u;, = 0
YZO
u;, = 0
Fil
F2,Z u;, = 0
u;, = 0
Y 30
F;1 u;, = 0
FY
uj,
=0
Ui3 =
0
FF l4i3 = 0
F3,3 Uj, = 0
=0
Fi4 Ui4 = 0
FS,4
uj,
=0
u;, = 0
F:’ u;, = 0
F3,5 Uj, = 0
u;, = 0
Fie u;, = 0
Fie u;, = 0
corresponding ( x l ,y,), a set F, ( x l ,yl) is computed, using double interpolation from the Fo table. Then the minimum Fo is chosen to give
F l ( X 1 , Y l ) = min FO(X0 OBu,BM
+ ~ o ( u , ) A , Y o+ 3 0 ( u , ) A ) = FO(U,*).
(6.3) The (xl*,yl*) are the locations in the F, table where F l ( x l * , yl*) and u l * obtained above are written. The Fl table is shown in Table 6.2. The Fk tables k = 2, 3, ..., N are successively computed in the same way and stored contiguously on magnetic tape, so that the Fk-l table can be read into the core memory to be searched to compute the Fk table. After all the Fo, F,, ..., FN tables have been computed and stored on tape, the extremal (xk, yk), k = 0, 1, ..., N , in the phase space, and also the optimal flux shutdown sequence {uN*}is obtained by numerically integrating the equations of state (3.12) and (3.13). This is done in the forward direction, and initiated by using the equilibrium conditions x = y = u = l . The “code obtains” the uN*, since N control decisions remain at the outset, corresponding to the x N = Y N = 1 .entry in the FN table. Using uN*,the state equations are integrated to a time A = T / N to obtain ~ ~ - ~ If ~A isy small ~ - enough, ~ . xN-,=xN+iN(uN*)A, y N - , = y , + j N ( u N * ) A . The F N - , table is now entered at XN-l,YN-I, again using double interpolation to obtain the corresponding uN- l*.
90
16
ComputationalAspects
TABLE 6.2.
Fl(x1, y l ) X
Y
Xll
x21
x31
x41
x51
X6'
Yl'
F;'
Fi2
Fi3
Fi4
Fi5
Fie
u11
u12
'13
u14
u15
u1.3
F2,'
F2,2
F2,3
F2,4
u2 1
u22
u23
u24
'25
u2.3
F;'
F?
F:3
F:4
F:'
F:.3
'31
u32
u33
u34
u35
;6'
Y2'
Y31
This procedure is repeated until all the F, tables are exhausted, thus generating the extremal locus, the sequence {xN,yN}.
6.2. The Xenon Override Constraint As mentioned earlier, the preceding explanation of the DYNPROG code did not include the xenon override constraint x < x,. One method by which this constraint can be aecomrnodated into the code is to simply constrain all x, < x, using an appropriate computer instruction. x=x, is a horizontal straight line in the xenon-iodine phase space cutting through the unconstrained extremals and denying them the region of phase space beyond. Evidently this reduces the search space when computing the Fk tables, which reduces the computing-time and fast-memory-storage requirements. The memory storage thereby freed is available for other purposes, such as reducing the mesh size of the (xk, Yk) grid* The second way in which the xenon override constraint can be introduced is through the addition of an artificial cost to the criterion functional, as mentioned earlier. In the DYNPROG code the artifice used is the addition of 10(x/x,)20 to @ or Y. This cost becomes tremendous when x is very close to x,, and rapidly dwindles to a negligible amount for x<x,. Then the optimal uk* choice is forced, so that the
6.31
DYNPROG and COAST Input-Data Format
91
extremal never crosses the x = x c line, but on the contrary is diverted from x = x c in a series of sawtooth-shaped segments in the phase space, as discussed later. These segments will be seen to correspond to a flux pulse train which “burns out” sufficient xenon so that the optimal extremal can be rejoined at a later point in the phase space. As mentioned in Section 5.6, another mode of suboptimal shutdown is to make a portion of the extremal coincident with the line x = x c . This, however, requires a relaxation of the upper bound M of the flux constraint, as will be discussed later as well.
6.3. DYNPROG and COAST Input-Data Format The input-data parameters which are required for a given computerproblem run of DYNPROG are punched on a single hollerith card (punched card) according to the format to be described. The card-totape converter enters this data card together with the DYNPROG code stack of punched hollerith cards, to be read onto the computer input tape for processing prior to the problem run. The input data consists of the following parameters. INDEX (1 or 2) determines whether (1) problem (a) or (2) problem (b) is to be solved for the particular computer run number of stages, N = T/A, or uN* decision intervals NMAX length of the control duration (units of iodine mean T life 9.58 hours) ro =ayl,/Al, equilibrium power, or flux, parameter RO post-shutdown time, at which the xenon concenTO tration is minimized; for problem (b) only flux-constraint upper bound, M. M = l is rated UMAX power flux increments used in search for uk*. DU = UMAX DU corresponds to bang-bang control Ax and/or Ay; size of Fk-table grid mesh DXY determines whether or not Fk tables are printed out SKIP (0, or 1) on off-line printer following the computerproblem run MAX I J dimension of Fk tables, < 100 x 100
92
Computational Aspects
CON
16
conversion factor for xenon concentration level to poison reactivity, in dollars xenon override constraint; units of normalized xenon concentration; e.g., XMAX= 10 means an override reactivity constraint corresponding to 10 times the equilibrium poison reactivity
XMAX
The above parameters are tabulated in Table 6.3 in terms of their location on a hollerith card. TABLE6.3. DYNPROG INPUT Card column
Number type
1 2-4 5-9 10-14
I1 I3 F5.0
F5.0
15-19 20-24 25-29 30-34 35-39
F5.0 F5.0
40-43 44-49 50-57
I4 F6.3 F8.1
F5.0 F5.0 F5.0
Parameter
Description
1 for PHI, 2 for PSI problems number of stages (usually 20) end of control phase (units of 11-1) ro parameter corresponding to operating power prior to shutdown TO xmin time, in PSI problems only UMAX upper bound on u* (usually 1) DU u* increment (usually UMAX) DXY size of Fk-table grid mesh SKIP F k table dump bypass (0 prints F k tables, 1 does not) MAX I J size of F k table array poison reactivity conversion factor to dollars CON XMAX xenon override constraint INDEX NMAX T RO
The COAST program, also written in ALTAC I11 language, computes families of solutions of the xenon and iodine differential equations (3.16), which describe the conditions following shutdown where u* ~ 0COAST . is used, with input initial conditions from DYNPROG at shutdown, to continue the extremal in its post-shutdown “coasting” phase. It also generates the data from which Fig. 3.2 is drawn. That is, COAST can provide the data for the xenon and iodine following immediate shutdown as well, where x(O)=y(O)=l and u*=O. The input data to COAST consists of the following parameters.
Appendix to Chapter 6
6.A]
xo
93
xenon-concentration shutdown level, or xenon initial condition for immediate shutdown iodine-concentration shutdown level, or iodine initial condition for immediate shutdown time increment ro = a p o / l , power or flux parameter number of time increments per run xenon poison reactivity conversion factor to dollars
YO DEL RO IMAX CON
The above parameters are tabulated in Table 6.4 as they would appear on a hollerith (punched) data card. TABLE 6.4. COAST INPUT Card column
Number type
1-9 10-18 19-27 28-36 37-41 4247
F9.4 F9.4 F9.4 F9.4
I5
F6.3
Parameter XO YO DEL RO IMAX CON
Description xenon concentration initial condition iodine concentration initial condition time increment (usually 0.1) equilibrium flux parameter number of time increments reactivity conversion factor
6.A. Appendix to Chapter 6
6.A.I. DYNPROG Glossary A B C CON COY DEL=d =T/NMAX DRU DU=du DXY 8 DYNPROG, ~ 2 3 , ~ DY1
=
wW/(l-w)
yl(w
= (1 - W ) / Y l ( W
+ ro)
+ ro)
= 1/1 - w factor for converting x to reactivity in dollars = --(y~/(l - w)[(exp - A ) - (exp - ~ 4 1 ) time increment used throughout code
= rodu u increment in u* search loop x and y interval which fixes Fk matrix grid interval size
code name; ALTAC 3; compiler TAC; binary deck and code check ; execute with conditional post-mortem dumps = du(1 - exp - d)
Computational Aspects
94
EDEL EDEO EX EU FKJ) FNJ) 4700-3 1000F, 0-51OOc, 30000-35000H FU
= exp -
= exp -
A
wA
+
exp - A(w rodu) = exp - (du)(rod) generic symbol for Fk(x,y) interpolated F k corresponding to jth column floating-point dump, command format, =
hollerith (input-output) dump interpolated F k with next value of u for comparison with FR(J) in U* search loop F k table entry on solution print out format = yl(w ro)
FWRDF GAM 1 GAM 2 = ya(w ro) index of X(rows) of F matrix, 1 < I < MAXIJ - 1 I INDEX 1,2 1 is @(x,y), problem (a), 2 is Y ( x , y ) , problem (b) tape-read check for correct I IPRIME IXNMTROTO INDEX, NMAX, T, RO, TO, UMAX, DU, DXY, SKIP, UMDUDXY MAX, IJ, CON; SKP index of y(columns) of F k matrix, 1 < J < MAXIJ - 1 J format for punching data cards MIJCON max number of rows(1) and columns(J) in F k table matrix S 1MJ MAXIJ instruction to mount extra tape used to store F tables MOUNT TAPE maximum number of stages of F k NMAX tape-positioning parameter NMA =NMAX 1 tape-read check for correct N NPRIME tape backspace-positioning parameter NRBACK RAYDMP special dump =w+ro RC RO = ro = o ( p o / L l ; equilibrium power parameter =w rodu RU computer time-sharing card /" = x,y/DXY; linear interpolation interval THX,THY = 1 -THX, THY THXC,THYC u1 = yl(w ro)(y - u ) / ( w rodu - 1) = (w ro)u/(w roAu) u2 UMAX urnax M = - y1/l - w uo1 = u*(Jdu),1 < J < MAXIJ 1 US(J)
+ +
+
+
+
us0
USTAR
u* U*
E O
+
+
+
+
6.A]
Appendix to Chtzpter 6
XMAX XY XY MAX
95
xe, xenon override constraint
x or y maximum value of x or y in
Fk
matrix
6.A.2. Sample Printout of DYNPROG NMAX = 20 UMAX = 2.00 D U = 2.00 XYMAX = 29.0 DEL = 0.05 B = 0.01 EDEL = 0.95 DYl = 0.1 CAM2 = 1.1 EU = 0.14
MAXIJ = 30 GAMl = 19.7 D R U = 40.00
Objective Function PSI
RO
= 20
T = 1.000
N
X
20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
1.00000 1.90647 2.73477 3.48991 4.17663 4.79938 5.36238 5.86957 6.32469 6.73125 7.09256 7.41173 7.69168 7.93517 8.14478 1.33704 1.82637 2.27211 2.67710 0.62092 0.38246
DEL
= 0.05000
USTAR
Y 1.m 0.95123 0.90484 0.8607 1 0.81873 0.77880 0.74082 0.70469 0.67032 0.63763 0.60653 0.57695 0.54881 0.52205 0.49659 0.56991 0.54211 0.51567 0.49052 0.56414 0.63417
0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 2.000 0. 0. 0. 2.000 2.000 0.
TO
=
1.00
YMAX
=
FN
CON
0.3 13142 0.300607 0.264368 0.204034 0.1 19307 0.321586 0.193685 0.372604 0.199694 0.353619 0.131254 0.259472 0.381486 0.497165 0.175914 0.23 1 196 0.480335 0.224100 0.448630 0.386202 0.634170
8.1250 15.4901 22.2200 28.3555 33.9351 38.9950 43.5693 47.6903 51.3881 54.6914 57.6271 60.2203 62.4949 64.4732 66.1764 10.8635 14.8393 18.4609 21.7514 5.0450 3.1075
6.A.3. D YNPROG and COAST AL TAC III Instruction Listing 20
40 ABC I
8.15
HLT 14-087-8 ON MAGTAPE, LIBS ALTAC3 DYNPROGS IOWS IDENTIFYF,
4
96 C
ComputationalAspects
C FUNCTION AFMIF(1,DXY) COMPUTES VARIABLE FROM SUBSCRIPT C AFMIF (l,DXY)=(l-1.) *DXY C SUBROUTINE IFMA (XY,TH,IJ) COMPUTES SUBSCRIPT FROM X AND Y VARIABLE SUBROUTINE IFMA(XY,TH.IJ) COMMON MAXIJ,XYMAX,DXY IF(XY) 10,10,20 10 IJ=I IH=XY/DXY RETURN 20 IF(XY -XYMAX)40,30,30 30 IJ=MAXIJ-1 IH=(XY -XYMAX+DXY)/DXY RETURN 40 IJ=XY/DXY+l. T H = (XY -AFMIF(IJ.DXY))/DXY RETURN END C MAKES HOLLER CODE FOR PHI A N D OR PSI C SUBROUTINE TYPE (P) DIMENSION P (1) PRINT 902,P 902 FORMAT ( I H0,20x,l9HOBJECTIVE FUNCTION A3) RETURN END C C C MAIN PROGRAM BEGINS C COMMON MAXIJ,XYMAX,DXY DIMENSION F( 100,100),FR(l00). US(100) CONTINUE C READ 900 S GETS BY CASE CARD C COLS TYPE PARAM DESCRIPTION C 1 I1 INDEX 1 FOR PHI OR 2 FOR PSI PROBS C 2-4 13 NMAX NUMBER O F STAGES (USUALLY 20) C 5-9 FS.0 T END OF CONTROL PHASE (UNITS O F INVERSE LAMBDA ONE) C 10-14 FS.0 RO PHI NAUGHT PARAMETER C 15-19 FS.0 TO XMlN TIME (IN PSI PROBS ONLY) C 20-24 F5.0 UMAX UPPER BOUND ON U (USUALLY I.) C 25-29 FS.0 DU U INCREMENT (USUALLY I . ) C 30-34 FS.0 DXY COORDINATE INTERPOLATION INCREMENT (USUALLY 0.1) FN TABLE DUMP BYPASS(0. DOESNT AND 1. DOES C 35-39 FS.0 SKIP BYPASS) C 40-43 14 MAXlJ SIZE O F FN TABLE (NOT GREATER THAN 100 x 100) C 44-49 F6.3 CON K MULTIPLIER FOR X C 50-57 F8.1 XMAX XMAX XENON OVERRIDE CONSTRAINT 1 READ 900, INDEX,NMAX,T,RO,TO,UMAX,DU,DXY,SKIP,MAXIJ,CON,XMAX REWIND 6 900 FORMAT (II,I3,7FS.O. I4,F6.3,F8.1) IF(1NDEX) 2,Z.S
6.A1
Appendix to Chapter 6
2 STOP C COMPUTATION O F CONSTANTS DEL = T/NMAX B = 8.*59./(56*(21. +29.*RO)) XYMAX=(MAXII- l.)*DXY RC=21./29. +RO GAMI=56./59.*RC GAMZ=3./59.*RC EDEL=EXPF(-DEL) DY 1=DU*(l. -EDEL) DRU=RO*DU EU=EDEL**DRU EDEO = EDEL**(21./29.) UOl =GAMl*(-29./8.) COY = UO 1*(EDEL - EDEO) GO TO (100,200)JNDEX C PHICOMPUTATION 100 A =(21./29.)r*(21./8.)~56./59.~(21./29. RO) C=29./8. 101 DO105 I = l , MAXlJ X =AFMIF(I.DXY) 102 DO105 J=l,MAXIJ Y =AFMIF(J.DXY) IF(Y) 108,108.109 108 F(I,J) = 10.E10 GO TO I05 109 F(I,J)=A*Y*(l +B*X/Y)**C 105 CONTINUE 8 I101,J102 GO TO 250 c PSI COMPUTATION 200 A T 0 = EXPF(21./29.*(T -TO)) BTO=(ATO-EXPF(T-TO))/B 20I DO205 I=l.MAXIJ X =AFMIF(1,DXY) ATOX =ATO*X 202 DO205 J = 1.MAXIJ Y =AFMIF(J,DXY) F(I,l)=ATOX+BTO*Y 205 CONTINUE 8 IZOl,JZ02 C STORES F AND USTAR TABLES FOR N=O 250 DO251 J = l , MAXIJ 251 US(J) =O.
+
N=O
DO260 I=l,MAXIJ IF(SKIP)255,255.260
C C C PRINT FO AND US0 C
C
255
905 906
PRINT 905,N,I,(F(I,J).J= 1,MAXIJ) FORMAT (3HON=I2,SX2HI=I3/(lH 10Fl2.5)) PRINT 907 PRINT 906,(US(J),J= 1,MAXII) FORMAT(1H 10F12.5)
97
98
Computational Aspects
I6
C COMPUTATION O F F(N,I,J) AND US(N,I,J) WRITE TAPE 6,N,I,(F(I,J),J= l.MAXIJ),(US(J),J= 1,MAXIJ) 260 300 DO 499 N=l,NMAX 301 DO399 I=l.MAXIJ X=AFMIF (1,DXY) 302 DO365 J=I,MAXIJ Y =AFMIF(J,DXY) C COMPUTATION O F F(1.J) FOR U=O FOLLOWS XI =EXEO*X+COY*Y Y1 =Y*EDEL CALLIFMA(XI,THX,Il) CALLIFMA(Y l,THY,Jl) THXC = I. -THX THYC= 1. -THY FR(J) =THXC*THYC*F(II ,J1) +THXC*THY*F(Il,Jl 1)+ lTHXoTHYCoF(I1 l,JI)+THXoTHY*F(Il+ l,J1+ I ) + IO.r(X/XMAX)roZO US(I)=O. C INITIALIZATION FOR U LOOP U=DU RU =21./29. +DRU EX= EDEO*EU c u LOOP 310 U1= GAM l*(Y - U)/(RU - 1 .) U2= RC*U/RU X1 =(X - U l - U2)oEX +Ul*EDEL+ UZ Y1 =Y 1 +DY I 350 CALLIFMA(X I ,THX,Il) CALLIFMA(Yl.THY,Jl) THXC= I. -THX THYC = 1. -THY
+
+
+
FU=THXCoTHYCoF(II,Jl)+THXC*THYoF(II.JI 1)+
lTHX*THYCoF(Il +l.JI)+THX~THYoF(II +1,J1 + I ) 704 IF(FU- FR(J))355,360,360 355 FR(J)=FU US(J)=U C INCREMENT U.RU, EX 360 U = U + D U RU =RU +DRU EX=EXoEU IF( U - U MAX) 3 10.310,365 C END U LOOP 365 CONTINUE S J FROM 302 WRITE TAPE 6,N,I,(FR(J),J= I,MAXIJ),(US(J),J= 1,MAXIJ) IF(SKIP)370,370,399 C C C PRINT FN AND USN C 370 907 399
PRINT 905,N,I(FR(J),J= 1,MAXIJ) PRINT 907 PRINT 906,(US(J).J= 1,MAXIJ) FORMAT(1H) CONTINUE S I FROM 301 DO401 M=l,MAXIJ
6.A]
Appendix to Chopter 6
99
BACKSPACE 6 DO415 I=l,MAXIJ READ TAPE 6,NPRIME,IPRIME,(F(I,J),J= 1,MAXIJ) IF(N -NPRIME) 410,403,410 403 IF( I-IPRIME) 410.415, 410 4 10 PRINT 9 15,N,NPRIME,I,IPRIME 915 FORMAT (18HO TAPE NOT SYNCHED,415) STOP 415 CONTINUE S I FROM 402 499 CONTINUE S N FROM 250 C FORMATTING USTAR AND FN TABLE PLUS COMPUTATION OF BOTH PRINT 921,NMAX,UMAX.DU,MAXIJ.XYMAX,DEL,B,RC,GAM I ,GAMZ,EDEL, DY 1,DRUI ,EU,EDEO,UOl.COY 921 FORMAT(6HlNMAX=l2,4X,5HUMAX=F5.2,4X,3HDU=F5.2.4X, 6HMAXIJ =I4,4X, I6HXYMAX = F5.1,4X.4HDEL = F4.1.4X,ZH B = F6.2,4X. 3HRC=75.1/ ~HOGAM~SF~.~I,~X,~HGAM~=F~.I.~X,~HEDEL=F~.~,~X, 401
402
4HDYl=F4.1,4X,4HDRU=F7.2,4X,33HEU=F7.2,4X,5HEDEO=F6.3,4X, 4HUOt=F5.2.4X,4HCOY=F6.3)
PRINT 920,SYSDATE FORMAT (IHO,ZOX.ZOH OPTIMAL USTAR POLICY 2XA8) GO TO (501,502)INDEX CALL TYPE (3HPHI) 50 I PRINT 925,RO.T.DEL.XMAX FORMAT (6HORO = ,FS.O,5X4HT = F5.3,5X6HDEL=F6.5.5X7HXMAX=F5.2) 925 GO TO 503 502 CALL TYPE (3HPSI) PRINT 930,RO,T,DEL,TO,XMAX FORMAT (6HORO =,F5,0,5X,4HT = F5.3,5X6HDEL= F7.5.5XSHTO=F5.2, 930 15X7HXMAX = F5.2) GO TO 503 503 PRINT 935 935 FORMAT(/lH,7X,I HN, 15X.lHX.15X. IHY,I2X,SHUSTAR, 14X,2HFN//) C FIRST LINE O F USTAR AND FNMAX TABLE X=l. Y=l. I=OS TAPE 6 NOW AT END O F F(NMAX)=ROW 1 O F F(NMAX4-I) NMA=NMAX+ 1 510 D O 525 LINE= I.NMA N=NMAX -LINE+ 1 NRBACK =MAXIJ+ I + I CALL IFMA(X.THX.1) CALL IFMA(Y,THY,J) THXC = 1. -THX THYC = 1. -THY IF(THX)544,545,545 544 IOTA = 1 GO TO 560 545 IF (THX-S) 549,549,546 546 I F (THY 1.)547,547.548 547 IOTA= I+ 1 GO TO 560 548 IOTA =MAXIJ GO TO 560 549 IOTA = 1 920
-
GO TO 560
100
ComputationalAspects
IF(THY)561,562,562 JOTA= I G O TO 588 562 IF(THY - .5)566.566,563 563 IF(THY - 1.)564.564,565 564 J O T A = J + l G O TO 588 565 JOTA= MAXIJ G O TO 588 566 JOTA=J G O T O 588 588 NRBACK=NRBACK-I DO520 M = L.NRBACK 520 BACKSPACE 6 C READS ROW I O F FN TABLE READ TAPE 6, NPRIME,IPRIME.(FR(L),L = I,MAXIJ),(US(L),L= I,MAXIJ) C TAPE CHECK SUBROUTINE 600 IF(NPRIME-N)601.605.603 C NPRIME LESS THAN N 601 KR=MAXIJ (N-NPR1ME)-I DO602 K=L.KR 602 READ TAPE 6 G O T O 600 C NPRIME GREATER THAN N 603 KR=MAXIJ (NPRIME-N)+l DO604 K = I,KR 604 BACKSPACE 6 G O TO 600 C IPRIME LESS THAN I 605 IF(IPR1ME-I) 606,515,608 606 KR=I-IPRIME-I DO607 K = I,KR 607 READ TAPE 6 G O T O 605 C IPRIME GREATER THAN I 608 KR=IPRIME-1+1 D O 609 K=I,KR 609 BACKSPACE 6 G O TO 605 C TAPE 6 NOW AT RECORD I + l O F USTAR A N D F N TABLE 515 FOO=FR(J) FIO=FR(J+I) IF(1OTA - 1)517,516,517 516 U=US(JOTA) 517 READ TAPE 6,NPRIME,IPRIME,(FR(L),L= I,MAXIJ),(US(L),L= 1,MAXIJ) BACKSPACE 6 C TAPE 6 RESTORED TO RECORD I + l FOI = FR(J) FII = F R ( J + I ) IF(1OTA -1)518,519,518 51 8 U = US(J0TA) 519 FWRDF=THXCoTHYCoFOO+THXC.THYIFOI+THXoTHYCoFIO+THXoTHY.FL1 XC=X*CON PRINT 940,N,X,Y,U,FWRDF,XC 940 FORMAT( IH,7X,I2, IOX,F9.5,8X,F9.5,9X,F5.3,7X,EI 3.6,2X,Fl1.4)
560 561
6.A]
Appendix to Chapter 6
C OPTIMAL TRAJECTORY EXTRAPOLATION RU = 21 ./29. ROIU UI = GAM Ir(Y - U)/(RU - 1 .) U2=RC*U/RU EX = EDEL**RU X=(X - U1 -U2)*EX+UIrEDEL+U2 702 Y=U+(Y-U)rEDEL 525 CONTINUE 8 LINE FROM 510 GO TO I END S SCOAST rF2 3C O35OOOF 035OOOC O35OOOH * N I COAST IDENTIFYFS DIMENSION X( lOOO),Y(lOOO) READ 9008 CASE CARD C FORMAT CARD COLUMN PARAMETER C F9.4 1-9 XOJNITIAL X C F9.4 10-18 Y0,INITIAL Y C F9.4 19-27 DEL , TIME INCREMENT c F9.4 28-36 R0,PHI NAUGHT C I5 37-41 IMAXJNDEX UPPER LIMIT C F6.3 42-47 C0N.K MULTIPLIER FOR X(I) C 1 READ 900,XO,YO,DEL,RO,IMAX,CON 900 FORMAT (4F9.4,15,F6.3) 3 PRINT 905,XO,YO,DEL,RO,IMAX 905 FORMAT (4HIXO=F9.4,5X,4DEL=F9.4,5X,3HRO=F9.4,5X. 1 SHIMAX = 15//IH,3X,IHI,SX,4HX(I),5X,4HY(l)//) A=56.*(21. +29.1R0)/(8.059.) DOllO I=I,IMAX 100 X(I)=(XO+ArYO)rEXPF( -21./29.r(1- I)*DEL) -A*YOrEXPF( -(I- 1)rDEL) 101 Y(I) = YO*EXPF( - (I - I)rDEL) XOC=X(I)*CON GO TO 105 105 PRINT 500,I,X(I),Y(I),XOC
+
500
FORMAT(IH,I5,2F9.4,2X.F11.4) I10 CONTINUES1 GO TO 1
ENDS (DATA END DATA
XO YO DEL RO IMAX CON
101
CHAPTER 7
Experimentui! Verijicution
7.1. Introduction and IRR-1 Reactor Description
To verify the calculations, it is highly desirable to shut down an actual operating thermal reactor using the optimal shutdown programs developed thus far. Besides such a practice’s providing an experimental check on the calculated shutdown programs, it turns out that the operating practicalities of the reactor used for the shutdown runs require an optimal flux shutdown program to ensure continuous twoshift steady-state operation. For this reactor the end of normal daily operation occurs at approximately 2300 hours, at which time the reactor is shut down. It is restarted the next morning at approximately 0800 hours. This desired startup time at the anticipated equilibrium power of approximately 5 megawatts almost coincides with the time at which the xenon maximum, caused by shutdown the night before, occurs. If there is sufficient xenon override reactivity, than no startup difficulty arises; hence optimal shutdown is trivial, as none is required. On the other hand, when there is insufficient positive reactivity available to override xenon, which can occur using cores with partially depleted fuel elements, then an optimal shutdown program is needed. The particular system on which the experimental results were obtained is a water-moderated swimming-pool type of research reactor, IRR-1, located at the Soreq Nuclear Research Center, Yavne, Israel [25]. It has a core composed of enriched fuel (90 per cent U235) elements of the MTR type.7 A new “clean” fuel-element configuration consists of about 20 fuel elements with an operating mass of approxi102
7.21
Immediate Shutdown of IRR-1 to Zero Flux
103
mately 4.4 kilograms of U235.A critical mass for such a fuel element The control rods are configuration is about 2.9 kilograms of U235. made of neutron-absorbing material, and each “reciprocates” in a special semifuel element whose lower half is filled with fuel. The fuelelement configuration is submerged in one end of a large dumbbellshaped swimming pool about 23 feet deep. The reactor presently operates at a nominal power of 2 megawatts and at a mean thermal flux of approximately 1.5 x l O I 3 neutrons/cm2sec. The available xenon override reactivity for a brand new (cold clean) core at room temperature is about 8 dollars. Of course, as the reactor slowly uses up fuel, the available positive reactivity falls. This makes the consequent override reactivity difficulty more and more acute as the fuel becomes depleted; hence the need for an optimal shutdown program. The swimming pool is surrounded with lead as well as special (barytes) concrete for shielding, which is pierced with a number of beam holes. They are used to provide collimated beams of thermal neutrons to the system periphery for various experimental nuclear physics research investigations (which is the main function of the reactor). As the moderator is transparent, one can look down through the swimming-pool moderator during normal reactor operation and see the fuel-element configuration (core) aglow with a bluish light. This is called Cerenkov radiation and is due to fast electrons released in fission, which lose their energy by radiating light because their speed is greater than the speed of light in their medium, i.e., in the water moderator. This is a well-known phenomenon, easily accounted for theoretically. Further details of this reactor are provided in Appendix 7.A.1. 7.2. Immediate Shutdown of IRR-1 to Zero Flux In the range of equilibrium flux levels at which the IRR-1 operates, the low- or high-flux asymptotic expressions for equilibrium xenon t MTR (Materials Testing Reactor, U.S. National Reactor Testing Station, Arco, Idaho) fuel elements consist essentially of 3-inch-square aluminium cylinders, 284 inches in length, containing 18 longitudinal fuel sheets. Each sheet, or layer, is curvilinear in section and clad in aluminium. The moderator-coolant water circulates through the fuel element between the fuel sheets.
104
I7
Experimental Verihation
reactivity given in (3.6) and (3.7) are not valid. The exact relationships for xenon reactivity magnitude at its maximum following immediate shutdown, and the time at which the maximum occurs, will be developed. From equations (3.16) the xenon behavior following immediate shutdown, in terms of initial equilibrium conditions x ( 0 ) = y(O)=u(O-)=l, u(O+)=O, is x(t) =
[+ 1
1 -
71 (w + ro> 1-w
e-Wt
Y1 (w
+ 'Ole-'
1-w
(7.1)
Equating the derivative of the above expression to zero yields the time at which the maximum occurs following initial immediate shutdown. 1 t,,, = -ln{w[l+ 1-w
,
IF1
Y 1 (lw- w + ro) (units of iodine mean life, 9.58 hr).
(7.2)
Inserting t,,, in (7.1) gives the corresponding magnitude of the xenon maximum as
(dimensionless units).
(7.3)
To convert this to poison reactivity, (3.18) gives for U235 fuel,
The constant multiplier differs from that in (3.18) because m = Zfue,/Zmodis not extremely large with respect to unity in the IRR-1 reactor. Using the parameters given in Appendix 7.A.1 for this reactor gives m=4.75. Then, from (1.4), K,= -(m/m+ 1) ( P / p ) ,which results in (7.4). The time at which the xenon maximum occurs can be used as a determination of the average equilibrium flux po. Samarium changes very slowly with respect to the time behavior of xenon, and thereby essentially does not effect the time of the xenon maximum. Figure 7.1 gives the experimentally determined xenon-plus-samarium behavior as
7.21 10 r
g
105
Immediate Shutdown of IRR-I to Zero Flux Experiment01 curve;
Computed curve
p o =1.55 X lo''
neutrons/cm2-sec(2 Mw)
4 -
3-
x"
2 -
0
0
1 0.2
1
1
0.4 0.6
1
0.8
l
1
1
1.0
1.2
1.4
I 1.6
I 1.8
I 2.0
I 2.2
I 2.4
I 2.6
I
2.8
I 3.0
Time, iodine meon life (9.58 hours)
FIG.7.1. Measured poison reactivity following immediate step shutdown for IRR-1 reactor at nominal power of 2 Mw.
a function of time for the IRR-1 at about 2-megawatts operating power. It is seen there that the maximum occurs at approximately 7.5 hours following immediate shutdown. From (7.2) ro is written in terms of the time at which the maximum occurs as From Fig. 7.1, trn,,=0.78, in units of iodine mean lifetime. With y1 =0.95 and w=0.724, r o = 1.86, implying an average thermal flux of ( p o = 1.55 x 1013 neutrons/cm2-sec. With this value of (po, the postshutdown xenon behavior can be computed using (7. l), converted to the corresponding xenon reactivity and also plotted in Fig. 7.1. From there, it is seen that the agreement between the computed xenon behavior for ( p o = 1.55 x l O I 3 neutrons/cm2-sec and the adjusted experimentally measured xenon behavior is very good. Even though the adjustment covers a multitude of calibration difficulties, the shapes of the computed and measured curves coincide quite well. At these flux levels, and xenon and samarium concentrations, the
106
Experimental Verijcation
I7
poison due to samarium is an appreciable part of the total of xenonplus-samarium poison reactivity. For example, using the dimensionless form of the samarium and promethium differential equations (3.14) and (3.15), which are
i = m , ( p - su)
u ( 0 ) = s(0) = 1
rn,
= a,cp,/l,
N
0.14 x 10-'3q,,
P(0) = 1 ,
p=u-p
(7.6) (7.7)
and eliminating the promethium concentration p , gives the following equation in the samarium concentration : 3
+ (1 + m,u)i + m,(li + u ) s = mou.
(7.8)
The immediate shutdown initial conditions are s(O)= 1 ;i(O)=m, and u=zi=O. Then the solution of (7.8) ist s= 1 +m,(l -e-")
(in units of iodine mean life, 9.58 hr).
(7.9)
The corresponding samarium poison reactivity from (3.24) is K , = $ - 1.43(1 +m,(l -e-"))
(in units of iodine mean life, 9.58 hrs). (7.10)
Hence the equilibrium samarium reactivity is K,(O)= - 1.43 dollars, independent of the equilibrium flux, and the long-time asymptotic value following immediate shutdown, which does depend on the equilibrium flux, is K,(m)=$ - 1.43(1 +m,).However, with m, from (7.6), and an m/m+ 1 =0.825 correction factor, it is seen that for an equilibrium flux l O I 3 neutrons/cm2-sec, the samarium poison reactivity varies from -1.18 to -1.34 dollars. This is about 15 to 25 per cent of the xenon poison reactivity at this equilibrium flux level. As mentioned previously, the samarium concentration changes slowly with respect to the xenon poison reactivity changes of interest, as shown in Fig. 7.2, so that it is assumed to contribute a constant negative reactivity of about - 1.25 dollars if the samarium reactivity was - 1.18 dollars prior to shutdown.
hence
mo = 0.114r0.
107
Shutdown to Nonzero Power Levels
7.31
0
._
8
'po =
lot3 neutrons/cm2-sec
5
._ I
I
I
I
1
Time after immediate shutdown, days
FIG.7.2. Samarium poison reactivity following immediate flux shutdown.
7.3. Shutdown to Nonzero Power Levels For reactor-control reasons, it is convenient in the IRR-1 to shut down to only 200 watts, from an operating power of approximately 2 megawatts, when using optimal shutdown programs. Hence it is desired to know if a ratio of between operating and shutdown flux approximates "zero" shutdown well enough. Then it is necessary to calculate the xenon behavior following an immediate shutdown to a nonzero constant flux level, u+ < 1. ,'t is obtained from (3.12) and (3.13), where the initial conditions are x(O)=y(O)=u(O)= 1, which are rewritten . t = - ( w + r o u + ) x + y l ( w + r o ) y + y ~ ( w + r o ) u + , (7.11) p=u+ -y. (7.12) Integrated, the post-shutdown xenon behavior is
wherep=w+r, and q=w+r,u+. (7.13) By differentiating and eliminating the time at which the maximum occurs, we obtain the post-shutdown xenon maximum as
I7
Experimental Verification
108
This is plotted in Fig. 7.3 for an equilibrium flux of 1.55 x 101j neutrons/cm2-sec, where it is expressed in terms of xmaX (0), the postshutdown xenon maximum for zero shutdown flux. It is seen that an immediate flux shutdown to only 1 per cent or less is tantamount to shutdown to zero flux. Therefore, a shutdown to 200 watts should approximate shutdown to zero flux very well. X e m maximum reactivity for immediate U
e ._
5
- 10
100 -shutdown to flux level ut
c 3
Time at which post-shutdown xenon maximum occurs, hours after shutdown
0
B
60
._ E 6’
-
5
E ._ X
40
E
-
4,
2 x
c
20
2
-
0 0.01
2 p
5
I
I
I 1 1 1 1 1 1
I
I
0.10
I
a 5
I 1 1 1 1
0 10
Normalized shutdown flux level, ut
FIG.7.3. Time at which post-shutdown xenon maximum occurs, and xenon maximum poison reactivity corresponding to immediate shutdown to flux level u+ < 1 expressed as percentage of xenon maximum for step shutdown to zero flux. For equilibrium flux, @O = 1.55 x IOl3 neutrons/cm2-sec.
The corresponding time at which the maximum occurs depends on the shutdown flux level u + . This time, a by-product of the above development, is given by
which is also plotted in Fig. 7.3 for the above equilibrium flux. For computation purposes, t,,, can be obtained first and used in the ex-
7.41
Xenon and Iodine Buildup and Decay
109
pression for xmaX, which is then rewritten (7.16) Also, as can be seen from Fig. 7.3, a shutdown to 1 per cent of the equilibrium flux is essentially equivalent to zero shutdown as far as I,,,,,is concerned. General results for partial shutdown, as well as changes to higher flux levels from equilibrium conditions, are available in tabular and graphical form [26, 271. 7.4. Xenon and Iodine Buildup and Decay After the reactor has been shut down for a long time (two days, or more), whether or not optimally, essentially all the iodine-135 has decayed to xenon-135, which in turn has either absorbed a neutron to be transmuted to xenon-136 or has /?-decayed to cesium-135. Both xenon-136 and cesium-135 are essentially neutronically inert for our purposes. If the reactor is now immediately restarted to operating power, the iodine-135 and xenon-135 begin to build up. It is important to determine the behavior of the buildup, especially the length of time it takes both the iodine and xenon to approach their asymptotic equilibrium concentrations. If the reactor is again shut down prior to the time at which the xenon and iodine can be considered to be at equilibrium, the postshutdown time at which the xenon maximum occurs as well as the magnitude of the xenon peak will be different than their counterparts for an initial equilibrium state. To obtain the xenon buildup behavior, (3.12) and (3.13) are rewritten with u= 1 corresponding to immediate startup, and zero initial conditions, to give f = - (w
+ ro)x + y l ( w + r o ) y + y 2 ( w + ro)
)'=l-y
x(0) = 0 ,
(7.17)
y(0) = 0.
(7.18)
The integrals give the buildup behavior as
+
3
y 1 (w ro) e-' x(t)=l-[ w+r,-l ~ ( t =) 1 - e - ' .
+[
1 - Y2(W + ro) w+ro-1
(7.19)
110
I7
Experimental Verification
For the IRR-1, let ro = 2 for easy illustration, which corresponds to an average equilibrium flux qo= 1.67 x l O I 3 neutrons/cm’-sec. With y1 =0.95 and w = 0.724, x(t) = 1 - 1.5e-‘ i(t)=
+ 0.5e-2.724‘
1.5e-‘ - 1.362e-2.724‘
y ( t ) = 1 - e-‘,
(7.20)
j ( t ) = e-‘.
(7.21)
At t= 5.0, which is essentially 48 hours after startup, x(5)=0.999, y(5)=0.993, i(5)=0.0275, j(5)=0.0068, so that equilibrium conditions are assumed to be attained after this amount of time has elapsed. The xenon buildup is shown in Fig. 7.4. On the other hand, consider the case where the IRR-1 reactor is first allowed to achieve equilibrium xenon and then has been operating on its normal daily schedule, which consists of approximately 15 hours “on” and 9 hours “off,” as in Fig. 7.5. Then (3.12) and (3.13) with
5
r
4.60
c ._
.-w 2.30 0 e C
oa
1.15
2
0
.-
c
e
0
2 3 4 5 6 Time, iodine mean life ( 9 . 5 8 hours)
I
FIG.7.4. Xenon buildup after immediate startup from long shutdown. Flux
1.0 u
0 2
0
Iodine y
Yi
Xenon
x XI
Xitl
XI t 2
FIG.7.5. Daily changes in xenon and iodine due to periodic shutdown.
7.41
Xenon and Iodine Buildup and Decay
111
U E O are integrated to give, in terms of the state (xi, y i ) at the onset of immediate shutdown on the ith day following the attainment of equilibrium xenon xi < x < x i + 1, yi
From (7.17) and (7.18), (u=l), but with X ( O ) = X ~ + andy(0)=yi+l, ~ the xenon and iodine concentrations in the interval xi+ < X < X ~ + ~ , yi+ < y < ~ ~are, + ~respectively, ,
It is easily calculated that with such a power schedule the xenon or iodine does not have sufficient time to reach equilibrium, i.e., where i = j = O , during the daily shutdown (u=O) phases. During the shutdown phases, the times at which the xenon maxima occur are obtained by equating the derivative of the xenon concentration in (7.22) to zero. This gives
The magnitude of the corresponding xenon peak is obtained by inserting the above tmaXin (7.22) to give
To ascertain how tmaXchanges with small changes in the ratio xi/yi, differentiation of (7.24) gives
with the above parameters and (xi/yi) in the vicinity of unity. Hence a
112
Experimental Verification
I7
10 per cent change in ( x i / y i ) ,which could occur because the xenon and iodine have had only, say, 15 hours in which to approach equilibrium, will yield a 3 per cent change in t,,,. Similarly, it is easily shown that t,,, decreases with decreasing (xi/yi).For example, a 13 per cent decrease in ( x i / y i )will result in a like percentage decrease in x,,,. xi and y i are almost impossibly difficult to determine experimentally in a direct fashion. However their values can be implied quite easily, as described below. In the IRR-1 reactor the control-rod system must be recalibrated from time to time, as the core ages through fuel burnup, to ascertain changes in correspondence between control-rod position and negative reactivity inserted. The standard method is to first make the reactor critical for a particular configuration of the control rods. Then the rod to be calibrated is inserted a bit at a time and the asymptotic change in flux corresponding to each rod change is noted. The inhour relationships [14]yield the reactivity as a function of the asymptotic time constant. This method suffers from difficulties, among which the most important is the inevitable neutronic interaction between the control rods. Another method for calibration is to use the changes in reactivity due to the xenon-iodine kinetics. Time changes, or slopes, due to xenon poison reactivity are easily measured. If (3.12) is examined prior to, and following, a step change in flux from u = l to u=O, then f- = - ( w f + = --xo
+ ro)xo+ y1 ( w + r o ) y o + y z ( w + r o ) , + Y l ( W + ro)Yo,
(7.27) (7.28)
where f- is the xenon slope just prior to the step, when u= 1, and f + is that immediately following the step, when u=O. Recall that if f = j = O and u = l , then x o = y o = l holds. The dimensionless xenon and iodine concentrations at the instant of shutdown are then, from (7.27) and (7.28), 1
xo=-[~i+yz(w+ro)], r0
(7.29)
7.41
Xenon and Iodine Buildup and Decay
113
where A R = R + -R-. As mentioned previously, the equilibrium flux vo can be determined by measuring the time at which the xenon maximum occurs when the reactor is immediately shutdown from initial equilibrium conditions. From (7.2),
1 I-w
tmaxeq= -].[(I
+
(7.31)
Y 1 (w + ro)
so that measuring tmaxeqgives ro, which in turn yields yo. With known equilibrium flux, the xenon negative reactivity K , can now be calibrated, since from (3.18) and (7.4),
To keep the IRR-1 critical as the xenon concentration builds up following shutdown, the control-rod system automatically compensates for negative reactivity changes in the reactor which are due predominantly to xenon as discussed earlier. Hence the xenon reactivity mirrors the reactivity inserted by the rods and so can be used to calibrate them (and vice versa) when the calibration is known. At equilibrium, x = 1, so that Kxcq
(7.33) =(5 Yx ) + (Ti-a-)(&)(& YI
For example, for a step shutdown to essentially zero flux (2 megawatts to 200 watts) from prior equilibrium, the resulting experimental curve, Fig. 7.1, yields t,,,,,=0.78 in units of iodine mean lifetime. Equation (7.31) gives an ro = 1.86 corresponding to yo = 1.55 x loi3 neutrons/ cm2-sec. Then (0.84) = 4.60 dollars.
(
)
1.86 1.86 0.724
+
(7.34)
Recall from Section 7.2 that a correction for samarium must be made between the calculated and measured reactivity. As already discussed, the IRR-1 is seldom at equilibrium xenon because of its daily shutdown and startup operation. Then the calibration formulas must be modified. The quantities that can be readily
114
I7
Experimental Verification
measured under these conditions are the ratio of the slopes, s = i - / i + , and the corresponding t,,,
=
1 l-w
-ln[w(l+
Y l ( lW-
Y-’.
w + r0)Yo
(7.35)
From equations (7.29), (7.30), and the fact that s = O implies xo = y o = 1, the “nonequilibrium” values of xo and y o are (7.36)
With these values of xo and y o inserted in (7.39, the result is
which is seen to reduce to tmaxeq for s=O, i.e., initial equilibrium conditions. Then for measured values of s and t,,,, ro and thus the equilibrium flux can be determined, from which the reactivity level of xenon can be determined, so that the control rods can be calibrated in this manner. A comparison can be made between t,,, and rmaxeqin that their difference At = t,,, - fmaxeqcan be simplified by realizing that y2 w/ro -4y l , since ro is of the order of unity at the reactor equilibrium power under consideration. Further, the second term in the logarithm is small enough so that an expansion through linear terms can be made. Then to good approximation, At
sr0
(w
+ ro)(w(l - s) + ro)’
(7.39)
Similarly, for the corresponding ratio of the magnitudes of the postshutdown xenon peaks to the above approximation,
7.51
115
Experimental Results
Hence the time of the post-shutdown xenon peak will occur later
(s>O) or earlier (s
conditions. Similarly, the magnitude of the xenon peak will be larger
(sO) than that for equilibrium shutdown.
If the reactor has been operating for a long time under periodic power conditions as in Fig. 7.5, so that the xenon has had an opportunity to come into only an average daily equilibrium, then s 0 upon immediate nonequilibrium shutdown. Of course, if the reactor is shut down from equilibrium conditions, s=O. All of this is depicted in Fig. 7.6.
Time
Time
Shutdown Following Recent Startup
Shutdown Following Long Periodic Operation
Time Shutdown Following Initial Equilibrium
"( kr
+x
s=o
Time
Time
xo;I
Time
FIG.7.6 Xenon slope ratio for immediate shutdown under various initial conditions.
7.5. Experimental Results The IRR-1 reactor optimal shutdown experiment was conducted with a core configuration consisting mainly of substantially depleted fuel elements. This is because the experiment happened to be run at a time immediately prior to changing the core configuration by adding new fuel elements and operating at an increased power of 5 megawatts with a mean thermal flux of about 4 x l O I 3 neutrons/cm2-sec.
116
Experimental Verification
I7
The operating mean flux during the experiment was about 1.5 x lOI3 neutrons/cm2-sec, corresponding to a nominal power of 2 megawatts. With the “old” core configuration, the equilibrium xenon poison reactivity was about -4.60 dollars, as calculated from (7.4).The measured positive reactivity over and above the equilibrium poison level was only approximately 50 cents with the depleted core. Since this would be inadequate to override the xenon, except for a short time following immediate shutdown, it was decided to shut the reactor down for a period of two days, to allow the xenon to decay away naturally before starting the experimental optimal shutdown runs. Then upon subsequent startup, prior to full buildup of xenon to its equilibrium concentration, the optimal shutdown programs can be executed, since the available positive reactivity margin was then some 3 to 4 dollars. It was decided to attempt an optimal shutdown program to minimize the post-shutdown xenon maximum, problem (a), with a shutdown duration T = 1.04corresponding to 10 hours. At the above flux level, the ratio. of xenon override reactivity to equilibrium xenon reactivity was assumed to be about 1.30; i.e., the xenon override constraint was taken as $ - 6.00.Such a program will shift the xenon peak ahead in time from about 0.8 to 1.7,in units of iodine mean life. This will avoid coincidence of the xenon peak with the desired daily morning time of restart, as,discussed in Section 7.1. The computed shutdown program for this set of parameters and the corresponding experimental shutdown program are shown in Fig. 7.7.The evident difference in the programs is due mainly to the fact that the measured shutdown is less abrupt than the computed shutdown, as discussed in Section 5.2. There is a slight lag in the startup portions as well, since the reactor cannot be restarted too quickly, for safety reasons. Figure 7.7also compares the computed and measured xenon poison reactivity during and following the shutdown control phase. The initial disparity occurs because the computed optimal shutdown xenon reactivity assumed an initial equilibrium state i(0) =)’(O) = 0, whereas the experiment was conducted prior to the reactor having attained equilibrium xenon, owing to the paucity of positive reactivity, as mentioned above. The measured changes in the xenon reactivity are
7.51
117
Experimental Results 2x106
U'I
scale I
Power, watts 2 x 102
uzo
Computed optimal shutdown
-------Experimental optimal shutdown
._ 2 4 > ._ +
5!
-
%! ._
3 -
O
H E
2
2 -
X
I
0.2
04 I I
0.6
0.8
I
I
I
I
1.0
Scale
I
I
FIG.7.7. Computed and experimental optimal shutdown programs to minimize post-shutdown xenon maximum. Allowable shutdown duration T = 1.04(10 hr), xenon override constraint 8 6.00, equilibrium flux @O = 1.5 x 1013 neutrons/cm2sec.
also much less abrupt than those computed, as seen in Fig. 7.7. The reasons are given in Section 8.7, which discusses the experimental conclusions.
118
Experimental Verification
I7
7.A. Appendix to Chapter 7. 7.A.1. Physical Parameters of IRR-1 Reactor at Initial Startup t Reactor type Power level Fuel Lattice configuration Nominal thermal flux Moderator coolant Reflector Shielding Coolant flow rate Average heat flux Design moderator water temperature Cold moderator water temperature Startup neutron source Fuel elements in reference core at 1MW operating power Cold clean excess reactivity of reference core (core composed of brand new fuel elements) Cold clean critical mass assuming an infinite water reflector Cold clean critical mass with 3-inch layer of graphite plus infinite water reflector Operating fuel mass Reactivity temperature coefficient at 35 "C Reactivity void coefficient of moderator Reactivity fuel mass coefficient, 3 inches of graphite plus infinite water reflector Reactivity fuel mass coefficient with infinite water reflector Number of control rods T y p e of control rod motion
heterogeneous thermal 1 Mw 90 per cent enriched U235(MTR fuel) 90 holes, 9 x 10 array 5.7 x 1 O l 2 neutrons/cm2-sec Ha0 water, lead, and graphite water, lead, standard concrete, and barytes concrete 3400 liters/min 0.75 cal/cm2-sec 35 "C 20 "C plutonium beryllium (lo7neutrons/sec) 19 fuel elements, 6 control elements,$ and 1 partial fuel element $ $7.81 2.9 kilograms Ua35 2.2 kilograms UZ35 4.41 kilograms Uzs5
- 1.091 cents/"(= - 0.094 cents/cm3 void 1.78 cents/gram U
2s5
1.45 cents/gram U 2 3 5 5 shim safety rods and 1 regulating rod
vertical, gravity-fall, motor-driven
+ See refs. [25, 281.
1 Control fuel elements (called semifuel elements herein) contain fuel in their
lower half only. The (absorbing) control rods reciprocate in the empty upper half. Partial fuel elements are essentially semifuel elements so placed in the core configuration that they do not receive a control rod.
7.A]
Appendix to Chapter 7
Control rod absorbers Minimum insertion time for 5-shim safety-rod scram Rod removal rate Control rod reactivity (5 shim safety rods at average of - $ 4.69Irod) 1 regulating rod
Kerf of reference core with all control rods fully inserted Maximum reactivity withdrawal rate (all 5 shim safety rods) Design (positive) reactivity allowances for reference core at 1MW: Equilibrium xenon-I35 Equilibrium samarium-149 Temperature changes Experimental facilities (beam holes, thermal column, etc.) Fuel burnup and low cross section fission-product accumulation Maximum xenon reactivity at 1 Mw (occurs 5 hours after immediate shutdown) Calculated prompt neutron lifetime
119
shim rods boron carbide (B4C) with cadmium liner regulating rod stainless steel 0.5-0.55 sec 5 shim rods 7.7 cm/min, 1 regulating rod 63 cm/min - $23.40 -
$ 0.78
- $ 24.18 0.895
10.5 centslsec
$ 2.87 $ 1.43
$0.08
$ 1.72 $ 1.70 $ 7.80 - $ 3.25
53 psec
120
17
Experimental Verification
7.A.2. Fuel-Element Churucteristics Characteristic
Standard fuel element
Weight of U2Z5(grams) 196 U235 enrichment (per cent) 90 Fuel-sheet thickness 0.51 Aluminium fuel-clad thickness (mm) 0.38 Effective fuel length (cm) 60 Unit cell area (cm2) 62.64 Water circulation gap (mm) between 1.o fuel elements Weight of fuel per sheet (grams) 10.89 Water channel thickness (mm) between 3.12 fuel sheets Metal-to-water volume ratio 0.54 Fuel sheets per element 18 Volume fraction of water 0.6470 Volume fraction.of aluminium 0.3502 Volume fraction of fuel 0.0028
L2thermal diffusion area ( c d ) Neutron age (cm2)
See footnote $, p. 118.
Partial fuel element +
98 90 0.51 0.38 60 62.64
98 90 0.51 0.38 60 62.64
1.o 10.89
1.o 10.89
3.12 0.48 9 0.6768 0.3218 0.0014
3.12 0.54 9 0.6470 0.3516 0.0014
0.08625
0.05100
0.05074
1.3265
1.3831
1.3265
2.91 3 52.3
4.725 49.1
4.952 52.3
0.05998
0.02999
0.02999
0.0710
0.0355
0.0355
1 .Ooo
1.Ooo 2.46 0.6963 2.078 1.4470
1.Ooo 2.46 0.6998 2.078 1.4542
2.46 0.8234 2.078 1.711 t
Control fuel element +
CHAPTER 8
Results and Conclusions
8.1. Introduction and Xenon Unconstrained Extremals The calculational results and conclusions will be examined first; those stemming from experimental verification will come later. The first obvious result is that essentially all the optimal flux shutdown programs are piecewise-constant, in other words, bang-bang. That is, to achieve optimal control for both problems (a) and (b), the reactor is immediately shut down and then pulsed to a maximum power, with the number of pulses and the pulse and no-pulse widths dependent on the system parameters. The fact that the optimal control behavior is bang-bang, instead of continuous, seems at first blush to belie intuition. However, from the equations of state (3.12) and (3.13), and the physical situation, this type of control is called for because the principal source of xenon is the decay of iodine, since the direct fission yield of iodine is some 20 times larger than that of xenon. This implies that to achieve optimal flux shutdown for either problem (a) or (b), the reactor should be immediately shut down as the first step to stop production of iodine, even at the expense of sustaining a rapidly increasing xenon concentration. This is accentuated in the case of high-flux reactors. To show this, assume that X r / Y T is not large, which is the usual case, and for high-flux reactors, ro S w ; then from its definition in (5.3), the problem (a) cost functional is closely approximated by
Results and Conclusions
122
18
Then for example, at a flux level of lo= 2 x loi4 neutrons/cm2-sec @ N 0.42(20yT
+ x T ).
(8.2) Hence, to obtain a min, @ it is imperative that the iodine concentration y,, at the control termination state ( x T ,yT), be minimized. Similar results hold also to obtain min, Y , since problem (a) is a special case of problem (b), as implied by their definitions. Thus, to minimize yT, the production of iodine should be curtailed as soon as possible. With the reactor in the shutdown state, no iodine is being produced, but the xenon concentration is increasing relatively fast and approaching the override limitation. However, at the propitious instant, which is a function of the system parameters, the flux is immediately jumped to its maximum level, M. This instantaneously supplies neutrons to be absorbed by xenon, thus immediately dropping its concentration while momentarily maintaining the iodine level constant. But the inevitable production of iodine due to this rapid flux jump is reflected in a subsequent increased concentration of xenon which will ultimately, howNonoptimal I I trajectory,
uz0
: 0
UI
c .C C
2 t
Extremal final stale, coasting phase begins X
0 10
0
02
04
06
00
10
Iodine, equilibrium iodine units.
12
I/Io
FIG.8.1. Phase (xenon-iodine)space plot of extremal arcs for problem (b), plus coasting-phase arcs for a control duration, T = 1, and time of occurrence of minimum xenon, TO= 1 , urnax= 1.0, @O = 5 x 1014, u = @/@PO. Time in units of iodine mean life.
8.11
123
Introduction and Xenon Unconstrained Extremals
ever, peak at a lower level in the post-shutdown phase, principally because its major source, the iodine, has been reduced at the end of the shutdown phase through reduction of the modulus of the termination state vector, I(xT,y T ) ( . In terms of the xenon-iodine phase space extremals, such as those depicted in Fig. 8.1, bang-bang control transfers the system from an initial extremal trajectory to a final (coasting) trajectory which peaks at a lower maximum than otherwise. This result is analogous to the control of a space vehicle being transferred by thrust pulses from an equilibrium state (Earth parking orbit) to an interplanetary trajectory, which is an extremal orbit in terms of satisfying a cost functional such as minimum total energy consumption. Another example of this type of transfer scheme is the four-pulse optimal shutdown program of Fig. 8.2. Four pulses are required in this case, since a lesser number LT
= 0 (corresponds to immediate shutdown Post-control chase-
9t 0 for
t >0)
1.0 2 .o T = 0.2 ~ P o s + ; c o n t m lphay-
"0
"
1.0
2.0
T =0.5 Post-control phase-
1.0
2.o T = 1.0 Post control phaseI
1.0 2.o 100-(Immediate shutdown #I = O for f > O )
"0
-e B
>:
-8
c .> ._
e
-
.-
10
0
Equilibrium xenon concentrat ion corresponds to poison negative rwctivity of $7.82
5
._
g
C
5
1.0
0
575 min r
1.0
II50 min 1
2.0
I725 min I
3.0
Time, iodine mean life (9.58hours)
FIG.8.2. Xenon poison reactivity and corresponding optimal flux shutdown policies to minimize maximum xenon poison reactivity for various shutdown control durations T.Equilibrium flux @O = 2 x 1014 neutrons/cm2-sec,umrx= 1 .O.
124
Results and Conclusions
I8
will not accomplish the trajectory transfer consistent with the parameters. A physical analog of this type of bang-bang control is a child’s swing. With a person pushing from behind with the correct tempo (correctly timed pulse train), the amplitude of the swing easily can be made to increase or decrease with push pulses of limited magnitude.
8.2. Xenon Constrained Extremals The approach taken to compute optimal flux shutdown programs is to obtain them first without the xenon override constraint. Then the xenon constraint is imposed and the resulting shutdown programs are compared with those without the xenon constraint. First it should be noted that the unconstrained extremals are perfectly valid, provided that the corresponding optimal flux shutdown programs are physically T = 0.2 t control phaseT = 0.5 Post m t r o l phose-
a
1.0
0
T = 1.0 Post control phase
-
1.0 2.0 3.0 Time, iodine mean life (9.58 hours)
FIG.8.3. Xenon poison reactivity and corresponding optimal flux shutdown policies to minimize maximum xenon reactivity for various shutdown control durations T. Equilibrium flux @O = 2 x 1014 neutrons/cm2-sec, urnax= 2.0.
8.21
Xenon Constrained Extremals
125
realizable. By this is meant that for a given set of parameters, there exists sufficient partial xenon override reactivity to counteract the xenon poison reactivity at the point in the control phase at which the first flux pulse is called for in the shutdown program. If this is so, then the unconstrained shutdown program can be executed; otherwise not, and the system must then wait until enough xenon has decayed away naturally in order to restart at all, much less in any optimal manner. For example, as seen from Figs. 8.2 and 8.3, a physically realizable shutdown program can require an amount of xenon override capability ranging up to 60 dollars' worth of reactivity, where the xenon maximum occurs at about 70 dollars at this equilibrium-flux magnitude ( q0 = 2 x l O I 4 neutrons/cm'-sec). A second note is that the xenon constrained extremals are coincident with their unconstrained counterparts except for the portion of the extremal field cut off by the xenon override constraint, the horizontal line x = x,. The optimal shutdown program respects this constraint 'by generating a suboptimal extremal arc consisting of sawtooth segments adjacent to the line x = x, between its two intersections with the corresponding unconstrained arc, as sketched in Fig. 8.4. The sawtooth
10
3
X
0 I0
~(0.65) = I
L state; coast phase begins
0
02
04
06
08
10
Iodine, equilibrium units,
I2
I/Io
FIG.8.4. Minimax phase (xenon-iodine) space plot of extremal arcs plus coasting-phase arcs for control duration T = 1. @PO = 2 x IOl4 neutrons/cm2-sec, urnax= 1 .O; cf. Fig. 8.2. Time in units of iodine mean life.
126
18
Results and Conclusions
extremal corresponds to a flux pulse train which burns our the xenon as the sawteeth zigzag opposite the line x = x c . Sawteeth occur because of the flux constraints, i.e., the bang-bang nature of the control, as well as the constraining equations of state trajectories.
8.3. Interdependence of Flux and Xenon Constraints If the flux-upper-bound constraint M is relaxed, the sawteeth can be dispensed with and the xenon constraint line x=xc itself can be made part of the extremal. This could be desirable from a practical point of Alternate flux program which makes xenon constraint line x = xc part of suboptimal
20
U
I0
0
t2 = 0.80
1,. 015
Alternate extremal orc including xenon constraint line x = xc
630 xenon override
Shutdown control phase
0
I
I
I
0.2
0.4
0.6
I 0.8
1
I
1.0
1.2
Time, iodine meon life (9.58 hours)
FIG.8.5. Optimal flux shutdown program for problem (b), to minimize xenon concentration at end of shutdown duration T = TO= 1. Equilibrium flux @o = = 2 x 1014 neutrons/cm%ec.
8.41
127
Two Types of Optimal Shutdown Payoffs
view in that reactor operating personnel might object strenuously to a large number of pulses in any optimal flux shutdown regime. Problems (a) and (b) are terminal-control types of problems; the cost functional depends only on the state at the termination of the control phase, so that the “running” or average cost of the control itself is assumed free (cf. Section 2.5). Hence the cost criterion is unaffected by either mode of suboptimal control, i.e., with or without sawteeth for a xenon constrained extremal. If the sawteeth are superseded in favor of the xenon constraint itself, then the line x=x, connects to the unconstrained extremal at the points marked + in Fig. 8.4. However, the flux-upper-bound constraint must be relaxed along the x = x c arc, or else u s 0 for certain sets of the parameters. On the arc x=x,, i = O holds, which together with the state equations (3.12) and (3.13) results in a differential equation whose solution yields an expression for the shutdown flux u on x=x,. It is
2 M ti < t < t 2 , (8.3) where without loss of generality a high-power reactor is assumed, implying r 0 B w and x,By,. The time t , is the instant when the unconstrained extremal first intersects the line x=x,, while t 2 is the time at which the constrained extremal leaves x = x, as marked by signs, as shown in Figs. 8.4 and 8.5. Both t , and tz are measured from initial equilibrium. At r , the flux is jumped from zero to its value given by (8.3); i.e., u ( t , ) > M . u then drops exponentially until time tz when u ( t z ) = M , thus coinciding with the value of u on the unconstrained extremal. This is shown in Fig. 8.5 for the specific example M = l .
+
8.4. Two Types of Optimal Shutdown Payoffs Another result aids in partially answering the question of whether or not optimal flux shutdown programs pay off in terms of possible fuel savings. For a series of unconstrained xenon minimax cases at equilibrium flux q0=2 x 1014 neutrons/cm’-sec, the xenon peak as a function of the shutdown duration T is plotted in Fig. 8.6 and is fitted by the empiricism min xp= - 70 exp ( - 0 . 7 T) dollars. At this equilibrium flux level, immediate shutdown (T=O) yields a min x p = -70 dollars,
128
Results and Conclusions Hours
10 L
0
1
0.25
I
0.50
I
0.75
I
1.0
Flux shutdown duration time T , iodine mean life (9.59 hours)
FIG.8.6. Maximum xenon reactivity and its time of occurrence following minimax flux shutdown phase. Equilibrium flux @PO = 2 x loL4neutrons/cmz-sec, Urnax = 1.0.
while if T = l , corresponding to 9.58 hours of shutdown duration, min xp 1: - 34.8 dollars. Also the time at which the maximum occurs is increased for the optimized shutdown. The ll-hour xenon maximum for zero shutdown time (immediate shutdown) has been shifted to 20 hours, for 9.58 hours of allowed shutdown time, as can be seen from Fig. 8.6. For high-flux reactors with normal fuel loads, and Zfue,/Zmoderator B 1, Eq. (1.7) asserts that the amount of additional fuel required to override the xenon peak poison reactivity is proportional to min xp. Hence the amount of fuel needed will be approximately halved with the above parameters if optimal shutdown programs are used. If it is desired to override the xenon concentration at will in the post-shutdown phase (complete override), Eq. (1.7) reveals that, at the above flux level for an enriched-U 2 3 water-moderated reactor, approximately 10.1 times as much fuel is required as called for by the critical mass equations. For example, if the critical mass is 6 kilograms, 61 kilograms are needed for complete override, whereas only 34 kilograms would be required if optimal shutdown programs are used with 9.58-hour shutdown durations. This would correspond to a saving of 324,000 dollars at the prevailing U.S. rate of 12,000 dollars per kilogram for enriched U Z 3 ’ .What is done for economic reasons in high-power thermal reactors, except in military applications, is to settle for a very limited partial override situation, so that the fuel inventory be maintained at reasonably low levels, roughly 2 to 3 critical masses. This corresponds
8.51
Short Allowable Shutdown Durations
129
to being able to restart within only an hour or less following immediate shutdown. Similar results are obtained for the minimization of the post-shutdown xenon concentration, problem (b). This can be seen from the similarity of the functions @ and Y, as discussed previously. In terms of IRR-1 reactor scheduling, it will be seen that optimal shutdown flux programs can be used to advantage. Under normal daily operation, the IRR-1 is shut down each night at about 2300 hours. For the anticipated higher power operation at 5 megawatts, the post-shutdown xenon maximum will be larger than its magnitude at the present power of 2 megawatts. More important, the time at which the post-shutdown xenon maximum occurs following an immediate nonoptimal shutdown will increase from approximately 7.7 to 9 hours. This will unfortunately coincide with the daily morning startup time. For protracted power operation at 5 megawatts, it is doubtful that sufficient xenon-peak-override capability can be maintained in this reactor to restart the following morning. Therefore, an optimal shutdown program of the minimax type will serve to lower the xenon peak concentration occurring at the early morning startup hour, shifting the xenon peak ahead in time, as seen from Fig. 8.3. Or, an optimal shutdown program of the type of problem (b) can be invoked which will minimize the post-shutdown xenon concentration at a time corresponding to early morning startup. Both types of optimal shutdown programs are equivalent in this case, in that they both shift the xenon peak sufficiently far ahead in time to depress the postshutdown xenon at the desired startup time.
8.5. Short Allowable Shutdown Durations A third general result, which is seen by examining the figures, is that for short shutdown durations (TG 0.20) the minimax xenon reactivity magnitude, problem (a), is relatively insensitive to the shutdown policy. Thus for shutdown durations of two hours or less, nothing much can be done to “minimax” the xenon reactivity poison. For short allowable shutdown durations, problem (b) is probably more apropos in that the corresponding criterion is only to minimize the xenon reactivity at a given post-shutdown time, especially so if it is desired to accomplish this precisely at the termination of the shutdown duration T.
130
Results and Conclusions
8.6. Strongly Limited Xenon Override Shutdown
For many thermal reactors, such as the IRR-1 reactor, the fuel inventory is relatively small (about 2 to 2.5 critical masses), so that the corresponding reactivity available to override xenon is also small. In general this can be expressed in terms of the ratio of available xenon override reactivity K , to the xenon reactivity at equilibrium operating conditions, Kxo. As can be appreciated from Figs. 8.7 to 8.11, the smaller this ratio for a given shutdown duration, the greater the number of pulses required for optimal shutdown. This is also depicted in Fig. 8.12, where the ratio Kc/Kxo is plotted as a function of the percentage of time that the reactor is on, i.e., at operating power, in the shutdown duration. It is seen that as Kc/Kxoapproaches unity, the reactor approaches its limit of not being shut down. That is, if there is no xenon override reactivity available, no optimal shutdown programs exist, as the reactor cannot be restarted at all following shutdown. It is also seen from Figs. 8.7 to 8.11 that, as the allowable shutdown duration is reduced, the average pulse widths are larger. This indicates that the reactor must remain at operating power a greater percentage of the time during the shutdown control phase. This is required to burn out the xenon, as there: is only a relatively small amount of xenon override reactivity available. However, the fact that the reactor is a t operating power so frequently during the shutdown duration does not allow the iodine concentration to be decreased very much. The iodine concentration must be decreased to obtain a substantial post-shutdown xenon minimum. Hence, such a minimum will not be realized. This is another way of saying that the allowable shutdown duration T is too short, as discussed in Section 8.5. The limited xenon override reactivity available, especially in the cases of short-time allowable shutdown duration, is reflected in the only slight reduction in minimax xenon attained compared to immediate shutdown, as shown in Figs. 8.7 to 8.11. If the flux maximum constraint M can be relaxed, much of the off-on behavior due to the large number of pulses required for optimal shutdown could be eliminated. That is, the flux would be increased to greater-than-rated operating power over part of the extremal, as de-
131
Strongly Limited Xenon Override Shutdown
8.61
1.0
$ 3 0 override
U
Scale I
0
1.0 $25 override
U
Scale I
0
1.0
$ 15 override
U
Scale I
0
I00
;?;ig----------------.
Coasting phase
$30
scale
II
,$70
xenon peak following step shutdown scale I
t U 0
-9
.-
Shutdown control phase scale I 30 override 25 override $ I 5 override
cn 0
e e
c
C
x"
10
I
I8
08
I0
Scale I
20 22 24 26 Time, iodine mean life ( 9 58 hours)
28
Scale ll
02
04
t-+--t
06
-4
FIG.8.7. Optimal shutdown for post-shutdown xenon minimax. Allowable shutdown duration T = 1 (9.58 hr). Equilibrium flux @O = 2 x l O I 4 neutrons/cm2sec, urnax= 1.0.
Results and Conckusions
132
10 U
$65 override Scale I
0
10
$ 4 5 override Scale I
U
n " 10
$ 30 override Scale I
U
0 I0
$ I5 override
U
Scale I
-
n
Coasting phase
Scale
II
I00
Shutdown control phase
P
.-
c
10
m 0
c
02 I0
18
20
04
06
08
I0
Scale I
22
24
26
28
Scale
Ii
Time. iodlne mean life ( 9 5 8 hours)
FIG.8.8. Optimal shutdown for post-shutdown xenon minimum (at end of shutdown duration). T = TO= 1, 00= 2 x 1014 neutrons/cm2-sec.
8.61
Strongly Limited Xenon Override Shutdown
133
10 U
$13 override Scale I
0
10 U
$10 override Scale I
0 I0 U
.+
0
Coasting phase
scale
11
$8 30 override Scale I Nonoptimal step shutdown xenon peak, scale I
I0
r
-
0
--
Shutdown control phase
Ti
scale I
h
?
"
+
e
-?
Is 0
P
E
x"
04
06
20
22
l o [.2+L++-.. 16
18
08
I0
Scale I
24
26
Scale
-+---A-
T i m e , iodine mean life ( 9 58 hours)
U
FIG.8.9. Optimal shutdown for post-shutdown xenon minimax allowable shutdown duration T = 1 (9.58 hr). Equilibrium flux 90= 4.2 x IOl3 neutrons/ cm2-sec, urnax= 1.0.
Results and Conclusions
134
1.0 U
$7.80 override Scale I
0 1.0
$9.00 override
U
Scale
0 1.0
$12.50 override Scale I
U
0
Coasting phase
, $12.50 override
scale
U
Shutdown control phase
$ 12.50 override $ 9.00 override $ 7.80 override
" ?
I
_-__--
+ Nonopfirnal step
shutdown xenon peak
Scale 1
-
02
03
04
05
L 0 t l I_ 3 _ 15+ - 17I Y 19
10 09
Scale 1 Scale U
I I
Time, iodine mean life (9 58 hours)
FIG.8.10. Optimal shutdown for post-shutdown xenon minimax. Allowable shutdown duration T = 0.5 (4.79 hr). Equilibrium flux GO= 4.2 x 1013neutrons/ cm2-sec, urnax= 1.0.
Strongly Limited Xenon Override Shutdown
8.61
135
I0
$ 7 0 0 override Scale I
U
0 10
$8.75 override
U
Scale I
0
1.0 $10.80 override
U
0-
- - - - _-
Scole I Nonoptimal step shutdown xenon peak level
-
10
?
Q 0
r,
c .>
.*
7 00 override
0
0
? m
._ c 0
m C
e
E
X
1.0 05
005 07
010
015
020
025
03
15
I7
1
09 II 13 Time, iodlne mean life ( 9 5 8 hours)
+
Scale I Scale
II
FIG.8.1 1. Optimal shutdown for post-shutdown xenon minimax. Allowable shutdown duration T = 0.3 (2.87 hr). Equilibrium flux @o = 4.2 x 1Ols neutrons/ cm2-sec, urnax= 1.0.
Results and Conclusions
136
-
5 5
-
0
l’? -
.=5 \k
p
-
0 01 .L
z B
-
C C
2 r
-
0
dc“
1.0
I
0
14
28
42
56
70
84
98
Percentage of time reactor is on during shutdown phase
FIG.8.12. Ratio of xenon override reactivity to xenon equilibrium reactivity versus number of pulses in shutdown program. The latter is in terms of percentage of “on” time during shutdown phase to minimax xenon, problem (a).
scribed in Section 8.3, to burn out the xenon as the alternative to a long flux pulse train. However, for small available xenon override reactivity, this might prove difficult, as the extent to which the flux constraint must be relaxed might be intolerable. This can be easily discerned by examining Eq. (8.3) and Figs. 8.4 and 8.5.
8.7. Conclusions of Experimental Investigation The optimal shutdown experiment is described in Section 7.5, where the results are also given. With the anticipated increase in power of the IRR-1 reactor to 5 megawatts, the fact that the post-shutdown xenon maximum will occur almost at the desired morning startup time, commencing daily operation then necessitates a shift in the time of occurrence of the xenon peak. This is easily accomplished by using an optimal shutdown program that minimizes the xenon maximum, problem (a). As discussed in Section 8.4, the shutdown program to
8.71
Conclusions of Experimental Investigation
137
minimize the xenon at a given post-shutdown time, problem (b), can be used as well. As seen from Figs. 8.7 to 8.11, the shutdown programs for problem (a) or (b) are quite similar when strong xenon override constraints are imposed, other parameters being equal. From Fig. 7.7 it is seen that there are differences between the computed optimal shutdown program and the one that is experimentally realized. This is because (1) the mechanical inertia of the control-rod system and the delayed neutrons constrain the shutdown rate of the reactor, as discussed in Section 5.2; and (2) the startup rate is constrained for safety reasons, so that the reactor does not pass through the desired equilibrium state (criticality at desired operating power) too quickly. If this happens, the reactor can become supercritical (k> 1) for a dangerously long time interval, presaging a reactor-core-meltdown accident. F,or reasons discussed in Section 7.5, the experimental shutdown program was initiated from a nonequilibrium state; i.e., i ( 0 )#O, j ( 0 )#O. This produced the initial “transient” in the measured xenon poison reactivity seen in Fig. 7.7. In general, however, the measured xenon reactivity changes are less abrupt than those computed, owing to the less abrupt experimentally realized optimal shutdown program. The xenon peak time has been shifted from 0.8 (in units of iodine mean life) for an immediate nonoptimal shutdown to 2.0 using the optimal shutdown program described in Section 7.5. However, the magnitude of the peak has remained essentially the same as its nonoptimal shutdown value, because of the strong xenon override constraint imposed. The ratio of the xenon override constraint to the equilibrium xenon level was about 1.30, which is quite near the limiting value for effective optimal shutdown programs, as discussed in Section 9.1. If no automatic control mechanism is available to exercise such a protracted optimal shutdown program (10 hours), T can be decreased, so that the operator on the late evening shift can manually exercise the shutdown program prior to shutting the reactor down for the night. Shorter optimal shutdown durations T will not depress the xenon reactivity at the termination of shutdown as much as the experiment under discussion (T= 1.04). How small T can be made depends on how much xenon override reactivity is available, which in turn depends on the age and status of the core configuration. Again, see Section 9.1.
138
18
Results and ConcIusions
Since one principal reason for nightly shutdown is to conserve fuel, perhaps a better shutdown criterion would be one that takes into account fuel usage (burnup). Such a criterion is investigated in Section 9.3. In these experiments the ratio of no-pulse duration to pulse duration during shutdown is effectively 3; hence the reactor is down only 37.5 per cent of the time during the shutdown control phase. Again, this is because of the imposition of the strong xenon constraint, which forces the reactor “on” frequently to burn out the xenon to maintain at will startup capability. Figure 8.13 depicts how the optimal shutdown program discussed would be executed on a daily basis. It is interesting to examine the optimal shutdown programs to minimize the xenon reactivity in the post-shutdown period, problem (b), for the case of no xenon override constraint. From previous discussion, the latter can be construed as assuming that sufficient xenon reactivity override exists at the onset of the first pulse required by the optimal shutdown program, so that the particular shutdown program is realized. Figure 8.14 depicts such optimal shutdown programs and their resultant xenon reactivity changes for various shutdown dura2300 U
0700
2300
0700
2300
0700
1.0
-s
3
LL
n
0‘
FIG.8.13. Computed optimal flux shutdown to minimize xenon at TO= 1 (9.58 hr) for various shutdown durations 7‘. Sufficient override reactivity assumed available at onset of first pulse. Equilibrium flux QO = 0.833 x 10ls neutrons/cm2sec.
8.7 1
Conclusions of Experimental Investigation
139
I.o U
0 1.0 -
s” ‘ u
=
-8
0
I’
u
6 -
8
-L
T = 0.0
=
t
T = 0.3
9 -
8 -
7 -
Xenon
Reactor daily restart
FIG.8.14. Suggested daily optimal shutdown program to minimax xenon in IRR-1 reactor for depleted cores.
tions T,with the post-shutdown time of xenon minimization To= 1.0 (9.58 hours). It is evident that the longer shutdown durations are more effective in reducing the xenon reactivity at the given post-shutdown time. This is simply because the reactor is operating at full power more of the time, so that more xenon is burned out. This is especially true in the case where T= To= 1.0 in Fig. 8.13, where the reactor is effectively “ ~ f f ” only 40 per cent of the shutdown control phase. This is seen to be the case, since the reactor is on during the last portion of the shutdown, which is immediately contiguous to the daily “on” period.
CHAPTER 9
Summary and Equivalences
9.1. Reprise The previous chapters have described the formulation, behavior, and optimal control policies of a substantive control process occurring in the field of nuclear reactor physics and engineering. It is that of controlling the reactor flux or power in order to shut the reactor down while minimizing the effect of xenon poison, which is a concomitant part of thermal-reactor operation. We have brought to bear one of the newer modern and powerful methods, which is dynamic programming, for obtaining optimal power shutdown policies or programs to accomplish this task. Without belaboring the obvious, such programs should play a very important role in the control and operation of such reactors, since to date there are no generally accepted measures to determine how, and to what extent, the xenon override problem should be handled. In large thermal reactors built thus far, only sufficient additional fuel has been provided to override the xenon poison for a very short time, compared to the mean life of iodine-135, following shutdown. Such times are of the order of to 1 hour following immediate shutdown. This is because of the inordinately large amounts of additional fuel required to cope with the xenon poison when no optimal shutdown program is used. As seen in Chapter 2, the amount of fuel inventory required to override the post-shutdown xenon at will ranges from one to two orders of magnitude over that needed to counter the xenon poison at steady-state operation. Even the latter is often two to three times the amount of fuel called for by the criticality
+
140
9.11
Reprise
141
equations used in reactor-design computations. Besides the very great expense (hundreds of thousands of dollars at the prevailing U.S. price of $12,00O/kilogram of U235) involved in tying up such a large fuel inventory, the control of such an overfueled reactor poses difficulties from the operational safety point of view. This is because such a system possesses dangerously large amounts of potentially positive reactivity, which could cause a neutron population multiplication at a stupendous exponentially increasing rate if the reactor became supercritical (k> 1) for a sustained period. Of course such a situation would only occur during a reactor accident. In general, to prevent the occurrence of such accidents, very elaborate precautions must be, and are, taken. It is felt, as has been stated often enough, that the xenon override difficulty can be alleviated through the use of optimal shutdown programs described in the previous chapters. This was demonstrated analytically in the description and consequent numerical computation of such programs using the dynamic-programming algorithm. It was also demonstrated experimentally on the (albeit low-to-medium power) IRR-1 reactor described in Chapter 7. For this reactor, considering the anticipated increase in its power, it will be sufficient to shift the xenon peak ahead in time, which is accomplished through optimal power shutdown to minimize the xenon maximum, also discussed in Chapter 7. From the operational point of view, optimal flux shutdown to control xenon poison provides an example of a process that has sufficiently complicated features to make its solution one which cannot be deduced in detail intuitively. Even though the general nature of the optimal shutdown can be deduced from physical grounds, to obtain the quantitative aspects would prove quite difficult indeed without performing extensive “cut and try” optimal power shutdown experiments on an actual reactor. As is by now evident, one must pulse the reactor to quickly reduce the iodine concentration, to ultimately minimize the xenon poison after shutdown. Upon initial shutdown there is an immediate increase of xenon which must be balanced against the simultaneous desired decrease of iodine. Hence there is a point at which the reactor must be pulsed to burn out the xenon to maintain control flexibility by respecting the xenon override constraint. How-
142
Summary and Equivalences
19
ever, this procedure increases the iodine concentration. How and when to pulse the reactor, within the limits imposed by the xenon override constraint, is not something that can normally be intuited from physical feeling about the vagaries of xenon behavior in large high-power thermal reactors. Inroads into the determination of optimal shutdown policies also can be obtained by using the methods of the maximum principle described in Chapters 4 and 5. However, the resulting formulation yields a formidable two-point boundary-value problem which is difficult to handle from the computational point of view. Analytical solution to obtain optimal flux shutdown programs using the maximum principle is out of the question, as hardly more than tutorial examples can be solved in this manner. However, the general nature of the control regimen (whether bang-bang or not) can be determined by inspection of the analytical formulation in terms of either the maximum principle or the principle of optimality of dynamic programming. As mentioned in Chapter 5 , certain trial-and-error procedures can be used in conjunction with the maximum principle to obtain optimal shutdown policies, but it is more natural to use a straightforward algorithm as provided by dynamic programming; one with no cumbersome twopoint boundary-value difficulties. There is a kind of equivalence between the optimal control policy corresponding to the xenon minimax criterion and that of, for example, the criterion of minimum-time optimal control. These and other aspects, including the relationship between the use of control criteria singly and in combination, as well as the equivalence between the maximum principle and the optimality principle of dynamic programming are discussed in Sections 9.2 and 9.3, respectively. As to the question of the efficacy of optimal shutdown programs, they can be employed not only on high-power thermal reactors, but on those of low to medium power as well. In the case of the IRR-1 reactor, it was seen that the use of an optimal shutdown program, either to minimize the post-shutdown xenon maximum or to minimize the xenon at a given post shutdown time, will greatly facilitate its normal daily operation. This is necessary because, at the anticipated increase in power operation, the amount of reactivity available to override xenon will be small, and the time of occurrence of the post-
Equivalence bei ween Optimaliiy and Maximum Principles
9.2)
143
shutdown xenon peak will almost coincide with the desired daily time of restarting the reactor. For the case of high-flux thermal reactors, approximately up to 50 per cent less fuel is needed if optimal shutdown programs are employed to override post-shutdown xenon at will. Lesser results are obtained for shorter shutdown duration times and more restrictive xenon override constraints. Generally, optimal shutdown programs are to no avail if the shutdown duration time is less than 2 hours, and/or the xenon override constraint is less than approximately 1.5 times the xenon poison reactivity at equilibrium operating power. 9.2. Equivalence between the Optimality Principle and the Maximum Principle
There is an equivalence between the optimality principle of dynamic programming and the maximum principle, which is readily apparent when both principles are applied to find the solution to a particular optimal control process [21]. This equivalence will be demonstrated by first using the principle of optimality to derive the functional equation of dynamic programming for a general class of optimal control processes. Using this functional equation, the adjoint equations of the maximum principle will follow, as well as the statement of the maximum principle itself. Consider the general control process of finding the optimal control policy that minimizes PT
J
= J 0 g ( x 1 , x 2 ,..., x,;
u 1 , u 2 ,..., urn)dt
(m s N ) ,
(9.1)
where T is the allowable duration of control, 2 = (xl, x2, ...,x,) is the state vector, and zi=(ul, u2, ..., urn)is the control vector. The components xi of the state vector satisfy the following equations of state (equations of motion) :
144
19
Summary and Equivalences
That is, it is desired to find the optimal control policy G*=(ul*, u2*, ..., urn*)that yields J*
=
s:
g (x1,x2,
...,x,;
u1*,u2* ,..., urn*)dt = min ,
(9.3)
where the {xi} satisfy equations (9.2) with initial conditions 2(0)= (cl, c2, ..., c,) such that the system proceeds to some given final state, ZT,at the end of the given control duration, T. Let a “cost,” or criterion, function S ( c , , c2, ..., ,c T ) be defined as rT
Using the principle of optimality of dynamic programming allows one, as explained in previous chapters, to write the following functional equation for S : S(c1,c2, ...,,c
u1,u2,..., u , ) A
T ) = min[g(c,,c,, ..., c,; U
+ S(C1 + i l A , ..., C N + i N A , T - A ) ] .
(9.5)
This equation enunciates the optimality principle in that the cost functional S is given by the minimum, over the allowa5le {ui},of the sum of two terms on the right side of (9.5). The first term is the cost of control for a duration of A units of time, and the second is the cost of a “new” optimal control process S (cl + i l A , c 2 + i 2 A , ..., c,+ii.,A, T - A ) , beginning in the state ( c l + i l A , c 2 + i 2 A , ..., c , + i , A ) but of duration T - A units of time. In the usual manner, expansion of the second term on the right side of (9.5) about the initial state (cl, c2, ..., c,) yields S(c1,c2,
..., ,c T ) = min z
g(c1,c2,
l
as + S + -aci iiA N
i=
..., c,;
ul, u 2 , ..., u,)A
as
--A
aT
1
+ O ( A 2 )+..- .
(9.6)
Canceling S on both sides of (9.6), dividing the result by A , taking the limit of zero A , and substituting from the equations of state (9.2)
9.21
145
Equivalence bet ween Optimality and Maximum Principles
results in Bellman's equation, viz.,
This is a partial differential equation from which the optimal control vector d* can be obtained in principle, together with the minimum cost, S. As already discussed a number of times, this equation is too difficult to solve in closed form, except for tutorial examples, so that one must resort to numerical computations, using its discrete analogue, Eq. (9.5). Now define, for convenience, an additional component, x ~ +of~ , the state vector f by letting x ~ satisfy + ~
It should be noted that the right side of (9.8) does not depend ex~ . fact plays a role in the formulation of the soplicitly on x ~ +This lution using the maximum principle. Here x N + l is introduced so that the resemblance between the two principles will become more apparent; Bellman's equation (9.7) can be rewritten
as
-(
aT
...,
~ 1 , ~ 2 , cN;
T ) = min
~
+
C i=l
as1
-ji(c1,c2,
aci
...,c,;
ul, ~
..., u,).
2 ,
(9.10)
To obtain the correspondence between this equation and the maximum principle, define a set of adjoint variables as (9.1 1)
146
Summary and Equivalences
19
From the definition of p i ( t ) in (9.11), its total derivative is
Furthermore, along an extremal in the phase space, (9.10) asserts that
as "+'as - = C -fi*
aT
aci
i=l
= min = constant,
(9.13)
where the optimal control vector 1* has been inserted in thef;:, giving cz, ...) CN; u1*, uz*, ...3 urn,*). Differentiation of (9.13) with respect to, e.g., u,*, assumed continuous, gives N + l as afi* C --=o. (9.14) i = l aciau,*
f;:*Ef;:(C,,
If (9.13) is also differentiated with respect to c,,, the result is
as + aci -
["*
afi* au,* au, ac,
11
- +y-
ac,
=O.
(9.15)
From (9.14) it is seen that the third term of (9.15) vanishes. Comparing (9.15) with (9.11) and (9.12) supplies the equation that the adjoint variables p i must satisfy, viz.,
bi(t)=-
~
+
j=l
afj aci
1 p j - .
(9.16)
Identifying a Hamiltonian, after Pontryagin, as N+1
(9.17)
then along an extremal, from (9.13), H must satisfy
H* = -
N+ 1
C
i=1
pifi* = max = constant.
(9.18)
9.31
Comparisonof Optimal Shutdown Criteria
147
That is, the maximum principle asserts that for a system satisfying equations (9.2), the optimal control that minimizes J, also maximizes H, where - H=Cf=+llpi& and where the p i are defined by (9.16). As can now be seen from (9.10), Bellman’s equation is the HamiltonJacobi equation, i.e., along an extremal (9.19) and the adjoint equations are the canonical equations of Hamilton. In other words, from (9.16), the definition of H, and identifying ci with xi yields aH aH . xi= - (9.20) ( i = 1, ..., N + 1). Pi(t)= axi api 9.3.
Comparison of Optimal Shutdown Criteria
As discussed in Section 2.5, there are two general types of control criterion functionals. They are the averaged or integral type, of which min J [ u ] = min U
u
1 T
o
L (2, u ) dt
(9.21)
is typical. Then there is the terminal control criterion, of which optimal xenon shutdown as presented in the foregoing chapters, is a case in point. For the latter, it is desired to minimize a “cost” functional of the final state only. That is, minJ, [ u ] = min c#~(f(T),u) or U
U
minY(f(T),u),
(9.22)
U
where f ( T ) is the state of the system at the termination of control. T is the shutdown control duration. Now, to appreciate how the interaction of the constraints influences the character of the optimal xenon shutdown programs, consider initially the xenon minimax criterion, problem (a), with no xenon override constraint but with givedflux constraints. It should be noted that any formulation with no constraints whatever is not physically meaniag’ful, as discussed in Section 4.1. Then for no xenon override constraint (complete xenon override capability) but with bounded flux, an optimal shutdown program and a final state (xT,yT) can be determined
148
19
Summary and Equivalences
which corresponds to a given xenon “minimax” @ ( x T ,y T ) .To attain the final state, ( x T ,yT), one can proceed along an extremal from the initial equilibrium state over a portion consisting of sawtooth segments which is a characteristic of the flux constraints, and is depicted in Figs. 8.7 to 8.1 1. Or, the same final state ( x T ,y T )can be achieved if the flux constraint is relaxed to allow part of the xenon override constraint boundary line to become part of the extremal. The optimal flux u* is continuous (exponential) and exceeds the upper flux bound over this portion of the extremal. This is shown in Figs. 8.4 and 8.5 and discussed in Section 8.3. Whether to choose the sawteeth or the xenon override constraint boundary as part of the extremal depends on the particular reactor system. If the reactor is operating at equilibrium which is close to its maximum upper power bound, then the sawteeth are used, obviously since the flux constraint cannot be relaxed. On the other hand, if the reactor is operating at a conservatively rated equilibrium power, including the xenon override constraint as part of the extremal would eliminate a number of pulses, as can be seen from Figs. 8.7 to 8.11. This would perhaps simplify the practical difficulties of requiring a complex shutdown control system and allow the reactor to exceed rated power temporarily, which should cause no undue harm. The fact that the control criterion or cost functional depends only on the final state, so that the cost of control during the shutdown phase per se is assumed to be zero, may not reflect the actual control cost well. An additional cost associated with the shutdown phase can be readily identified as the control mechanism effort used to pulse the control rods. This is developed by first rewriting Bellman’s equation (5.17) for either problem (a) or (b): to find the optimal control policy u* and the corresponding control functional F= min, @ or F= min, Y that satisfies
aF ( E , T )
~-
aT
- min V F * i j ( u )
F(C,O)=@(E)
or
Y(E)
(9.23)
O
with the usual flux, xenon, and state-equation trajectory constraints. The “velocity” vector in the xenon-iodine phase plane is ij(u)= ( i ( u ) , j ( u ) ) , the initial state vector of the system is E=(xo, yo), and the components i ( u ) and j ( u ) are the equations of state (3.12) and (3.13).
9.41
149
Other Equivalences
Now suppose that one includes an integral cost as well as a terminal cost, in that it is desired to shutdown on minimum energy trajectories. This is analogous to maneuvering a space vehicle on an interplanetary trajectory so that minimum fuel is consumed. Then the reformulation consists of finding the optimal shutdown policy that yields j : u d t = min
(9.24)
with terminal cost @ or Y, flux constraints O < u < M , xenon override constraint x < x,, and trajectories satisfying the differential equations of state (3.12) and (3.13). The corresponding Bellman's equation is obtained as in Section 5.4, where now F(E, T ) = min u
aF ( E , T )
--
aT
s'
o
u dt
- min [u
+ @(E)
or
min]' u
+ VF.c(u)]; F ( E , O ) = @ ( E )
o
u dt
+ Y(E), (9.25)
or Y ( E ) .
OSuSM
If the right side of (9.25) is written in expanded form, it is seen to be still linear in the function u, so that the bang-bang nature of the control is unaltered. Of course, the details of the optimal control u* would be obtained by numerically integrating (9.25) on digital-computing machinery. 9.4.
Other Equivalences
It is also interesting to note a certain equivalence between the xenon minimax, problem (a), and the minimum shutdown duration time problem. For a xenon constraint x, and shutdown time T,, the optimal shutdown minimax control problem can be equivalent to the shutdown program used to obtain the same xenon minimax magnitude by minimizing the shutdown duration time T , , and T,= T,. To restate this more explicity, consider the time t , at which the xenon peak occurs, which is obtained by solving (3.12) and (3.13) for u=O and equating i to zero as discussed in previous chapters, in terms of the shutdown duration T. It is gotten from the expression exp[- ( I
[
- w)(tp- T)]= w 1 +
l-w 71
-1. XT
(w + ' 0 ) Y T
(9.26)
150
Summary and Equivalences
19
The corresponding magnitude of the xenon peak is
Equations (9.26) and (9.27) can be combined to give the xenon peak explicity in terms of the shutdown duration time T and t, as
(9.28) Disregarding the constant y1 [l + ( r O / w ) ] , the minimax problem for a fixed shutdown duration time T is to find u,* so that min x, = eT min y, exp (-
04u,dM
Odu,dM
1,)
.
(9.29)
For fixed x,, the minimal-time optimal problem can be written from (9.28) as min eT = x p max y,exp(-t,). (9.30) O
Odu,dM
urn* is the optimal shutdown policy that minimizes the (assumed feasible) shutdown duration time T, required to transfer the system, with given xenon constraint, to achieve a fixed x,. This is equivalent to an optimal shutdown policy that yields the minimum x, for a fixed feasible shutdown duration time T,. A geometric proof by contradiction is discussed in refs. [30] and [31]. The converse is not necessarily true. Optimal shutdown to minimax xenon does not require minimum time trajectories in the xenon iodine space. Still another kind of equivalence is that the xenon shutdown problems (a) and (b) can be construed as routing problems. Routing problems are those in which it is desired to proceed through a given network from a starting point to an end point on a given route that extremizes a particular criterion functional. The famous (unsolved) travelingsalesman problem is one example. It is desired that the traveling salesman proceed through a given network of cities, visiting all of them, and returning home on an optimal route, which is the one that minimizes his total gasoline mileage. In our case the control is bang-bang, so the network is composed of the intersections of the two singleparameter families of solutions to the differential equations (3.12) and
9.41
Other Equivalences
151
(3.13) for u=O and M, respectively. This is shown in the xenon-iodine phase space of Fig. 9.1. To solve problem (a), for example, Q, (xk, yk) is computed for all intersections (xk,yk).The differences, e.g., between the j t h and the kth interstices, @ k j = @ (xk, yk) - @ ( x j , y j ) are then computed and each segment in the network is so labeled. Now let
,-
u = M trajectories
FIG.9.1. Network in xenon-iodine phase space. Arrows indicate increasing time. Segments are marked 0 or M corresponding to the two control choices. Numerical labels represent the differences @ k j . = min @ k j be the minimum of the total value of the functional using an optimal route from the mesh point k to the mesh point N . The mesh point N corresponds to a final state ( x T ,yT). From the principle of optimality, the recurrence equations from which the optimal route is obtained are given by @k
Q,
(9.31)
Q,N = @(XT9 Y T ) 9
@, = min(Okj j#k
+ Qj)
(k = 1,2, ..., N - 1).
(9.32)
152
Summary and Equivalences
19
Equation (9.31) is trivial, since the system is already at its final state. Equation (9.32) is simply the minimum over t h e j mesh points available, of the sum of @kj from the mesh point k to mesh point j plus the optimal Q j from mesh point j to the final-state mesh point N . In problems (a) and (b) one merely works backward from a chosen final mesh point N . Since only two arrows point toward each mesh point (directed network), one simply chooses the minimum @kj segment at each stage, compiling a running total until a desired initial equilibrium state is reached. This procedure generates an optimal route (logical tree) throughout the network, with the optimal policy decision made at each stage, in accordance with (9.31) and (9.32). This method can be shown to yield the correct optimal shutdown program [9], simplifying the search procedure enormously, since the optimal route is delineated immediately without the necessity of examining the plethora of possible routes. The xenon override constraint is incorporated by merely confining the network to a strip between the abscissa and the line x = x,. Then x = x, is, or is not, considered part of the network, depending on whether the flux constraint is relaxed, as discussed in Section 8.3.
9.5. Higher-Order-System Formulations Another avenue of investigation would be to computerize a threedimensional version of problems (a) and (b). Since it is closer to the actual reactor-control situation that the reactivity, as opposed to the flux, is the more natural control variable, it would perhaps be a better, although more complex formulation, to let the state be represented by three state variables (x, y , u ) while the reactivity control variable K = K / K o , where K is the controllable reactivity and KO is the reactivity required to balance the xenon reactivity at equilibrium conditions. That is, if the delayed and prompt neutrons are lumped together by a time constant ?, the mean lifetime of the neutron transient behavior, a reactor kinetics equation for the neutron density can be written [2],
where
(o
is the thermal flux, K is the controllable reactivity, and
aX/PZrue,is the xenon poison reactivity in dollars.
9.51
Higher-Order-System Formulations
153
The reactivity that balances the xenon poison to maintain equilibrium is given by
Then the new equations of state in dimensionless form, with time in units of&’, are i=-(w )i=u-y,
+ rou)x + y l ( w + r o ) y + y z ( w + r o ) u ,
li = b ( K - X ) U
(9.34)
( b = K,-JAIf),
and the reactivity K is assumed to be suitably constrained. From either the optimality principle or the maximum principle, it is easily seen that the optimal reactivity control is also bang-bang. In fact, from the maximum principle it is straightforward that
(9.35) where K:,, and K;,, are the respective positive and negative normalized reactivity limits. p 3 is the “third” variable of the adjoint equations, satisfying p 3 = - aH/au, where H is the Hamiltonian corresponding to the system of equations (9.34). Analogous to the two-dimensional version of problems (a) and (b), the dynamic-programming formulation using the optimality principle is F,(x,y,u)=rninF,_,(x+ii-d,y +)ii-d,u+tid) K
F o ( x , y , O )= @ or
Y.
(9.36)
The corresponding Bellman’s equation is F ( x , y , O , O ) = @ or
Y , (9.37)
and the optimal control policy is
(9.38)
154
Summary and Equivalences
19
These investigations can also be expanded in the direction of other reactor poisons which are present and which affect the neutron kinetics, but to a lesser extent. The ultimate system description is of high dimension (many variables), including besides the above considerations, delayed neutrons in detail, temperature-moderator density changes, core structural changes, hydrodynamical considerations, etc. This results in a very complicated formulation. To handle the correspondingly higher dimensional F tables would demand the services of million-word core-memory computers possessing nanosecond arithmetic and memory access times to achieve reasonable computer-run times, as discussed in previous chapters.
References
(1) Glasstone, S., and Sesonske, A., “Nuclear Reactor Engineering,” pp. 26OtT. Van Nostrand, Princeton, New Jersey, 1963. (2) Ash, M., “Nuclear Reactor Kinetics,” Chapter 6. McGraw-Hill, New York, 1965. (3) Weinberg, A. M., and Wigner, E. P., “Physical Theory of Neutron Chain Reactors,” Chapters 2, 17. Univ. Chicago Press, Chicago, Illinois, 1958. (4) Bellman, R., “Dynamic Programming,” Chapter 1. Princeton Univ. Press, Princeton, New Jersey, 1957. (5) Ash, M., “Nuclear Reactor Kinetics,” Chapter 6. McGraw-Hill, New York, 1965. (6) Roberts, S. M., “Dynamic Programming in Chemical Engineering and Process Control,” Chapter 7. Academic Press, New York, 1964. (7) Randall, D., and St. John, D. S., Xenon spatial oscillations. Nucleonics 16, 82-87 (1958). (8) Lellouche, G. S., Space dependent xenon oscillations. Nucl. Sci. Eng. 12, 482-489 (1 962). (9) Pontryagin, L. S., Boltyanski, V. G.,Gamkrelidze, R. V., and Mishchenko, E. F., “The Modern Theory of Optimal Processes,” Chapter 2. Wiley (Interscience), New York, 1962. (10) Bellman, R., “Adaptive Control Processes - A Guided Tour.” Princeton Univ. Press, Princeton, New Jersey, 1961. (1 1) Leitmann, G. (ed.) “Optimization Techniques with Application to Aerospace Systems.” Academic Press, New York, 1962. (12) Takahashi, Y., The maximum principle and its application. Am. SOC.Mech. Engrs. Paper 63-WA-333,8-10 (1963). Trans. Winter Mtg. ASME, Philadelphia, Pennsylvania, Nov. 1963. (1 3) Roberts, S. M., “Dynamic Programming in Chemical Engineering and Process Control,” Chapter 6. Academic Press, New York, 1964. (14) Ash, M., “Nuclear Reactor Kinetics,” Chapter 1. McGraw-Hill, New York, 1965. 155
156
References
(15) Valentine, F. A., The problem of Lagrange with differential inequalities as added side conditions. Contrib. Var. Calc. Univ. Chicago, pp. 403-407 (1937). (16) Pontryagin, L. S., Boltyanski,V. G.,Gamkrelidze, RV., and Mishchenko, E. F., “The Modern Theory of Optimal Processes ,” Chapter 6. Wiley (Interscience), New York, 1962. (17) Ash, M., Bellman, R., and Kalaba, R., On control of reactor shutdown involving minimal xenon poisoning. Nucl. Sci. Eng. 6, 152-156 (1959). (18) Ash, M., Application of dynamic programming to optimal shutdown control. Nucl. Sci. Eng. 24, 77-86 (1966). (19) Rosonoer, L. I., L. S. Pontryagin’s maximum principle in optimal control, automation and control. Aufom. Telemekhan. 20, 10-12 (1959). (20) Bellman, R., “Adaptive Control Processes - A Guided Tour,” Chapter 4. Princeton Univ. Press, Princeton, New Jersey, 1961. (21) Kalaba, R., Illustrative example from notes of a course in Dynamic Programming. Univ. Calif. Los Angeles Eng. Dept., Los Angeles, Calif. 1961. (22) Pontryagin, L. S., Boltyanski, V. G.,Gamkrelidze, R.V., and Mishchenko, E. F., “The Modern Theory of Optimal Processes,” Chapter 5. Wiley (Interscience), New York, 1962. (23) Rosztoczy, Z. R., and Weaver, L. E., Optimum reactor shutdown program for minimum xenon buildup. Nucl. Sci. Eng. 20 (5), 318-323 (1964). (24) Bellman, R., and Dreyfus, S., “Applied Dynamic Programming.” Princeton Univ. Press, Princeton, New Jersey, 1962. (25) Kahan, R. S., Israel Research Reactor IRR-1, Part 1, General Description IA 680. Israel Atomic Energy Commission, Soreq Nuclear Research Center, Yavne, Israel, 1960. (26) Ward, A. G., A universal curve for the prediction of xenon poison after a reactor shutdown. At. Energy Can. Lrd., Chalk River Res. Project CRRP-685 (AECL-411), Jan. 1957. (27) MacRae, W., and Parr, R., Xenon poisoning transients following changes in reactor power and their effect on reactivity. At. Energy Res. Esrab. (Gt. Brit.), AERE ED/R1590, Jan. 1954. (28) Tadmor, J., Israel Research Reactor IRR-1, Hazards Evaluation Rep. IA 689. Israel Atomic Energy Commission, Soreq Nuclear Research Center, Yavne, Israel, 1960. (29) Glasstone, S., “Nuclear Reactor Engineering.” MacMillan, New York, 1956. (30) Roberts, J. J., and Smith, H. P., Jr., Time optimal solution to the reactivityxenon shutdown problem. Nucl. Sci. Eng. 22,470-478 (1965); Equivalence of the time optimal and minimax solutions to the xenon shutdown problem. Nucl. Sci. Eng. 23, 397-399 (1965). (3 1) Shinohara, Y., and Valat, J., Optimalisation de l’empoisonnement xenon par minimalisation du pic xenon. Compt. Rend. Acad. Sci. 259, 1623-1626 (1964). (32) Sumner, H. M., The neutron crossection of xenon-135. U.K. Ar. Energy Authority, At. Energy Estab. ( G t . Brit.). Winfrith. Rep. AEEW-R116, June 1962.
Bibliography
Listed below are a document glossary and a bibliography of selected entries pertaining to the central xenon problems discussed in this monograph, including the spacedependent (flux tilt) xenon problem discussed in Sections 1.2 and 2.3. The selection emphasizes primarily the nuclear reactor physics and nuclear engineering aspects, rather than the more mathematical material cited in the monograph proper and enumerated under References.
Document Glossary Document Code Sponsoring Agency AB Atomenergi-Aktiebolaget Atomenergi, Stockholm, Sweden AE Aktiebolaget Atomenergi, Stockholm, Sweden AECD(AECL) U.S. Atomic Energy Commission Declassified (Limited), Germantown, Maryland Atomic Energy Research Establishment, Harwell, Berkshire, EngAERE land Argonne National Laboratories, Lemont, Illinois ANL Allmanna Svenska Elektriska Aktiebolaget, Stockholm, Sweden ASEA Brookhaven National Laboratories, Upton, New York BNL Babcock and Wilcox, Atomic Power Division, Lynchburg, VirB.W. ginia Chemical Abstracts, Washington, D.C. CA Commissariat a I’Energie Atomique, Paris, France CEA University of Chicago Metallurgical Laboratories (Manhattan CF Project), Chicago, Illinois, Centre d’Etudes Nucleaire, Saclay, France CPLA National Research Council of Canada, Atomic Energy Project, CRP, CRRP Chalk River, Ontario, Canada General Electric Corp. Aircraft Nuclear Propulsion Project, DC Cincinnati, Ohio 157
Bibliography
158 DP E.443/N ED EUR FXM HW IA
Dupont Atomic Power Division, Savannah River, Georgia Atomic Energy Research Establishment, Harwell, Berkshire, England Atomic Energy of Canada Ltd., Chalk River Project, Chalk River, Ontario, Canada EURATOM, European Atomic Energy Community: France, Germany, Italy, Benelux Pratt and Whitney Aircraft Division of United Aircraft Corp., Hartford, Connecticut Hanford Atomic Energy Works, Richland, Washington Israel Atomic Energy Commission/Ministry of Defence, Tel-Aviv,
Israel
ID0 JENER KAPL MonP NAA NSA ORNL-CF RT/ING TID TPI WAPD
Idaho Operations Office of USAEC, Arco, Idaho Joint Establishment for Nuclear Energy Research, Kjeller, Norway Knolls Atomic Power Laboratories, General Electric Corp., Schenectady, New York Monsanto Chemical Corp. Atomic Energy Division, Oak Ridge, Tennessee Atomics International Division of North American Corp., Canoga Park, California Nuclear Science Abstracts, Washington, D.C. Oak Ridge National Laboratories, Oak Ridge, Tennessee Comitato Nazionale per I’Energia Nucleare (Ingegneria e Tech), Rome, Italy Division of Technical Information Extension of USAEC, Germantown, Maryland Atomic Energy of Canada Ltd., Chalk River Project, Chalk River, Ontario, Canada Westinghouse Atomic Power Division, Bettis Field, Pittsburgh, Pennsylvania
Xenon Bibliography H. Albers, et al., Xenon poisoning in a shut-down reactor. Nucl. Sci. Eng. 15, 342-344 (1962); N S A 17, 26812 (1963). M. Ash. The xenon minimax problem. IA-988, Soreq Nuclear Research Center Yavne, Israel, Dec. 1964. M. Ash. Application of dynamic programming to optimal shutdown control. Nucl. Sci. Eng. 24, 71-86 (1965). M. Ash, R. Bellman, and R. Kalaba. On control of reactor shut-down involving minimal xenon poisoning. Nucl. Sci. Eng. 6, 152-156 (1959). I. Aviram and A. Pazy. The xenon effect on the statics of a thermal reactor. IA-741. Israel AEC Dimona Research Establishment, Mar. 1963; NSA 17,26762 (1963).
Xenon Bibliography
159
W. M. Barss. Digital simulation of xenon instability in reactors. CRRP-1002 (1961). H. H. Baucom. Estimate of xenon reactivity with application to the SER reactor. NAA-SR-Memo-5302, May 1960; NSA 15, No. 4765 (1961). E. S. Bettis, W. B. Cottrell, E. R. Mann, J. L. Meem, and G. D. Whitman. The aircraft reactor experiment-physics. Nucl. Sci. Eng. 2, 841-853 (1957). D. R. deBoisblanc and W. Nyer. Xenon-135 generation in the MTR reactor. AECD-3699, May (1953); NSA 10, (1956), No. 3855. S. Breslouer. Analysis of xenon reactivity prediction in the HTRE-1 reactor. DC-60-4-118, Apr. 1960; NSA 16, No. 19984, (1962). F. Brown and L. Yaffe. The independent yield of Xe-135 produced in the fission of natural uranium by pile neutrons. Can. J . Chem. 31, 242-249 (1953); NSA 7, No. 4666, (1953). M. M. L. Bulati et al. A xenon poison computer. 2nd U.N . ConJ Peaceful Uses At. Energy, Geneva, 1958, p. 1627. W. Burch and L. Shappert. Behavior of iodine and xenon in the homogeneous reactor test. Trans. Am. Nucl. SOC.4, (2), 354 (1961); NSA 16, No. 830, (1962). R. S. Carlsmith. Xenon and samarium poisoning. CF-61-1-59, Jan. 1961; NSA 15, No. 12458, (1961). G. G. Casini and J. Pillon. Statistical weight factor for calculating xenon effect on reactivity in equilibrium conditions. EUR-l02f, 1962; NSA 17, No. 28425 (in French), (1963). J. Chernick. The dynamics of a xenon-controlled reactor. Nucl. Sci. Eng. 8,233-243 (1960); CA 55, No. 1727Oc, (1961). J. Chernick, et al. The effect of temperature on xenon instability. Nucl. Sci. Eng. 10, 120-131 (1961) (BNL-51127); NSA 15, NO. 22848, (1961). R. D. Cheverton. Xenon Chase and samarium burnup in the HFIR reactor. CF-61-7-87, July 1961 ; NSA 15, NO. 28833, (1961). H. Clark and J. English. Xenon tables. DP-200, May 1957; NSA 11, No. 12883, (1957). H. H. Clayton. Power instability due to poison. TPZ-41 (1961). B. Davison. Correction for non-zero time step in the numerical simulation of spatial xenon in a nuclear reactor. AECL-1292 (1961). B. Davison. A Semi-numerical semi-analytical method for the two-group theory of xenon oscillation. CRP-993 (1961). R. W. Deutch. Fission product build-up in enriched thermal reactors. Nucleonics 14, 89 (1956). D. Dickey and J. E. McEwen, Jr. Slide rule simplifies xenon computation. Nucleonics 18, 2, 88, 90, 92-93 (1960); NSA 14, No. 7495, (1960). K. Donelian and J. Menke. Xenon instability periods as a function of flux in thermal piles. Mon P-379, Sept. 1947; NSA 10, No 7341, (1956). T. R. England. Xenon-power oscillations in PWR-1 reactor analysis and observations. Trans. Am. Nucl. SOC.7, 2, 220 (1962). R. Eriksen and W. Halg. Xe-135 poisoning of the JEEP reactor. JENER 17, (1953); N S A 7, No. 4901, (1953).
160
Bibliography
R . L. Ewen. Calculation of complex natural modes for spatial xenon oscillations and comparison with a simple approximation. Trans. Am. Nucl. SOC.5 (l), 179 (1 961). J. R. Fresdall. Nuclear reactor dynamics I. Xenon poisoning effects; M.Sc. thesis, Univ. Washington, Seattle, Washington, 1960. J. Fresdall and A. Babb. Xenon-135 transients resulting from time-varying shutdown of thermal reactors. Trans. Am. Nucl. SOC.4,2, 316-317 (1961); NSA 16, No. 781, (1962). G. C. Fullmer. Xenon instability in graphite reactors. HW-69084 (1961). G. C. Fullmer. Let the reactor prevent xenon instability. Nucl. Sci. Eng. 9,93-94 (1961); CA 55, No. 218671,(1961). J. Furet. The influence of xenon poisoning in high-flux reactors on the choice of control rod speeds. CEA-2081 (1961); NSA 16, No. 15773,(1962). J. Furet and A. L. Garcie. Influence of xenon poisoning on the control and safety of high flux reactors. J. Nucl. Energy Pt. A and B 16,209-219 (in French) (1962); NSA 16,No. 21738,(1962). G. Galuppini and L. Sani. Xenon poisoning under different operating conditions in nuclear reactors. RT/ING (63)6, Mar. 1963;NSA 17,No.29726,(1963). W. Gibbard. Xenon transient characteristics. KAPL-44, Apr. 1948 (decl. Feb. 1957); NSA 11, No. 7830,(1957). H. E. Goeller. Production of gaseous fission products in homogeneous reactors. CF-49-9-114, Aug. 1949 (decl. Apr. 1957); NSA 11, No. 13890, (1957). G. Goertzel. Reactor dynamics. “The Reactor Handbook”, Chapter 1.6;AECD3645 (1955). J. N. Grace. Analysis of a reactivity instability experiment with boiling and non linear analysis of spatial stability and flux tilt. NAL-6205, 174-188; NSA 15, No. 3596, (1961). J. N. Grace and M. A. Schultz. Diffusion coupled oscillation with xenon reactivity feedback. Presentedat the Am. Nucl. SOC.Mtg., Los Angeles,California, June 1958. J. N. Grace, M. A. Schultz, and R. Fairey. Inherent reactor stability (fundamental mode xenon oscillations). Proc. Conf. Nucl. Eng., Univ. Calif. Los Angeles. 1955, California Book Co., Berkeley, California, 1955. S. B. Gunst and J. C. Connor. The stability of “stable” fission product poisoning. Nucl. Sci. Eng. 8, 128-132 (1960). V. W. Gustafson. PRTR reactor power test results. HW-80253,NSA 18,No. 40834, (1964). G. L. Gyorey. On the theory of xenon induced instability in neutron flux distributions. Ph.D. thesis, Univ. Michigan, Ann Arbor, Michigan, 1960. G. L. Gyorey. The effect of modal interaction on the xenon instability problem. Nucl. Sci. Eng. 13, 338-344 (1962); NSA 16,No. 28443,(1962). G . L. Gyorey. The effect of modal interaction on the xenon instability problem. Trans. Am. Nucl. Soc. 4, 1, 83 (1961); NSA 15,No. 21776, (1961). W. Haelg, et al. Theoretical and experimental study of reactivity change. Z. Angew. Math. Phys. 14, 178-185 (in German) (1963); NSA 17,No. 24661,(1963).
Xenon Bibliography
161
D. R. Harris and P. S. Lacy. A simple approximate test for spatial xenon stability. Trans. Am. Nucl. SOC.3 (2), 437 (1961). D. R. Harris, S. Kaplan, and S. G. Margolis. Modal analysis of flux tilt transients in a nonuniform reactor. Trans. Am. Nucl. SOC2. (2), 178-179 (1959). P. Haubenreich. Xenon poisoning in the ISHR reactor. CF-53-5-202, May 1953 (decl. Feb. 1957); NSA 17, No. 7779, (1957). R. Haworth and R. Hockney. Xenon override in gas cooled reactors I. Sketching xenon transients. J. Nucl. Energy Pt. A and B 18 601-619 (1960). J. F. Hill. The build-up of poisons on shutdown of a reactor. At. Energy Res. Establ. (Harwell) Rept. R/M 18, (1956); Phys. Abstr. 60, 738 (1957). A. Hitchcock. “Nuclear Reactor Stability,” pp. 35-46. Temple Press, London, 1960. R. Hockney. Xenon override in gas cooled reactors 11. A simple theory of vector rod movement. J. Nucl. Energy 18, 621-643 (1960). R. R. Hoefner. Flux tilt oscillations at Savannah River. Nucl. Sci. Technol. 2 (3) 290 (1956). P. Hofmann, et al. A procedure for xenon calculations. TID-10055, Sept. 1953; NSA 11, No. 9871, (1957). P. L. Hoffman, H. Hurwitz, Jr., and E. Wachspress. A xenon calculation procedure adaptable to digital computers. KAPL-1594, Aug. 1956; NSA 10, No. 12024, (1956). B. Johnson and H. L. McMurry. Reactivity effects of Xe-135 and Sm-149 for the MTR reactor with U-235 and Pu-239 fuels. IDO-16398, Aug. 1957; NSA 11, No. 13496, (1957). S. Kaplan, et al. Space-time reactor dynamics. 3rd U.N . Conf. Peaceful Uses At. Energy, Geneva, 1964; NSA 18, No. 33016, (1964). S. Katcoff and W. Rubinson. Yield of Xe-133 in the thermal neutron fission of U-235. Phys. Rev. 91, 1458-1469 (1955). E. Kervi. Extending reactor time-to-poison and reducing poison shutdown time by pre-shutdown power alterations. Conf. -243-51 (from Am. Nucl. SOC.Conf. Problems Operating Research and Power Reactors, Ottawa, Oct. 1963); NSA 18 No. 2168f (1964). H. J. Kirk. Xenon and samarium concentration produced by various flux programmes. Westinghouse Atomic Power Division, Bettis Field, Pennsylvania; WAPD-183 (1956). T. Krieger, et al. The influence of non-l/v absorbers on reactor parameters in xenon poisoning. ORNL-2739 (Paper IV-B); NSA 13, No. 20344, (1958). G. S. Lellouche. Control of xenon spatial oscillations. Brookhaven National Laboratories, Upton, New York; BNL-6330 (1962). G. S. Lellouche. Space dependent xenon oscillations. Nucl. Sci. Eng. 12, 482489 (1962) (BNL-5573); NSA 16, NO. 18630 (1962). G . S. Lellouche. Reactor size sufficient for stability against spatial xenon oscillations. Nucl. Sci. Eng. 13, 60-62 (1962); CA 57, No. 6825h (1962). M. M. Levine. Equal charge displacement rule in fission product poisoning. Nucl. Sci. Eng. 9, 495 (1961).
162
Bibliography
G . J. R. MacLusky. The application of analogue methods to compute and predict xenon poisoning in a high flux nuclear reactor. J . Brif. Nucl. Energy Conf. 2, 361-370 (1957); NSA 12, NO. 6714 (1958). W.MacRae and R. Parr. Xenon poisoning transients following changes in reactor power and their effect on reactivity. ED/R-1590 (1958); C A 52, No. 9795d (1958). S. G. Margolis. A Nyquist criterion for spatial xenon stability. Trans. Am. Nucl. SOC.3,437 (1960). S . G. Margolis. Operator induced xenon oscillations. Westinghouse Atomic Power Division, Bettis Field, Pennsylvania; WAPD-BT-29 (1963). S . G. Margolis. Non convergence of steady state xenon calculations. Trans. Am. Nuc/. SOC.7, 2-3 (1964); NSA 18, NO. 28963 (1964). S. G . Margolis and N. J. Curles, Jr. Control induced xenon oscillations. Trans. Am. Nuc/. SOC.7, 220-221 (1962); NSA 19, NO. 3535 (1965). S. G . Margolis and S. Kaplan. Non-linear effects on spatial power distribution transients and oscillations with xenon reactivity feedback. Westinghouse Atomic Power Division, Bettis Field, Pennsylvania; WAPD-T-1156, June 1960; NSA 15, No. 4722 (1961). J. D. McCullen. Reactivity loss due to xenon build-up. E.443/N-14, Jan. 1954; NSA 12, No. 6826 (1958). K. R. Mercky. Fission product yield of inert gases. Hw-60431, May 1959; NSA 18, No. 18869 (1964). J. Miller. Xenon poisoning in molten salt reactors. CF-61-5-62, May 1961; NSA 15, No. 20320 (1961). K. Mochizuki and A. Takeda. An analysis of neutron flux spatial oscillation due to xenon build-up in a large power reactor core. Nucl. Sci. Eng. 7,336344 (1960). 0. Norinder. Numerical investigations of xenon stability in slab geometry. AB Atomenergi RFR-199, R4-157, June 1962. 0. Norinder. Two-group analysis of xenon stability in slab geometry by modal expansion. AE-108, May 1963; NSA 17, No. 28416 (1963). R. M. Pearce. Analog simulation of xenon instability. CRRP-998 (1961); (AECL1185); NSA 15, No. 13912 (1961). R. M. Pearce. Analog simulation of xenon spatial instability. Trans. Am. Nucl. SOC.4 (1) 83-84 (1961); NSA 15, NO. 21777 (1961). R. M. Pearce. Method of studying xenon spatial instability with an analog computer. Nucl. Sci. Eng. 11, 328-337 (1961); NSA 16, No. 3906 (1962). R. M. Pearce. Xenon oscillation in finite reactors. Nucl. Sci. Eng. 16, 336-367 (1963); NSA 17, No. 31699 (1963). G . N. Plass and M. Ginsburg. Intermittent operation of a pile. CP-3112, July 1945; NSA 10, No. 9881 (1956). J. Randall and D. S. St. John. Xenon spatial oscillations. Nucleonics 16, 3, 82-86 (1958); NSA 12, No. 6867 (1958). C. S. Robertson. Xenon build-up and decay for a homogeneous reactor region. DC-59-8-130, July 1959; NSA 16, NO. 14229 (1962).
Xenon Bibliography
163
M. T. Robinson. Xenon poisoning kinetics in gas-sparged, molten fluoride fueled nuclear reactors. Nucl. Sci. Eng. 4,270-287 (1958); C A 53 (1959). A. Roch. Xenon poisoning of a reactor and its representation on a phase plane. Neue Tech. 4, 191-197 (1962) (French); NSA 16, No. 21741 (1962). C. Roderick. Poison transient after power cutback and its effect on reactivity. AECD-4041(1952) (decl. Jan. 1956); NSA 10, No. 5362 (1956). Z. Rosztoczy and L. Weaver. Optimum Reactor shutdown program for minimum xenon buildup. Trans. Am. Nucl. SOC.7, 1-2 (1964); NSA 18, No. 28962 (1964). H. H. Rubin. ZAP - A program for the analysis of space time kinetics with xenon feedback. KAPL-3061, June 1964; NSA 18, No. 40781 (1964). 0. Ryden and C. A. Bergquist. A new xenon predictor for high flux reactors. ASEA Res. 7,203-210, (1962); NSA 17, No. 4261 (1963). J. B. Sampson, et al. Poisoning in thermal reactors due to stable fission products. KAPL-1226, Oct. 1954. G. M. Sandquist. Xenon-135 induced instability of the neutron flux in nuclear reactors. IDO-16985, Oct. 1964; NSA 19, No. 5431 (1965). Y. Shinohara, and J. Valat. Optimization of xenon poisoning by minimization of the xenon peak. Compt. Rend. 259,1623-1626 (1964) (French); NSA 19, No. 3522 (1965). L. M. Shotkin. Reactor stability against xenon oscillations. TID-7662, 548-581 (1958); NSA 18, No. 23028 (1964). L. M. Shotkin and F. H. Abernathy. Linear stability of the thermal flux in a reflected core containing xenon and temperature reactivity feedback. Nucl. Sci. Eng. 15, 197-212 (1963). H. Smets. The effect of burnable fission products in power reactor kinetics. Nucl. Sci. Eng. 11, 133-161 (1961); NSA 16, No. 1260 (1962). S . Spetz. Advanced test reactor. Study S-R-122. Axial xenon stability. IDO-24456, Nov. 1962; NSA 18, No. 31034 (1964). C. Steinert. The xenon concentration in operation of a power reactor. Aromkernenergie 8, 9-12 (1962) (German); NSA 17 No. 15478 (1963). D. St. John. Reactor stability. Dupont Atomic Power Division, Savannah River Facility, Georgia; DP-517, 1960. B. Stevenson. Xenon poisoning in the fireball. FXM-1471, Dec. 1955; NSA 16, No. 15899 (1962). J. C. Stewart. A generalised criterion for xenon instability. KAPL-M-55-4, 1955. R. S. Stone. Xenon instability. Nucl. Safety 1 (2), 35 (1959). J. Valat. Etude analogique preliminaire a I’optirnalisation de I’empoisonement xenon dans un reacteur nucleaire. CPLA 6-5, 1961. W. M. Vaunoy. ATR xenon stability study interim report. B. W. Internal Memorandum File 59, 3063-3080, Feb. 1962. A. Ward. Universal curve for the prediction of Xe poison after a reactor shutdown. Atomic Energy of Canada Ltd., Chalk River Project 411, 1957; CA 51, No. 78771. (1957).
164
Bibliography
A. G. Ward. The problem of flux instability in large power reactors. CRRP-657, (1956). G. N. Watson and R. B. Evens. Xenon diffusion in graphite; effects of xenon absorption in molten salt reactors containing graphite. ORNL-CF-61-2-59, Feb. 1961. J. Webster and G. Cazier. Xenon behavior with linearly decreasing power. IDO16282, Sept. 1955; N S A 10,No. 10951 (1956). A. Weinberg. Resonance absorption. CF-50-8-116, Aug. 1959 (decl. 15, 1955); N S A 10, No. 4383 (1956). A. M. Weinberg and E. P. Wigner. “The Physical Theory of Neutron Chain Reactors,” pp. 60M03. Univ. Chicago Press, Chicago, Illinois, 1958. J. J. Went. Reactor kinetics. Chem. Weekblad 54, 193-198 (1958). P. Wenzel. Effective xenon poisoning after shut-down of a thermal reactor. Kernenergie 4, 926934 (1961) (German); N S A 16,No. 15803 (1962). P. Wenzel. Effective steady state xenon poisoning in thermal reactors. Kernenergie 4, 775-780 (1961) (German); N S A 16,No. 15798 (1962). A. B. Whiteley, et al. Calculation of reactor power and temperature distribution in three spatial dimensions. 3rd United Nations Conference Peaceful Uses Atomic Energy, Geneva, 1964; A/Conf. 28/P/169, May 1964. D. M. Wiberg. Optimal feedback control of spatial xenon oscillations. Trans. Am. N u c ~SOC. . 7 , 219-220 (1964); N S A 19,NO. 3534 (1965). D. M. Wiberg. Optimal feedback control of spatial xenon oscillations in a nuclear reactor (thesis). TID-21273, June 1964; N S A 19, No. 3654 (1965). R. S. Wick. Space- and time-dependent flux oscillations and instability in thermal reactors due to nonuniform formation and depletion of xenon. WAPD-TM-138, Aug. 1958; NSA 12,No. 16723 (1958). G. Young. Critical mass needed to override Xe. Mon P-457, Dec. 1947; N S A 10, No. 3728 (1956). G. B. Zorzdi. Xenon oscillations in nuclear reactors. Energia Nucl. Milan, 6 393-398 (1959); C A 54, No. 1092i (1960).
Index
In the entries for the authors, numbers in parentheses are reference numbers and
indicate that an author's work is referred to although his name is not cited in the text. Numbers in italic show the page on which the complete reference is listed.
A
Absolute value criterion, 47 Absorption cross section, 1, 6 Arithmetic mean, 30 Artificial xenon override, 84 Ash, M., 8 (2), 14 ( 9 , 17 ( 3 , 72 (14), 74 (17, 18), 88 (18), 112 (14), 152 (2), 155, 156 Averaged control criterion, 27
B Bang-bang control, 59, 65, 67, 124 Bang-off-bang control, 63 Barn, 1 Bellman, R., 12 (4), 29 (4), 50 (lo), 52 (lo), 59 (lo), 74( 17), 77 (20), 86(24), 155, 156
Bellman's equation, 12, 75, 76, 148 8-Decay, 4 Bogus coin detection, 49 Boltyanskii, V. G., 45 (9), 73 (16), 84 165
(22), 85 (22), 152 (9), 155, 156 Breeder reactor, 14
C Cerenkov radiation, 103 Cesium-135, 4 Circulating fuel, 11 Closed-loop transfer function, 32 COAST code data format, 92 input, 93 listing, 101 Coasting phase, 123 Coasting trajectory, 123 Constraints, 33, 44 on flux magnitude, 45, 71 on inverse period, 45, 72 on xenon concentration, 46, 73, 90, 126 Control criterion functional, 26 Control flexibility, 2, 8 Control vector, 31
Index
166
Core memory, see Fast access memory Cost functional, 18 Criterion functional, 18 Critical masses, 130 Critical reactor, 8 D
Decay products, see Fission product nuclei Depleted fuel, 3, 115 Dilemma in bang-bang control, 77 Discrete optimal control, 25 Dollar, 9 Dreyfus, S., 86 (24), 156 DYNPROG code data format, 91 description, 87 glossary, 93 input, 91 listing, 95-101 sample printout, 95
E Enriched fuel reactors, 10 Epithermal reactor, 11 Equilibrium flux, 35 samarium, 38, 106 state, 38, 39 xenon, 36 Extremal field, 127
F Fk Tables, 89 Fast access memory, 87 Fast reactor, 11 Feedback control, 30 Fertile isotope, 14 Fission, 1
Fissionable isotope, 14 Fission product nuclei, 1, 84 Flux, 8 Flux tilt, see Spatial xenon oscillations Fuel costs, 22, 128, 140 cycles, 21 savings, 127, 128, 148 G
Gamkrelidze, R. V., 45 (9), 73 (16), 84 (22), 85 (22), 152 (9), 155, 156 Geometric mean, 30 Glasstone, S., 8 (I), 9 (l), 155, 156 H
Hamilton-Jacobi equation, 13 Hamiltonian, 146 Hamilton’s canonical equations, 13
I Immediate flux shutdown, 39 Interplanetary trajectories, 149 Inverse period, 45 Iodine-135, 4 IRR-1 reactor control rods, 103 calibration, 112 core configuration, 102, 111 depleted core, 115 description, 102, 118-120 scheduling, 129, 130 shielding, 103
K Kahan, R. S., 102 (25), 118 (25), 156 Kalaba,R.,74(17), 78 (21), 143 (21), 156
167
Index
L Leitmann, G., 52 (11), I55 Lellouche, G. S., 24 (8), 155 Linearized control theory, 30, 31 Long-term irradiation, 21
M MacRae, W., 109 (27), 156 Magnetic tape storage, 87 Manhattan Project, 2 Maximum power limit, 41 principle, 13, 55 Minimum criterion, 71 energy trajectories, 149 mean squared control, 53 time control criterion, 53, 142 Minimax criteria, 48 Mishchenko, E. F., 45 (9), 73 (16), 84 (22), 85 (22), 152 (9), 155, 156 Moderator coolant, 9 MTR fuel elements, 103 Multiplication factor, 8, 9 Multi-stage decision theory, 12
N Neodymium-149, 6 Network routing, 150-152 Neutron flux, 8 population, 9 Non-equilibrium xenon, peak, 111 time, 111 Nuclear resonance, 6
0
Open-loop transfer function, 32 Optimal control variable, 18 investment policy, 18 routing, 150 Optimality principle versus maximum principle, 80 Orbital guidance, 33, 123 Override, see Xenon override
P Parr, R., 109 (27), 156 Period, 45 Periodic startup and shutdown, 110 Plutonium, 2, 11 Pu-239, 11 Poison factor, 10 kinetics, 34 reactivity, 36 Polaris submarine, 3 Pontryagin, L. S., 45 (9), 73 (16), 84 (22), 85, 152 (9), 155, 156 Pontryagin’s principle, 52, 59 Post-shutdown period, 10 Principle of optimality, 13 Promethium-149, 5, 6
R Randall, D., 24 (7), 155 Reactor core, 9 Rendezvous control, 63 Roberts, J. J., 150 (30), 156 Roberts, S. M., 22 (6), 48 (6), 63 (13), 64 (13), 69 (13), 155 Rosonoer, L. I., 75 (19), 81, 156 Rosztoczy, Z. R., 84 (23), I56 Routing problems, 150
168
Index
Running control cost, see Averaged control criterion S
Samarium-149, 1, 6 Samarium-150, 38 Samarium asymptotic concentration, 43 kinetics equation, 38 poison, 6 Sawtooth extremal, 125 Sesonske, A., 8 (l), 9 (l), 155 Shinohara, Y., 150 (31), 156 Shutdown control phase, phase, period, see shutdown duration, 71 effectiveness, 137 Simplified xenon shutdown, 64 Smith, H. P., Jr., 150(30), 156 Spatial xenon oscillations, 3, 24 State vector, 31 Step function shutdown, see Immediate flux shutdown St. John, D. S., 24 (7), 155 Subcritical reactor, 9 Sumner, H. M., 7, I56 Supercritical reactor, 9, 46 Swimming pool reactor, see IRR-1 reactor
T Tadmor, J., 118 (28), 156 Takahashi, Y.,59 (12), 61 (12), 63 (12), I55 Tellurium-135, 4 Terminal control criterion, 18, 27 surfaces, 76 Termination state vector, 122 Th-232, 11 Time optimal control, see Minimum
time control criterion Transversality condition, 63, 64 Traveling salesman problem, 150 Two-point boundary values, 56, 67
U U-233, 11 U-235, 14 U-238, 11 Uranium fuel, 2, 11
v Valat, J., 150 (31), 156 Valentine, F. A., 73 (15), I55 W
Ward, A. G., 109 (26), 156 Weaver, L. E., 84 (23), 156 Weierstrass necessary condition, 13 Weinberg, A. M., 8 (3), 155 Wigner, E. P., 8 (3), 155
X Xenon-135, 1, 109 Xenon-136, 23, 109 absorption cross section, 6 computers, 12 constrained extremals, 124 criterion functional, 74 decay, 8 functional equation, 75 iodine phase space, 125 kinetics equations, 38-39 maximum, 40 minimax functional, 44, 71, 121 minimum functional, 44, 71
Index optimal control, 77 override boundary curve, 149 complete, 8 partial, 3 peak shift, 129
169
peak-, see Maximum poison, 1 Z
Zero flux, 39
This page intentionally left blank