This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
, then VERY P= , then APPROX P =
192
calculating the degrees of membership (pJ and non-membership (vr) of the result records. For the purpose we define the following translation rules for the FROM and WHERE clauses: we calculate the ~ land f vf (respectively the degrees of membership and nonmembership obtained after the application of the Cartesian product operation) for all relations in the FROM clause: Let RI, R2, ..., R,, be the relations in the FROM clause. An element from the result set will be of the form x, = <xl, k , ..., x,,>, where %(k=l,. . .,n) is the corresponding tuple (set of attributes) from the relation Rk.Then for each tuple x, from the result set, clf (PRI(%), PR2(%), .. .,PRnh)) and vf (vRI(%),vR2(x2), . ..,VRn(%)) we calculate the uy and v, (respectively the degrees of truth and falsity used for applying the selection operation) for all the predicates in the where clause. The single logical operations are translated the following way: ForoperationOR(p, ORp20R ... 0 R p n ) ~ ~ ( p I , ..., p 2 pn)=max@(pI), , ~ @ 2 ) 9 ...>P@~)),voR@I,P~, ~ . . , p n > = ~ ( v @ d , v ( P...,v@J) d, For operation AND @, ANDp, AND ... ANDp,,) /LAN&,, p2, ..., pn)= min (P@I),P@z), .. ., ~@n)), VAND@I, PZ,. . v Pn) = ~ z ( v @ I )v@2), , . . .> v(PlJ) For operation NOT (NOTp) ~NOT(P) = v@), VNOT(P) = p(p) For all the modifiers there should be defined a function in the database. For example the modifier VERY should have a corresponding function VERY which returns the modified intuitionistic fuzzy Boolean value. k = min (k,uy), Vr = max (vf, v,) since the entire WHERE clause from the intuitionistic fuzzy SQL statement acts as a modifier to the degrees o f membership and non-membership of the result rows, then if the statement doesn’t contain level operators, the WHERE clause in the translated SQL statement should contain only the condition -nmship c 1 in order to filter those records which have no degree of membership or indefiniteness, otherwise the translated SQL statement should contain in its WHERE clause the corresponding conditions to satisfy the level operators. The realization of the functions min and max with variable count of arguments in SQL uses the aggregate functions rnin and max and is the following (using Oracle PL/SQL syntax): min(xl, k , ..., %) is represented as “(SELECT min(-x-) FROM (SELECT xl AS -x- FROM dual UNION ALL SELECT x2 FROM dual UNION ALL . .. UNION ALL SELECT xn FROM dual))” ma$xl, x2, .. .,x,,) is represented as “(SELECT max(-x-) FROM (SELECT xl AS -x- FROM dual UNION ALL SELECT x2 FROM dual UNION ALL ... UNION ALL SELECT xn FROM dual))”
193
Example: Let us have the intuitionistic fuzzy relations A and B. Both of them contain columns M and N for storing the degrees of membership and non-membership respectively. Let the intuitionistic fuzzy predicates P and Q be defined over the columns A.X and B.Y. intuitionistic fuzzy SQL statement: SELECT A.X, B.Y, -mship, -nmship FROM A join B ON A.id = B.id
WHEREVERYP(A.X,B.Y)ORQ(A.X,B.Y)MODIFN[0.3,0.7] translated SQL statement (Oracle PL/SQL syntax): SELECT A.X, B.Y, (select min(-x-) from (select a.m as -x- from dual union all select b.m from dual union all select max (-x-) from (select very@(a.x, b.y)).m from dual union all select q(a.x,b.y).m from dual) ) ) as -mship, (select max(-x-) from (select a.n as -x- from dual union all select b.n from dual union all select min (-x-) from (select very@(a.x, b.y)).n from dual union all select q(a.x,b.y).n from dual)) ) as -nmship FROM A join B ON Aid = B.id WHERE -mship >= 0.3 AND -nmship <= 0.7; An example for application of the modifiers over the stored data in a whole relation is the following UPDATE statement: intuitionistic fuzzy SQL statement: UPDATE A MODIF D[a]; translated SQL statement (Oracle PL/SQL syntax): UPDATE A SET m = m + a*( 1-mn), n = n + (1-a)*( 1-mn); 5.
Conclusions, Application and Future Work
This model gives the opportunity to handle relations, which have particular degree of reliability encapsulated regardless of what the source is (table which contains data with intuitionistic fuzzy degree of reliability or query which uses intuitionistic fuzzy predicates representing imprecise terms) The IFRDB model has two basic goals: first, to be able to store data with a particular degree of reliability (degree of membership of the elements) and second, to be able to process queries, which contain intuitionistic fuzzy predicates. An example for how to use the IFRDB as an intuitionistic fuzzy storage is the occasion when the data in the database need to be actualized at certain intervals of time. In this case we should use the extension of temporal IFRDB. This would be very useful when using distributed database systems. Another example could be a database, which is filled by experts, each of them having a reliability coefficient, which may not be fixed, but belonging to a
194
certain interval. Then the entered data obtain degree of reliability, and the degree of indefiniteness corresponds to the length of the a.m. interval. The second goal covers the capability to handle imprecise information. The IFRDB should be able to retrieve information corresponding to natural language statements such as the example below. As a subject to future work we will develop packages for Oracle and MS SQL Server, which will implement the algorithm for translation of the IFSQL. We will also develop an IFRDB system for storing and predicting game results (e.g. from football championships). The system uses the IFS theory as it initializes each team’s rating with full indefiniteness and after each match the indefiniteness decreases and the rating increases or decreases according to the result. The system uses the “UPDATE . . . MODIF” statement to modify the new rating and could give answers to questions such as for example “How ofen Liverpool was approximately as good as Manchester Utd.”
References 1. Kolev, B., P. Chountas, “Intuitionistic Fuzzy Relational Databases”, In: Proceedings of the Seventh International Conference on Intuitionistic Fuzzy Sets, 23-24 August 2003, Sofia, Bulgaria, 109-113 2. Atanassov K. Intuitionistic Fuzzy Sets, Springer-Verlag, Heidelberg, 1999. 3. Connoly T., C. Begg, A. Strachan. Database Systems: A Practical Approach to Design, Implementation and management, Addison-Wesley, Harlow, England, 1998. 4. Atanassov, K., N. Nikolov, H. Aladjov, Remark on two operations over intuitionistic fuzzy sets, International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, Vol. 9, No. 1,2001, p. 71-75 5. Atanassov K., Generalized Nets in Artificial Intelligence, Vol. I : Generalized Nets and Expert Systems, “Prof. M. Drinov” Publ. House, Sofia, 1998 6. P. BOSC,and 0. Pivert, “Fuzzy querying in conventional databases”, In: Fuzzy Logic for the Management of Uncertainty,Zadeh, L. and Kacprzyk, J. Eds, John Wiley, New York, 1992, p. 645-671. 7. Raju, K.V.S.V.N., A. K. Mahumdar. “Fuzzy Functional Dependencies and Lossless Join Decomposition of Fuzzy Relational Database Systems”, ACM Transactions on Database Systems, Vol. 13,No. 2, June 1988,p. 129-166
EFFICIENT CLUSTERING WITH FUZZY ANTS
S. SCHOCKAERT, M. DE COCK, C. CORNELIS AND E. E. KERRE Fuzziness and Uncertainty Modelling Research Unit, Department of Applied Mathematics and Computer Science, Ghent University, Krijgslaan 281 (S9), 9000 Ghent, Belgium E-mail: Steven.Schockaert@UGent. be http://fuzzy. UGent. be
In the past decade, various clustering algorithms based on the behaviour of real ants were proposed. The main advantage of these algorithms lies in the fact that no additional information, such as an initial partitioning of the data or the number of clusters, is needed. In this paper we show how the combination of the ant-based approach with fuzzy rules leads to an algorithm which is conceptually simpler, more efficient and more robust than previous approaches.
1. Introduction
While the behaviour of individual ants is very primitive, the resulting behaviour on the colony-level can be quite complex. A particularly interesting example is the clustering of dead nestmates, as observed with several ant species under laboratory conditions'. By exhibiting only simple basic actions and without negotiating about where to gather the corpses, ants manage t o cluster all corpses into 1 or 2 piles. The conceptual simplicity of this phenomenon, together with the lack of centralized control and a priori information, are the main motivations for designing a clustering algorithm inspired by this behaviour. Real ants are, because of their very limited brain capacity, often assumed to reason only by means of rules of thumb2. Therefore in this paper we propose a clustering approach in which the behaviour of the artificial ants (and more precisely, their stimuli for picking up and dropping items) is governed by fuzzy IF-THEN rules. The resulting algorithm is efficient, robust and easy to use thanks to observed dataset independence of the parameter values involved. 195
196
2. Related work
Deneubourg et a1.l proposed an agent-based model to explain how ants manage to cluster the corpses of their dead nestmates. Artificial ants (or agents) are moving randomly on a square grid of cells on which some items are scattered. Each cell can only contain a single item. Whenever an unloaded ant encounters an item, this item is picked up with a probability which depends on an estimation of the density of items of the same type in the neighbourhood. If this density is high, the probability of picking up the item will be low. When a loaded ant encounters a free cell on the grid, the probability that this item is dropped also depends on an estimation of the local density of items of the same type. However, when this density is high, the probability of dropping the load will be high. Simulations show that eventually all objects of the same type are clustered together. Lumer and Faieta3 extended the model of Deneubourg et al., using a dissimilarity-based evaluation of the local density, in order t o make it suitable for data clustering. Unfortunately, the resulting number of clusters is often too high and convergence is slow. Therefore, a number of modifications were proposed, by Lumer and Faieta themselves as well as by others (e.g. 4,5). Since two different clusters can be adjacent on the grid, heuristics are necessary t o determine which items belong to the same cluster. Monmarch@ proposed an algorithm in which several items are allowed to be on the same cell. Each cell with a nonzero number of items corresponds to a cluster. Each ant a is endowed with a certain capacity ~ ( a ) . Instead of carrying one item at a time, an ant a can carry a heap of .(a) ) from a heap and items. Probabilities for picking up, at most ~ ( aitems for dropping the load on a heap are based on characteristics of the heap, such as the average dissimilarity between items of the heap. When an ant decides to pick up items, the .(a) items whose dissimilarity to the centre of the heap under consideration is highest, are chosen. Two particularly interesting values for the capacity of an ant a are .(a) = 1 and .(a) = M. Monmarch6 proposes to apply this algorithm twice. The first time, the capacity of all ants is 1, which results in a high number of tight clusters. Subsequently the algorithm is repeated with the clusters of the first pass as atomic objects and ants with infinite capacity. After each pass k-means clustering is applied for handling small classification errors. In a similar way, in an ant-based clustering algorithm is combined with the fuzzy c-means algorithm. Although some work has been done on combining fuzzy rules with ant-based algorithms for optimization problems',
197
to our knowledge until now fuzzy rules have not yet been used to control the behaviour of artificial ants in a clustering algorithm.
3. Fuzzy Ants Our algorithm is in many ways inspired by the algorithm of Monmarche. We will consider however only one ant, since the use of multiple ants on a non-parallel implementation has no advantages. Instead of introducing several passes, our ant can pick up one item from a heap or an entire heap. Which case applies is governed by a model of division of labour in social insects by Bonabeau et aL9. In this model, a certain stimulus and a response threshold value are associated with each task a (real) ant can perform. The response threshold value is fixed, but the stimulus can change and represents the need for someone to perform the task. The probability that an ant starts performing a task with stimulus s and response threshold value 6' is given by
where n is a positive integer. Let us now apply this model to the problem at hand. A loaded ant can only perform one task: dropping its load. Let Sdrop be the stimulus associated with this task and 6'drop the response threshold value. The probability of dropping the load is then given by
where i E { 1,2} and 721,722 positive integers. When the ant is only carrying one item n 1 is used, otherwise n 2 is used. An unloaded ant can perform two tasks: picking up one item and picking up all the items. Let son,and s,ll be the respective stimuli and Qone and 6,ll the respective response threshold values. The probabilities for picking up one item and picking up all the items are given by
where ml and m2 are positive integers. The values of the stimuli are calculated by evaluating fuzzy if-then rules as explained below. First we introduce some notations. Let E be a fuzzy relation in X, i.e. a fuzzy set in X 2 , which is reflexive and Tw-transitive
198
(ie. T w ( E ( z , y), E ( y , z ) ) 5 E ( z ,z ) , for all z, y and z in X ) where X is the set of items to be clustered and TWthe Lukasiewicz triangular norm defined by T ~ ( z , y= ) maz(0,z y - l ) ,for all z and y in [0,1]. For z and y in X , E ( z ,y) denotes the degree of similarity between the items z and y. For a heap H C X with centre c in X , we define a v g ( H ) = E ( h ,c) and m i n ( H ) = minh,H E ( h ,c). Let E'(H1, H z ) be the similarity between the centres of the heap H1 and the heap Hz. Because of the limited space we do not go into detail about how to define and/or compute the centre of a heap, as this can be dependent on the kind of the data that needs to be clustered.
+
xhEH
Dropping items The stimulus for a loaded ant to drop its load L on a cell which already contains a heap H is based on the average similarity A = a v g ( H ) and an estimation of the average similarity between the centre of H and items of L. This estimation is calculated as B = Tw(E*(L,H ) ,a v g ( L ) ) which is a lower bound due to our assumption about the Tw-transitivity of E and can be implemented much more efficiently than the exact value. If B is smaller than A , the stimulus for dropping the load should be low; if B is greater than A , the stimulus should be high. Since heaps should be able to grow, we should also allow the load to be dropped when A is approximately equal to B. Our ant will perceive the values of A and B to be Very High, High, Medium, Low or Very Low. The stimulus will be perceived as Very Very High, Very High, High, Rather High, Medium, Rather Low, Low, Very Low or Very Very Low. These linguistic terms can be represented by triangular fuzzy sets. The rules for dropping the load L onto an existing heap H are summarized in table 1. Picking up items An unloaded ant should pick up the most dissimilar item from a heap if the similarity between this item and the centre of the heap is far less than the average similarity of the heap. This means that by taking the item away, the heap will become more homogeneous. An unloaded ant should only pick up an entire heap, if the heap is already homogeneous. Thus, the stimulus for an unloaded ant to pick up a single item from a heap H and the stimulus to pick up all items from that heap are based on the average similarity A = a v g ( H ) and the minimal similarity M = rnin(H) and can be inferred using fuzzy rules. Because of the limited space, we omit the corresponding rule bases. For evaluating the fuzzy rules, we used a Mamdani inference system with COG as defuzzification method.
199 Table 1. Stimulus for dropping the load.
~
A is V. High
A is High
A is Medium
A is Low
A is V. Low
RH
H
VH
VVH
VVH
~
B is V. High B is High
L
RH
H
VH
VVH
B is Medium B is Low
VVL
L
RH
H
VH
VVL
W L
L
RH
H
B is V. Low
VVL
W L
VVL
L
RH
The algorithm During the execution of the algorithm, we maintain a list of all heaps. Initially there is a heap, consisting of a single element, for every item in the dataset. Picking up an entire heap corresponds to removing a heap from the list. At each iteration our ant randomly chooses one heap H from the list and acts as follows. If the ant is unloaded and if H consists of a single element, the element is picked up with a fixed probability. Depending on the definition of the centre of a heap, comparing the minimal and average similarity of a heap consisting of two elements may not be meaningful. If H consists of two elements a and b, one of them is picked up, with a probability (1 - E ( a ,b))“, where 161 is a small positive integer (e.g. 2). Otherwise both elements are picked up, with a fixed probability. If H consists of more than two elements, the stimuli for picking up a single element and for picking up all elements are inferred using the fuzzy rule bases and the corresponding probabilities are given by Eqn. (2)-(3). If the ant is loaded with a heap L , a new heap containing the load L is added to the list of heaps with a fixed probability. Else, if H consists of a single element a , and L consists of a single element b, L is merged with H with a probability E ( a ,b)k2, were 162 is a small positive integer (cg. 2). Else, if H consists of more than one element, the stimulus for dropping the load is calculated and the probability that H and L are merged is given by Eq. (1). The most important parameters of the algorithm are n1,722,ml,rnz in Eqn. (1),(2) and (3). Good results were found within a wide range of values, satisfying rnl = m2 < n1 < 722. Moreover, the values of the parameters seem to be independent of the dataset, but are dependent on the definition of the similarity measure E that is used. All response threshold values were set to the modal value of the fuzzy set representing the linguistic term “medium” for the stimulus.
200 4. Concluding remarks
We have presented a clustering algorithm, inspired by the behaviour of real ants simulated by means of fuzzy IF-THEN rules. Like all ant-based clustering algorithms, no initial partitioning of the data is needed, nor should the number of clusters be known in advance. Initial experimental results indicate good scalability to large datasets. Outliers in noisy data are left apart and hence do not influence the result, and the parameter values appear to be dataset-independent which makes the algorithm robust.
Acknowledgments Martine De Cock and Chris Cornelis would like to thank the Fund for Scientific Research - Flanders for funding their research.
References 1. J. L. Deneubourg, S. GOSS,N. Franks, A. Sendova-Franks, C. Detrain, L. ChrCtien. The Dynamics of Collective Sorting Robot-Like Ants and Ant-Like Robots. Ekom Animals to Animats: Proc. of the 1st Int. Conf. on Simulation of Adaptive Behaviour. 356-363 (1990). 2. B. Holldobler, E. 0. Wilson. The ants. Springer-Verslag Heidelberg (1990). 3. E. D. Lumer, B. Faieta. Diversity and Adaptation in Populations of Clustering Ants. from Animals to Animats 3: Proc. of the 3th Int. Conf. on the Simulation of Adaptive Behaviour. 501-508 (1994). 4. J. Handl, B. Meyer. Improved Ant-Based Clustering and Sorting in a Document Retrieval Interface. Proc. of the 7th Int. Conf. on Parallel Problem Solving from Nature. 913-923 (2002). 5. V. Ramos, F. Muge, P. Pina. Self-organized Data and Image Retrieval as a Consequence of Inter-Dynamic Synergistic Relationships in Artificial Ant Colonies. Soft Computing Systems: Design, Management and Applications. 87,500-509 (2002). 6. N. MonmarchC. Algorithmes de Fourmis Artificielles: Applications B la Classification et B I’Optimisation. PhD thesis, UniversitC Franqois Rabelais (2000). 7. P. M. Kanade, L. 0. Hall. Fuzzy Ants as a Clustering Concept. Proc. of the 22nd Int. Conf. of the North American Fuzzy Information Processing Society. 227-232 (2003). 8. P. LuCiC. Modelling Thansportation Systems using Concepts of Swarm Intelligence and Soft Computing. PhD thesis, Virginia Tech (2002). 9. E. Bonabeau, A. Sobkowski, G. Theraulaz, J. L. Deneubourg. Adaptive Task Allocation Inspired by a Model of Division of Labor in Social Insects. Working Paper 98-01-004, available at http://ideas.repec.org/p/wop/safiwp/98-01004.html (1998).
ENHANCED RBF NETWORK USING FUZZY CONTROL METHOD KWANG-BAEK KIM Department of Computer Engineering, Silla University, San I -I, Kwebop-dong, Sasang-gu, Busan 617-736, South Korea
JUNG -WOOK MOON Department of Computer Engineering, Busan University, Jangjun-Dong, Busan, 609-735, South Korea JAE -HYUN NAM Division of Computer and Information Engineering, Silla University, I -I Gwaebop-dong, Sasang-gu, Busan 61 7-736, South Korea
This paper proposes an enhanced RBF network that enhances learning algorithms between input layer and middle layer and between middle layer and output layer individually for improving the efficiency of learning. The proposed network applies ART2 network as the learning structure between input layer and middle layer. And the auto-tuning method of learning rate and momentum is proposed and applied to learning between middle layer and output layer, which arbitrates learning rate and momentum dynamically by using the fuzzy control system for the arbitration of the connected weight between middle layer and output layer. The experiment for the classification of number patterns extracted from the citizen registration card shows that compared with conventional neural networks and the ARn-based RBF network, the proposed method achieves the improvement of performance in terms of learning speed and convergence.
1. Introduction The RBF network is a feed-forward neural network that consists of three layers, input layer, middlelayer and output layer. In the RBF network, because the operations required between layers are different, learning algorithms between layers can be mutually different. So, the optimum organization between layers can be separately constructed. The selection of the organization for middle layer determines the overall efficiency of the RBF network[l]. Therefore this paper proposes and evaluates the enhanced RBF network that uses ART2 to organize the middle layer efficiently and applies the auto-turning method of adjusting learning rate and momentum using the fuzzy control system for the arbitration of the connected weight between middle layer and output layer. 201
202 2.
Enhanced RBF Network
The learning of ART2-based RBF network is divided to two stages. In the first stage, competitive learning is applied as the learning structure between input layer and middle layer. And the supervised learning is accomplished between middle layer and output layer[2]. The enhanced RBF network applies ART2 to the learning structure between the input layer and the middle layer and proposes the auto-turning method of arbitrating learning rate for the adjustment of the connected weight between the middle layer and the output layer. When the absolute value of the difference between the output vector and the target vector for each pattern is below 0.1, it is classified to the accuracy, and otherwise to the inaccuracy. Learning rate and momentum are arbitrated dynamically by applying the numbers of the accuracy and the inaccuracy to the input of the fuzzy control system. Figure l(a) shows the membership function to which the accuracy belongs, whereas Figure l(b) shows the membership function to which the inaccuracy belongs.
The number of correct
The number of incorrect
(a) Accuracy
(b) inaccuracy
Figure 1. The membership function
The values of
c,,
and thigh are calculated by Eq. (1) and Eq. (2).
c,,, = log,(N, + N p ) , ‘high
N ,:thenumberof inputnodes N, :the number of patterns
1
(1)
(2)
= C h + ‘low
In Figure 1, F, A and T are the member functions indicating false, average and true, respectively. When the rule of controlling fuzzy to arbitrate the learning rate is expressed with the form of if then, it is as follows: R, : If correct is F , incorrect F Then a is B R, : If correct is F , incorrect A Then a is B R3 : If correct is F , incorrect T Then a is B R4: If correctis A, incorrectF Thena is M R5 : If correct is A , incorrect A Then a is M R6 : If correct is A, incorrect T Then a is M R, : If correct is T , incorrect A Then a is S R, ; If correct is T , incorrect F Then a is S 4 : I f correct is T , incorrect T Then a is S
-
203 Figure 2 shows the output membership function calculating the learning rate, which is going to be applied to learning.
Learning rate Figure 2. The membership function of learning rate
In Figure 2, S, M and B are the member functions indicating small, medium and big, respectively. When accuracy and inaccuracy are decided as the input value of the fuzzy control system, membership degrees of accuracy and inaccuracy for each membership function are calculated. After the calculation of membership degree for each member function, the rule of fuzzy control is applied and the inference is accomplished by means of Max-Min method. The learning rate is calculated by defuzzifier[3] method after the fuzzy inference. Momentum is calculated by Eq. (3). p=[-a
(3)
3. Experiment and Performance Evaluation We analyzed the number of epoch and the convergence by applying 136’s number patterns having 10x10 in size, which are extracted from the citizen registration cards, to the conventional delta-bar-delta method[4], the ART2based RBF network and the learning algorithm proposed in this paper. Table 1 shows parameters of each algorithm, which were used for the experiment. Table 1. Parameters, which were used for learning
In Table 1, a, indicates the learning rate, p the vigilance parameter of ART2, ( the parameter for calculation of momentum, and p , K , y parameters fixed by delta-bar-delta algorithm. The experiment have been executed 10 times under the criterion of classifying input pattern to the accuracy when the absolute value of the difference between the output vector of input pattern and the target vector is below E ( E 5 0.1) in 10000’s epoch executions.
204
Figure 3 shows the graph for the change rate of TSS(Tota1 Sum of Square) of error according to the number of epoch. As shown in Fig. 3, the proposed method has faster speed of primary convergence and smaller TSS of error than conventional methods. Deb-bar-Dehr
REF N e b ~ o l kon The Bas= ART2 -PloposedMetilood
0
500
1000
1500
2000
2500
The Num berof Epoch
Figure 3. Graph of total sum of square
4. Conclusions An enhanced RBF network is proposed in this paper, which uses ART2 algorithm between input layer and middle layer to enhance the efficiency of learning of conventional ART2-based RBF network and applies the auto-tuning method of arbitrating learning rate and momentum automatically by means of the fuzzy control system to arbitrate the weight value efficiently between middle layer and output layer. The experiments of applying the proposed method to the classification of number patterns abstracted from the citizen registration card shows 2 results related to performance: First, the proposed method did not react sensitively to the number of learning and the convergence, whereas conventional methods did, and Second, the total sum of square has decreased remarkably than conventional methods.
References C. Panchapakesan, D. Ralph and M. Palaniswami (1998), "Effects of Moving the Centers in an RBF Network," Proceedings of IJCNN, Vol. 2, pp. 1256-1260. K. B. Kim, S. W. Jang and C. K. Kim (2003), "Recognition of Car License Plate by Using Dynamical Thresholding Method and Enhanced Neural Networks," Lecture Notes in Computer Science, LNCS 2756, pp. 309-319. M. Jamshidi, N. Vadiee and T. J. Ross (1993), Fuzzy Logic and Control, Prentice-Hall. R. A. Jacobs (1988), "Increased rates of convergence through learning rate adaptation," IEEE Transactions on Neural Networks, Vol. 1, No. 4, pp. 295-308.
PATTERN RECOGNITION WITH SPIKING NEURAL NETWORKS AND DYNAMIC SYNAPSES A. BELATRECHE, L.P MAGUIRE, T.M. MCGINNITY Intelligent Systems Engineering Laboratory, School of Computing and Intelligent Systems, Faculty oflngineering, University of Ulster, Magee campus Northland Road, Deriy, BT48 7JL, Northern Ireland, United Kingdom {a,belatreche, Ipmaguire, tm.mcginniq}@ulster.ac.uk
Spiking neural networks represent a more plausible model of real biological neurons where time is considered as an important feature for information representation and processing in the human brain. In this paper, we apply spiking neural nctworks with dynamic synapses for pattern recognition in multidimensional data. The neurons are based on the integrate and-fire model, and are connected using a biologically plausible model of dynamic synapses. Unlike the conventional synapse employed in artificial neural networks, which is considered as a static entity with a fixed weight, the dynamic synapse (weightless synapse) efficacy changes upon the arrival of input spikes, and depends on the temporal structure of the impinging spike train. The training of the free parameters of the spiking network is performed using an evolutionary strategy (ES) where real values are used to encode the dynamic synapse parameters, which underlie the learning process. The results show that spiking neurons with dynamic synapses are capable of pattern recognition by means of spatio-temporal encoding.
1.
Introduction
Spiking neural networks (SNN) have been the subject of significant recent research reflecting the view that spikes have a key role in biological information processing [ 1][2]. It is believed that real neurons use more information other than the average firing rate to perform computation, as the difference in firing times could convey information about the input stimuli and that the relative order of firing times could be used as an alternative to rate coding [5][6][7]. Most simulations of neural networks share the assumption that synaptic efficacy (weight) is considered to be static during the reception of afferent spike trains. However recent experimental studies of real biological neurons cells show that the synaptic efficacy generating the postsynaptic potential is a variable (dynamic) quantity which depends on the presynaptic activity, i.e, the temporal structure of the presynaptic spike train [8] [lo]. In this work, activity-dependent synapses (dynamic synapses) are used to connect integrate-and-fire neurons and their computation capability is evaluated when applied to perform pattern recognition of non-linearly separable data. The inputs are represented in the form of spike trains in order to be handled by the spiking network and the timing of the maximum response of the output
205
206 neurons reflects the detection of a particular pattern. The STDP (spike-timedependent plasticity) algorithm has been used to train the network in an unsupervised way, however the resulting performance proved unsuccessful. The free parameters of the network are then tuned in a supervised way, using an evolutionary strategy (ES) where real value encoding is used to encode the synaptic parameters, which are optimised to minimize the error between the output neuron current maximum response times and the desired ones.
2. Network Architecture The integrate-and-fire neuron model is used to model spiking neurons. A neuron is represented by a voltage across its cell membrane and a threshold. The status of the neuron is determined by the integration of its excitatory and inhibitory postsynaptic potential (EPSP, IPSP). When its membrane potential reaches a certain threshold, the neuron generates (fires) a spike or action potential. The neuron dynamics are modelled by the following equation:
where z, is the time constant of the neuron membrane, R, is its resistance and lSyn represents the total synaptic inputs. (T,,, =40ms, R, =lOOMQ). A feedforward fully connected spiking network is used, where the different layers are labelled I, H, 0 for the input, hidden and output layer respectively as shown in Figure 1. The spiking neurons are connected via dynamic synapses.
H
0
dynamic synapse Figurel. Network architecture: input neurons are sources of spike trains, connected through dynamic synapses to other postsynaptic neurons in the next layers. Each dynamic synapse causes a change in the postsynaptic potential and the receiving neuron merely integrates the different changes caused by different dynamic synapse. The input spike trains are transformed into a spatio-temporal structure to be learned by the output neuron using the time of its maximum response.
3.
Model of Dynamic Synapse
We consider the dynamic synapse (DS) model introduced in [9]. The model assumes that a DS is represented by a finite amount of resources called neurotransmitters. Each pre-synaptic spike (arriving at time tsp ) activates a
207 fraction ( U S E , utilization of synaptic efficacy) of resources, which then quickly inactivate with a time constant (qn)and recover with a time constant (z,~). The synapse dynamics are given with the following equations:
i I
dY dt
-=--
zin
& dt- . Y Tin
+ USE.x.Ap(t - t s p )
=
~re,
where x, y, and z are the fractions of resources in the recovered, active, and inactive states, respectively. The postsynaptic current is taken to be proportional to the fraction of resources in the active state, IsyH=AsEy(t).AsE is the maximum strength of the synapse and Ap(t- tsp) is the action potential received at time tsp, see illustration of Matlab simulations in figure 2.
4
!
d5i
I
!
Figure 2. Left: time course of the three states of a dynamic synapse (X, Y, Z) and the response of a neuron (V) connected to this depressing synapse. This dynamic synapse is injected with a regular spike train with 20Hz (that is, an Inter Spike Interval ISI=SOms) (with first spike at at time t=200ms, last spike at t-700ms). Right: time course of a neuron (V) connected to three different dynamic synapses (DS) represented by their states (Xl, X2, X3), each DS is injected with a spike train with a frequency 20Hz and starting at different onset times (200, 350, 400 respectively). , R, =lOOMQ, ~,~,=800ms r,,,=3ms. Parameters values are: &=0.8, A s ~ = 2 5 O p Azm=40ms,
4. The Training Algorithm
The aim of the training algorithm is to make the output neuron reach its maximum response at different times for different classes of patterns. For this purpose, an evolutionary strategy based supervised training is implemented to
208 optimize the dynamic synapses free parameters that underlie the learning of different patterns. The choice of evolutionary strategies is motivated by their suitability for treating continuous optimisation problems, without the complexity of binary encoding schemes [3][4]. Real value encoding is used and a combination of Gaussian and Cauchy mutation is implemented to tune the synaptic efficacy by evolving the time constant Tin, which makes the neuron reach its maximum response earlier (smaller value) or later (bigger values), see Figure 3. The spiking neural network is mapped to a vector of real values where the synaptic time constants are reordered with respect to different layers. A set of such vectors (individuals) will form the population to be evolved. A set of temporal input patterns is defined and denoted by {P(tl,...,t,)}, where P(tl,. ..,t,) represents a single input pattern such that the components tl, ...,tm define the firing times of each input neuron k I. For these input patterns a set of target output times is assigned, denoted by {tto}, at the output neurons OEO.these output times represent the expected timings of the output neurons maximum response. The ES aims to minimise the following objective function:
where t,"(t) and t:(t) define the actual and target timings respectively, and T is the total number of patterns in the training set. This error function is used to rank the individuals, thus no fitness function is calculated.
Figure 3. Different timings of the maximum response of a neuron with two different values of
q,,
The implementation of the self-adaptive evolutionary strategy is as follows: 1. Generate an initial population of p individuals, and set g=1. Each individual is taken as a pair of real-valued vectors, (xi, ql), 'd iE { 1,. .., p } , where xi's are objective variables representing the synaptic time constants, and q,'s are standard deviations for mutations. The vector x, is the phenotype representation of a spiking neural network.
209 Evaluate the error (equation (4)) each individual (xi, q,), i = l , ...,p,of the population is generating. 3. Each parent (x,, qi) generates a single offspring (xi’, q,’) by: for j=l,. .., n, where n is the size ofx, and qi.
2.
’o),
where x,o), x, q, 6) and q,‘0) denote the j-th component of the vectors x,, x,’, q, and 77,’ respectively. N (0,l) denotes a normally distributed random number with mean 0 and standard deviation 1. N,(O, I) indicates that the random number is generated anew for each value of j.The factors z and z’ are set to 1/(2n’”)l” and 1/(2n)’”. S, is a Cauchy random variable with a scale of 1, and is generated anew for each value ofj. 4. Evaluate each offspring (x, q, ’), i=l,. ..,p. 5. Generate a new population P (’ using tournament selection and elitism to keep track of the best individual at each generation. 6. Stop if a maximum number of generations is reached (g=g,,) or a target error is met; otherwise, g=g+Z and go to step 3. I,
5.
Results and Discussions
The performance of dynamic synapses based neural networks to discriminate between different patterns of different classes is evaluated through the nonlinearly separable XOR problem and two benchmark data sets, namely the IRIS data and the Breast Cancer data obtained from University of Wisconsin Hospitals [ 111. The implementation of these neuronal models and training approach is carried out using Matlab R13. As spiking neural networks operate on temporally encoded inputs or spike trains, the real valued features are first temporally encoded before being fed to the spiking net (XOR problem: see Table 1, IRIS and Cancer data sets: real value features are linearly transformed into firing times as in [13]). The ES based supervised training was able to find the dynamic synapses time constants that learned the XOR problem perfectly, by making the output neuron reaches its maximum response earlier (loms) for the inputs ({O,l}and{ l,O}), and later (40ms) for the inputs ({O,O}and { l,l}). Also for the IRIS and Breast Cancer data sets, the performance of the obtained networks after ES-based training were comparable to classical artificial neural networks trained with Matlab BP and LM methods (backpropagation and Levenberg-Marquardt), Spikeprop [ 121 (a gradient based training of spiking net without DS) and an ES trained spiking net without DS [13], see Table 2. The results show that by using time as a means of information representation and processing, biologically plausible models of dynamic synapses are capable of
210 handling and discriminating between different patterns of different classes. It is not the aim of this work though to claim that these detailed models outperform the classical methods when applied to classification tasks, but it is rather to further our understanding of the way real neurons represent and communicate information. Future work will apply these models to inherent temporal data such as speech signals and time series and assess its performance in extracting relevant features from these signals and performing recognition tasks. Table 1. Temporal encoding of the XOR problem, a logic value '0' is assigned a late firing time (40 ms), while a logic value of ' I' is assigned an early firing time (10 ms). Logic values
0 0 1 1
0 1 0 1
Temporal codes (ms)
0 0 6 6
0 1 1 0
0 6 0 6
40 10 10 40
Table 2. Comparison of different network performances, network architecture, training and test classification accuracy.
Net SNN with DS SNN without DS [13] Matlab BP Matlab LM SpikeProp [121
4x10x1 4x10x1 50x10x3 50x10x3 50x10x3
IRIS Train
96% 98.6% 97.3% 98.6 % 97.5%
Test 97.3% 97.3% 94.6%
96% 96.2%
Net 9x6x1 9x6x1 64x15x2 64x15x3 64x15x3
CANCER Train 97.2% 97.2% 98.5% 98.0% 97.8%
test 97.3% 98.2% 96.9% 97.3% 97.6%
References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13.
W. Gerstner, Phys. Rev. E 51(1): pp. 738-758 (1995). W. Gerstner, W. Kistler, Cambridge University Press, (2002). X.Yao,Y.Liu, G.Lin, IEEE Trans Evol Com, Vol.3,No.2,pp.82-102, (1999). T. Back, H. P. Schwefel.,Evol. Comput, Vol. 1, No. 1, pp. 1-23, (1993). J.J. Hopfield, Nature, vol. 376, pp. 33-36, (1995). W.Maass. Proc 7th ACNN, pp. 1-10, (1996). S.J. Thorpe, F. Fize and C.Marlot, Nature, pp. 381:520-522, (1996). G.Fuhrmann, I.Segev, H.Markram and M.Tsodyks, Neurophysiol, 87(1): 140-148, (2002). M. Tsodyks, K. Pawelzik and H. Markram, Neural Computations, 10, 821-835 (1998). H. Markram,, &, M. Tsodyks. Nature, 382, 807-810 (1996). O.L.Mangasarian,W. H. Wolberg, SIAMNews,Vol23,No5, pp 1-18,(1990). S.M. Bohte, H. La Poutre, J.N. Kok. Proc 8th ESANN, pp 419-425,(2000). A.Belatreche, LP.Maguire, T.M.McGinnity, Q.X.Wu, Proc 7th JCIS, pp 1524-1527, (2003)
EVOLVING AUTOREGRESSIVE NEURAL NETWORKS FOR THE PROBLEM OF MODELING NONLINEAR HETEROGENEOUS TIME SERIES* NIMA REYHANI AND MAHMOOD KHARRATt Info Society Department, Iran Telecom Research Center North Karegar Ave., Tehran, Iran
This paper addresses the problem of modeling heterogeneous time series through a community of neural net-based autoregressive models which evolves based on the dynamicity detects from system's observed behavior. The system assigns a NN-based autoregressive model to each time window which are defined based upon the estimated values of noise variance. So, the system can learn more whenever new data point is present. In other words the VC-dimension*of the learning machine is usually enough to learn the whole series.
1. Introduction Heterogeneous time series are those which are generated by time varying dynamical systems, or are in the affect of multiple/varying noise source(s), which are not linear in its nature. For example, a set of time ordered series gathered by observing a system of weighted addition of two different chaotic attractors with different coefficients which generally varying along time is called heterogeneous one. Some approaches have been proposed for the purpose of modeling in these cases as that of this paper concerns, which use multiple learning machines as done for homogeneous ones [ 11. All of them consider the whole series as of rising from an integrated system. The basic idea behind such approaches is what has already been proved by Takens [2, 31. This theorem shows that the state of system can be reconstructed by a set of finite windows, i.e. regressors as illustrated in expression (l), of the time series, in any desired accuracy. w =[~(~ t ) ,(- 1); t .,~ (- dt + 1),~ (- dt ) r *
(1)
The theorem does not say anything about the regressors. Here, we present an approach to modeling heterogeneous time series through defining an evolving neural net community. The paper is organized as follows: section 2 and 3 provide brief descriptions on Gamma test and other works on Time Series * This work is supported by Iran Telecom Research Center with project number 8232161
Email: {nreyhani, kharrat}@itrc.ac.ir, Tel:+98 21 8497357
' Vapnik-Chervonenkis dimension
21 1
212
Modeling. In section 4 we introduce our approach to heterogeneous time series modeling. And the paper is finished with a conclusion on the proposed approach in section 5.
2. GammaTest Gamma Test is a non parametric data analysis approach applied for estimating the noise variance of a finite data set [4]. This algorithm determines the least Minimum Square Error (MSE) that can be achieved from a finite training set. Suppose the input data be as ( (xi,yi)llli< M}, where the inputs x of R", and the corresponding output y of R. The system calculate the following formula
where YN[i,k] is the k-th nearest neighbor of q , and 15 k l p. Then system finds regression line of the pairs (6M(k), vM(k)):l< k l p where the longitude from origin of the regression line can be considered as estimated value of noise variance.
3.
Neural Network Autoregressive Models
ARMAX structure is a more general than of (Auto Regressive with external input) ARX model which can be formalized as
where the optimal predictor is as follows
Regression and parameters vector are defined as:
[ ( t , e ) =[& - 1) y ( t - 2 ) y~( t - +(t ~ ( e) t , ~ (- t1, e ) A
e=b,
a , GP, ~
- d ) ~ (- td - i ) ~~ (- td - m),
~ (- tk , e)]'
P,A P,?Y~ Y, y k I T
Note that, in the situation where C(.) is equal to 1, the model called (Auto regressive with external input) ARX or (Finite Impulse Response)FIR where
213 both C(.) & A(.) are equal to 1. Based on the regression vector, i.e. regressors, as input to a multilayer neural network, the whole system called NNARMAX, NNARX, and NNFIR respectively [61. 4.
The Proposed Approach
Here, a description of the approach is drawn based on the assumption that the observing system variation occurs slowly in course of time. As mentioned in chapter 1, there is no note on how one can determine the windows sizes or regressors in Takens theorem. Also, in case of applying neural nets as approximators, the corresponding VC-dimension leads to become zero after considerable number of epochs [ 5 ] . To circumvent these, we propose to apply a community of neural nets with unknown population and prefixed input features with one approximator at the starting point. The training process leads to inserting a new local approximator, e.g. NN-ARX, based on the differences of noise variance estimation of the previously seen time data and ones which consists of additional data points. Notice that the element of regression vector or lag space is determined using a feature selection process using a nonlinear component analysis (NPCA) [ 6 ] , Lipschitz quotients based [ 6 ] , or even Gamma Test based approaches [4]. The whole algorithm is as following.
procedure k + 1 // t 1 // Chunkk
Learning (Ordered-Training-Set as OTS, E ) as index of Windows as index of each training data 8 / / Corresponding training set of windowk rreCent r of first L points of OTS //L=some things while any data point left in TS DataSetk + (L number of left data point from OTS) U Chunkk Compute r value of Chunkk as rrecent if 1 rrecent - rprevious I E then Chunkk+l+ 0 define a new window wk Regressor, + Feature-Selection (ChUnkk) "Amk NNARX (Regressork , Chunkk) K + k+l else t
t
rprevious
t
end if end while end procedure
rrecent
214
As it is seen from the algorithm, the Gamma Test detects situations at where the achieved MSE of a window is less than one that can be achieved through using the new subset of data point. Such inequality means that the new subset belongs to a new variation or the system has been varying at that stage. At this stage the training procedure performing in the last approximator stops, and new approximator inserts into the community. By the way, the new data points adds to the currently defined chunk and considered as a training set for defining and training new approximator. So, the community of neural nets evolves through defining a new neural-based approximator to the neural-community.
5. Concluding Remarks In this paper, we proposed an approach to modeling time series data which are not homogeneous in their nature, based on estimating the least square error that can be achieved using the data set in order to determine where new windows should be defined. These make the advantages of having windows with dynamic sizes. Also, from sensitivity analysis viewpoint, another problem of using singular approximators is its low sensitivity when the system facing with a long term ones. By defining new windows the system make the opportunity of having enough information chunks for learning new dependencies so that the whole learning performance only depends on the sampling data and the original system changes. Experimental results show the capability of the approach to circumvent the mentioned problems with respect to both heterogeneous and long term time series. References 1. N. Reyhani, K. Badie, and M. Kharrat, “A New Approach to Heterogeneous time series Analysis Using Hybrid Case Based Reasoning and Additive Fuzzy Systems”, WSEAS Trans. On Systems, Issue 4, Vol. 2, Oct. ’03. 2. F. Takens, “Detecting Strange Attractors in Turbulence”, proc. of Dynamical Systems and Turbulence, Vol. 898, Lecture Notes on Math., 336-381, Springer-Verlag, 1981. 3. J. McNames, “Local averaging optimization for chaotic time series prediction”, Neuro computing, Vol. 48, No.1-4, pp. 279-297, Oct. 2002. 4. D. Evans, Antonia J. Jones, “A proof of the Gamma test” Proc. Roy. SOC. Series A 458(2027), 2759-2799,2002 5. Y. Bengio, et al. “Learning Long-Term Dependencies with Gradient Descent is Difficult”, IEEE Trans. on NN, Vol. 5 , Issue 2, Mar’94 6. M. Norgaard, et al., “Neural Networks for Modeling and Control of Dynamic Systems”, Springer, 2001.
RECOGNITION OF IDENTIFIERS FROM SHIPPING CONTAINER IMAGES USING FUZZY BINARIZATION AND NEURAL NETWORK WITH ENHANCED LEARNING ALGORITHM KWANG BAEK KIM Department of Computer Engineering, Silla Universiv, San 1-1, Gwaebop-Dong, Sasang-Gu, Busan, 61 7-736, Korea
In this paper, we propose and evaluate a novel recognition algorithm for container identifiers that effectively deals with various distortions and recognizes identifiers from container images captured in various environments. The proposed algorithm, first, extracts the area containing only the identifiers from container images using CANNY masking and bi-directional histogram method. The extracted identifier area is binarized by the proposed fuzzy binarization method. Then a contour tracking method is applied to the binarized area in order to extract the container identifiers, which are the target for recognition. In this paper we also propose and apply an enhanced ART1-based RBF network for recognition of container identifiers. The results of experiment for performance evaluation on the real container images showed that the proposed algorithm performs better for extraction and recognition of container identifiers compared to conventional algorithms.
Keywords : Container Recognition, Fuzzy Binarization, Contour Tracking, Enhanced ART1-based RBF Network
1.
Introduction
Recently, the quantity of goods transported by sea has increased steadily since the cost of transportation by sea is lower than other transportation methods. Various automation methods are used for the speedy and accurate processing of transport containers in the harbor. The automation systems for transport container flow processing are classified into two types: the barcode processing system and the automatic recognition system of container identifiers based on image processing. However, these days the identifier recognition system based on images is more widely used in the harbors. The identifiers of transport containers are given in accordance with the terms of IS0 standard, which consist of 4 code groups such as shipping company codes, container serial codes, check digit codes and container type codes[l][2]. The I S 0 standard prescribes only code types of container identifiers, while it doesn’t define other features such as size, position and interval of identifier characters etc. Other features such as the foreground and background colors of containers, the font type, and the size of identifiers, vary from one container to another. These variations in features for 21 5
216
container identifiers, makes the process of extraction and recognition of identifiers quite difficult [3]. Since the identifiers are printed on the surface of the containers, shapes of identifiers are often impaired by the environmental factors during the transportation by sea. The damage to the surface of the container may change shapes of identifier characters in container images. Therefore, after preprocessing the container images, an additive procedure must be applied, in order to decide whether the results are truly the edges of identifiers or just the noise from the background.
2. Container Identifier Extraction Figure 1 shows examples of container images representing two different types of identifier arrangement on the surface of containers,
(a) Horizontal arrangement of Identifiers
(b) Vertical arrangement of Identifiers
Figure 1. Examples of Container Images
2.1. Extraction of Container Identiper Areas
Figure 2 shows the various steps in the algorithm for identifier area extraction.
Figure 2. Extraction algorithm of identifier areas
Since the container images include noise caused by the distortion of the outer surface and shape of containers on the upper and lower areas, the calculation of vertical coordinates of identifier areas ahead of horizontal coordinates can generate more accurate results. Hence, we calculated the vertical coordinates of identifier areas by applying the vertical histogram to edge maps, and applied the
217
horizontal histogram to the block corresponding to the vertical coordinate calculating the horizontal coordinate.
2.2. Extraction of Individual Identipers We extracted container identifiers from identifier areas by binarizing the areas and applying contour tracking algorithm to the binarized areas. Container images include diverse colors, globally changed intensity and various types of noises, so that the selection of threshold value for image binarization is difficult using traditional methods, which use distance measures [4]. Therefore, we propose a novel fuzzy binarization algorithm to separate the background and identifiers for extraction of container identifiers. The proposed fuzzy binarization algorithm defines I M l d as the mean intensity value of the identifier area for the selection of interval of membership function. I,, is calculated like Eq. (1).
f 2 I, r=n
I,,
=
(1)
1-0
H x W
where I ri is the intensity of pixel (i, j) of identifier area, and H and W are the pixel lengths of height and width of identifier area respectively. lMln and I,, define the minimum intensity value and the maximum value in the identifier area respectively. The algorithm determining the interval of member h c t i o n I,,~""] in the proposed fuzzy binarization is as follows:
l
Step 1:
~
~
~
~
=I
IMrn
~
-
M ~
~
~
,
~Mln
LaF= Iua - I,, Step 2:
> 128 Then I,,"
I,
rf
Else
Step 3:
= 255-
I,,,
= IMd
IUidF> IUaxF Then
rf
If
I,,""
Else
D
> I,[,"
Then
CT = I,,,"
=
Else
If
I,"
>
Then
C T =IuIdF
Else o = IUaxF
Step 4:
Calculate the normalized IMnNeW& IUarNew
I,,
New NeW
'Max
=I,,-a = IMid
a
218
In most cases, individual identifiers are embossed in the identifier area and the noise between identifier codes and the background is caused by shadows. We used the fuzzy binarization algorithm to remove the noise from the shadows. The membership fimction of the proposed hzzy binarization is shown in Figure 3.
"t I
I,,"
Figure 3. Proposed fuzzy membership function
Next, we extracted the container identifiers from the binarized identifier area by using the contour tracking method. In this paper, the 4-directional contour tracking method using 2x2 mask was applied considering the whole preprocessing time of container images[5]. Transform identifier area to
Transform identifier area to grayscale image
groups sequentially in the
groups sequentially in the vertical direction (a) Identifier extraction in vertical identifier area
(b) Identifier extraction in horizontal identifier area
Figure 4. Two types of identifier extraction algorithms
In this paper, the extracted identifiers are arranged in a single row by using Euclidean distances between identifiers and classified into three code groups. The Euclidean distance is calculated by measuring the distance between the start pixel of the first identifier and the start pixel of the other identifier having a vertical offset from the first identifier. The vertical offset must be less than one
219
half of the vertical size of the first identifier. Then, by combining identifier sequences in every row, one row of identifiers is created. Finally, identifiers in the row are classified sequentially to code groups according to the IS0 standard [l]. Figure 4(a) shows the procedure for identifier extraction in identifier area with vertical arrangement and Figure 4(b) shows the extraction procedure in the area with horizontal arrangement. 3.
Identifier Recognition using an Enhanced RBF Network
For the improvement of the success rate of recognition, this paper proposes an enhanced RBF network that adapts the ARTl network to the learning structure between the input layer and the middle layer and applies the output layer of the ARTl network to the middle layer. In the ARTl network, the vigilance parameter determines the allowable degree of mismatch between any input pattern and saved patterns[6]. Moreover, since many application of image recognition based on the ARTl network assigns an empirical value to the vigilance parameter reduction of the success rate of recognition may occur. To correct this defect, this paper enhances the ARTl network by adjusting the vigilance parameter dynamically according to the homogeneity between patterns by using Yager’s intersection operator, which is one of fuzzy connection operators[7]. Eq. (2) shows the equation applied to the ARTl network for refinement in this paper, which dynamically adjusts the vigilance parameter p by using Yager’s intersection operator. p ( n + 1) = 1- min(1, ,/((I - p ( n ) y + (1 - p ( n - 1))’
4.
j
(2)
Performance Evaluation
Totally 100 container images of 754x504 pixel size and 256 colors were used in the experiment. By using the proposed extraction algorithm for all 100 images the identifier areas were successfully extracted from the images. Table 1. Performance comparison of identifier extraction
220
Applying identifier extraction algorithms proposed in this paper and the histogram based algorithm [3] to the extracted identifier areas, experimental results were summarized and compared in Table 1. Our algorithm, first, distinguished the background and container identifiers by using the proposed fuzzy binarization, and then, extracted identifiers by using the contour tracking. Table 2 compares learning performances in the experiment that applied the conventional ARTl Based BRF network algorithm and an enhanced ARTIbased RBF network to container identifiers extracted by the proposed algorithm mentioned above. As shown in Table 2, the number of clusters created at the learning process of the proposed ARTl network was much lower than the conventional ARTl network, which means that it is efficient to use the proposed ARTl network in the construction of middle layer in the enhanced ARTI-based RBF network. Table 2. Comparison of learning performance
Table 3 compares recognition performances of the two algorithms by the number of recognition successes in the experiment. As shown in Table 3, the recognition rate of the enhanced ARTI-based RBF network was higher than the conventional ART 1-based RBF network. Table 3. Comparison of recognition performance
221
5.
Conclusions
In this paper, we have proposed and evaluated a novel recognition algorithm of container identifiers for the automatic recognition of transport containers. The container images demonstrate certain characteristics, such as irregular size and position of identifiers, diverse colors of background and identifiers, and the impaired shape of identifiers caused by container damages and the bent surface of containers making the identifier recognition by image processing difficult. Hence, we proposed a hzzy binarization algorithm to separate clearly the background. For identifier recognition, we also proposed an enhanced ART 1based RBF network that organizes the middle layer effectively using the enhanced ARTl neural network. The proposed network adjusts the vigilance parameter dynamically according to the homogeneity between patterns. Results of the recognition experiment by applying the conventional ART1based RBF network and the enhanced RBF network to the 1054 extracted identifiers show that the enhanced ARTl-based RBF network has a higher rate of recognition compared to the conventional ARTl -based RBF network.
References 1. ISO-6346, Freight Containers-Coding -Identification and Marking, 1995. 2. N. B. Kim, “Character Segmentation from Shipping Container Image using Morphological Operation,” Journal of Korea Multimedia Society, V01.2, No.4, pp.390-399, 1999. 3. M. Y . Nam, E. K. Lim, N. S. Heo and K. B. Kim, “A Study on Character Recognition of Container Image using Brightness Variation and Canny Edge,” Proceedings of Korea Multimedia Society, Vo1.4, No. 1, pp. 111- 1 15, 2001. 4. Liane C. Ramac and Pramod K. Varshney, “Image Thresholding Based on Ali-Silvey distance Measures,” Pattern Recognition, Vo1.30, No.7, pp. 1 1611173, 1997. 5. K. B. Kim, S. W. Jang and C. K. Kim,”Recognition of Car License Plate by Using Dynamical Thresholding Method and Enhanced Neural Networks,” Lecture Notes in Computer Science, LNCS 2756, pp. 309-3 19,2003. 6. Grossberg, S., “Adaptive pattern classification and universal recoding: parallel development and coding of neural feature detectors, ” Biol. Cybern., V01.23, pp. 187-202, 1976. 7. H. J. Zimmermann, Fuzzy set Theory and it’s Applications, Kluwer Academic Publishers, 1991.
FUZZINESS-DRIVEN EDGE DETECTION BASED ON RENYI’S A-ORDER FUZZY ENTROPY
I. K. VLACHOS AND G. D. SERGIADIS Aristotle University of Thessaloniki Faculty of Technology, Department of Electrical & Computer Engineering, Telecommunications Laboratory, University Campus, GR-54124, Thessaloniki, Greece E-mail: { ivla,sergiadi}@auth.gr This paper presents an algorithm for edge detection in images based on fuzzy sets theory. In order to perform edge detection we exploit the fuzziness present a t edge locations in an image. The proposed parametric scheme is based on the notion of the a-order entropy of a fuzzy set. The parameter a controls the sensitivity of the algorithm to detect various types of edges. A detailed comparison with the algorithm proposed in [l] is carried out.
1. Introduction Edge detection is a fundamental task in pattern recognition and machine vision. An edge is defined as the boundary between two regions with relatively distinct gray-level properties. Most edge detection techniques are based on the computation of a local derivative operator. Fuzzy sets theory [2] has been successfully applied to many image processing and pattern recognition problems. The extensive use of fuzzy logic in digital image processing is mainly favored from the ability of fuzzy sets to cope with “qualitative” measures, such as the edgeness of a region, by modelling the ambiguity and vagueness often present in digital images. In addition, fuzzy sets theory provides us with a solid mathematical framework for incorporating expert knowledge into digital image processing systems. In this paper we present an efficient algorithm for edge detection in images based on fuzzy sets theory, as a modification of the algorithm introduced in [l].The method exploits the intrinsic fuzziness present at areas along an edge. The fuzziness is measured using Renyi’s a-order entropy of a fuzzy set. An intuitive membership function is also introduced in order to describe the “degree of edgeness” of a pixel in its vicinity.
222
223
0.0
0.2
0.4
0.6
0.8
1.0
0.0
0.2
0.4
0.6
0.8
1.0
Figure 1. (a) ea curves for a + 1 (solid line), a = 0.3 (short-dashed line), and a = 3.0 (dashed line). (b) Sugeno-based parametric index of fuzziness for X = 80.0 (solid line), and X = -0.9877 (short-dashed line).
2. Fuzzy Sets and Fuzzy Entropies 2.1. Image Representation in the Setting of Fuzzy Sets Theory Let us consider an image X of size M x N pixels, having L gray levels g ranging from 0 to L - 1. The image X can be regarded as an array of fuzzy singletons [3]-[5].Each element of the array denotes the membership value p x ( g i j ) of the gray level g i j , corresponding to the ( i , j ) - t h pixel, regarding to a predefined image property such as brightness, edgeness, etc. Using the fuzzy sets notation image X can be represented as:
X = { p , x ( g i j ) / g i j l i = 0 , 1 , . . . , M - 1, j = O , l , . . . , N - l} .
(1)
2.2. The a-Order Entropy of a f i z z y Set
As an extension of Shannon’s entropy, Renyi in [6] defined the a-order entropy Ha of a probability distribution (PI,p2, . . . ,p,). Bhandari and Pal in [7]introduced the a-order fuzzy entropy, which in the case of an image X defined as in Eq. 1 is given by:
where a(# 1) is a positive real parameter and e a ( p x ( g i j ) ) is defined as:
224
It should be mentioned that the a-order fuzzy entropy constitutes a oneparameter generalization of the De Luca and Termini’s entropy HLT defined in [8],since lima-+l Ha = HLT. Fig. l ( a ) illustrates Eq. 3 for various values of parameter a. 3. Fuzziness-Based Edge Detection
3.1. Fast f i z z y Edge Detection [ l ]
An edge is defined as the boundary between two regions with relatively distinct intensity properties. According t o this definition edgy pixels belong to both regions. Therefore, it is expected that these pixels should exhibit high fuzziness values. In [l]the linear index of fuzziness was used to measure the degree of fuzziness of pixels inside a square window W. Let us consider an “optimal” edge in a 3 x 3 neighborhood W :
(4) 100 100 100 The corresponding membership values can be easily calculated by intensity normalization:
0.0 0.0 0.0
(5) 1.0 1.0 1.0
Definition 3.1. The degree of edgeness of the (m,n)-th pixel is given by w-lw-1
where w is the size of the sliding square window. Eq. 6 can be rewritten for the window W centered a t the (m,n)-th pixel as: pEdgeness(gmn) = min{l,w%(W))
>
(7)
where 71 is the linear index of fuzziness defined in [9], which in the case of an image X is given by the following formula:
The index of fuzziness considers the intersection of a fuzzy set and its complement. This means that the index of fuzziness measures the lack of distinction between fuzzy sets and their complements.
225 In [l]the membership function of Eq. 7 was modified in order for the algorithm to become more robust, since the spatial calculation of the membership values is noise sensitive. The modified function is given by the following formula: PlEdgeness (gmn)
= min{l, wv
- minspatial (gij) (w)1(maxspatial (gij)Smax
1.
(9) where gmaz is the maximum gray level of the image. Furthermore, in [l] the parametric Sugeno fuzzy complement was also used in order t o control the sensitivity of the algorithm to edges. The Sugeno fuzzy complement is given by the following equation:
with X E ( - 1 , ~ ) .
3.2. Proposed Method The fuzziness-based approach to edge detection proposed in [l]exhibits some drawbacks. Let us consider the following situation where the pixels inside the sliding window have only two different intensity values, that is
W = 8080 80 8 [: :8 i:8
.
The corresponding membership values, computed by intensity normalization, are either 0 or 1, thus assigning to the central pixel a degree of edginess equal to zero, according to Eq. 9. Therefore, using Eq. 9 simply fails t o extract this type of edges. Another issue arises from the non-symmetric behavior of the parametric index of fuzziness based on Sugeno fuzzy complement, shown in Fig. 1(b), with product as the intersection operator. Let us consider the case of the sliding window being located at two different types of edges; one “bright” and one “dark”, with corresponding membership values, i.e. =
& I
0.0 0.0 0.0 0.9 0.9 0.9 [l.O 1.0
,I
&/ =
0.0 0.0 0.0 0.1 0.1 0.1 [LO 1.0
,.,I
.
(12)
Both edges are LLequivalent”in the sense that the strength of the edge in both cases, calculated as gray-level difference, equals 0.9. Thus, it is expected that both edges should be treated similarly, that means assigning the
226 same degree of edgeness to the central pixel. However, this is not the case if we consider the parametric approach based on Sugeno fuzzy complement with product as the intersection operator. Due to its non-symmetric nature the Sugeno-based index of fuzziness assigns different weights for the same departure from the values 0 and 1. For example in the case of the edges mentioned above, if we calculate the degree of edgeness for X = -0.9877, in the case of the “bright” edge the membership value is 1.0, while for the “dark” we have 0.2465. This shows clearly that even thought the edges are “equivalent” they are not treated the same way. In order to overcome this drawback we propose the following algorithm for fuzziness-driven edge detection based on Renyi’s a-order entropy of a fuzzy set. Fuzzy image processing consists of three stages: (a) fuzzification, (b) suitable modification of the membership values, and (c) defuzzification. In the fuzzification stage of the proposed algorithm, the membership function p x is initialized according to:
for all i E {0,1, . . . , M - 1) and j E (0, 1,. . . , N - l),where gmin and gmax stand for the minimum and maximum gray levels of the image respectively. Using Eq. 13 for initializing the membership values has the advantage of stretching out the membership function over the entire gray-level range and normalizing the intensity values in order to lie in the unit interval [0,1]. Fuzzy entropy is a measure of the fuzziness of a fuzzy set, arising from the inherent ambiguity carried by the fuzzy set itself. Moreover, entropy is also a measure of the amount of information contained in a system. This approach provides us with another way of looking to the problem of edge detection, since it is expected that edge points will carry more information than non-edge ones, because object contours are the primary sources of stimulation of the human visual system. In order t o ensure that the proposed method successfully retrieves all existing edges we modify Eq. 9 as follows:
where 6 is given by the following formula
227
(a)
(b)
(4
(c)
Figure 2. (a) Computer-generated image. Results obtained using the Sugeno-based parametric index of fuzziness for (b) X = -0.9877 and (c) X = 80.0. (c) Proposed edge detection scheme for a = 0.3.
and p x ( g i j ) is defined as P X ( g i j)
Sij
=
m a i ~ { O , l.,. ,.M -l},
j€{O,l,.
.. ,N-
1) ( g i j )
(16)
Parameter a of the a-order fuzzy entropy controls the sensitivity of the proposed method to edges. As a --t 0 the measure becomes more sensitive in extracting edges from the image under processing. It should be mentioned that instead of Renyi’s a-order entropy, we can use any parametric index of fuzziness derived from parametric t-norms, such as Dombi’s t-norm, or the t-norms proposed by Hamacher, Schweizer-Sklar etc. These parametric fuzzy t-norms can be used to implement symmetric indices of fuzziness in the general form of 4
M-lN-1
where T ( . ) is the implemented t-norm, p denotes the control parameter of the t-norm, and A is a normalization factor ensuring that the index of fuzziness takes values in the [0,1] interval. This is an essential requirement for the index in order to be qualified as a sensible measure of fuzziness. 4. Experimental Results
In order to evaluate the performance of the proposed edge detection scheme, we applied the proposed algorithm to various real-world and synthetic images of different types. We have considered gray-scale images of size 256 x 256 pixels with 8 bits-per-pixel gray-tone resolution. Fig. 2(a) demonstrates how the proposed method raises the drawbacks stated in Sec. 3.2. The synthetic computer-generated image consists of the two “optimal”
228
Figure 3. (a) Test image, and edge images obtained with the parameter set to (b) a = 3.0, and (c) a = 0.3 using the proposed method. (d) Image obtained using Sobel operator.
(a) Figure 4.
(b)
(c)
( 4
(a) Test image, and edge images obtained with the parameter set to (b)
a = 3.0, and (c) a = 0 . 3 using the proposed method. (d) Image obtained using Prewitt operator.
edges described by Eq.12. The horizontal edges of the rectangle correspond to the “bright” edges, while the vertical ones to the “dark”. Figs. 2(b) and 2(c) depict the results obtained using the Sugeno-based index of fuzziness with product as the intersection operator and the value of the parameter X set t o 80.0 and -0.9877 respectively. The result derived using the proposed approach is illustrated in Fig. 2(d) for a = 0.3. By comparing the produced edge maps we can observe that our method successfully treats both edges equivalently, by assigning them the same degree of edgeness, while at the same time overcomes the drawback of the sliding window being located a t an area consisting only of two different intensity values. The proposed method was also tested using real-world images. The results obtained using our approach are compared to those derived using different edge-detection techniques, such a s the Sobel and Prewitt operators. Figs. 3(b) and 3(c) show the edge maps of the test image illustrated in Fig. 3(a) for different values of the parameter a. For the image of Fig. 3(b) we have set a to a = 3.0, while for the one in Fig. 3(c) the parameter was
229
0.3. Fig. 3(d) shows the result obtained after applying the Sobel operator. From the images one can observe that as the parameter a decreases more edges emerge since the a-order fuzzy entropy assigns higher weights even to small gray-level differences. Compared to the Sobel operator the proposed method significantly extracts different types of edges and has the advantage that its sensitivity can be easily controlled by suitably tuning the parameter a of Renyi’s fuzzy entropy. The efficiency of our algorithm, is mainly favored from the ability of fuzzy sets to model the ambiguity and vagueness present in digital images. Another result is also depicted in Fig. 4 from which similar conclusions can be drawn. 5 . Conclusions
This paper presents an efficient algorithm for edge detection in images based on Renyi’s a-order entropy of a fuzzy set. The algorithm measures the degree of edgeness in image regions in terms of fuzziness and informational content, using a sliding-window approach. The sensitivity of the proposed method can be adjusted in order to retrieve edges of different types. Finally, our future work involves a detailed investigation of the behavior of various indices of fuzziness derived from parametric t-norms in order to perform fuzziness-driven edge detection.
References 1. H. R. Tizhoosh, Fast and robust fuzzy edge detection, in Fuzzy filters for image processing M. Nachtegael, D. Van der Weken, D. Van De Ville, E . E. Kerre (Eds.), Springer-Verlang, Heidelberg, (2003) pp. 178-194. 2. L. A. Zadeh, Fuzzy sets, Inj. Contr. 8 , (1965) pp. 338-353. 3. S. K. Pal and R. A. King, Image enhancament using fuzzy set, Electronics Letters 16, (1980) pp. 376-378. 4. S. K. Pal and R. A. King, Image enhancement using smoothing with fuzzy sets, IEEE Trans. Syst., Man, and Cybern. SMC-11, (1981) pp. 494-501. 5. S. K. Pal, A note on the quantitative measure of image enhancement through fuzziness, IEEE Trans. Pattern Anal. Machine Zntell. PAMI-4, (1982) pp. 204-208. 6. A. Renyi, On measures of entropy and information, Proc. Fourth Berkeley Symposium on Mathematical Statistics and Probability, (1960). 7. D. Bhandari and N. R. Pal, Some new information mesures on fuzzy sets, Information Sciences 67,(1993) pp. 209-228. 8. A. De Luca, S. Termini, Definition of a nonprobabilistic entropy in the setting of fuzzy set theory, In& Contr. 20, (1972) pp. 301-312. 9. A. Kaufmann, Introduction to the theory of fuzzy subsets, Academic Press, New York, (1975).
STATISTICAL ANALYSIS OF ECG SIGNALS WITH WAVELET TECHNIQUES AND METHODS OF NON-LINEAR DYNAMICS
IMRE PAZSIT Chalmers University of Technology, Department of Reactor Physics, SE-418 96 Goteborg, Sweden E-mail: [email protected] This paper demonstrates the use of some methods of signal analysis, performed on ECG and in some cases blood pressure signals, for the identification of the heart health status. Spectral analysis, continuous wavelet transform and fractal analysis were tested. The analysis was made on data from mice and rats. A correlation was found between the health status of the mice and the rats and some of the statistical descriptors, most notably the phase of the cross-spectra between ECG and blood pressure, and the fractal properties and dimensions of the interbeat series (RR-interval fluctuations).
1. Introduction Methods of time and frequency domain analysis of random processes', used earlier mostly for non-living systems, are being increasingly used in medicine and biology2. As a consequence, one can note papers even on nuclear conferences that report on analysis of heartbeat data or some other medical diagnostics signal. The purpose of the present paper is that, in line with this development, to report on some spectral, wavelet and fractal analyses of heartbeat data, i.e. ECG signals, with the goal of identifying features that indicate unhealthy status. In contrast to ordinary stationary processes most often observed in engineering applications, an ECG signal is not stationary, rather quasi-periodic. It consists of the periodic repetition of the so-called PQRST complex, which has a "spiky", strongly non-harmonic character. In most cases so far, the analysis was done on the interbeat intervals, i.e. the times between the R-R peaks, also called HRV (for a review see Ref. [2]). This series is more like a stationary random signal, and it has interesting fractal properties.
230
231 One subject of the present contribution is the test of spectral, wavelet and fractal techniques on data from mice and rats.
2. The experimental data The data used in this analysis consist of two groups. One is taken from mice, and in these measurements only ECG data were recorded. There are 4 data files available, which will be labelled as Cases 1,2,3 and 5. In the second group, data of blood pressure and ECG data were recorded simultaneously from rats. There are two measurements of this kind, labelled as bphrl and bphr2. One illustrative case of mice data is shown in Fig. 1 below. The data appear quite regular, with a normal heart frequency of about 10 Hz. This frequency is clearly seen in the wavelet transform of the time signal, represented by horizontal line just below mid-height of the periodogram (lower part of Fig. 1). Each QRS complex induces a conical shape on the periodogram, whereas the interbeat interval fluctuations appear as deviations of the horizontal HRV from being a straight line. Such a representation is, however, quite ineffective to quantify or even sensitively detect heart rate variations. Such anomalies can be better seen on the analysis of the R-R interbeat intervals. Raw ECG data
2
3
4
5
Time [s]
Figure 1. Time series from measurement with mouse, Case 1, together with its continuous Morlet wavelet transform.
232 3. Analysis of the data
Spectral analysis was only made on the raw ECG signals. Wavelet transform was performed on both the raw signals and on the time series represented by the R R interbeat intervals as functions of the beat number (these will be called HRV signals for simplicity). The fractal analysis was only performed on the HRV signal. These will be described here below. 3.1. Spectral analysis of the r a w ECG signals
For brevity, we only cite here the only interesting finding. It is related t o the cross-spectra and correlations between blood pressure and ECG data. There are two such measurements for rats. For both cases, auto-spectra of the blood pressure and the ECG signal were calculated, as well as the coherence and the phase of the cross-spectra, as functions of the frequency. The results with the interesting finding are shown in Fig.2. bphr2
Blood pressure
4096
bphr2
Heort rote
10-1
-
:: 10' : 10'
t
10-4
10-5
100
10-0 0
10-1
j-J 0
~~~
20
bphr2
0
40 60 Frequtlncy [Hz]
80
100
coherence HR - EP
j 9
40 F,mqY."Cy
60
80
100
[HI]
phase
20
40 60 Frequency [Hz]
HR
- EP
f
0
e -1
0.1
-2
02
00 0
20
bphr2
-3
20
40 60 Frsqusncy [HZ]
80
100
0
60
100
Figure 2. APDS, coherence and phase of the blood pressure and ECG signal for rats, case bphr2. The linear dependence of the phase curve breaks down for low frequencies.
The autospectra (APSD) show the fundamental frequency and higher harmonics peaks of the periodic heartbeats. The fundamental frequency is under 5 Hz, both for the B P and the ECG signals, which is characteristic for rats. The interesting part lies in the coherence and the phase relationships between the two signals. Despite of the low coherence, there is a clear linear dependence of the phase of the cross spectrum as a function of the frequency,
233 up to about 100 Hz. Such a linear relationship is due to a constant time delay between the first and the second signal. The slope is proportional to the time delay between the two signals. Both measurements show a clear time delay effect, but there is a significant difference between the two cases. For the case bphrl, (not shown here), the linear phase stretches from zero frequency up t o high frequencies. In the case bphr2, Fig. 2, the phase is constant and equal to -180° from zero frequency up to 7-8 Hz, i.e. the two signals are out of phase in this region, varying in opposite phase without time delay. This indicates that for the case bphr2, at low frequencies there is another process that interferes with the transport process. In industrial diagnostics, the occurrence of a new process, interfering with the pure transport, is often a sign of an anomaly. Our finding is consistent with such an expectation, since, due to the difference in their physical exercise, the rat bphrl can be taken as healthier than bphr2.
3.2. Wavelet and fractal analysis of the HRV data There are standard methods of extracting RR-interval data from raw ECG signals, but these are usually based on the identification of the QRScomplex in the ECG waveform. For our data these methods were not applicable due t o the data quality. Instead, the R-R peaks were identified as the maxima of the negative derivative of the original series. Wavelet transform for the analysis of both ECG and HRV data has been used for a long time4. There have been primarily two strategies in use so far. One is to search for characteristic patterns either in the ECG or in the HRV data that could indicate anomalous behaviour4. The other avenue is to look for self-similarity properties of the HRV signal on the wavelet transform4, or with other methods3, and determining its fractal dimension. A search for the qualitative self-similarity properties of the data available to us was made on the RR-interval series derived in the way described above. The interval series a s functions of the beat number, and their wavelet transform for HRV data for one case of mice is shown in Fig. 3. The wavelet transform was calculated with the wavelet toolbox of the Interactive Data Language (IDL). The wavelet transform clearly shows a self-similar fractal structure in the cardiac dynamics of the rats, in that large structures which can be observed at large scales occur in a similar but smaller form for smaller scales, over the whole range of scales considered. For the quantitative investigation of the fractal properties of the inter-
234
0.04
-
4 16
l64 256
1024 0
460
200
1wO
12M)
Figure 3. RR-interval series as a function of beat number for mouse Case 1 (top part) and its continuous wavelet transform (bottom part).
beat intervals, the method suggested by Higuchi5 was used, in the form as described in Ref. [3]. This consists of the calculation of the average length ( L ( k ) )of the RR-interval curve v ( n ) , n = 1 , 2 , 3 . .. , N , at various length scales k as the average of the quantity 1
L,(k) = -[( k
I-[
c a=1
Iw(mf ilc)- w(m
N-1 + (i - 1)kI)- [?]k I
(1)
with respect to m = 1,2,3,.. . k . The negative slope of log(L(k))as plotted against log k gives the fractal dimension of the curve. Plots of l o g ( L ( k ) ) vs log k are shown for all four cases of mice RRinterval series in Fig. 4. It is seen that all cases have a fractal dimension D larger than unity, and in particular 1.5 5 D 5 1.9 for cases 1, 3 and 5. These cases have a quite definite single value of the fractal dimension. However, Case 2 shows two segments of the curve, representing a different fractal dimension for shorter and longer scales. These are approximately equal to D s = 1.6 and D L = 3.5. The tendency ( D s < D L ) corresponds to what in Ref. [3] was found for healthy, but elderly patients. It is interesting to note that the HRV data showing two fractal dimensions in our case belongs to an animal with an anomaly (excess in growth hormones). Hence a correlation is found between health status and fractal properties of the interbeat intervals. 4. Conclusions These preliminary investigations have demonstrated or reproduced the use of some spectral, wavelet transform and fractal dimension analysis in the
235
Figure 4. Plots of log(L(k))vs log k for all four cases of mice RR-interval series.
classification of the status of the cardiac dynamics of animals. More detailed analysis will be performed on annotated data base of human ECG signals along the wavelet based methods indicated in the paper.
Acknowledgments The measured data from mice and rats were obtained from Dr. Goran Bergstrom, Department of Physiology, Goteborg University, which is acknowledged with thanks. This work was financially supported by the science foundation Adlerbertska forskningsstiftelsen and Kristina Stenborgs stiftelse.
References 1. I. Pazsit, Int. J . for Real-Time Systems 2 7 , 97-113 (2004) 2. M. Malik, J. T. Bigger, A. J. Camm R. E. Kleiger, A. Malliani et al., European Heart Journal, 17, 354-381 (1996) 3. L. Guzman-Vargas and E. Calleja-Quevedo and F. Angulo-Brown, Fluct. Noise Letters,3, L83-L89 (2003) 4. B. Suki, A. M. Alencar, U. F'rey, P. Ch. Ivanov, S. V. Buldyrev, A. Majumdar, H. E. Stanley, C. A., F h c t . Noise Letters, 3, Rl-R25, (2003) 5 . T. Higuchi, Physica D, 31, 277-283 (1988)
CLASSIFICATION OF TWO-PHASE FLOW REGIMES VIA IMAGE ANALYSIS BY A NEXJRO-WAVELET APPROACH*
c. SUNDE', s AVDIC~+AND I. P ~ S I T ' 'Department of Reactor Physics, Chalmers University of Technolom, SE-412 96 Goteborg, Sweden Faculty of Sciences, Department of Physics, University of Tuzla, 75000 Tuzla, Bosnia-Herzegovina,
A non-intrusive method of two-phase flow identification is investigated in this paper. It
is based on image processing of data obtained from dynamic neutron radiography recordings. Classification of the flow regime types is performed by an artificial neural network (ANN) algorithm. The input data to the ANN are some statistical functions (mean and variance) of the wavelet transform coefficients of the pixel intensity data. The investigations show that bubbly and annular flows can be identified with a high confidence, but slug and chum-turbulent flows are more often mixed up in between themselves.
1. Introduction
Two-phase flow patterns are usually classified into four classical so-called flow regime types. These are 1) bubbly, 2) slug, 3) churn-turbulent, and 4) annular flow regimes (see Fig. 1). Recognition and, possibly control, of the flow regime types is essential in numerous practical applications. Although the flow classification can be done reliably in fully instrumented channels in which thermocouples, pressure transducers, flow-meters etc. are available, a more challenging alternative would be to use non-intrusive methods. In this field, the availability of the methods is not as wide. Non-intrusive methods so far have been based on radiation attenuation measurements, such as X-ray [I] or gamma-rays. These methods are usually based on the detection of collimated rays penetrating the flow, and the processing of the variation of the intensity, modulated by the flow, by various statistical methods (probability distributions, auto- and cross-correlations and spectra). A qualitatively different approach, which will be pursued in this paper,
* This work is supported by the Swedish Centre for Nuclear Technology
Work partially supported by contract 14.10-030515/200310002 of the Swedish Nuclear Power Inspectorate
236
237
is to use image analysis. After all, the concept of flow regime arises from an intuitive judgement of the topology of the flow, based on visual observation. Bubbly f b w Slug flow Such images of two-phase flow can be produced in transparent pipes easily with visible light, but in metal pipes neither X-rays or gamma-rays are applicable. X-rays do not penetrate the wall, and gamma-rays can not, in general, be produced with desired intensity such that an image with good contrast and Churn flow Annular flow dynamics can be achieved. However, dynamic neutron radiography has been developed to produce images of twophase flow in metal pipes at the Kyoto University Research Reactor Institute [2]. Some of these measurements were made available for us, and were used in the Fig. 1. Neutron radiography images present analysis. Some sample images are of the four basic flow regimes. shown in Fig. 1. 2.
The classification algorithm
Our objective in this work was to find an algorithrmc identification method. Artificial neural networks (ANNs) were selected for this purpose. Static images for raining and test data were provided as follows. IN the recording at our disposal, the heating was gradually increased starting with cold water. During the measurement, all four flow regimes occurred in sequence, with a smooth transition between the different regimes. The sections of the four regies were identified. From each sequence, 200 tiff images were extracted and used in the classification. Each image consisted of about 60 000 pixels (4 11* 153 pixels). In this work a simple feedANN forward network with backpropagation was used consisting of Tonsig an input layer, an output layer and one hidden layer (Fig. 2). As the figure shows, 40 nodes were used in the hidden layer, two different types Fig. 2. The layout of the ANN used. of transfer functions between the layers, and four output nodes, representing the four different regimes. Before the output nodes, thresholding was used; values about the threshold were converted
238
to unity and values below to zero. An identification was considered definite (but obviously not necessarily correct) if and only if one of the output nodes showed unity and the others zero. The regime type was defined by which output node was "fired".
3.
Generation of the input data with wavelet transform
The information content in the images, represented by the about 60.000 pixels, needs obviously reduced in dimension before using it for input. Wavelet transform s e e m to be a very effective tool to achieve this goal ([3], [4]). Wavelet transform coefficients are often better (more sensitive) features in pattern recognition tasks than the original data. What regards two-phase flow, the four flow regimes have structures that show up spatial variations at different scales. Hence, wavelet coefficients from a multi-resolution analysis (or the original input data, after a wavelet multiresolution analysis) seem to be suitable input data. To further reduce the number of input data, at each resolution scale, one can condense the coefficients further, i.e. use their first two statistical moments (mean and variance). As it turned out, the quality of the radiography recordings had a relatively poor quality in terms of quantitative usage for the present purpose such that a full 2-D multiscale resolution analysis was not practical. After the step of splitting the data into approximations and details as
no useful information content remained in the details DI. Thus, only the data of the first level approximation were used. These were further condensed into two parameters, the mean and the variance. Hence the network used in this work had two input nodes. This will be improved significantly in the future, as work is going on with better quality input data from new measurements.
4.
Results
Fig. 3 shows the mean of the wavelet coeffients corresponding to the first approximation of the multiresolution analysis for various wavelets, as functions of the sample number. The figure shows that this parameter has a relatively good discrimination power. The variance of the same values, not shown here, shows more overlapping, and hence less discrimination. The results are shown in Fig. 4 where tests were made with 200 samples for each regime. The results show that the identification was quite successful. In particular the success ratio was 100% for annular flow, and very close to 100%
239
" mm
Fig. 3. The mean of the wavelet coefficients for the different flow regimes as funtions of the sample number
NN bion I
Fig. 4. The results of the identification procedure with different wavelets
also for bubbly flow. Slug and chum-turbulent flow were mistaken for each other in a few percents of the cases. This pilot study shows the possibilities of the method, even if the quality of the input data did not make the full power of the wavelet pre-processing process visible. Further work is going on to test and develop the method further with better quality input data.
Acknowledgments We thank Prof. Kaichiro Mishima, Kyoto University Reactor Research Institute, for providing us with the flow images. The visit of Senada Avdic to Chalmers was supported by the Swedish Nuclear Power Inspectorate (SKI).
References 1. M. A. Vince and R. T. Lahey, Int. J. Multiphase Flow.8,93 (1982). 2. K. Mishima, S. Fujine, K. Yoneda, K. Yonebayashi, K. Kanda and H. Nishihara: Proc. 3rd Japan-US Seminar on Two-Phase Flow Dynamic, Ohtsu, Japan (1988). 3 . N. Hazarika, J. Z. Chen; A. C. Tsoi, A. Sergejew, Signal Processing 59, 61 (1997) 4. P. S. Addison, The Illustrated Wavelet Transform Handbook, IoP (2002)
PERFORMING AN ANALYSIS OF FUZZY FUSION FOR SPERMATOZOA IN FERTILE HUMAN JIA LU
Computer Science and Information System, UniversiQ of Phoenix 5050 NW 125 Avenue, Coral Springs, FL 33076, US.A cluiia(ii.email. uophx.edu
YUNXIA HU Department of Health Science Nova Southeaster UniversiQ, U.S.A szmizvh!&ivfloridu.coin
To evaluate the sperm quality of fertile men, the sperm morphology was examined based on a fuzzy data fusion. All sperms were observed in infertile patients. The abnormal spermatozoa of the fertile men were classified into different types: head, neck, and tail. Detailed study of each sperm track was possible as the image sequence with the individual sperm movement traces for easy inspection. The criteria defining the types of the classification could be set in order to track the details of fuzzy fusion analysis. The morphology was performed as a complete in the evaluation. The numbers of the normal and abnormal sperm were identified, and the type of defect was recorded for each abnormal spermatozoon. We showed the results in this paper that fuzzy fusion morphology provided a unified and the consistent framework to express different shapes of spermatozoa in fertile human.
1.
Introduction
Human spermatozoa morphology is assessed routinely as part of standard laboratory analysis in the diagnosis of human male infertility. This practice has its origins in the work of (Seracchioli 1995) which showed that sperm morphology was significantly different in fertile compared to infertile man. The evaluation of human sperm morphology has been a difficult and inconsistent science, since it is based on the individual sperm parameters. There are many approaches to human sperm morphology recognition available, and some of them have been applied to real world tasks with great success (Swanson 1985). However, these evaluations for human sperm morphology are normally hard to establish and human knowledge is hard to incorporate into the precision levels. Human semen evaluation continues to be influenced by subjectiveness of the investigator and a lack of objective measurements for sperm morphology continues to be a problem.
240
24 1
Fuzzy data fusion is especially suited to provide methods to deal with and to fuse uncertain and ambiguous data arising in computer vision (Karen 1989). Fuzzy logic theory has already turned out to be a valuable tool for the solution of various single tasks in image understanding (Kopp 1996). These successful applications of fuzzy notions stimulate the idea that the integration of single vision modules using fuzzy methods will result in a powerful fuzzy data fusion system (Rusan 2002). In this paper, we present a general approach of processes and representations in 3-CCD camera image recognition for those individual sperm shall be implemented using the theory of fuzzy data fusion.
2. Spermatozoa Human morphological slides were prepared using smearing and staining technique in order to create imaging analysis based on the fuzzy fusion in the window programming. The morphological abnormalities normally relate to the main regions of the spermatozoon (i.e. head, neck/mid-piece, and tail). There were bent, asymmetrical tail insertion, thick or irregular mid-piece or thin mid-piece in neck, and mid-piece abnormalities. Fuzzy data fusion was performed based on various sperm head, neck, and tail defects. Small acrosomal area or double heads sperm and free tails were not computed. A high frequency of coiled tails indicated that the sperm had been subjected to hypo-osmotic stress. Tail coiling related to sperm aging. A frequency of coiled tails was computed in fuzzy data fusion.
3.
Fuzzy fusion behavior
The most obvious illustration of fusion is the use of various sensors typically to detect a human sperm images. Fuzzy data fusion was used for recognition of the human sperm properties. The processes for the experiment are often categorized as low, intermediate and high level fusion depending on the processing stage at which fusion takes place. It combines several sources of sperm morphology data to produce new data that was more informative and synthetic of the inputs. Typically, the images presented several spectral bands of the same scene are fused to produce a new image that ideally contains in a single channel of all of the information available in spectral smear. An operator (or an image processing algorithm) could then use this single image instead of the original images. This is important when the number of available spectral bands becomes so large that it is impossible to look at images separately. This kind of fusion requires a precise pixel level of the available images. Those features may come from several raw data sources or from the same
242
data sources. The objective is to find relevant features among available features that might come from several feature extraction methods. Human sperm morphology information to be fused was captured from multismear images. For the individual sperm information source corresponds to different microscope sequences of the sperm sample. According to the head size, neck, and tail quantization level in different slides of selection path during the process of image capture, the data alignment was transformed for the multiple source data into a common coordinate system. The human sperm data was modeled by fuzzy modeling correspond to multiple sources of feature fusion, fuzzy data hsion, and the fusion decision in h z z y fusion behavior.
4. Fuzzy recognition 4.1. Human sperm asfuzw, integral and fuzzy sets
Definition 1: Let S be a fuzzy location with the element of S denoted by s, in this case, S = (s). S is the set of human sperm volume and s is the coordinate of shape, s = (% b, c). Definition 2: Let fuzzy set A be a fuzzy subset on S, A = (s, pa (s)ls ES). A is a fuzzy set for human sperm of S. Definition 3: Let X be a universal set for the sperm image, where D = (d), D is the fuzzy index of multi-smear. X provides the H (head) when d = 1, N(neck) when d=2, T (tail) when d=3, and X is the set of signal intensity of image. The element of X is denoted by sd Definition 4: Let the fuzzy set AX be a fuzzy set of the human important sperm defined as AX = ((s, d), pAX(s,d)Is~S, xd E X) For the fuzzy integral measurement of human reproductive sperm, I need observe the image by the following definitions. Let Z be a non-empty finite set. Definition 5: An objective set function s:2' -> [0, I] is a fuzzy measurement if s(0) = 0; s(Z) = 1, if I, J c 2" and I c J then $1) I s(J), if I , c 2" for 15 n < cc and the sequence {In} is the monotone in the sense of inclusion. Definition 6: let s be a h z z y measurement on Z. The function h: Z -> R with respect to s is defined as in Eq. (I), {h(zl), ..., h(zl)} = Z {h(zi) - h(zi-1))s (Ii) (1) where indices I have been defined 0 I h(z) I h(z1) I 1, I = { q, .. .,z,}; h(q)) =O.
243 4.2. Fuzw model Fuzzy models were proposed for the fuzzy information. The membership functions of human sperm corresponding to H, N, and T, pH, pN, and p~ were defined as in Eq. (2) (Lu and Hu 2004),
p$(s, h)=%+% sin((h2 - (s+h)/2 x s/h-s)
(2)
The membership functions of fuzzy models were used to analyze the feature of the parameter increasing of the operator. Membership functions present the relation of the degree of the sperm shape levels. They project the fuzzy feature of human sperm image onto corresponding fuzzy of H, N, and T. The knowledge rules were used for the fuzzy models. A combination of the sperm features, the membership function with high degree of H: H x N x T -> [0, 13 ~ H ( sif) and only if s E H where d = 1, 2, 3. The analysis of these features was based on any fuzzy intersection operator for fusing these features. Fuzzy starts with the fuzzification process of the given image yielding a fuzzy image defined as in Eq. (31,
.
.
FH={pH(H,+,);d=O,l,. .,d-1 ,n=O,l,. .,N-1 ,OIpHI1} (3) where p (Hdn) is the membership functions that denote the degree of the brightness relative to the value of a pixel Hdnwhich is situated position of the given image. The membership functions were used to model the ambiguity in the image. Although there exist some different type of membership functions, it has been widely applied to image problems. This function was still defined as in Eq. (4),
where b = (a + c) / 2 is the crossover point. The bandwidth of the membership function was defined as F = b-a = c -b. It determines the fuzziness in pH. 4.3. Fuzzy fuzzifcation
A fuzzy fuzzification was proposed for several fuzzy subsets by using fuzzy set. A fuzzy relation of human sperm was applied to interior region connection in subset and exterior region between subsets. After fuzzification, a fuzzy measure was calculated to determine the average amount of ambiguity using a linear or quadratic index of fuzziness, a fuzzy entropy or an index of non-fuzziness measure. The index reveals the ambiguity in the given images (H)(N)(T) by measuring the shapes between its fuzzy property pH, pN pT and the nearest binary version pH, pN
244 pT. Since the aim of the fuzzification is to determine the object from the properties of human sperm, the optimal images can be determined by minimizing the ambiguity in the given image. To determine the minimum ambiguity the crossover point, the bandwidth of the membership functions varies along the all shape levels of human sperm. We simulated the average percentage normal and percentage abnormal and selected the larger of those two groups for duplicate comparison. We computed the difference between the two assessments in that group. If the difference was smaller than the value obtained from fuzzy membership function computing, the assessments could be accepted. If the difference was larger than the value, two new assessments should be simulated. For the soft computing, we need check whether the smear was difficult to read from the computing, the smearing kept in reserve should be stained with fresh solutions and assessed. 4.4. Human sperm digital images A fuzzy set S in a universe X is characterized by a X - [0,1] mapping Xs, which associates with every sperm shape element x in X a degree of membership XS(X)of x in the fuzzy set S. In the following, we denoted the degree of membership by S(x). Note that a digital image can be identified with a fuzzy set that takes values on the grid points (x,y, z), with x, y, z E N, 0 < x I M and 0 < y I
N and 05 z 5 W (M, N, W E N). Therefore, for head, neck, and tail: H, N, T, we had that H, N, T €AX), with X = {(x, y, z)l 0 Ix I M, 0
Results
The recognition of 3-CCD camera image uses the results of recognition of human sperm to illustrate the rough shape and localization in its fuzzy dilation to account for variability between the model and the image. This knowledge was
245 combined with information extracted from the image itself, which leads to a successful recognition of the normal and abnormal in fuzzy data fusion for human sperm that it can be compared with the traditional microscope calculations. Each image from a set of human sperm properties including the shapes was computed, which was then subjected for evaluation and further analysis. The images were labeled for object recognition. The images results demonstrate how these findings facilitate approaches on the 100 semen samples. The average probability of morphology recognition analysis was equal to 95% and the average probability of unknown parameter was equal to 4.5%. Fuzzy data fusions show that we were successful in our recognition of human sperm into different shape groups. The percent increased for each of the categories, such as head, neck, and tail area, were quite similar, except for a larger percent increase in the mean head, neck, and tail area in the their shape properties. Conclusions In analyzing the fuzzy fusion results, it confirmed that the fuzzy fusion was a viable solution in reducing and controlling both the variability and the subjectivity of the classifications of human sperm morphology data. This technique predicted the shape of human sperm with 95% accuracy on a test set of 1300 images. We have also shown the usefulness of fuzzy fusion morphology in this context. This work opens new perspectives for spatial reasoning under imprecision in image interpretation. It showed very clearly from the present data that human sperm shape was preserved not only visually but also by objective morphometry after digitizing image process. References
1 . R. Seracchioli, The diagnosis of man infertility by semen quality. Hum. Reprod. 10, 1039-1041. (1995). 2. R.J. Swanson, Male factor evaluation in in vitro fertilization Norfolk experience. Fertil Steri1;44:375-83. (1985). 3. G. H. Karen, Developmentally regulated mitochondria1 fusion mediated by a conserved, novel, predicted GTPase. Cell, Vol. 90, 12 1-129, July 1 1. (1989). 4. H. Kopp-Borotschnig, A new concept for active fusion in image understanding applying fuzzy set theory. Fuzzy IEEE, New Orleans. (1996). 5. S. Rusan, Fuzzy markovian segmentation in application of MRI, and computing vision and image understanding 85, pp. 54-69. (2002). 6. Jia, Lu and Yunxia, Hu, Soft Computing for Spermatozoa Morphology. Proc. SCI, Orlando, U.S.A (2004).
LINGUISTIC SUMMARIES OF IMAGE PATTERNS HEMA NAIR Faculty of Engineering and Technology, Multimedia University, Melaka 75450, Malaysia In this paper, an approach that utilises fuzzy logic is presented in order to develop linguistic summaries of some patterns in remote-sensed images. Techniques such as clustering and genetic algorithms are used to mine images and optimise the Linguistic summary to the most suitable one for each pattendobject in the database.
1. Introduction
This paper proposes an approach utilising fuzzy logic [I], [ 2 ] to describe some patterns in remote-sensed images. Clustering and genetic algorithms are used to develop the most suitable linguistic summary of each patterdobject stored in a table. This paper is organised as follows. Section 2 describes the system architecture, section 3 describes the approach, section 4 discusses the implementation issues, and section 5 discusses the conclusions and future work. 2.
System Architecture
The system architecture is shown in Figure 1 . The input image is analysed and some feature descriptors extracted by the image analysis component These descriptors are stored thereafter in a relational table in the database. The knowledge base uses geographic facts to define feature descriptors using fuzzy sets. It interacts with a built-in library of linguistic labels which also interacts with the summariser as it supplies the necessary labels to it. The summariser receives input from these components and performs a comparison between actual feature descriptors of the image stored in the database with the feature definitions stored in the knowledge base. From among these summaries, the most suitable one is selected by interaction with the engine (genetic algorithm).
3.
Approach
Area, Length, Location in image, and Additional Information are the attributes of the patterns/objects that are used to develop their linguistic summaries. Area, length and location (X, Y co-ordinates in image) are extracted by image analysis component in Figure 1. Additional information is calculated using k-means Y = y1,y2,...yp then truthbi is F) = pF(yi) : i = clustering technique. If 1,2,...g .
246
247
Image
Database
+ Analysis & Input
Feature Extraction
Summariser
Feature
Figure 1. System Architecture.
p+i) is the degree of membership of y i in the fuzzy set F. For each object yi( island or area of land etc), the degree of membership of its feature descriptor such as area or length in corresponding fuzzy sets is calculated. Triangular fuzzy sets for area are large, considerably large, moderately large, fairly large, and small and fuzzy sets for length are long, considerably long, relatively long, fairly long and short. The linguistic description is calculated as follows: T.=m J l jAm2j~ mJ"'3 Am . nj , where mu is the matching degree [l] of the ith attribute in thejth tuple. mu€ [0,1] is a measure of degree of membership of the ith attribute value in a fuzzy set denoted by a fuzzy label.. The logical AND (A) of matching degrees is calculated as the minimum of the matching degrees [ 11.
T=CT~,V~~~ k
$0
T in equation (1) is a numeric value that represents the truth of the overall summary of the k objects in the database. The GA evolves the most suitable linguistic summary for all the objects by maximising T.
4. Implementation Issues This section explains the genetic algorithm approach and then discusses the results from applying this approach to mining images.
4.1. GA Approach A genetic algorithm emulates biological evolutionary theories as it attempts to solve optimisation problems [3]. The evaluation function for the linguistic summaries or descriptions of all objects in the table is f = max(T), where T is evaluated as shown in the previous section and f is the maximum fitness value
248
of a particular set of generations of the GA.
linguistic summaries that have evolved over several
4.2 Results The fuzzy sets that quantify area or length are defined based on geographic facts such as: Largest continent is Asia with area of 44579000 km2, Largest freshwater lake is Lake Superior with area of 82103 km2, Smallest continent is AustralidOceania with area of 7687000 km2. Only one of the triangular fuzzy sets formulated is shown here due to space limitation. The fuzzy set for considerably large expanse of water is defined in equation (2). Pconsiderably large expanse of wa,r(x)=1-(55068.66-x)/27034.33, for 28034.331 X555068.66
=1-(~-55068.66)/27034.33,for 55068.66<X5 82103 =0, x< 28034.33 =0,x> 82103 An example SPOT Multispectral image with 20 m resolution to be analysed is shown in Figure 2. Table 1 shows a small sample data set of feature descriptors from some of the patterns in the image. Table 2 lists the data extracted from the image for clustering. Additionally, an object is classified using some rules that hold for SPOT 20 m resolution and similar images. For example, uniform band ratio is checked for the water envelope surrounding island and also for classifying any water body. This analysis uses only unsupervised classification algorithms such as k-means. In order to extract patterns such as urban areas, supervised classification is necessary for which ground data is required for training. Table 1 . Feature descriptors from Figure 2. Additional information attribute denotes numbers as follows: l=Other Water Body, 3=Land. Location indicates X, Y pixel co-ordinates of centroid of pattedobjec t.
Figure 2. Image of area in peninsular Malaysia on March 6, 1998. Geographic co-ordinates of the image are approximately 3"17'U-3"48'U latitude and 100"58'T-101"38'T longitude. Approximate scale 1: 0.0003764.
249
The GA is run with following input parameter values set which is obtained after several trial runs. Number of bits in a chromosome string of the population = 10, Generations per cycle = 12, Population size = 200 strings, Probability of cross-over = 0.53, Probability of mutation = 0.001. After 96 generations, the linguistic summaries generated for the data in Table lare: A small area of land at the centre. 0 A small expanse of water at the lower left. A small expanse of water at the centre. A small expanse of water at the lower right
5. Conclusions and Future Work A new approach to develop linguistic summaries of some image patterns has been presented. As future work, development of a user-friendly tool with graphical interface to ease the task of extracting and calculating feature descriptors could be considered. Table 2. Data recorded from Figure 2 for clustering. The columns represent from left to right X-object, Y-object, X-envelope, Y-envelope, R-object, G-object, B-object, R-envelope, G-envelope, B-envelope.
References 1. J. Kacprzyk and A. Ziolkowslu. Database queries with fuzzy linguistic quantifers. In IEEE Transactions on Systems, Man and Cybernetics, (1986). 2. H. Nair. Developing linguistic summaries of patterns from mined images. In Proceedings of International Conference on Advances in Pattern Recognition (2003). 3. R.E. Smith, D.E.Goldberg, and J.A. Earickson. SGA-C:A C-language implementation of a Simple Genetic Algorithm. TCGA Report No.91002, (1994).
IMAGE RETRIEVAL USING LINGUISTIC EXPRESSIONS OF COLORS
A. AIT YOUNES , I. TRUCK, H. AKDAG AND Y . REMION LERI, Universite' de Reims Champagne Ardenne rue des craybres BP 1035, 51687 Reims cedex 2 E-mail: [email protected] This paper proposes a software for image retrieval using linguistic expressions of colors. The retrieval requires a classification of images according to dominant colors. Pixels of the images are considered in HLS space and membership to classes are then computed. The classes correspond to the linguistic expressions proposed in the software: classes for hues and classes for colors qualifiers. These data are stored in a database to facilitate the user queries.
1. Introduction Image retrieval is an important problem that is useful in many fields 3,7, ', In medical applications, it is important to retrieve images in order to help medical expert forecasts, for example. Another example lies in web content detection: Hammami et al. classify images in determining whether they contain a lot of skin texture or not in order to detect adult and sexual contents '. In this article, we propose a classification of images according to their dominant color. Classes for hues and classes for qualifiers (e.g. dark) are built and each image is assigned to certain classes. This assignment is function of the quantity of pixels that belong to classes. The paper is organized as follows: section 2 is devoted to the problem of color representation where fuzzy membership functions are used. In section 3 we focus on the profile determination for each new entry (image) in the database. Finally in section 4 we present the software developed and the associated database. Screen captures are also shown while section 5 concludes this article.
'.
2. Color representation with fuzzy subsets
RGB space is the space usually used to represent the color on a screen. A color is expressed through a combination of the three components Red, 250
251
Green and Blue. Although this space is easy to use, it is not appropriate for our problem because the modification of the hue or the lightness, ... of the color is not trivial with these three dimensions (R, G and B). To facilitate the color modification we choose a space that allows us t o characterize a color with only one dimension: its hue. Indeed hue is enough to recognize the color, except when the color is very pale or very somber. This space is called HLS (Hue, Lightness, Saturation): H is defined as an angle but can also be represented in the interval [0,255] as the other components L and S. The difference between H and the other components is that its definition interval loops which means that 0 and 256 are the same points. For this problem, we limit ourselves to the nine fundamental colors defined by the set I representing a good sample of colors (dimension H) :
I = {red, orange, yellow, green, cyan, blue, purple, magenta, p i n k } This set corresponds to the seven colors of Newton to which we have added color pink and color cyan. Of course, this choice is not restrictive, the set of colors can be modified as desired. Thus HLS space is convenient for our problem as we have seen but it has a drawback: it is a non UCS (uniform color scale) space ’. Indeed there is a lack of uniformity since our eyes don’t perceive small variations of hue when color is green ( h = 2~85)or blue ( h = f170) while they perceive it very well with orange ( h = 21) for example. To deal with non uniformly distributed scales, authors such as Herrera and Martinez propose to use fuzzy linguistic hierarchies with more or less labels, depending on the desired granularity ’. Another approach from Truck et al. is to represent the hues with trapezoidal or triangular fuzzy subsets thanks to colors definitions from www.pourpre.com lo. This technique is more empirical but fits better the human perception, that is why we also use this approach. For each color of 7 they built a membership function varying from 0 to 1 (ft with t E I ) l 0 If . this function is equal to 1, the corresponding color is a ”pure color” (cf. figure 1). For each fundamental color, the associated interval is defined according to linguistic names of colors: for example to construct fyellow, we can use color ”mustard” whose hue is equal to 55 and whose membership to fyellow is equal to f 0 . 5 . For some colors, the result gives a wide interval: for example, the colors ”green” and ”blue” which are represented by trapezoidal fuzzy subsets. For the construction of these functions, in this article we suppose that two functions representing two successive colors have their intersection
252 point value equal to 1/2. It means that when h corresponds to an intersection point it can be assigned to both colors with the same weight.
I
0
21
85
43
128
170
I91
213
234
255
Figure 1. The dimension H.
As usual we denote ( a ,b, a ,p) a trapezoidal fuzzy subset with [a,b] the kernel and [a - a , b p ] the support.
+
Now we can define the membership function of any color t :
vt
E
7,
f t ( h )=
[
if ( h 2 a ) A ( h 5 b) zf ( h < a - a ) A ( h > b + p ) h-o if ( h > a - a ) A ( h < a ) if(h>b) A (h
P
For example, for t 2 1 , a = 2 1 , p = 22) : 0
=
o r a n g e we have a triangular subset with ( a =
i f h 2 43 if h < 2 1
Moreover if we want to complete the modeling, it is necessary to take into account the two other dimensions (L,S). A scale representing the colorimebric qualifiers is associated to each dimension. These two intervals are divided into three: the first subinterval corresponds to a low value, the second to an average value and the last to a strong value. This division gives for saturation S: "dull", "moderately dull" and "saturated"; and for lightness L: "gloomy", "heavy" and "pallid". These two scales are then aggregated to give nine qualifiers for colors l1 defined by the following set Q (cf. figure 2 ) :
Q = { somber, dark, deep, gray, m e d i u m , bright, pale, light, l u m i n o u s }. Each element of the set Q is associated to a membership function vary-
253 0
85
s
255
170
b
somber 85
medium
bright 170
light 255
Figure 2.
Fundamental color qualifiers.
ing between 0 and 1 (denoted fq with q E Q). For these functions the intersection point value is also supposed to be equal to 1/2 and every function is represented through the set (a,b, c, d, a , p, y,6) (cf. figure 3). The membership function of any qualifier q is defined below : 'v'q E Q, '1 0 l-(c--Y) (d+6)-1
if(a<s
<
-S
,
< PC + y b )
zf(d < 1 < d + 6 ) A (pl - 6s > pd - 6b) A ( d +6s > a d + ba)
< s < u ) A (a1- ys > ac - ya) A (a1+ 6s 5 a d + ba) zf ( b < s < b + p) A ( p l + ys > Pc + yb) A ( p l - 6s < Pd - 6b) Zf
b+P
YS
.
( a - LY
For example, for q
=
somber we have ( a =
a!
=
0, b = 43, /3 = 84, c =
7 = 0, d = 43, 6 = 84) :
1 0 fsomber(1, s ) =
Zf(s 5 43) A (1 5 43) Zf(s 2 127) V (1 2 127) Zf(43 < 1 < 127) A (1 > s )
127-s . 7 Zf (43 < s < 127) A (1 5
S)
3. Determining images profiles After the formal definitions of colors and qualifiers, the next step is to build the image profile. A profile is defined according to the image membership to the various categories: the nine fundamental colors and the nine color qualifiers. For each pixel of the image we can determine the values taken by the various membership functions to the categories. For each category the value obtained corresponds to the ratio between, on the one hand, the sum, on all the pixels of the image, of the membership functions values to the category and on the other hand the number of pixels, which gives a quantity between 0 and 1. This quantity is the membership degree of an
254
T'
0
43
127
212
255
d f
L
Figure 3.
Dimensions L and S.
image to the given class. The membership degree of an image to a certain class is defined as follows: Let I be an image. Let P be the set representing the pixels of I . Each element p of the set P is defined by its color coordinates (hp,l,, s,). p can be one pixel or a set of pixels. We can calculate the functions ft(hp), .fq(lp, s), for t E 7 and q E Q.
Ft,q
Let Ftand be the following functions, representing the membership degree of I to the classes t and ( t ,q ) :
Every image is defined by a profile of 90 elements (ITI+ITxQl = 9+81). A profile can be presented as follows : [ Ft(I),F t , q ( I ) ] An image can be assigned to several classes among the 90 existing ones. There are 9 principal classes denoted Ct with t E 7 , and 81 subclasses which express a refinement of the research: (?t,q with ( t ,q ) E 7 x Q.
255
Figure 4. Profile of an image.
As shown in figure 4 the classes can be represented through a tree with father-son relationship, the classes Ct can be considered as fathers - and the classes (?t,q as their sons. For example the father of class Cred,somber is Cred.
Let us denote:
0
The classes Ct if & ( I ) 2 F * ( I )- A, with X a tolerance threshold The classes (?t,q if & ( I ) >_ F * ( I )- X and F t , q ( I )>_ F,*(I) - X
An image is assigned to a subclass only if it is also assigned to its father class. 4. Presentation of the software In the software, a database is used to store images with their profiles (cf. figure 5). That helps us to optimise the exploitation of these information. The software is divided into two sections, the first one corresponds to the treatment and the insertion of the images in the database, the second one to the exploitation of this database through requests with linguistic terms. In the first section, the profile of the new image is built and stored: to insert an image in the database we have to introduce a new record into the database after determining the values taken by the functions representing the membership degrees of the image t o the various classes. In the second section, the database can be exploited according to two levels of precision. The first one corresponds t o the nine fundamental colors (dimension H), the second one t o the nine color qualifiers.
256 Imagequalifier
,
Imape ID name size image
Image ID Hue ID Ftat .pialifioO)
l..n
Imagehue
Image
I
1
^ Hue ID Fte(I)
l..n
l..n
l..n
l..n
Hue
Qualifier l..n
Qualifier ID qualifier-name
Figure 5.
Hue ID hue-name
Database
j
Figure 6.
Query with only a color and query with a color and a qualifier.
Once the hue is selected, the user has the possibility to refine his request by specifying a color qualifier. For that purpose, he can choose one of those proposed in the list, or click in the corresponding zone in the image. For example, figure 6 shows images whose dominant color is "blue" and images whose dominant color is "luminous blue". Two other kinds of requests are handled: the first one allows us to retrieve B&W and gray-level images (in this case, H is not considered), and the second one allows us to retrieve images with more than one dominant color. One-color requests can be successively added (composed) to obtain a multi-color request.
257 5 . Conclusion
In this work we have developed a n approach that permits t o classify images according t o their dominant color(s). We limit ourselves t o fundamental colors and nine color qualifiers; those can be widened without modifying the approach, only few modifications have t o be performed in the software. As a fut,ure work, this software will be adapted for medical applications: it will help medical forecasts and analysis, like tumors detection. Instead of working on the whole image, the software will consider zones of images. It will retrieve images which contain zones with a strongly dominant color (for example red) which can correspond to cancerous cells.
References 1. B. Bouchon-Meunier (1995). La Logique Floue et ses Applications, AddisonWesley, 1995. 2. Y. Chen and J. Z. Wang (2002). A Region-Based Fuzzy Feature Matching Approach to Content-Based Image Retrieval. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 24, no. 9, 1252-1267, 2002. 3. L. Foulloy (1990). Du contr8le symbolique des processus : dkmarche, outils, exemples, Phd Thesis, Universitk Paris XI, September 1990. 4. M. Hammami, L. Chen, D. Zighed, Q. Song (2002). Dkfinition d’un modkle de peau et son utilisation pour la classification des images. In Proceedings of MediaNet ’2002, 187-198, Sousse, Tunisie, June 2002. 5. F. Herrera, L. Martinez (2001). A model based on lingustic two-tuples for dealing with multigranularity hierarchical linguistic contexts in multiexpert decision-making. IEEE, Transactions on Systems, Man and Cybernetics. Part B,31(2), 227-234, 6. P. Hong, T. Qi and T. S. Huang (2000). Incorporate support vector machines to content-based image retrieval with relevance feedback. IEEE International Conference on Image Processing, Vancouver, Canada, 2000. 7. B. Le S a m (2003). Classification non exclusive et personnalisation par apprentissage : Application ii la navigation dans les bases d’images, Phd Thesis, INRIA, France, July 2003. 8. J. Roire (2000). Les noms des couleurs. In Pour la science, Hors skrie, n 27. 9. I. Truck (2002). Approches symbolique et floue des modificateurs linguistiques et leur lien avec l’agrkgation. Phd Thesis, Universitk de Reims ChampagneArdenne, France, December 2001. 10. I. Truck, H. Akdag, A. Borgi (2001). A symbolic Approach for Colorimetric Alterations. In Proceedings of EUSFLAT 2001, 105-108, Leicester, England, Septembre 2001. 11. I. Truck, H. Akdag, A. Borgi (2001). Using Fuzzy modifiers in Colorimetry. In Proceedings of the 5th World Multiconference on Systemics, Cybernetics and Informatics, SCI 2001, 472-477, Orlando, Florida, USA, 2001.
A COLORING ALGORITHM FOR IMAGE CLASSIFICATION
D. GOMEZ, J. MONTERO, J. YANEZ AND c. POIDOMANI Department of Statistics and O.R. Complutense University Madrid, Spain E-mail: [email protected]
In this paper we present a pixel coloring algorithm, to be considered as a tool in fuzzy classification. Such an algorithm is based upon a sequential application of a divisive binary procedure on a fuzzy graph associated to the image to be classified, taking into account surrounding pixels. Each color will suggest a possible class, if homogeneous, and the hierarchical structure of colors will allow gradation between classes.
1. Introduction
Classification in remotely sensing images quite often suggests techniques based upon fuzzy models. This is mainly the case when there are no objects to be classified. Objects, at least in a standard sense, use to present clear borders, and classification can be developed just based upon a boundary analysis and a previous knowledge of the shapes of the different objects under consideration. On the contrary, many classification problems about earth land use, for example, refer to classes showing gradation from one class to the next class. There are no clear boundaries, and each class defines a fuzzy set with no particular shape (see Bezdek and Harris'). In fact, there is an increasing research on Fuzzy Sets Theory applied to in remote sensing classification problems (see, e.g., Foody') . Many different approaches can be found in remote sensing classification literature. In Amo et for example, some of the authors proposed a classification model based upon a modified outranking model, basically taken from Pearman et aZ.12. But the output information appeared to be difficult to be managed by non-qualified decision makers. A main need was to develop fuzzy representation techniques. In particular, it was missing some kind of coloring tool allowing a consistent and informative picture aL3l5,
258
259 showing possible regions and gradation of membership t o possible classes. In this paper we propose an unsupervised crisp coloring methodology, to be considered within a more elaborated fuzzy classification system, as defined in Amo et a.’l4. The coloring procedure we present here is defined by means of a divisive crisp binary coloring process, which seems promising as a helpful tool in order to find out consistent regions and postulate possible fuzzy classes. In section 2 we introduce the basic pixels fuzzy graph associated to an image, and in section 3 we present a crude coloring algorithm. A final comments section shows the some particular improvements actually under development (see G6mez et ~ 1 . ~ ) .
2. The image and its associated pixels fuzzy graph
Let us consider an image as a bidimensional map of pixels, each one of them being characterized by a fixed number of measurable attributes. These attributes can be, for example, the values of the three bands of the visible spectrum (red, green and blue), the whole family of spectrum band intensities, or any other family of physical measures. Our main objective is to determine a family of pixels suggesting to define a class. This information should be taken into account in a later supervised analysis where additional information may exist. The image I under consideration is therefore divided into pixels (information units), and the whole information is summarized as a vector of b measures for each pixel I = {(z:,~, . . . ,z$) / ( i , j ) E P } , where P represents the associated set of pixels in which the image is divided, P = { ( i , j )/ i E (1,.. . , r } j E (1,.. . , s } } ] meaning that we are dealing with an image of size T x s , each pixel being characterized by b numerical measures. Given such an image I , a standard crisp classification problem pursues a partition in crisp regions, each one being a subset of pixels, to be considered a candidate for a new class, in case such a region is homogeneous enough. In this way, a crisp classification approach looks for a family of subsets of pixels {Al,. . . , A,} such that P = U;=,AI, but Ai n Aj = 0,V i # j . Our approach in this paper pursues to obtain an approximate gradation by splitting each subset under consideration into two crisp classes every time. The key tool will be a distance between the measured properties of pixels d : P x P -+ [0, oo),which at a first stage can be based upon the Euclidean distance in Rb. Of course, any other ad hoc distance can be taken into account in a future research. Obviously, the classification process will
260 be strongly dependent on the selection of the appropriate distance, to be chosen taking into account all features of the image under consideration, toget her with our particular classification objectives. Hence, our set of pixels P is being modeled as a planar fuzzy graph (see, e.g., K6czyl0, Mordeson and Nair8, or Rosenfeld13) whose nodes are pixels, described by means of their Cartesian coordinates i E { 1,. . . ,r } and j E (1,.. . , s}. The graph will be planar in the sense that two pixels ( i , j ) and ( i ’ , j ’ ) cannot be linked if J i- i ‘ J Ij - j’l > 1. Consequently, two pixels could be adjacent only if they share one coordinate being the other one contiguous. Let G = (V,@ be a fuzzy graph, where V is the node set and the fuzzy edges set fi is characterized by the matrix p = ( p i j ) i , j E ” , where pi3 = p ~ ( { i , j } ) , V i ,Ej V , and pfi : V x V -+ M is the associated membership function. Each element pij E M represents the intensity level of the edge { i , j } for any i , j E V . The set M is linearly ordered (pi,j 3 pi,,j, means that the intensity level of edge { i , j } is lower than the intensity level of edge { i ‘ , j ‘ } ) . Hence, the set M allows the literal graduation of the edge sets; for example, if M = { n ,I, h } the edges can be graduated as null ( n ) ,low ( I ) or high ( h ) . We can then denote by G ( I ) = (P ,E ) the graph associated to our image I , where M = [0, 00) is the domain of the distance function d:
+
-
Definition 2.1. Given the image I and a distance d between measured properties of pixels, the pixels fuzzy graph is defined as the pair G ( I ) = h_
(Pi E ) . h_
Notice that our pixels fuzzy graph G ( I ) can be also characterized by the set P plus two T x s matrices, D 1 and D 2 , where D& = d ( ( i , j ) ,(i + l , j ) ) , V ( i , j )E { l , ... , r - l} x (1,... , s } and D& = d ( ( i , j ) , ( i , j + l ) ) , V E { 1,. . . ,r } x { 1,. . . , s - 1). Since our coloring procedure will be based upon this alternative representation, from now on we shall denote our pixels fuzzy graph G ( I ) by ( T , s, D1, D 2 ) . The key coloring algorithm proposed in the next section will take advantage of the above alternative representation, which shows relation between adjacent pixels in the pixels fuzzy graph G ( I ) . h_
-
261
3. A crude coloring algorithm A c-coloring of a graph G = (V,E ) (see, e.g., Pardalos et al.”) is a mapping C : V -+ (1,.. . , c } , verifying C(w) # C ( d ) if {w,w’} E E. Any c-coloring induces a crisp classification of the nodes set V , being each class associated to one color: V c ( k )= {v E V / C(w) = k } , k c { I , .. . , c } . Our objective is to obtain a classification of pixels through a c-coloring C of the pixels fuzzy graph G(1):the pixel ( i , j ) E P will be classified as k E { 1,. . . , c } if its color is C ( i ,j ) = k . In order to color a fuzzy graph, we consider G,, the crisp graph defined by the a-cut edge set E, = { { e , e ’ } / pe,et 2 a } . The values of this parameter a will be selected in such a way that, a successive binary coloring process will be applied to some fuzzy subgraphs of G(1). The first binary coloring analyzes the pixels set P classifying each pixel as 0 or 1. The second binary coloring is applied separately to the subgraph generated by those pixels colored as 0, to obtain the classes 00 and 01, and to the subgraph generated by those pixels colored as 1, to obtain the classes 10 and 11. This hierarchical process of binary coloring is repeated in the crude coloring process. In this way, a c-coloring C will be defined on G ( I ) : if C ( i , j ) = k , with k = 6 for instance] then the binary representation of k - 1 = 5 is 101, i.e. the pixel ( i , j ) will be binary colored three times (1, 0 and 1, respectively).
-
-
-
3.1. The basic binary coloring procedure
A natural way of introducing the basic binary coloring procedure is to classify two adjacent pixels as 0 and 1 if and only if the distance between them is greater or equal than a prescribed threshold a . Notice that, in this way, only adjacent pixels are classified as distinct (if distance between them is high), while a standard approach classifies two arbitrary pixels in the same class if that distance is low (no matter they are adjacent or not). Formally, and in order to define the first binary coloring procedure, given a value a , let G, denote the a-cut of the fuzzy graph G(1):G, = ( P ,E,), where
-
The set E, is the set of all pairs of adjacent pixels with a distance d lower than a. Let col : P -+ {0,1} be a binary coloring of G,. The first binary coloring can be obtained assigning an arbitrary color (”0” or ”1”)to certain
262
pixel and fixing an order in which pixels will be colored. That first pixel to be colored could be, for example, the pixel ( 1 , l ) in the left top corner of the image; then pixels will be colored from left to right and up to down depending on a fixed threshold a. In general, given a colored pixel (2, j ) the adjacent pixels (i 1,j)and ( i , j 1) will be subsequently colored. Since pixel (i 1 , j 1) can be alternatively colored either from pixel (i 1,j) or from pixel ( i , j l),a natural constraint is that both colors must be the same; otherwise, the coloring will be denoted as inconsistent.
+
+
+ +
+
+
Definition 3.1. Given a pixel set P a square is a subset of four pixels s q ( i , j ) = {(i,j);(i
+ 1,j);( i , j+ 1);(i + 1 , j+ 1))
being i E (1,.. . ,r - l} and j E (1,.. . , s - l}. We shall then denote by PS the set of all squares, PS = ( s q ( i , j ) / i E (1,.. . , r - l}, j E (1,.. . , s - 1)).
Definition 3.2. Given a pixels fuzzy graph ( r , s , D 1 , D 2 ) , a square s q ( i , j ) E PS is consistent at level a if given an arbitrary color coZ(i,j),the above binary coloring procedure assigns the same color to pixel (i + 1,j 1), no matter if it is done from pixel ( i , j + 1) or pixel (i + 1,j). Otherwise, the pixel square is inconsistent.
+
Consequently, the above binary coloring of pixels fuzzy graph depends on the chosen threshold value a , and we have two extreme cases: E = max(i,j)EP{da,j;dS,j} (if we fix a threshold a: > Z, then the whole picture is considered as a unique class (col(i,j ) = col(1,l) V ( i ,j ) E P ) ; and g = min(i,j)EP{d:j;d~,j} (in case a < g , the picture looks like a chess board, being all adjacent pixels alternatively classified as ”0” and ”1”). Only the interval [a,4 should be properly considered. Indeed, determining an appropriate intermediate a level is not a trivial task. But once a level a: is given, the inconsistent squares can be detected with the binary function
inconsis,(i, j , D 1 , 0 2 ) =
if is inconsistent at level a { 01 otherwise sq(i,j)
where each value inconsis,(i, j , D1, D 2 ) depends on the square values d i , j , d&+l, dS,j and dS+l,j. Pseudocode for computations relative to listing consistent and inconsistent squares, inconsis(i,j , a:, D1, D 2 ) , has been developed by the authors in terms of four additional 0 - 1 variables (see G6mez et al.’).
263
Definition 3.3. Given a value a , the pixels fuzzy graph ( T , s , D1, D 2 ) is consistent at level a if all squares s q ( i , j ) E PS are consistent at level a. Definition 3.4. Given a pixels fuzzy graph ( T , s, D1, D 2 ) ,its consistency level, denoted as a*,is the maximum value a E [ g , E ]for which the fuzzy graph is consistent.
Existence of such a consistency level a* is always assured, at least meanwhile our image contains a finite number of pixels. If some inconsistency is detected for a given a level, a decreasing procedure can be introduced in order to find a lower level a* assuring consistency. Such a procedure will be initialized with a* = Z,and then we search among inconsistent pixels s q ( i , j ) , by means of a new function newalpha (see G6mez et al.’). We can therefore look for a value a* assuring consistency. Pixels axe being classified either into a class ”0” or a class ”1” and in the next step we proceed to get a more precise color for both classes (class ”0” will switch either into ”00” or ”01”). This will be done by alternatively activating only one of the classes already colored in a previous stage. Analogously, such a binary coloring process is applied in subsequent stages to those activated pixels under consideration (a subset of pixels P’ c P ) . This subset of pixels P’ getting a more precise color at each stage can be also characterized by a matrix act such that a c t ( i , j ) = l , V ( i , j ) E P’ and a c t ( i , j ) = 0 V ( i , j ) s;Z P’. We can compute the interval [a,E] for the activated pixels, by means of a procedure initalpha(r, s, D 1 D , 2 ,act). It may be the case that two adjacent pixels are not activated, and therefore the process should stop. This situation can be easily detected in the associated pseudocode initalpha, where the lowest distance between activated pixels cy is initialized as a very big value (the greatest value of distances between activated pixels is initialized as 0). Again, notice that a given square can be consistent for a value a but inconsistent for another value a’ < a. Hence, a decreasing procedure must be repeated for the overall set PS until we find a new level a* assuring that all squares are made consistent in the new coloring environment. Then, a function called consislevel is the core of our algorithm: it will iteratively compute the consistency level a* for the family of pixels being actually activated (the initializing value will be E , which is obtained from the procedure initalpha). The input arguments of consislevel are the pixels fuzzy graph ( T , s , D1, D 2 ) and the T x s matrix act. The interval [a,E] is computed after the procedure initalpha is called, being a* the returned value. The associated consisleveZ(r, s , D1, D2, act) pseudocode computes
264
the consistency level a * ( a c t ) , for a given a subset P' c P of activated pixels. Following the standard order in the activated pixels (at every stage) the level a*(act) assures a valid binary coloring col procedure. In order t o perform these computations, a procedure bin cola(^, s , D 1 D ,2 , act, COZ) will compute the binary coloring of activated pixels at level a , the fist call t o this procedure taking as initialization c o l ( i , j ) = O , V ( i , j ) E P (see G6mez et aL9). 4. Final comments
The final objective of the algorithm we propose in this paper is to show decision maker several possible pictures of the image, each one obtained by means of an automatic coloring procedure of each pixel based upon a particular distance. Such a coloring procedure takes into account behavior of each pixel with respect t o its surrounding pixels, and each color will suggest a possible class. Our coloring process is based upon a basic binary procedure, which is again and again applied, leading to a hierarchical structure of colors (i.e., possible classes). This basic binary procedure evaluates the distance of the measurable description between adjacent pixels, assigning a color depending on whether such a distance is either lower or higher than a previously chosen threshold. Each colored picture can be analyzed by decision makers in a posterior classification procedure: certain homogeneous regions can be identified, and a subsequent comparison may lead to a fuzzy classification, if we are able to evaluate the degree of concordance of each pixel to each one of those identified regions (see Amo et ul.'). Due to space limitations, pseudocodes have not been included in this paper, but they can be obtained from the authors under request, plus additional details (see G6mez et ~ 1 . ~ )Of . course, the classification process induced by the previous binary coloring can be refined, and an appropriate relaxed coloring algorithm should be tried in order to bypass the computational inefficiency of the above crude coloring algorithm (see G6mez et al.g for details). Once our basic binary coloring process has been successively applied t times, we shall be able to distinguish 2t classes. Our complete coloring process is therefore equivalent to a hierarchical classification procedure, obtaining as output a set of nested clusters, to be properly analyzed.
Acknowledgments This Research has been partially supported by CICyT grant BFM20020281, Spain.
265 References 1. A. Amo, D. G6mez, J . Montero and G. Biging: Relevance and redundancy in fuzzy classification systems. Mathware and Soft Computing 8, 203-216 (2001). 2. A. Amo, J. Montero and V. Cutello: O n the principles of f u z z y classification. Proc. N.A.F.I.P.S. Conference, 675-679 (1999). 3. A. Amo, J. Montero and G. Biging: Classifying pixels by means of f u z z y relations. Int. J. General Systems 29, 605-621 (2000). 4. A. Amo, J. Montero, G. Biging and V. Cutello: Fuzzy classification systems. European Journal of Operational Research (to appear). 5. A. Amo, J. Montero, A. FernBndez, M. L6pez, J. Tordesillas and G. Biging: Spectral f u z z y classification: a n application. IEEE Trans. Syst. Man and Cyb. (C) 32, 42-48 (2002). 6. J.C. Bezdek and J.D. Harris: Fuzzy partitions and relations: a n axiomatic basis for clustering. Fuzzy Sets and Systems 1, 111-127 (1978). 7. G.M. Foody: T h e continuum of classification fuzziness in thematics mapping. Photogrammetric Engineering and Remote Sensing 65, 443-451 (1999). 8. J.N. Mordeson and S. Nair: Fuzzy graphs and Fuzzy Hypergraphs (PhysicaVerlag, Heidelberg, 2000). 9. D. G6mez, J. Montero, J . Y S e z and C. Poidomani: A f u z z y graph coloring algorithm for image classsification. Technical Report (Dept. Statistics and O.R., Complutense University, Madrid, Spain). 10. L. K6czy: Fuzzy graphs in the evaluation and optimization of netwoks. Fuzzy sets and systems 46:307-319 (1992). 11. P.M. Pardalos, T. Mavridou and J. Xue: T h e Graph Coloring Problem: A Bibliographic Survey. In: D.Z. Du and P.M. Pardalos (Eds.): Handbook of Combinatorial Optimization, vol. 2 (Kluwer Academic Publishers, Boston, 1998); 331-395. 12. J. Montero, A. Pearman and J. Tejada: Fuzzy multicriteria decision support for budget allocation in the transport sector. TOP 3, 47-68 (1995). 13. A. Rosenfeld: Fuzzy graphs. In: L.A. Zadeh, K.S. Fu and M. Shimura (Eds.): Fuzzy sets and their applications to cognitive and decision processes (Academic Press, New York, 1975); 77-95.
FUZZY MODELS TO DEAL WITH HETEROGENEOUS INFORMATION IN DECISION MAKING PROBLEMS IN ENGINEERING PROCESSES*
L. MARTiWEZ Dept. of Computer Science University of Jakn, 23071 - JaLn, Spain e-mail: [email protected] J.LIU Manchester School of Management UMIST PO Box 88 Manchester, UK, M60 1 QD e-mai1:j. [email protected]. uk
D. RUAN Belgian Nuclear Research Centre (SCK*CEN) Boeretang 200, 2400 Mol, Belgium e-mail:[email protected] Before implementing a n engineering system in the design process are studied different proposals to evaluate and rank them. In this evaluation process several experts assess different aspects and criteria according to their knowledge and preference on them. These criteria may have different nature (quantitative, qualitative) and the experts could belong to different areas and have different knowledge on each criterium, so the assessments used to express the value on each criterium could be assessed with different types of information (numerical, linguistic, interval-valued). In such a case, to select the best proposal we must deal with this heterogeneous information t o evaluate and rank the different proposals. In this contribution we show different fuzzy approaches for dealing with heterogeneous information.
1. Introduction In the design of traditional engineering systems the main objective for selecting a design option is to minimize cost. In recent years, however, the design selection has increased its complexity due to the need of taking into account aspects or criteria such as safety, cost and technical performance *This work has been partially supported by the research project TIC 2002-03348, and by the UK Eng. and Phy. Sci. Research Council (EPSRC) under grant no: gr/r30624.
266
267 simultaneously. In the future all solicitations involving source selection should be structured using safety, cost and technical performance considerations '. Also, the decision of implementing a design in an engineering system will depend on if the design can satisfy technical and economical constraints. Therefore, Multi-Criteria Decision Making (MCDM) techniques 2,10 could be applied for ranking the different design options. In these MEMCDM problems the preferences provided by the experts for the different criteria may be expressed with different types of information depending on the knowledge of the experts and on the nature of the criteria (quantitative and qualitative). When the experts do not have a precise knowledge about the criteria the probability theory could be useful t o deal with vague information, but it is not too difficult t o find many aspects of uncertainties that do not have a probabilistic character since they are related to imprecision and vagueness of meanings. In addition qualitative aspects are difficult t o assess by means of precise numbers. To rank engineering designs using MEMC-DM problems we deal with criteria as safety, cost and technical performance. In these problems, intrinsically vague information appear and could be assessed by means of numerical information (probabilistic), interval values and in those cases such that the nature of the criterium is qualitative the use of linguistic information l2 is common and suitable. Therefore, it is not a seldom situation to deal with numerical, interval valued and linguistic information in the evaluation process of engineering designs 8)11. We shall call this type of information as heterogeneous information. The decision model to rank the different designs assessed by means of heterogeneous information taking into account the criteria of safety, cost and technical performance will use a framework as the following one '>11: 0 Safety assessments will be synthesized for each design. Cost and technical assessments will be provided by the experts. 0 These assessments will be the input information for a MEMC-DM that we shall solve to rank the different designs. In this MEMC-DM problem the assessments for the criteria will be combined to obtain a degree of suitability of each design option. The main difficulty to solve this problem is that the values used to assess the criteria are expressed in different utility spaces (heterogeneous information) and we shall need to unify the different utility spaces to combine the input information in order to obtain the degree of suitability of each design option. In this contribution we shall show an approach to unify this heteroge-
268 neous information dealing with fuzzy sets and after how this approach could be improve using the linguistic 2-tuple model ‘. This contribution is structured as follows. In section 2 we show how to unify heterogeneous information dealing with fuzzy sets. In section 3 we review the linguistic 2-tuple model and its application to deal with heterogeneous information. Finally some conclusions are pointed out. 2. Using fuzzy sets to deal with Heterogeneous Information We must keep in mind we are dealing with heterogeneous contexts composed by numerical, interval valued and linguistic information. Our aim is t o rank the different proposals characterized with this type of information. So, we need to unify the heterogeneous information into a common utility space to operate on it easily. Here, we show how to unify numerical, interval valued and linguistic information into a common utility space that is fuzzy sets on a linguistic term set, ST. The common utility space ST may be chosen depending on the specific problem, according to the conditions shown in ‘. Afterwards, each numerical, interval-valued and linguistic evaluation, is transformed into a fuzzy set in ST, F ( S T ) ,using the following transformation functions: (I) Transforming numerical values, s$ E [0, 11, into F ( S T ) : T :
~(s:)
[O, 1) + F ( S T )
= {(so,Yo),. . . , ( ~ s , ’ ~ 9 ) }E, S ~i T a n d y t E [0,11
Remark: We consider membership functions, ps,(.), for linguistic labels, s, E ST, are represented by a parametric function (a,, b,, d,, q ) . And being 7,the degree of membership of the number into the linguistic terms of ST. (2) Transforming linguistic terms, sk E S , into F ( S T ) : TSST TSS,(S,L3)
s
= {(Ck,YL)/k
-+
F(ST)
E {0,..7g}l,
E
s
7;= maxy min{Psk (Y), P C k (Y)l where p s (.) ~ and p c k ( . )are the membership functions of the fuzzy 23 sets associated with the terms s k and c k , respectively.
269 (3) Transforming interval-valued, s : ~ in [0,1] into F ( S T ) . Let 1 = [&,TI be an interval in [0,1]. We assume that the interval-valued has a representation, inspired in the membership function of fuzzy sets 7: pr(fi) =
{
0,if6
The transformation function is: rrs, : 1 J’(ST) -+
7rsT(s,13)= { ( c k , ~ ; )/ k E (0, ..., g ) > , 7; = maxy min{PsI13 ( Y ) , P C k ( Y ) )
where , U ~ I (.) is the membership function associated with the x3 interval-valued st. At this moment all the input information (heterogeneous information) is expressed in a common utility space and we can operate with this information easily to obtain a ranking of the alternatives. This method has been applied successfully in the process of safety synthesis in 8)11
3. Using 2-tuples to deal with heterogeneous information The use of fuzzy sets allow us to unify the heterogeneous information, but the results to rank the different proposals will be fuzzy sets that are not straight to order and not easy to understand for all the experts. However, the use of the linguistic 2-tuple model will allow to order straightly the different proposals and the results will be easily understandable by all the experts . Now, we review briefly the linguistic 2-tuple model and show how to convert the fuzzy sets obtained in the section 2 into linguistic 2-tuples.
3.1. The 2-Tuple Fzlzzy Linguistic Representation Model The 2-tuple fuzzy linguistic representation model, presented in 4, will be used in this contribution to unify the heterogenous information. This model is based on symbolic methods and takes as the base of its representation the concept of Symbolic Translation.
Definition 1. The Symbolic Translation of a linguistic term si E S = {so, ..., sg} is a numerical value assessed an [ - . 5 , . 5 ) that support the ”difference of information” between an amount of information p E [O,g] and the closest value in ( 0 , ...,g} that indicates the index of the closest linguistic term si E S , being [O,g] the interval of granularity of S .
270 Fkom this concept the linguistic information is represented by means of 2-tuples (ri,ai), ri E S and a( E [-.5, . 5 ) . This model defines a set of functions between linguistic 2-tuples and numerical values.
Definition 2. Let S = {so, ..., s g } be a linguistic term set and p E [O,g] a value supporting the result of a symbolic aggregation operation, then the 2-tuple that expresses the equivalent information to P is obtained with the following function: 5 : [0,g ] S x [-0.5,0.5) si i =round(P) A@) = ( ~ i , a )w,i t h a = p - i a E [-.5, .5)
-
where round(.) is the usual round operation, si has the closest index label to "P" and "a" is the value of the symbolic translation.
Proposition 1.Let S = {so, ...,s g } be a linguistic t e r n set and ( s i ,a ) be a linguistic 2-tuple. There is always a A-l function, such that, from a 2-tuple it returns its equivalent numerical value P E [O,g] in the interval of granularity of S .
-
Proof. It is trivial, we consider the function:
A-l : S x [-.5, .5) [O,g] A-'(s~,a ) = i + a = p
A linguistic computational model for 2-tuples was introduced in '. 3.2. Transforming fuzzy sets in ST into linguistic 2-tuples In section 2 the heterogeneous information was unified by means of fuzzy sets in the common utility space, ST, now we shall transform them into linguistic 2-tuples in ST. This transformation is carried out using the function x and the A function (Def. 2): X : F(ST)
---f
[o,g]
X ( T ( f q ) = X ( { ( S j , Tj), j = 0, ...,9 ) ) =
p
* c
g=o j-r,
=P
is a numerical value in the granularity interval of S T , i.e., S, =
{ s o , ..., s g } , p E [O,g]. Then, to obtain the linguistic 2-tuple from we shall
use the A function presented in the Definition 2:A(P) = (sira ) Now all the input information are expressed in a common utility space, S T , by means of linguistic 2-tuples. So we can use all the linguistic 2-tuple operators to obtain the results we are looking for. This model has been used to deal with het.erogeneous processes in evaluation and decision processes in 3 , 6 .
271
4. Conclusions
In engineering we can face problems involving decision processes dealing with information assessed in different utility spaces. In this contribution we have showed two fuzzy approaches t o deal easily with heterogeneous information composed by numerical, interval valued and linguistic values. In the future we shall apply these approaches t o the whole decision process in the engineering problem. References 1. D. Dubois and H. Prade. Fuzzy Sets and Systems: Theory and Applications. Kluwer Academic, New York, 1980. 2. L. Gin-Shuh and J. Wang Mao-Jiun. Personnel selection using fuzzy MCDM algorithm. European Journal of Operational Research, 78(1):22-33, 1994. 3. F. Herrera, E. Herrera-Viedma, L. Martinez, and P.J. SBnchez. A linguistic decision process for evaluating the installation of an ERP system. In 9th International Conference o n Fuzzy Theory and Technology, Cary (North Carolina) USA, 2003. 4. F. Herrera and L. Martinez. A 2-tuple fuzzy linguistic representation model for computing with words. I E E E Transactions o n Fuzzy Systems, 8(6):746752, 2000. 5. F. Herrera and L. Martinez. The 2-tuple linguistic computational model. Advantages of its linguistic description, accuracy and consistency. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, S(Supp1.):3349, 2001. 6. F. Herrera, L. Martinez, and P.J. Sgnchez. Managing non-homogeneous information in group decision making. European Journal of Operational Research, page To appear, 2004. 7. D. Kuchta. Fuzzy capital budgeting. Fuzzy Sets and Systems, 111:367-385, 2000. 8. J. Liu, J.B. Yang, J. Wang, H.S. Sii, and Y.W. Wang. Fuzzy rule-based evidential reasoning approach for safety analysis. International Journal of General Systems, page In press, 2004. 9. H.S. Sii and J. Wang. A subjective design for safety framework for offshore engineering products. In Workshops o n Reliability and Risk Based Inspection Planning and ESRA Technical Committee o n Offshore Safety, Zurich, Switzerland, 2000. 10. R.R. Yager. Non-numeric multi-criteria multi-person decision making. Group Decision and Negotation, 2:81-93, 1993. 11. J. B. Yang, J. Liu, J. Wang, and H. S. Sii. A generic knowledge-base inference methodology using the evidential reasoning approach - RIMER. I E E E Transactions o n Systems, Man, and Cybernetics, page To appear, 2004. 12. L.A. Zadeh. The concept of a linguistic variable and its applications to approximate reasoning. Information Sciences, Part I, 11, III, 8,8,9:199-249,301357,43-80, 1975.
SELF-TUNING METHOD FOR FUZZY RULE BASE WITH BELIEF STRUCTURE JUN LIU', LUIS MARTINEZ2, JIAN-BO YANG', JIN WANG3
' Manchester School of Management, UMST, PO Box 88, Manchester M60 I QD, UK Dept. of Computer Science, University of Jakn, E-23071 Jakn, Spain School of Engineering, Liverpool John Moores University, Liverpool UK A framework for modelling the safety of an engineering system using a fuzzy rule-based evidential reasoning (FURBER) approach has been proposed recently, where a fuzzy rulebase designed on the basis of a belief structure (called a belief rule expression matrix) forms a basis in the inference mechanism ofFURBER. In this paper, a learning method for optimally training the elements of the belief rule expression matrix and other knowledge representation parameters in FURBER is proposed. This process is formulated as a nonlinear objective function to minimize the differences between the output of a belief rule Lrise and given data. The optimization problem is solved using the optimization tool provided in MATLAB. A numerical example is provided to demonstrate how the method can be implemented.
1. Introduction A framework for modelling the safety of an engineering system using a fuzzy rule-based evidential reasoning (FURBER) approach was recently proposed [*I. In the framework, a fuzzy rule-base designed on the basis of a belief structure is used to capture uncertainty and non-linear relationships between the parameters, and the inference of the rule -based system is implemented using the evidential reasoning algorithmL3]. A belief rule expression matrix forms a basis in the inference mechanism of FURBER, which is a framework for Epresenting expert knowledge but it is difficult to determine its elements entirely subjectively, in particular for a large scale rule base with hundreds of rules. Also, a change in a rule weight or an attribute weight may lead to significant changes in the performance of a belief rule base. As such, there is a need to develop a method that can generate an optimal rule expression matrix using expert judgments as well as statistical data. In this paper, a learning method for optimally training the elements of the belief rule expression matrix and other knowledge representation parameters in FURBER is proposed. This process is formulated as a nonlinear objective function to minimize the differences between the output of a belief rule base and given data and is solved using the optimization tool provided in MATLAB.
2. Fuzzy Rule-Bas ed Evidential Reasoning (FURBER) Approach This section reviews the FURBER framework [21. To take into account belief degrees, attribute weights and rule weights in a rule, a belief rule-base is given by R=(R,, Rz, ..., R L } .The kthrule can be represented as follows:
272
273 Rk: IF U is Ak THEN D with belief degree p k , with a rule weight 6, and attribute weights 81, &,. . ., 6 , where U represents the antecedent attribute vector ( U I ..., , UTk), Ak the packet antecedents { A : ,..., A:k }, where A: ( i = l ,..., Tk) is the linguistic value of the 'i antecedent attribute in the k' rule; Tk the number of antecedent attributes used in the 'k rule, D the consequent vector @I,..., DN), and J k the vector of the belief degrees ( Jlk ,. . ., J N k ) for kE { 1 ,. .., L } . This is the vector form of a belief rule, J i k measures the degree to which Di is the consequent if the input activates the antecedent Ak in the k' rule for i = l , . .., N , k = l , . .., L . L is the number of rules in the rule-base and N is the number of possible consequents. The rule-base can be summarized using a belief rule expression matrix shown in Table 1 : Table 1: A belief rule expression matrix
In the matrix, Wk is the activation weight of Ak, which measures the degree to which the kth rule is weighted and activated. The degree of activation of the kth rule W k is calculated as:
where a: ( i = l , .. ., Tk) is the degree of belief to which the input for U, belongs to A,k of the 'i individual antecedent in the k' rule.. In this paper we assume that aik (i=1, ..., Tk) are already given. Based on the above belief rule expression matrix, we can use the evidential reasoning (ER) approach 13] to combine rules and generate final conclusions. Using the overall ER algorithm [51, the combined degree of belief pi in D, is generated as follows:
[
L
p* n(wkJj,k
pJ. =
k=l
N L +l-Wkzflj,k)-n(l-Wk j =I k=l
1-p*
[
]
" I
zJj,k)
j =1
,J= 1 , ...,N
(2)
n
constraint conditions are: (a) 0 5 u ( D j ) 5 1;j=1,. . ., N,(b) 0 < J j k 5 1 ,PI,...,
274
(e) 0 1 8 , I 1 , N ; k=l, ..., L; (c) O I ~ ~ = l J j k I l ; ( d0 )1 6 i r l , i = l , ..., k=l, ..., L . Notice that it is the beliefs used in the belief structure and the activation weights that determine the actual performance of inference.
3. The Optimization Algorithm for FURBER Depending on the output form, the optimization formulation are given in different form. The output may be given by the expert judgment as the following form: (1) Confidence score (numerical judgment); (2) Distributed linguistic assessment with belief (subjective judgment). For case (1): The objective of the tuning method is to determine the belief rule matrix that produces the parameter estimation (beliefs, weights, and utilities of the output linguistic term) which minimize the mean square error criterion defined as MIN 1
M
where,
5 = - X ( y m- j,)’
where
cj =-M
(51
(3) N
, ym = C u ( D j ) J j ( m ) ,M i s the number of points M m=l j=1 in the training set, 5 , is the expected confidence score or the actual output. ( y , -jm)is the residual at the mth point. This optimization problem is solved using the optimization function FMTNCON provided in MATLAB [I1. For case (2): this is to solve problems with multi-objective functions using FMINIMAX in MATLAB. The multi-objective function: MTN ( 4 , ti,...,5y) (4) 1
M C. (Pj(m)-fij(m))’ ,i 1 , ..., N .
p,(m) is given by Eq. (2) for
m=l
the mth input in training set. M is the number of points in the training set,
P j ( m ) is the expected belief corresponding to the individual consequent 0,. ( pj(m)-pj(m)
) is the residual at the mth point. The hning parameters are
belief and weights, without utility.
4. A Numerical Example An same example for exploratory expert system in [4] is used here, which aims to determine a confidence degree to which the system believes that a container may contain graphite. To tune the belief-rule-base, a set of 12 trainmg data are used. The initial belief, initial weight are based on the following cases: (1) In case that the belief expression matrix is given by experts (2) In case that the belief expression matrix is not given by experts Due to the limitation of pages, only part of the results are shown here. Test result comparison is illustrated in Fig. 1 : C( 1-4) are the comparison between the
275
expected output and four kinds of outputs respectively, i.e., the output based on the trained belief expression matrix, the output based on the belief expression matrix given by experts, the output based on the belief expression matrix given by experts without assigning the belief, the output based on the randomly generated belief expression matrix. Especially, after the parameter optimization, the performance of the system is perfectly achieved (i.e., Cl).
9
10
11 12
, 0.9
E o.a f 0.7
number of test
Fig.l Test results comparison
5. Conclusion A self-tuning method for fuzzy rule base with the belief structure was proposed. This tuning method provided a practical and reliable support for the proposed FURBER approach, which has been shown a reasonable and flexible rule -based inference approach in [2]. References [1] Coleman T., Branch M.A, and Grace A. (1999), Optimization Toolbox—for Use with Matlab, The Mathworks Inc. [2] Liu J., Yang J.B., Wang L, Sii H.S., Wang Y.M. (2004), Fuzzy rule-based evidential reasoning approach for safety analysis, International Journal of General Systems, 33 (2-3), 183-204. [3] Yang J.B. and Xu D.L. (2002), On the evidential reasoning algorithm for multiple attribute decision analysis under uncertainty. IEEE Transactions on Systems, Man, and Cybernetics-Part A, 32 (3), 289-304. [4] Yang J.B., Liu J., Wang J., Sii H.S. (2003), A generic rule-base inference methodology using the evidential reasoning approach- RIMER, accepted by IEEE Transactions on Systems, Man, and Cybernetics. [5] Wang Y.M, Yang J.B. and Xu D.L. (2003), Environmental impact assessment using the evidential reasoning approach., submitted to European Journal of Operational Research.
AGGREGATION OF FUZZY OPINIONS WITH THE MEAN DISTANCE AND A SIMILARITY MEASURE UNDER GROUP DECISION MAKING *
JIBIN LAN School of Economics and Management, South West Jiaotong University, School of Mathematics and Information Science, Guangxi university, Nanning, Guangxi, P.R. China.530004 E-mail: [email protected]. cn YANG XU AND JIAZHONG LIU Center of Intelligent Control and Development, South West Jiaotong University, Chengdu, Sichuan, P.R. China.610031 E-mail: [email protected]
Abstract: Under the circumstance that opinions of experts of group decision making are assumed as L - R fuzzy numbers, the membership functions are moved toward the left according to the sizes of their expectations. The similarity degree of two experts’ opinions is calculated for an alternative where their expectations equal zero and the mean distance of two members’ opinions is represented as their opinion differences. Using simple additive weighting (SAW) method , a new aggregation for fuzzy opinions under group decision making is proposed in this paper. For each expert, the degree of similarity between his and the others is averaged. The weight of specific expert’s opinion in aggregation is proportional to the degree of average agreement of that expert. Keywords : fuzzy individual opinions; fuzzy opinion aggregate; fuzzy number; group consensus opinion; the mean distance.
1. Introduction
Several methods have been proposed for drawing consensus from opinions of experts, methods in considered the fuzzy preference relation of each expert. Ishikawa et aL5 and Xu et al.ll represented the individual opinions with interval-value and got the consensus from cumulative frequency distribution. Hsu proposed a method called a similarity aggregation method 21627,9,10
*This work is supported by the Chinese NSF grants 60074014.
276
277
(SAM) t o aggregate individual opinions of experts based on measuring near degree. Hsu suggested that individual opinions should be represented by positive trapezoidal fuzzy numbers. Then similarity between fuzzy numbers are computed pairwise and the average agreement degree of each expert are used. The weight of specific expert’s opinion in aggregation is proportional to the degree of average agreement of that expert. However, Hsu assumed that the opinions of all experts are represented by fuzzy numbers should be intersected. Otherwise the degree of similarity is zero. In order to avoid this disjoint situation, Hsu suggested that Delphi method could be employed to modify each expert’s opinions. The way is plausible, but we don’t think it always exists in reality. In this paper, we employ a similarity measure l2 and the mean distance avoiding the flaws of the cases where the supports of fuzzy numbers are disjoint. A method with the similarity measure and the mean distance for aggregation weights are introduced. 2. Preliminaries Fuzzy numbers are the natural generalization of real, crisp numbers. A fuzzy number is a normal fuzzy subset of a real line with the right semicontinuous and quasi-concave membership function. The above definition implies that a-level R, = {z : p ~ ( z 2 ) a } of such a fuzzy subset R is a close interval [a?,a:] for any a E (0, I] and the support of a fuzzy number k is a crisp set suppk = c l ( { z : p ~ ( z>) 0)) = [ae,u:],where pg(z) is the membership function of k , and el is the closure operate. The normality of the fuzzy number implies that l-level = {x : p g ( z ) = l} is not an empty set. Let us denote a1 = a;, a2 = al1, a3 = uf and a4 = a:. A fuzzy number R is called a L- R fuzzy number if its membership function can be represented as :
-
-
4x1=
{
L ( Y ) ,z < a2; 1, a1 I x I a,; R ( F ) ,z > a3
(1)
where the function R is nonincreasing, left-continuous function of real line satisfying the condition R(0) = 1, the function L is nondecreasing, rightcontinuous function of real line satisfying the condition L(0) = 1, spreads al = a2 - a l , a, = a4 - a3. The functions L and R is called the lek- and right-hand side.
278 The trapezoidal fuzzy number is the simplest form of L - R fuzzy number. The sides of this fuzzy number are linear, we denote (alla2, a3, a4) as a trapezoidal fuzzy number1i3. When a2 = a3 we obtain the triangular fuzzy number, the conditions a l = a2 and a3 = a4 imply the close interval and, in the case a1 = a2 = a3 = a4, we obtain a crisp real number.
Definition 2.1. Let 2 be L - R fuzzy number. Its membership function is p x ( x ) , x E R, If the integral JEmlx(px(x)dx < co,then
is defined as the expectation of
8.
Now we assume that & is the L - R fuzzy number represented ith expert's subjective estimation of the ration to an alternative under a given criterion. Let E = F(fi1,E2,...,En) be the consensus of opinions, How to construct F by combining these evaluated rating Ei(i = 1,2, ...,n ) is an important issue. In this paper, the simple additive weight (SAW) method is employed. For example, if opinions of four experts are represented by positive trapezoidal fuzzy numbers as follows:
-
R1 =
( 1 , 2 , 3 , 4 ) , E 2= ( 1 , 3 , 3 . 5 , 4 ) , E= ~ (2,2.5,3,5),E4 = (1.5,2,3,4)
and their weights are 0.3, 0.4, 0.2, 0.1, then the result of aggregation would be E = 0.3 0 0.4 0E 2 0.2 0E3 0.1 0E4 = (1.25,2.5,3.2,4.2).
+
+
+
3. Aggregation Method In this section, we give a new method for measuring aggregated opinion. Let El, El ..., k, be L - R fuzzy numbers represented experts' opinions. Their membership functions are p ~ & ( z(i) ,= 1 , 2 , ..., n ) and their expectations
E(Ei,)(i
= 1,2, ..., n ) exist. We move the membership functions p ~ , ( x ) (i = 1 , 2 , ..., n) towards the left-according to the size of E(E,) and obtain new fuzzy numbers K1, K2 , ...,K , and new membership functions pk-,(x)=
- -
+
p ~ , ( x ,?3(Ez))(i= 1 , 2 , . . . , n ) . Then a new similarity degree between
E,, EJ is defined as follows:
279 The similarity degree S ( R i ,&) between expert Ei and expert Ej is determined by the proportion of the consistent area ~ ‘ ( r n i n { p (z), ~ , pir; (z)})dz to the total area Jx(rnaz{p,z (z), p - (z)})dz and it is an index of the simK3 ilar opinions of expert Ei and expert Ej . It represents the similarity degree of expert Ei and expert E j in the case that their opinion expectations equal to zero. Since in many cases expert’s opinions are not intersect, it is essential to choose a basic point to measure the similarity degree of expert’s opinions. In this paper, the basic point is that expert’s opinion expectations equal to zero. Obviously 0
5 S ( R 2 , E j ) 5 1,S ( R i ,Rj) = S ( R j ,R,)
Definition 3.1. Let the mean distance of
(4)
Ei,Ej be two L - R fuzzy trapezoidal numbers, then Ei,& is defined by: d m ( R i , R j ) = IE(Ri) - E(Rj)I
(5)
Since E(Ri),E ( & ) are the centroid of the opinions of expert Ei and expert E j , the dm(?ii, Rj) is used as an index of the opinion difference of expert Ei and expert Ej. Considering S ( & , &) may be equal to zero, the new measure function is defined as follows:
T ( & ,E j ) =
0 . 5 S ( k i ,kj)
1 + d,(Ri,
0.5 + 1+ d m ( R i , Rj)
Ej)’
The T ( & , & ) considers not only the similar opinions, but also the opinion differences of expert Ei and expert Ej. The larger the d m ( E i , Rj), the less the agreement degree. Contrarily, the larger the S(Ri,E j ) is, the more the similarity degree. The average agreement degree of expert Ei(i = 1 , 2 ,..., n ) is obtained by averaging the degree of similarity with respect to other experts: n
Without considering ith expert’s degree of importance, the aggregation weight of Ei is then given by n
280 Since we obtain the weight of experts’ opinions, we can combine all experts’ opinions into a consensus opinion by:
E = W(E1)O El + W ( E 2 )o E.2 + . . . + W(E,)
O R,
(9)
We summarize the method discussed above and propose an algorithm t o combine all experts’ opinion into a consensus opinion. Algorithm Initial step: For the criterion and an alternative under group decision making environment, each expert Ei ,(i= 1 , 2 , . . . ,n ) proposes his opinion as a L - R fuzzy number denoted by Ri and its membership function is p~~(z). stepl: Calculate the expectation E ( k i ) of each fuzzy number Ei ; then move its membership function towards the left according t o the size of its expectation. New fuzzy numbers El, k2, . .. , and their membership functions p~~(z) = p~~(z + E ( R i ) ) ( i= 1 , 2 , . . . ,n) are obtained.
-
En
step2: Calculate the similarity degrees S ( z i ,E j ) ( i , j = 1 , 2 , . . . , n ,i # j ) between Ei, step3: Calculate the mean distance d,(&, & ) ( i , j = 1 , 2 , . . . , n ,i # j ) between Ri, Rj . step4: Calculate the agreement degree T ( g i ,&)(i, j = 1 , 2 , . . . ,n, i # j ) step5: Calculate the average agreement A(Ei)(i= 1 , 2 , . . . , n , ) of each expert Ei. step6: determine the weight W ( E i ) . step7: combine all experts’ opinions into a consensus opinion R .
&.
-
4. Conclusions
In this paper, an aggregation method is proposed t o aggregate individual opinions into group consensus opinion under group decision making while the opinions are represented by L - R fuzzy numbers. The method extended some of the previous methods in that it can not only deal with the situation when fuzzy numbers are intersected, but also when fuzzy numbers are disjoint. Using the mean distance also may help to avoid the lose of information. It is certainly interesting and useful in group decision-making. References 1. Ph.Diamond, Least squares methods in fuzzy data analysis, Proc. [FSA’S], Brussels, Management & System Science 1991,pp.60-64. 2. M. Fedrizzi and J. Kacprzyk, On measuring consensus in the setting of fuzzy preference relations, in: J. Kacprzyk and M. Roubens, Eds. Non-
281
3.
4.
5.
6. 7.
8.
9. 10.
11. 12.
conventional preference Relations in Decision Making (Springer, Berlin, 1988) 129-141. S.Herilpern, Using a distance between fuzzy numbers in socieeconomic systems, in R.Trapl(Ed), Cybernetic and Systems’94, World Scientific, Singapore, 1994 ,pp.279-286. H. M. Hsu, C.T. Chen, Aggregation of fuzzy opinions under group decision making, Fuzzy Sets and Systems 79 (1996) 279-285. A. Ishikawa, M. Ambiguous, T. Shiga, G. Tomizawa, R. Tactic and H. Mileage, The max-min Delphi method and fuzzy Delphi method via fuzzy integration, Fuzzy Sets and Systems 55 (1993) 241-253. J. Kacprzyk and M. Federation, A soft measure of consensus in the setting of partial (fuzzy) preferences, European J . Operate. Res. 34 (1988) 315-325. J. Kacprzyk, M. Federation and H. Norm, Group decision making and consensus under fuzzy preferences and fuzzy majority, Fuzzy Sets and Systems 49 (1992) 21-31. G.Munda et al, Qualitative multicriteria methods for fuzzy evaluation problems: An illustration of economic-ecological evaluation ,Eur.J.OR, 82 (1995),79-97 H. Nurmi, Approaches to collective decision making with fuzzy preference relations, Fuzzy Sets and Systems 6 (1981) 249-259. T. Tanino, On group decision making under fuzzy preferences, in: J. Kacprzyk, M. Fedrizzi Eds, Multiperson Decision Making Using Fuzzy Sets and Possibility Theory (Kilowatt Academic Publishers, Dordrecht, 1990) 172-185. R. N. Xu and X. Y . Zhai, Extensions of the analytic hierarchy process in fuzzy environment, Fuzzy Sets and Systems 52 (1992) 251-257. R.Zwick,E.Carlstein,D.V.Budescu,Measuresof similarity among fuzzy concepts:a comparative analysis,Internat.J.Approx.Reason1(1987)221-242.
A NEW METHOD WITH PROJECTION TECHNIQUE FOR FUZZY MULTI-ATTRIBUTE DECISION MAKING *
JIBIN LAN School of Economics and Management, South West Jiaotong University, School of Mathematics and Information Science, Guangxi university, Nanning, Guangxi, P. R. China. 530004 E-mail: [email protected]. cn YANG XU AND JIAZHONG LIU Center of Intelligent Control and Development, South West Jiaotong University, Chengdu, Sichuan, P.R. China.610031 E-mail: [email protected]. edu. c n
Abstract: A conception of the left and right projection which one fuzzy vector project on another is introduced in this paper. The purpose is t o propose a new method to select an optimal alternative in fuzzy multi-attribute decision making environment. Using the conception of the left and right projection, the difference of each alternative with the fuzzy ideal solution or negative ideal solution are projected on the fuzzy weight vector. The size of the combination projection coefficient that combines the left and right projection coefficients is used as a judgement standard to measure each alternative. The decision making criterion is that the more the combination projection coefficient, the more superior the alternative . Keywords : alternative; fuzzy weight vector; fuzzy ideal solution; fuzzy vector projection; combination projection coefficient; fuzzy multi-attribute decision making.
1. Introduction Many different methods have been employed to deal with fuzzy multi-attribute decision making. The technique for ordering preference by similarity to ideal solution T O P S I S is one of them, the basic idea of the T O P S I S is that both fuzzy positive ideal solution and fuzzy negative ideal solution are the frames of reference. Hamming distance is employed to measure the differences between each alternative and the fuzzy positive 122343536~819210311
*This work is supported by the Chinese NSF grants 60074014.
282
283
ideal solution or fuzzy negative ideal solution. The principle of decision making is that the smaller the size of the Hamming distance between an alternative and the fuzzy positive ideal solution, the superior the alternative; or the lager the size of the Hamming distance between an alternative and the fuzzy negative ideal solution, the superior the alternative. In this paper, we will introduce a conception of the left and right projection of two fuzzy vectors. We consider that the weight vector should be the decision maker's predilection for every attribute and the weight vector should become a standard scale to measure each alternative superior or inferior. 2. Preliminaries
A fuzzy number k is called a L- R fuzzy number if its membership function can be represented by :
=
{
qy), 5 < az; 1,
al
IzIa,;
R ( V ) ,2 > a3
(1)
where R is nonincreasing, left-continuous function of real line satisfying the condition R(0) = 1, L is nondecreasing, right-continuous function of real line satisfying the condition L(0) = 1, al, a, are the spread. The functions L and R is called the left- and right-hand side. In many applications connected with data analysis we need not have fuzzy data but the simpler form, close intervals or crisp numbers. We introduce the left and right expected value7 of the fuzzy number k. The left- and right expected value of the k can be defined by:
E,(k) = al - L Y ~
L(z)dz,E*(k)= a,
+ ar
Definition 2.1. Let k1, k2,. . . ,kn be fuzzy numbers, their membership functions are p (z), p~~ (z), . . . ,p ~ (z). , The minimum and maximum RL Rz,... ,k, are denoted as 2-,ki+. Their membership fuzzy set of 21, function are represented by:
-
284
3. The new method
The basic model of fuzzy multi-attribute decision making can be described as: A given alternative set A = {Al, A2, . . * ,Am} and an attribute set C = {Cl, (72,. . . , Cn} for each alternative, the attribute set is counterpart the weight vector G = (GI)G2, . . . ,&) which explains comparative important degree of each attribute. There are many forms expressing the weight vector, but the most common are: (a) utility preference function; (b) the analytical hierarchy process and a fuzzy version of the classical linear weighted average. Since subjectivity, vagueness, and imprecision enter into the assessments of decision makers, we assume that attribute indices and the weight values are L - R fuzzy numbers. Each alternative Ak, (k = 1 , 2 , . . . , rn) can be represented as follows: Ak =
(zkl)Zk2)"'
(1)
,zkn).
Definition 3.1. Let w1 = ( w 1 1 , w 1 2 , - . . , q n )be a fuzzy vector, where w l j , j = 1 , 2 , . . . , n are L - R fuzzy numbers. The left and right expected vector are defined as: E*(211) = (E*( W l l ) , E*(21121, . . . , E* (win)); E*(Wl)= ( E * ( V l l ) , E * ( V l 2 ) , . . . , E * ( v l n ) ) .
(2)
Definition 3.2. Let w l = ( ~ 1 ' ~, 1 2 ,.. . ,win), v 2 = ( ~ 2 1 ~, 2 2 , .. . , usn) be two fuzzy vectors. The cosine of the left and right expected angle of v1 and 212 are defined as:
n
cos(E(v;),E(V;)) = [ C E*(vlj)E*(vzj)l/ j=1
( CE*(U1jl2
J"
j=1
2 E*(v2j)2).
j-1
(3)
Now let us consider fuzzy multiple attributes decision making problem. Given the fuzzy weight vector G = ( 6 1 , G,.. . , Gn), its left and right normalized expected vector are E,(G) = (E,(wl), E , ( w ~ )., . . ,E,(w,)), E*(G)= ( E * ( W l ) , E * ( W 2 ) , . . . ,E*(W")). The positive ideal solution P+ is defined as follows:
P+ = (P;, Pz', . . . ,P 2 )
(4)
where PT equals the maximum set of & j , (i = 1,2, .. . ,m) if and only if attribute Cj is the kind of proceeds attribute index, or PT equals the
285
minimum set of Z i j (i = 1,2, . . . ,m ) if and only if attribute Cj is the kind of cost attribute index. The negative ideal solution N - is defined as follows:
where NJ: equals the minimum set of Z i j , (i = 1 , 2 , . . * ,m) if and only if attribute Ci is the kind of proceeds attribute index, or NJT equals the maximum set of Z i j ( i = 1,2,... , m ) if and only if attribute Cj is the kind of cost attribute index. We construct the difference between alternative Ai and the positive ideal solution P+ is as follows:
D+ = (d:, d A , . . . ,d&)
(6)
where d;. = P; - Z i j if attribute Cj belongs to the kind of proceeds attribute index, or d t = Zij - PT if attribute Cj belongs to the kind of cost attribute index. the difference between alternative Ai and the negative ideal solution N - is constructed as follows:
where dE< = Z i j - NJT if attribute Cj belongs to the kind of proceeds attribute index, or d; = NJT - Zij if attribute Cj belongs to the kind of cost attribute index. In order to avoid the difference of the unit among attribute indices , D', 0%: should be normalized. the normalized left and right expected values are denoted as follows: I
E,(dt) = E * ( d t ) / l ~ ~ r n E , ( d ~ j ) ; E =*E(*z(td)t ) /lsksrn max E*(d&); = E*(d;)/ max
lsksrn
E,(dij);E*(2;) = E*(d;)/ max E*(dLj). lsksrn
(8)
The normalized left expected vector fizr is denoted as E,(&) = (E*(&),E,(d,), . . ,E*(z;)) and the normalized right expected vector fii is denoted as E*(DL)= (E*(z;),E*(zZ),... ,E*(zYn)).So do 6:. Let normalized EL, 5: project on the weight vector 6. We obtain the normalized left projection coefficient as follows: I
n
&b;
= ( C E*(@/
n
n
j=l n
j=1
j=1
j=1
p--+ -wD,
c
E*(Wj)2)1/2COS(E*(fi)i), E*(G));
= ( C E * ( q y / C E*(Wj)2)'~2COS(E*(D~), E,(6))
(9)
286
and the normalized right projection coefficient as follows:
cos(E*(E;), E*(G)) and where cos(E, (E;), E,(G)),cos(E,(@), E*(G)), cos(E*(@), E*(G))are the cosine of the left and right angle of 5; and 5: with$. We construct the combination projection coefficient as follows:
+ (1- X)&fi;,
P ; ( X ) = X&j;
P'(4
= Xpea+ f (1 - X)P,d+
(11)
where X E [0,1]. Now we can construct a measure function as follows:
When X E [0,1] is given, we get a new decision making method. The larger the fi(X), the inferior the alternative Ai . The combination projection coefficient p,(X) is a function of X and can be explained as the total difference sum of alternative Ai with the negative ideal N - . The combination projection coefficient p'(X) is a function of X and can be explained as the sum of difference among alternative Ai with the positive ideal P+. Obviously, the more the p'(X), the inferior the alternative Ai. Contrarily, the more the p i ( X ) , the superior the alternative Ai. Algorithm(P1SM) step 1:According to the decision maker's preference, we determine the weight vector G,then calculate the left and right normalized expected vectors E,(G),E*(G). step 2: According to the fuzzy indices, calculate fuzzy positive ideal solution P+ = (P:, p;, . . . , p : ) and fuzzy negative ideal solution N - = ( N r , NT , . * ,N i ) . step 3: calculate the differences 0: between alternative Ai and the positive ideal solution P+ and the differences 0; between alternative Ai and the negative ideal solution N - . step 4: Normalize the left and right expectations of O r , Dh, step 5: Calculate the left and right projection coefficients feb,:, peb;,
-
of 0' 7 Df . step' 6 : give X E [O,11. Calculate combination projection coefficients
&fi+
7
287 p r (A), p i (A) and the measure function fi (A).
step 7: rank alternatives according to the size of fi(A). f i ( A ) , the superior the alternative Ai.
The larger the
4. Conclusion
In this paper, the method (PISM) with projection technique for fuzzy multiattribute decision making is introduced. The method extended T O P S I S . It is certainly interesting and useful in multi-attribute decision-making. References 1. S. M .Bass and H. Kwakernaak, Rating and ranking of multiple aspect alternatives using fuzzy sets, Automatica Vol. 13 (1977),47-58. 2. J.J.Buckley, The multiple judge, multiple criteria ranking problem: Afuzzy set approach, Fuzzy Sets and Systems, 13(1984),25-38. 3. Ph.Diamond, Least squares methods in fuzzy data analysis, Proc. [FSA'S], Brussels, Management & System Science, (1991),pp.60-64. 4. D. Dubois and H. Prade, f i z z y Sets and Systems: Theory and Applications. (Academic Pree,New York,1980) 5. J.Efstathiou and V. Rajkovic, Multi-attribute decision making using fuzzy heuristic approach, IEEE Trans. On Systems,Man.Cybernetics, 9(1979),326333. 6. P.Gou, Self-organizing fuzzy aggregation model t o rank the objects with mulitple attributes, IEEE llansaction on SMC: Part A, vol. 30,(2000), 573-580. 7. S.Herilpern, Expected value of fuzzy number, Fuzzy sets and Systems, 47, (1992) 81-87. 8. H.Kwakernaak, An algorithm for rating multiple aspect alternative using fuzzy sets, Automatica, 15, (1979),615-616. 9. K.Nakamura, Preference relation on a set of fuzzy utilities as a basis for decision making, Fuzzy Sets and Systems, 20,(1986),147-162. 10. E.Takeda and T.Nishida, Multiple criteria decision problems with domination structures, Fuzzy Sets and Systems, 2,(1980),123-136. 11. R.R.Yager, O n ordered weighted averageing aggregation opemtiors i n multi-criteria decision-making. IEEE Transaction on SMC: Part A, vol, 19,(1988) ,183-190.
A NEW CRITERION FOR FUZZY MULTI-ATTRIBUTE DECISION MAKING *
JIBIN LAN School of Economics and Management, South West Jiaotong University, Chengdu, Sichuan, P.R. China. 610031 E-mail: lanjibin@gxu,edu.cn
YANG XU AND JIAZHONG LIU Center of Intelligent Control and Development, South West Jiaotong University, Chengdu, Sichuan, P.R. China.610031 E-mail: [email protected]
Abstract: A conception of the left and right projection which one fuzzy vector projects on another is introduced in this paper. Using the conception of the left and right projection, each alternative is projected on the fuzzy weight vector. The size of the combination projection coefficient that combines the left projection coefficient with the right one is used as a judgement standard to measure each alternative. Keywords: alternative; fuzzy weighted vector; fuzzy vector projection; combination projection coefficient; fuzzy multi-attribute decision making.
1. Introduction
Several methods have been proposed to deal with fuzzy multi-attribute decision making (MADM), Laarhoven and Pedrycz’s method’is based hierarchical aggregation and is similar to Satty’s method8 with the difference that is fuzzifies the criteria and alternatives rating pairwise comparisons. Bellman and Zadeh2 proposed the m u - m i n principle to deal with multiple attribute problems, the approach states that one can look at a decision as a goal and attributes which combined form the decision space. The decision space is a fuzzy set whose membership function is the degree to which each alternative is a solution. The combination operation used in the intersection to express the ‘and’ connective between the goal and attributes, the optiT h i s work is supported by the Chinese NSF grants 60074014
288
289 ma1 alternative is the one whose membership achieves maximum. Yager’ovll employed the max-min principle of Bellman and Zadeh approach to deal with (MADM), the main difference is that the importance of attributes are represented as exponential scalars. The weighted average m e t h o d ~ l > ~ the most commonly used algorithm, the objective function for aggregating the preference and attribute values is:
where A i , Wj and Z i j represent alternative i, the weight of attribute j and the relative merit of attribute j for alternative i, respectively. 2. The New aggregation rule
A fuzzy number E is called a L - R fuzzy number if its membership function can be represented by :
CLb)=
{
L ( Y ) ,z
< a2;
1, al 5 2 I a?“; R ( Y ) ,z > a3
(2)
where R is nonincreasing, left-continuous function of real line satisfying the condition R(0) = 1, L is nondecreasing, right-continuous function of real line satisfying the condition L(0) = 1, al, aTare the spread. The functions L and R is called the left- and right-hand side. The left and right expected value of the E can be defined by6: 03
E,(k) = ai - a1
L ( z ) d s , E * ( k ) = a,
+ ar
1
M
R(z)dz.
(3)
Definition 2.1. Let v1 = (v11,v12,... ,vln),v2 = (v21,v22,... , ~ 2 be ~ ) two fuzzy vectors, the cosine of the left and right expected angle of w1 and w2 are defined as: n
cos(v1*,v2*) =
/------
c
[j=1CE*(Vlj)E*(WBj)ll[ (j=1 C E*(vlj)2j = 1 E*(.2j)2)1; n
coS(V;,vb) = [C E*(vlj)E*(vzj)Il[ j=1
j=1
(4)
j=1
Now let us consider the fuzzy multiple attributes decision making problem. Given the fuzzy weight vector W = (Wl,.w7;,...,Wn), its left and right expectedvector are E,(W) = (E*(Wl),E*(&), . . . , E*(Wn)),E * ( G )=
290 (E*(Wl),E*(Wz),. . . , E*(W,)). In order to avoid the difference of the unit among attribute indices, the left and right expected value should be normalized For the kind of proceeds attribute indices, the normalized left and right expected value are denoted as follows: E,(xk3)
= [E*(Zk3) - min
l
E*(Zk3) =
E * (-xkz)l"l
[E*(Zk3) - l~ i;nE* (Zkz)l/[l~ ~ ~ nE *(dl
__
(5) For the kind of cost attribute indices, the normalized left and right expected value are denoted as follows:
Let alternative A k project on W ,we obtain the normalized left and right projection coefficient as follows:
where cos(&+, W*)cos(z';, W') are the cosine of the left and right expected angle of A k with W.We construct the combination projection as follows: PcAk
+ (l - ' ) P G A k
=
(8)
where X E [0,1]. The combination projection coefficient of A k on Ur is p c ~ ~ ( = X Xp) -wAk (1 - X)pcA, . The left and right expected value are the centroid of the left and right side of a L - R fuzzy number. The left and right expected vector E,(Ak), E * ( A k )are the left and right centroid of alternative Ah. The left and right projection coefficient ,peAL denote the size of the normalized left and right centroid of Ak projected on the left and right centroid of w. The p-Ak,w and pGAkware the normalized left and right projection of A& on W. Since the fuzzy weight vector Ur represents the decision maker's preference, the decision making criterion is supposed to accord with the decision maker's preference, the fuzzy weighted vector W ought to become a standard scale to measure each alternative. The combination projection ) become a scale t o measure the superior or inferior coefficient P ~ A ~ ( Xcan
+
ezAk
29 1 of each alternative , so q u a t h e utility set of A,+,t h e PCA,, (A) is reasonable. For X E [0,1], t h e larger t h e P~A,,(X), t h e superior the alternative.
3. Conclusion
In this paper, t h e aggregation for fuzzy multi-attribute decision making is introduced. It is certainly interesting a n d useful in multi-attribute decisionmaking. References 1. S.Bass and H.Kwakernaak, Rating and raking multipleaspect alternative using fuzzy sets, Automatica,l3,(1977), 47-58. 2. R.E.Bellman and L.A.Zhdeh, Decision-making in fuzzy environment, Management Sci,17(4), (1970),141-164. 3. D.Dubois, Fuzzy Sets and Systems: Theory and Application. A c a d e m i c Press, N e w York, (1980). 4. W.M.Dong, H.C.Shah and F.S.Wong, Fuzzy computations in risk and decision analysis, Civil Engrg. Systems, 2,(1985),201-208. 5. W.M.Dong and F.S.Wong, Fuzzy weighted averages and implementation of the extension principle, Fuzzy sets and systems,21,(1798),183-199. 6. S.Herilpern,Expected value of fuzzy number, Fuzzy sets and S y s t e m s , 47, (1992), 81-87. 7. P.J.M.Laarhoven and W.Pedrycz, A fuzzy extension of Saaty’s priority theory, Fuzzy sets and systems,ll,(l983),229-241. 8. T.L.Saaty, Modelling unstructured decision problems-the theory of analytical hierarchies, Math,Computs.Simulation,20,(1978),147-158. 9. Yingming Wang, A projection method for multiindices decision making. S t a t i s t i c s Research, (4),(1998), 66-69 (in Chinese)). 10. R.R.Yager, Fuzzy decision making including unequal objectives, F u z z y sets and s y s t e m s , l , ( 1978),87-95. 11. R.R.Yager, On ordered weighted averaging aggregation operators in multicriteria decision making, IEEE Trans. S y s t e m s Man Cybernet, 18(1), ( 1988),183-190.
A SYSTEMS ANALYSIS OF IMPROVING THE REASONABILITY OF APPRAISEMENT SYSTEM OF HUMAN RESOURCE* XIAOHONG LIU College of management, Southwest Universityfor Nationalities, Sichuan, Chengdu, 610041, [email protected] SHUWEI CHEN
Intelligent Control Development Center, Southwest Jiaotong University, Sichuan, Chengdu, 610031,China. [email protected] YANG XU
Intelligent Control Development Center, Southwest Jiaotong University, Sichuan, Chengdu, 610031,[email protected],edu.cn The appraisement of enterprise human resource is an important aspect in the modern human resource management. It is very important to meet the requirement o f information society and to strengthen an organization cohesive affinity by improving the rationality of human resource appraisement. The appraisement system of human resource should have certain level rationality, which has important meaning for each organization. This paper puts forward the model of working process of the appraisement system of human resource, and more, analyzes the infection of appraisement object, appraisement quantity table, appraisement corpus, appraisement result synthetic method, appraisement result application and feedback, appraisement policymaker and environment factors etc.
1.
Put forward the problem
The reasonability of an appraisement system of human resource is one very important aspect in the modem management of human resource, it concerns an organization’s development and survival, especially in the information society, it will produce very notable negative influence on the organization, even as possible danger for the organization if the appraisement system of human resource does not have enough reasonability. Therefore, to study on the reasonability of the appraisement system of human resource has theoretical meaning and practicable application value. The theoretic achievement of study on the reasonability of the appraisement system
* This work is supported by National Social Science Foundation of China (Grant No. 03CJLOO4) and National Natural Science Foundation of China (Grant No. 60074014). 292
293 of human resource is unsatisfactory up to now, moreover, it is the problem that should be solved urgently in practice. The authors think that the appraisement of human resource is a systematic project, and it is necessary to analyze the numbers the major factors which influence on the reasonability of an appraisement of human resource. This paper has established the model of working process of the appraisement system of human resource, suggested the major factors that are formed an appraisement system, analyzed by the numbers the method to improve the reasonability of the appraisement of human resource. 2.
The model of working process of the appraisement system of human resource
The normally conducted course of the appraisement system of human resource is as follows: under a certain environment conditions appraisement, policymaker invite the field expert give the appraisement quantity table, select and ask to appraisement corpus according to appraisement quantity table to give appraisement result of appraisement object, and composes the result of appraisement, finally, carries out the application and feedback. Working process of the appraisement system of human resource is shown as Fig. 1.
+
Appraise quantity table
Feedback I
I
Appraisement
Appraisement
Composing
corpus
obiect
method
Apprai sement pol icymaker
Application
Output
Environment conditions
Fig. 1 The working process of the appraisement system of human resource 3.
Systems analysis of improving the reasonability of system of appraisement of human resource
It is very important to improve the reasonability of system of appraisement of human resource for each organization. The appraisement of human resource is a systematic project, and we should analyze the essential factors that are used to form the system of appraisement of human resource.
294
3.1. Appraisement object
One of major purposes of the appraisement of human resource is to improve the basic quality of the appraisement object. The appraisement object is the persons who are accepted the evaluation in the appraisement system of human resource. The factors such as ideological idea, value concept, professional morals, working style and degree of participating in the evaluation have important influence on the appraisement system of human resource. Therefore, we can improve the basic quality of appraisement object through building function of input and output, reinforcing training and so on.
3.2. Appraisements quantity table The appraisement quantity table is the major objective basis of the appraisement of human resource. It is necessary that the appraisement quantity table is provided with the objective and quantification to operate in practice
3.3. Appraisement corpus The appraisement corpus should be various people such as the directness leader, colleague, underling, and the working business flow out (input) etc. Moreover, it is necessary that to consider the relative importance of different appraisement corpus, and the method is to design the weight of appraisement corpus. 3.4. The synthetic method of the appraisement result
It is universal that the opinions of appraisement are provided with conflict. D-S evidence theoretical synthetic formula is a kind of more effective with the proper method law for the composition. The authors have suggested that a kind of evidence of considering different evidence importance (weight) composition formula that is used in the appraisement result composition to solve the conflict of opinions of appraisement corpus. 3.5. The application andfeedback of the appraisement Another major purposes of appraisement system of human resource is application and feedback the result of appraisement. Through application and feedback, on the one hand, we guarantee the organization’s goal to realize (this is the last goal of the management of human resource, is also called as indirect goal), on the other hand, it is to fulfill the organization management policy of human resource (this is the direct goal of the management of human resource).
295
3.6. Appraisement policymaker Appraisement policymakers are conducted and the major policymaker of the appraisement system of human resource. Their attitude, policy level, knowledge, ability and the justness have important influence on the reasonability of the appraisement system of human resource. 3.7. Environmental suitability The appraisement system of human resource is carried out under certain environmental conditions. On the foundation of forecasting the environmental factors, we should establish the countermeasure that is corresponding change of environment. 4.
Conclusion
This paper has built the model of work process of the appraisement system of human resource, and analyzed the method to improve the reasonability of the appraisement system of human resource. This works is usefbl to build the theory of the appraisement of human resource and to guidance the practice of human resource management. Reference 1.
2.
Remus 1lies;Timothy A.Judge Understanding the dynamic relationships among personality, mood, and job satisfaction: A field experience sampling study Organizational Behavior and Human Decision Processes Volume: 89, Issue: 2, November, 11 19-1 139(2002) Dennis B.Amett; Debra A.Laverie; Charlie McLane Using job satisfaction and pride as internal-marketing tools, The Cornell Hotel and Restaurant Administration Quarterly Volume: 43, Issue: 2, April, 87-96(2002)
A MODEL OF EVALUATION OF AN APPRAISEMENT SYSTEM OF HUMAN RESOURCE* XIAOHONG LIU College of management, Southwest Universityfor Nationalities, Sichuan, Chengdu, 610041,[email protected]
SHUWEI CHEN
Intelligent Control Development Center, Southwest Jiaotong University, Sichuan, Chengdu, 610031, China. [email protected] YANG XU
Intelligent Control Development Center, Southwest Jiaotong University, Sichuan, Chengdu, 610031, China. [email protected]. edu.cn It is necessary that the appraisement system of human resource is provided with the enough reasonability. An organization should strengthen its key competitive ability in information society, therefore, need to strengthen its cohesive affinity by various effective means, the reasonable appraisement system of human resource is a kind o f important means to strengthen organization cohesive affinity. This paper puts forward a model of evaluation of appraisement system of human resource.
1.
About the background of the research
According to the basic conditions of human resource such as its psychology, interest, knowledge, ability, body and performance and so on, the appraisement of enterprise human resource is the course of estimating and judging the value for human resource (include existing value and latent value). The appraisement of enterprise human resource is used in the choice of future occupation of human resource and job adjustment to offer scientific and objective basis, its purpose is reasonably disposition human resource and improves resource to use efficiency. The appraisement system of human resource is a very important factor that affects the working satisfaction, and it is one important aspect of the modem management of human resource. Along with the raising of social informative level, especially Internet is widely used, the speed of each kind of information spread is more and more rapidly, the space of spread ~
* This work is supported by National Social Science Foundation of China (Grant
No. 03CJLOO4) and National Natural Science Foundation of China (Grant No. 600740 14).
296
297 is more and more widely, which will have notable influence on the appraisement system of human resource. It is necessary that the appraisement system of human resource is provided with enough reasonability. An organization should strengthen its key competitive ability in information society, therefore, need to strengthen its cohesive affinity by various effective means, the reasonable appraisement system of human resource is a kind of important means that strengthen organization cohesive affinity.
2. The factors of the appraisement system of human resource The appraisement of human resource is a systematic project. An appraisement system of human resource should have special function which is composed of appraisement object, appraisement corpus, appraisement quantity table, appraisement result composition, appraisement result application and feedback, appraisement policymaker and environment. Let xo, xl, x2, x3, x4, xs and x6 denote the appraisement object, appraisement quantity table, appraisement corpus, appraisement result composition, appraisement result application and feedback, appraisement policymaker and environment, the appraisement system of human resource can express being with set:
x,
y, = { XO,X,,..', } (1) In Eq. (l), 3i,j E {0,1,2 ,..., 6 ) , if X,f X,, , then the appraisement system is not identical. There are different multi-systems of appraisement of human resource that server as different persons such as administrator, technician and worker, especially in the factor of appraisement quantity table" aspect has obvious discrepancy, and therefore, there are a lot appraisement systems of human resource in an organization. 3. A model of evaluation of the reasonability of the appraisement system of human resource
3.1. Index of model of evaluation This paper puts forward an index of model of evaluation of the reasonability of the appraisement system of human resource, and as table 1 shows. The index is built according to the factors of the appraisement of human resource, which includes seven of A level index and twenty-three of B level index.
298 Table 1. Index of model of evaluation of the reasonability of the appraisement system of human resource B level indexes A level index 1 . Appraisement object's quality (Ti)
2. Appraisement quantity table reasonability
(T 2 ) 3. Appraisement corpus' quality
(T 3 )
4. Appraisement result composition's reasonability (T 4 ) 5. Appraisement result application and feedbacks' reasonability (T 5 ) 6. Appraisement policymaker's quality (T 6 )
Name 1.1 Ideological idea 1 .2 Worth system 1 .3 Professional morals 1 .4 Work style 1 .5 Participate in level 2.1 Effect degree 2.2 Letter degree 3.1 Influence 3.2 Justice 3.3 Knowledge 3.4 Participate in level 4. 1 Mathematics basic 4.2 Handle conflict 4.3 result effective 5.1 Apply 5.2 Feedback 6.1 Justice 6.2 Policy 6.3 Knowledge 6.5 Ability 7.1 Outside environmental 7.2 Inside environmental
7. Environmental suitability (T 7 )
Mark
Til T12 T13 T14 T15 T21 T22 T31 T32 T33 T34 T41 T42 T43 T51 T52 T61 T62 T63 T64 T71 T72
3.2. Model of evaluation of the reasonability of the appraisement system of human resource 1. Define evaluation sets asF = \k],k2,k3,k4,ksj, in which, k,, k2, k3, k4 and k5 denote the result of appraisement, such as "very good", "good", "general", "bad" and "very bad". 2. Define A level fuzzy judge factor sets as: T=T, u T2 u T3 u T4 u T5 u T6 u T7, in which T,, T2, T3, T4, T5, T6 and T7 express A level index of the table l.Let /l; denote the weight of Tt , 7
i — 0,1,...,7 ; and T] A f = 1 > then the vector of weight of A level index i=0
shows as: A — (/I,
/12
A3
/14
A5
A6
A,-]).
299 3. Define B level fuzzy judge factor sets as TI={ T l l , TI2, ...,T, k}, i=1,2,. . .,7. Let wy denote the weight of TI/, then the vector of the weight of B level index shows as: A, = (w,, w12
... w,,) ,
k
w,/= 1
and /=I
According to Table 1, appraisement policymakers invite the appraisement corpus to give the score in [0, 11 marking of appraisement objects, and according to the degree of the biggest subordinate law or-fuzzy distribution law algorithm etc., we can get the B level index judge matrix R, :
Acc_ording to the weight of B level index, we can get the A level index judge matrix R :
According to the weight of A level index, we can get the judge calculation formula of fuzzy synthesize:
E=AoE
(4)
The result of Eq. (4 ) is: N
Yr = B = ( a ,
a2
a3
a4
a51
(5)
In Eq.(5 ) al, a2, a3, a4 and a5 express the degree of subordinate to the level of evaluation comment kl, k2, k3, k4 and k5 respectively, and a, E [OJ] , i = 42, ...,5 ; According to reality, we can get the result of Eq.(5) , which is the finally result of evaluation of the reasonability of an appraisement system of human resource.
Reference 1. L.A.Zadeh, Fuzzy Sets, Information and Control,8, 338-353( 1965) 2. McBratney, Alex. B.; Odeh, Inakwu O.A. Application of fuzzy sets in soil science: fuzzy logic, hzzy measurements and fuzzy decisions Geoderma, Vol.: 77, Issue: 2-4, June, 85-1 13(1997)
AN ALGORITHM FOR LINEAR BILEVEL PROGRAMMING PROBLEMS CHENGGEN SHI, GUANGQUAN ZHANG AND JIE LU Faculty of Information Technology, University of Technology, Sydney P.O. Box I23 Broadway, Sydney, NSW 2007 Australia
For linear bilevel programming problems, the branch and bound algorithm is the most successful algorithm to deal with the complementary constraints arising from K u h Tucker conditions. This paper proposes a new branch and bound algorithm for linear bilevel programming problems. Based on this result, a web-based bilevel decision support system is developed.
1. Introduction The game theory of Von Stackelberg [I] has motivated bilevel programming (BLP). The majority of research on BLP has centered on the linear version of the problem. There have been nearly two dozen algorithms [2,3,4,5,6,7,8] proposed for solving linear BLP problems since the field being caught the attention of researchers in the mid-1970s. A popular way to solve a linear BLP problem is that a bilevel programming problem is transferred into a nonlinear programming problem by using KuhnTucker conditions. The reformulation of the linear BLP problem is a standard mathematical program and relatively easy to solve because all but one constraint is linear. Omitting or relaxing the constraint leaves a standard linear program that can be solved by using simplex algorithm. This is the case of the algorithms proposed by Bard and Falk [ 5 ] , and, Fortuny-Amat and McCarl [9]. The algorithm of Fortuny-Amat and McCarl is quite similar to that proposed by Bard and Falk. Both of them require the addition of 9 rn (q, m are the number of the follower’s variables and the number of follower’s constrains, respectively) variables and the explicit satisfaction of the complementary slackness, albeit in different ways. In a later study, Bard and Moore [lo] developed an implicit approach to satisfying the nonlinear complementary constraint. This algorithm is called branch and bound algorithm. Its characteristic is that it iterates to solve a linear program that has n 9 2m variables and p 4 rn constraint functions. Bard and More [lo] investigated the efficiency of their branch and bound approach and found that the CPU time of the branch and bound approach grew exponentially with the size of the problem. This paper develops a more efficient
+
+ +
+ +
300
301 branch and bound algorithm by dividing a large non-linear problem into two small sub-problems, then solving them separately. Following the introduction, this paper overviews linear BLP in Section 2. Section 3 presents a new branch and bound algorithm. A conclusion and future work are given in Section 4.
2.
Linear Bilevel Programming
For X E X C R " , ~ E Y c R " , F : X x Y c R ' , a n d a linear BLP problem is given by Bard [2]:
f:XxYcR',
min F ( x , y) = clx + d,y
(14
+ B, y I b,
(1b)
X€X
subject t o A,x
min f (x,y) = c,x+ d,y YEY
subject to
A,x + B,y I b, ,
(1c)
(Id)
R " , d , , d, E R", b, E R P , b, E R q , A, E R P x " , B, E R p x m ,A, E R q X n ,B, E RqXm. Let u E Rq and v E R" be the dual variables associated with constraints (Id) and y 2 0 , respectively. Bard [2] gave the following proposition. Proposition 1 A necessary condition that (x*,y*) solves the linear BLP problem (1) is that there exist (row) vectors u * and v * such that * * * (x , y ,u ,v*) solves:
where c,,
C, E
min(c,x + d,y)
(24
subject to A,x + B, y I b,
u(b, - A,x - B,y)
(2b)
+ VY = 0
(2d)
A , x + B,y I b,
(2e)
x 2 0, y 2 0,u 2 0,v 2 0 .
(20
302 3. A New Branch and Bound Algorithm Our main idea is to divide the non-linear programming problem (2) into the following three sub-problems. The first problem is:
A,x + B,y 5 b, , A,x
+ B,y
I b, ,x 2 0, y 2 0 .
(3)
The second problem is:
uB,-v=-d,, u > O , V ~ O .
(4)
The third problem is:
v = O , ~ ( b -,A , x - B , ~ ) = O .
(5)
+
The problem (3) is standard linear programming and only has n m variables and p + q constraint functions. The problem (4)is also linear programming and only has q m variables and m constraint functions. The solving of (1) is equivalent to finding an optimal solution by iterating to solve (3), solve (4),and check if ( 5 ) is satisfied. The following notations are employed in the description of the algorithm. Let = { 1,. .,q m }be the index set for the t e r n in (2e), and let F be the incumbent upper bound on the leader's objective function. At the kth level of the search tree we define a subset of indices c W , and a path Pk corresponding to an assignment of either ui = 0 or g i = 0 for i E Wk . Now let S,+ = { i : i E W k , u i =O}, S; = { i : i E W k , g i = O } , S: = {i :i P 1. For i E S: , the variables ui or g i are free to assume any nonnegative value in the solution of (3) and (4),so complementary slackness will not necessarily be satisfied. Step 0 (Initialization) Set k =0 , s l =4 , S; = 4 , S: = (1,..., q m}, and F = 0 0 . Step l(1teration k ) Set g i = 0 for i E S; . Attempt to solve (3). If the resultant problem is infeasible, go to Step 5; Otherwise, set ui = 0 for i E si . Attempt to solve (4).If the resultant problem is infeasible, go to Step 5 ; Otherwise, put k t k 1 and label the solution ( x k,y ,u ) . Step 2 (Fathoming) If F ( x k ,y ) 2 go to Step 5.
+
. +
w
wk
wk
+
+
k
k
Step 3 (Branching) if u ig i (x ,y
k
F, ) = 0 , i = 1,...,p + q + m , go to Step
4. Otherwise, select i for which ukgi ( x k ,y k ) #
i,.Put S,'ts,'u{i,},
s,"ts:\{i,},
0 is the largest and label it
S;tS;,append
i, to P k ,
and go to Step 1. Step 4 (Updating) F t F ( x k ,y k ) . Step 5 (Backtracking) If no live node exists, go to Step 6 . Otherwise branch to the newest live vertex and update
s;, s, s:
and Pk as discussed below. Go to Stepl. Step 6
303 (Termination) If
F=
00
, there is not feasible solution to (1). Otherwise,
F
declare the feasible point associated with the optimal solution to (1). Based on this algorithm, a web-based decision support system for linear BLP problems has been developed (The system will be introduced in another paper soon).
4.
Conclusion and Future Work
This paper presents a more efficient branch and bound algorithm by dividing a large problem into small sub-problems. Based on this algorithm, a web-based decision support system for linear BLP problems has been developed. This algorithm and its system will be test against its scalability.
References
H. Von Stackelberg, The Theory of the Market Economy. Oxford University Press,York, Oxford, (1952). 2 J. Bard, Practical Bilevel Optimization: Algorithms and Applications. Kluwer Academic Publishers, USA (1998). 3 W. Candler and R. Townsley, A linear two-level programming problem. Com- puters and Operations Research. 959-76 (1982). 4 W. Bialas and M. Karwan, Two-level linear programming. Management Science. 30: 1004-1020 (1984). 5 J. Bard and J. Falk, An explicit solution to the multi-level programming problem. Computers and Operations Research. 9:77-100 (1982). W. Bialas and M. Karwan, Multilevel linear programming. Technical 6 Report 78-1, State University of New York at Buffalo, Operations Research Program (1978). E. Aiyoshi and K. Shimizu, Hierarchical decentralized systems and its 7 new solution by a barrier method. IEEE Transactions on Systems, Man, and Cybernetics. 11:444-449 (1981). D. White and G. Anandalingam, A penalty function approach for 8 solving bi-level linear programs. Journal of Global Optimization. 3:397-419 (1993). J. Fortuny-Amat and B. McCarl, A representation and economic 9 interpreta-tion of a two-level programming problem. Journal of the Operational Research Society, 32:783-792 (1981). 10 J. Bard and J.T. Moore, A branch and bound algorithm for the bilevel programming problem. SIAM Journal on Scientific and Statistical Computing, 11:281-292(1990). 1
A FUZZY GOAL APPROXIMATE ALGORITHM FOR SOLVING MULTIPLE OBJECTIVE LINEAR PROGRAMMING PROBLEMS WITH FUZZY PARAMETERS
FENGJIE WU, GUANGQUAN ZHANG AND J E LU Faculty of Information Technology, University of Technology, Sydney PO Box 123, Broadway, NSW2007, Australia lfengjiew, zhangg, jielu} 0 it.uts.edu.au Many business decisions can be modeled as multiple objective linear programming (MOLP) problems. When formulating a MOLP problem, objective functions and constraints involve many parameters which possible values are assigned by the experts who are often imprecisely or ambiguouslyknown. So, it would be more appropriate for these parameters to be represented by fuzzy numbers. In this paper, a new fuzzy goal approximate algorithm is developed for solving fuzzy multiple objective linear programming (FMOLP) problems with fuzzy parameters when the fuzzy goals for the objective functions need to be achieved. And an illuminative example is also given to illustrate the algorithm developed
1
Introduction
Multiple objective linear programming (MOLP) is one of the popular methods to deal with complex and ill-structured decision-making. When formulating a MOLP problem, various factors of the real world system should be reflected in the description of the objective functions and the constraints. Naturally, these objective functions and constraints involve many parameters which possible values are assigned by the experts who are often imprecisely or ambiguously known.' With this observation, it would be more appropriate to interpret the experts' understanding of the parameters as fuzzy numerical data which can be represented by fuzzy numbers. The fuzzy multiple objective linear programming (FMOLP) problems involving fuzzy parameters would be viewed as a more realistic version than the conventional one.' Various kinds of FMOLP models, methods, and approaches have been proposed to deal with different decision-making situations which involve fuzzy values in objective function parameters, constraints parameters, or goals. Zhang et a1 3-4 proposed a method to solve a fuzzy linear programming (FLP) problem by transforming it into a corresponding four-objective constrained optimization problems and another method to formulate linear programming problems with fuzzy equality and inequality constraints. Wu et al. developed an approximate algorithm for solving the proposed FMOLP problems with fuzzy parameters in any form of membership function in both objective functions and constraints. Based on that, in this paper, a new fuzzy god approximate algorithm is developed for solving fuzzy multiple objective linear programming (FMOLP) problems with fuzzy parameters when the fuzzy goals for the objective functions need to be achieved
304
305
2
Fuzzy Multiple Objective Linear Programming and Related Definition and Theorems
In this paper, we now consider the situation that all coefficients of the objective functions and constraints are represented by fuzzy numbers in any form of membership function. Such FMOLP problems can be formulated as follows: (FMLOP) [Maximize f ( x ) = Cx [s.t.
(1)
X€X = {t6fi" | Axxb,;c>o}
where C is an k x n matrix of fuzzy numbers, A is an m x n matrix, b is an m-vector, and x is an n-vector of decision variables, x e R" • Now, we have the following definitions about FMOLP problems. Definition 2.1 x' is said to be a complete optimal solution, if and only if there exists x' e X such that /.(x'\Ji(x), i = l *, for all x e X • Definition 2.2 x' is said to be a Pareto optimal solution, if and only if there does not exists another xeX such that 7-(*)h/!-(**)' f°r ^ '• Definition 2.3 x is said to be a weak Pareto optimal solution, if and only if there does not exists another x e X such that f.(x)> f,(x')> f°r all '• Associated with the FMOLP problems, let's consider the following multiple objective linear programming (MOLPj) problems: (MOLPx) (Maximize ((cj;,*},{Qs,;t))F,V/l e [o,l] |s.t.
(2)
x 6 X = {it 6 R" | Afjc < b^,Afx < b*,x > 0,VA 6 [0,l]}
where cj-
For the crisp MOLPj, problems, we also have the following definitions. Definition 2.4 x" is said to be a complete optimal solution, if and only if there exists x'eX suchthat f . ( x ' } > f . ( x ) , \ = 1, ...,k, for all xsX .' Definition 2.5 x' is said to be a Pareto optimal solution, if and only if there does not exists another x e X suchthat f.(x)> /(**)> for all /and /(*) */.(*:') for at least one J
-'
Definition 2.6 x' is said to be a weak Pareto optimal solution, if and only if there does not exists another xe X such that f.(x)> f.(x"),i= 1» • • • . k . ' The following theorem shows the relationships between FMOLP problem and the MOLPX problem. Theorem 2.1 Let x' e X be a solution to the MOLP^ problem. Then x" is also a solution to the FMOLP problem
306 3
A Fuzzy Goal Approximation Algorithm for Solving FMOLP Problem
(x)r
Considering the FMOLP problem, for each of the fuzzy multiple objective functions f"(x) = ()., ..., , assume that the DM can specify some fuzzy goals g" =(g",,g2,...,~,)T which reflects the desired values of the objective functions of the DM. Based on the definition of FMOLP problem and MOLPl problem and Theorem 2.1, we can make the conclusion that the solution of MOLPh problem is equally the solution of FMOLP problem. From the definition of MOLPk problem, when the DM , the corresponding Pareto optimal sets up some fuzzy goals =(i,,i2,...,f,)T
6 x(.),
solution, which is, in the minimax sense, the nearest to the fuzzy goals or better than that if the fuzzy goals is attainable, is obtained by solving the following minimax problem:
i
( M O L P ~ ) Min m x
(3)
~ ~ k , x ) - g ~ , ( c : , x ) - g : r , v n t [OJI
x E X = {x E R" I Akx S 6k,A;x 5 6 2 , x t 0,VL E [O,l]}
s.t.
where g:=[g,",,g,L,,,..,g:P) g: =[g::.g,q,r,g;J For the simplicity in presentation, we define = {x E R" I A;X I b;, A;X I b; ,x 2 0} A [0,1~ The main steps of algorithm are described as follows: Let the interval [0,1] be decomposed into 1 mean sub-intervals with (1+1) nodes ~ ( =i0, . . . , I ) which arearranged in the order of o=;l, <;3, < . . . < A = 1 , then
x,
define
=
"
x*,, and denote
1
(MOLPh), min max ((C;,, , x ) - g,; ,(C;, , x ) - gf, x
S.t.
E
1,i
= 1 , . ..,k;0 = &, <
... < A, = 1 (4)
x'
then solve (MOLPh), with ()., , where ()., = ( x , , xz,. ..,x,,), ,and the solution is obtained subject to constraint x E X ' . Step 2: Solve (MOLP&, with (.)?, . Steu 3: If -(x), l < E , the solution x'of MOLPh problem is ( x ) 2 , . Otherwise,
Ster,:Set I
= 1,
update 1 to 21 and go to Step 2.
4
An illustrative example
Let us consider the following FMOLP problem with two objective functions:
s.t.
&xl + C2,x2 2 6, '?31xl+ z,2xz5 5,
1
41x1
+ %2xz
24
Where membership functions of coefficients of objectives and constraints are:
307 ro
X
< 1014 < X
I
(x-3.5)/3.5
&)=
[(324 - x2)/275
x < 0.5or3 < x
3.55 x < 7 75x57
szG)=
7 < x < 18
I,
4 5 X < 10
[(20-x)/lo
10 < x s 20
(x2 - 16)/84
ro
x < -2or -0.5 < x
105 x 5 10
Finally, the running results i s that the decision variables x,* = 0.6306 and X; = 5.4216 , two fuzzy objective functions T,'(0.6306,5.4216)= 0.6306?,, +5.4216?,, h ( x ; , x ; ) = 7;(0.6306,5.4216)=
0.6306F2, + 5.4216
9
Acknowledgements This research is supported by Australian Research Council (ARC) under discovery grant DP0211701.
References 1. Sakawa, M., "Interactive multiobjective linear programming with fuzzy parameters," in Fuzzy sets and interactive multiobjective optimization. New York: Plenum Press, 1993. 2. Wu, F., Lu, J., and Zhang, G , "A New Approximate Algorithm for Solving Multiple Objective Linear Programming with Fuzzy Parameters" The Third International Conference on Electronic Business (ICBE2003), Singapore, 2003, 3. Zhang, G Q., Wu, Y. H., Remia, M., and Lu, J., Formulation of fuzzy linear programming problems as four-objective constrained optimization problems Applied Mathematics and Computation, vol. 39,383-399 (2003). 4. Zhang, G . Q., Wu, Y. H., Remia, M., and Lu, J., "An a-Fuzzy Max Order and Solution of Linear Constraint Fuzzy Optimization Problems" Proceedings of Computational Mathematics and Modelling, 2002
A KIND OF FUZZY LEAST SQUARES SUPPORT VECTOR MACHINES FOR PATTERN CLASSIFICATION*
SHUWEI CHEN AND YANG XU Intelligent Control Development Center, Department of Mathematics, Southwest Jiaotong University, Chengdu 610031, Sichuan, P. R. China E-mail: [email protected], [email protected]
Support Vector Machine (SVM) is a new machine learning method, and Least Squares Support Vector Machine is an SVM version that involves equality instead of inequality constraints, and works with a least squares cost function. Both of these two methods learn the decision surface from two distinct classes of input points, and each point is assigned to one of these two classes. But in many applications, a few points are not sure assigned to one of these two classes. In this paper, we apply a fuzzy degree based on membership function to each input point and reformulate the LS-SVM such that different input points can have different contributions to the learning of the decision function.
1. Introduction Support Vector Machine (SVM) proposed by Vapnik [8, 21 is a new machine learning method based on the Statistical Learning Theory. Statistical Learning Theory [8, 91 is mainly developed to resolve the overfitting problem experienced by most learning machines. SVMs have been successfully applied to pattern recognition and function estimation problems [8, 9, 51 because of its high generalization ability. In this method the original input space is mapped into a higher dimensional input space called feature space, and an optimal separating hyperplane is constructed in the feature space to maximize the generalization ability. Least Squares Support Vector Machines (LS-SVMs) proposed by Suykens and Vandewlle [6] are trained by solving a set of linear equations. In this way, the solution follows from *We gratefully acknowledge the support of the National Nature Science Foundation of P. R. China (Grant No. 60074014).
308
309 a linear Karush-Kuhn-Tucher condition system instead of a quadratic programming problem. In many applications, a few points are not sure assigned to one of these two classes, i.e., there exist some fuzziness in the input points. For instance, the degree of an input point belonging to some class is 0.8. To resolve this problem, Lin and Wang [4] proposed a kind of method that assign a membership degree, which implies the degree of the input point belonging to a class, to each input point. And then the membership degree is applied to decide the weight of each input point, i.e., the contribution of each point to the learning of the decision function. In this paper, we apply this method to the case of LS-SVMs. Note that Tsujinishi and Abe [7] have proposed a kind of fuzzy LSSVMs, which is an extension of the method proposed by Inoue and Abe [3, 11 to the LS-SVMs. But this method is developed mainly to resolve the unclassifiable regions exist in multiclass problems. By fuzzifying the class function, the points in the unclassifiable regions are assigned to the class with the maximum membership degree. The paper is organized as follows. In Section 2, the architectures of the SVMs and the LS-SVMs are described. Section 3 presents a kind of fuzzy LS-SVMs that takes into account the different contributions of the input points with different fuzzy degree to the learning of the decision function. Section 4 comes to the conclusion.
2. Least Squares Support Vector Machines 2.1. Architecture of
SVMs
Let m-dimensional training data xi (i=l, 2, " ' , n) belong to class I or class I1 and the associated label be yi = 1 for class I and -1 for class 11. If these data are linearly separable in the feature space, the decision function can be determined as:
+
D ( z ) = wT,p(z) b ,
(1)
where ~ ( xis) a mapping function that maps x into a l-dimensional space, w is an l-dimensional vector and b is a scalar. To separate data linearly, the decision function satisfies the following conditions: yi(wT,p(zi)
+ b) 2 1
for i = 1,2,
' ' '
,n.
(2)
If the problem is linearly separable in the feature space, there are an infinite number of decision functions that satisfy (2). Among them the
31 0 hyperplane is required to have the largest margin between two classes. Here the margin is the minimum distance from the separating hyperplane to the input data and this is given by D(z)/11w11. And the separating hyperplane with the maximum margin is called the optimal separating hyperplane. Then in order to obtain the optimal separating hyperplane with the maximum margin, we must find w with the minimum IIwII. This leads to solving the following optimization problem: minimize +wTw subject to yi(wTp(zi)
+ b) 2 1
f o r i = 1 , 2 , . . . , n.
(3)
When training data are not linearly separable, slack variables is introduced into (2) as follows: yi(wTp(zi)
+ b) 2 1 ti,ti2 o -
f o r i = I, 2 , . . . , n .
(4)
The optimal separating hyperplane is determined so that the maximization of the margin and the minimization of the training error are achieved.
+
minimize 3wTw $ CyZltr subject to yi(wTp(rci) + b ) 2 1 -ti,
ti 2 o
f o r i = 1,2;.. , n ,
(5)
where C is a parameter that determines the tradeoff between the maximum margin and the minimum classification error, and p is 1 or 2. When p = l , the SVM is called L1 soft margin SVM (L1-SVM), and when p=2, L2 soft margin SVM (L2-SVM). In the conventional SVM, the optimal separating hyperplane is obtained by solving the above quadratic programming problem. 2.2. Architecture of LS-SVMs In contrast to the SVM, the LS-SVM is trained by minimize +wTw + $ Cy=,t; subject to yi(wTp(zi) + b ) = 1 -ti,
f o r i = 1 , 2 , . . . ,n.
(61
In the LS-SVM, the equality constraints are used instead of the inequality ones employed in the conventional SVM. Therefore the optimal solution can be obtained by solving a set of linear equations instead of solving a quadratic programming problem. To derive the dual problem of ( 6 ) , the Lagrange multipliers are introduced as follows:
31 1
where ( Y = ( ( Y ~.,. . , is a set of Lagrange multipliers, which can be positive or negative in case of LS-SVM formulation. The conditions for optimality are derived by differentiating the above equation with respect to 20, 6,b and cyi, and equating the resulting equations to zero, which can be expressed in a matrix form by
where
Hence the decision function (1) can be found by solving the linear set of equations (7) instead of quadratic programming. 2.3. Kernel functions
One of the characteristics of the SVM is that it uses the technique called kernel trick. In ( 8 ) , defining
K ( % ,Zj)= /j(.i)Tv(.j) ,
(9)
where K ( z i , z j ) is a kernel function, and by this we can avoid treating variables in the feature space. In the following, we use the kernel functions as follows: 0 0
0
dot kernels K ( z i ,zj)= x T x j ; polynomial kernels: K(’ci,zj)= (zTzj l)d,where d is a positive integer; RBF kernels: K(zc;,sj)= ezp(-y((zi -zjIl2), where y is a positive parameter.
+
3. Fuzzy Least Squares Support Vector Machines
In this section, we introduce a kind of fuzzy LS-SVM to overcome the influence of noise in the input points to the decision function. To do this, we assign a fuzzy membership degree si to each input point, and the input data pairs are rewritten as (xi,yi, si), i=l, 2, . . . , n, where y E {-1, l}, (T 5 si <_ 1 and sufficient small o > 0.
312
Because the fuzzy membership degree si implies the degree of the input sample z i belonging to one class, and the parameter is a measure of error, the term sit: can be looked upon as a measure of error with different weighting according to the importance of the input sample to learning of the decision function. This leads to the optimization problem:
where C is a constant. It is noted that a smaller si reduces the effect of the parameter ti in the above problem such that the corresponding xi is treated as less important. To solve this optimization problem we construct the Lagrangian
and find the saddle point of Q ( w ,b, a , 0. The parameters must satisfy the following conditions:
These conditions can be written immediately as the solution to the following set of linear equations
Hence the decision function (1) can be found by solving the linear set of equations (12) instead of quadratic programming.
31 3 The only parameter C in SVM (LS-SVM) controls the tradeoff between the maximization of the margin and the minimization of the amount of misclassifications. A larger C makes the training of SVM less misclassifications and narrower margin. The decrease of C makes SVM ignore more training points and get wider margin. In Fuzzy LS-SVM, we can set C to be a sufficient large value. It is the same as LS-SVM that the system will get narrower margin and allow less misclassifications if we set all si = 1. With different value of s i , we i in the system. can control the tradeoff of the respective training point z A smaller value of si makes the corresponding z i less important in the training. There is only one free parameter in SVM (LS-SVM) while the number of free parameters in Fuzzy LS-SVM is equivalent to the number of training points. 4. Conclusion
In this paper, we proposed a kind of fuzzy LS-SVM that fuzzifies each input point such that different input points have different contributions to the learning of the optimal separating hyperplane.
References 1. S. Abe and T. Inoue, f i z z y support vector machines for multiclass problems. Proceeding of the Tenth European Symposium on Artificial Neural Networks (ESANN 2002) 113-118 (2002). 2. N. Cristianini and J. S. Taylor, A n Introduction to Support Vector Machines. Cambridge University Press, (2000). 3. T. Inoue and S. Abe, Fuzzy support vector machines for pattern classification. Proceeding of the international Joint Conference on Neural Networks (IJCNN'Ol), 1449-1454 (2001). 4. C. F. Lin and S. D. Wang, Fuzzy support vector machines. IEEE Transactions on Neural Networks, 13(2), 464-471 (2002). 5. B. Scholkopf, C. J. C. Bruges and A. J. Smola, Advances in Kernel Methods - Support Vector Learning. Cambridge, MA: MIT Press, (1999). 6. J. A. K. Suykens, and J. Vandewlle, Least squares support vector machine classifiers. Neural Processing Letters, 9(3), 293-300 (1999). 7. D. Tsujinishi, and S. Abe, Fuzzy least squares support vector machines for multiclass problems. Neural Networks, 16,785-792 (2003). 8. V. Vapnik, The Nature of Statistical Learning Theory. New York: Springer, (1995). 9. V. Vapnik, Statistical Learning Theory. New York: Wiley, (1998).
ONLINE TRAINING EVALUATION IN VIRTUAL REALITY SIMULATORS USING EVOLVING FUZZY NEURAL NETWORKS.
L. S. MACHADO Department of Computer Science, Federal University of Paraa%a,Jotio Pessoa, PB, Brazil E-mail: [email protected]. br R. M. DE M O U E S Department of Statistics, Federal University of Paraz'ba, JoEo Pessoa, PB, Brazil E-mail: [email protected] A new approach to quality of training online evaluation in virtual reality worlds is proposed. This approach uses evolving fuzzy neural networks (EFhNN) for modeling and classification of trainee in predefined classes of training.
1. Introduction Nowadays, with recent advances in virtual reality systems, several kinds of training are made in virtual reality environments. Very realistic virtual re ality environments are constructed to immerge the user into a simulation of a real task. In this cases, an online evaluation could offer immediate feedback about user training performance. For online evaluation of training in virtual worlds, Moraes and Machado4 proposed recently several approaches for this problem, presenting methods of online evaluation. It is obtained attaching an evaluation tool t o the virtual reality system that supervises the user during the simulation. This paper proposes the use of a recent class of Fuzzy Neural Networks named Evolving Fuzzy Neural Networks (EFuNNs)' for an online training evaluator in virtual reality simulators.
2. Virtual Reality and Simulation Virtual Reality (VR) refers t o real-time systems modeled by computer graphics that allow user interaction and movements with three or more 314
31 5 degrees of freedom'. VR Worlds are 3D environments created by computer graphics techniques where one or more users are immersed total or partially t o interact with virtual elements. The quality of the user experience in a VR world is given by the graphics resolution and by t h e use of special devices for interaction. Basically, the devices stimulate the human senses as vision, audition and touch. There are many purposes for VR systems, but a very important one is the simulation of procedures for training. Training simulation provides significant benefits over other methods, mainly in critical procedures. Evaluation in simulations for training based on VR is necessary t o measure the training quality and provide some feedback about the user performance4. User movements, applied forces, angles, position and torque can be collected from the devices and be used in an evaluation*. Spatial movements can be collected from mouse, keyboard and any other tracking device. Robotic devices can provide tactile feedback t o the user, and it can measure forces and torque applied during the interaction. Using these devices, VR systems can provide data t o evaluate a simulation performed by a user. This way, an evaluation tool could supervise t h e user movements during the manipulation of the object. Because VR systems deal with many threads (visual, haptic feedback, movements capture, stereoscopy) running simultaneously, an evaluation tool can not interfere in this process. So, the evaluation tool must present low complexity t o do not compromise t h e system performance. At the same time, it must have high accuracy t o do not compromise evaluation. The problem can be solved using Evolving Fuzzy Neural Networks (EFuNNs) '. EFuNNs are structures that evolve according determined principles' and they have low complexity. To test the evaluation method, we are using a bone marrow harvest (BMH) simulator 3 , a VR based system for training. The BMH simulator recreates the pelvic region of the human body, the tissue layers and their physical properties and integrates a robotic arm that provide force feedback during its manipulation.
3. Evolving Fuzzy Neural Networks (EFuNNs) and the
Evaluation Tool The EFuNNs are structures that evolve according principles': quick learning, open structure for new features and new knowledge, representation of space and time and analyse themselves errors. The EFuNN is a connectionist feed forward network with five layers of neurons, but nodes and
31 6
connections are created or connected when data examples are presented2. The input layer represents input variable of the network as crisp value z. The second layer represents fuzzy quantization of inputs variables. Here, each neuron implements a fuzzy set and its membership function as triangular membership, gaussian membership or other. The third layer contains rule nodes (rj) which evolve through training and each one is defined by two connections vectors: Wl(rj) from fuzzy input layer t o rule nodes and Iy2(rj) from rule nodes to fuzzy output layer. These nodes are created during learning and they represent prototypes of data mapping from fuzzy input t o fuzzy output space. In this layer we can use a linear activation fundion or a Gaussian function. The fourth layer represents fuzzy quantization of the output variables from the third layer. A weighted sum of input function and an activation function are ustxlfor fuzzy quantization. The last layer uses an activation function t o calculate defuzzified values for output variables y. In the third layer, each VVl(rj) represents the coordinates of the center of a hypersphere in the fuzzy input space and Wz(rj) represents the coordinates of the center of a hypersphere in the fuzzy output space. The radius of the hypersphere of a rule node rj is defined as Rj = 1 - Sj, where Sj is the sensitive threshold parameter for activation of rj from a new example (z,y), where X is an input vector and y is an output vector. The pair of fuzzy data (xf,yf) will be allocated t o rj if z ~ f is into the rj input hypersphere and if yf is into the rj output hypersphere. For this happen, two conditions must be satisfied: (a) The local normalized fuzzy distance between xf and Wl(rj) must be smaller than Rj; (b) The normalized output error must be smaller than an error threshold E , where E is the error tolerance of the system for fuzzy output. If the condictions (a) or (b) are not satisfied, it can be created a new rule node. The weights VVl(rj) and W2(rj) of rule rj are updating according t o the interactive process. When a new example is associated with a rule rj, t h e parameters Rj and Sj are recalculated. If exists temporal dependencies between consecutive data, the connection weight W 3 can capture that. The connection W 3 works as a Short-Tern Memoryand as a feedback connection from rule nodes. The EFuNN learning algorihtm starts with initial values for parameters and the EFuNN is trained by examples until convergence2. Based on EFuNN, the evaluation tool proposed must supervise the user movements and the parameters associated t o it. In the VR simulator the trainee must extract t h e bone marrow. In t h e first movement, he must feel the skin t o find the best place t o insert t h e needle. After, he must feel the tissue layers (epidermis, dermis, subcutaneous, pericrsteum and com-
317 pact bone) trespassed by the needle and stop at correct position to do the bone marrow extraction. If the quantity of bone marrow is sufficient t h e proceeding stop, else he must find another position t o extract bone marrow again. In our system the trainee movements are monitored by variables as: acceleration, applied force, spatial position, torque and angles of n e e dle. The system is calibrated by an expert which execute several times the correct procedure. When a trainee uses the system, his performance is compared with expert performaces and a comparison coefficient of performances is obtained. This coefficient is normalized and works as a mark for trainee performance. Several classes of performances are available t o give t o trainee a position about his training, as: ''YOU are well qualified" "you need some training yet", "you need more training" and "you are a novice". 4. Conclusions In this paper we presented a new approach t o online training evaluation in VR simulators. This approach has an elegant connectionist approach done by EFuNN and solves the main evaluation problems in evaluation procedures. It is welcome because provides an appropriate connectionist fuzzy methodology t o measures in VR worlds. Systems based on this approach can be applied in VR simulators for several areas and can be used t o classify the trainee into classes of learning giving him a real position about his performance.
References 1. Burdea, G.; Coiffet, P., Virtual Reality Technology. Addison-Wesley, 2003. 2. Kasabov, N., Evolving Fuzzy Neural Network for Supervised/Unsupervised On-line, Knowledge-based Learning, IEEE Trans. on Man, Machine and Cybernetics, v.31, n.6, 2001. 3. Machado, L. S., Mello, A. N., Lopes, R. D., Odone Fillho, V. and Zuffo, M. K., A Virtual Reality Simulator for Bone Marrow Harvest for Pediatric Transplant. Studies in Health Technology and Informatics: MMVR, v. 81, pp.29S297,- Jan., 2001. 4. Moraes, R. M.; Machado, L. S., Hidden Markov Models for Learning Evaluation in Virtual Reality Simulators. International Journal of Computers & Applications, v.25, n.3, pp. 212-215, 2003.
THE PERSONALIZED PAGERANK BASED ON USER BEHAVIORS: ZHANSHENG LI Department of Computer Science & Engineering, Xihua University, Chengdu 610039, China YAJUN DU, YANG XU Department of Applied Mathematics, Southwest Jinotong University, Chengdu 6l003l. China YUEKUN WANG, DONGMEI QI Department of Computer Science & Engineering, Xihua University, Chengdu 610039. China
The PageRank algorithm takes advantage of the link structure of the Web to improve the ranking of query results, independent of any particular users. Obviously, the PageRank score reflects a “democratic” importance that has no preference for any particular pages. In fact, users always browse those Web pages in which they have more interest. So, we propose a new personalized ranking vector--UserRank (UR) based on user behaviors of browsing pages to estimate the importance of Web pages. In order to yield more accurate query results, we take the UR vector combined with PageRank vector to generate a new ranking vector--Personalized PageRank (P2R). Experimental results demonstrate that our P2R algorithm can more closely reflect user preference and more largely improve the precision of ranking results.
1. Introduction Today tradition information retrieval techniques have not met the need of users on the Web. Recently, however, various link-based raniung strategies have largely improved the query results of Web search. The HITS“] algorithm proposed by Kleinberg attempts to find two types of Web pages, hubs (pages that point to many pages of high quality) and authorities (pages of high quality).Contrastively, the PageRankl3] algorithm, introduced by Page et a]., takes advantage of the link structure of the Web to generate a global “importance rank of all of the pages on the Web. Figure 1 shows how the PageRank scheme works at a simple search engine. “I A problem common to both HITS and PageRank is topic drift, but they do not consider that the personal characteristics of users (such as user preference). In reality, a user may have a set of preferred pages. Jeh and Widom‘61make use of a ”
* This work is supported by the National Natural Science Foundation under grant No. 60074014.
318
31 9 user-specified set of initially-interesting pages to create “personalized views” of the Web, redefining importance according to user preference. cull is^[^] thinks that utilizing user behaviors of browsing pages can greatly improve the precision of ranking results. Wang et a1.[81 discusses user behaviors’ distribution characteristics of browsing pages in detail. However, we draw a conclusion that there are three important factors in user behaviors of browsing pages for personalizing a page, such as the sequence, times and time spending of browsing pages. So, we can create a log database to record user behaviors of browsing pages. Clearly, every user possesses privacy, and his behavior can’t be forcibly gathered. If they permit us to record their behaviors, they will gain better search results. Considering those important factors for personalizing a page, we propose a new ranking vector--UserRank (UR) to estimate the importance of those pages. The use of personalized PageRank to enable personalized Web search was first proposed in [3]. And the parameter pC3]can be an important factor for personalizing pages. Regretfully the parameter p in the standard PageRank is a uniform probability distribution over all pages, and it can’t reflect the preference of particular users. So, it is reasonable for us to take the UR vector combined with the standard PageRank vector to generate a new ranking vector which named Personalized PageRank (P’R). An illustration of our P2R scheme used in a new search engine is given in Figure 2.
I
Crawl-time
i
Offline
ij
Query-time
Repository
Searcher
Fig 1. A simple search engine utilizing the PageRank Crawl-time i
Offline
!:
Ouerv-time - ,
[Qu:ry)
PageRank
Query Results Log
U Fig 2 . A new search engine utilizing our Personalized PageRank scheme
320 2. Review of PageRank Page et al. proposes the intuitive description of PageRank: if page i has a link to page j, then the author of page i is implicitly conferring some importance to page jL3] . We let ODi be the number of links from page i and L be the square matrix corresponding to the directed Web graph G and let the matrix entry Ii,,=l/ OD,, Let n be the number of all nodes in the Web graph and let p be the n-dimensional column vector representing a uniform probability distribution over all nodes:
p=
nInx1.
Then, let PR denote a vector represents the important (i.e. PageRank) of pages. So, the formula of PageRank can be written as: PR = c x LXPR (1- C) xp, (1) where c E (0,l) is the “teleportation” constant which limits the effect of rank sinks.
+
3.
Our New Ranking Vector Based on User Behaviors
3.1. User Behaviors of Browsing Pages We think that user behaviors of browsing pages are very important of ranking pages and embodying personalized characteristics of Web search. With the permission of users, we record user behaviors in a log database. Analyzing a great deal of records in the log database, we deem that there are three key factors for personalizing a page, such as the sequence, times and time spending of browsing pages.
3.2. UserRank (UR) Intuitively, we consider that the priority of browsing has been given to those pages in which he (or she) has the most interest. Furthermore, he probably visits those pages frequently and spends more time in browsing them. Based on those cases and the conclusions in Section 3.1, we utilize those key factors for personalizing a page to propose a new ranking vector--UserRank (UR) vector discussed in Section 1 to estimate the importance of preferred pages. Definition 1: Information I=(U, Q, D, R, S, T) is a 6-tuple, where U denotes a particular user; Q denotes a query submitted by user U; D denotes the time of user U submitting query Q; R=( 1, 2,..., i,.. ., n} represents the page set of query results for query Q submitted by User U, where i denotes a page and n = IRI denotes the number of pages in set R; S R represents the sequence set of pages which have been browsed by user U in sequence; T = {t, I j E S } represents the set of time spending of browsing page j, and I ti 2 , where I , 2 is a given positive constant (0<
I< 2),
and t=x tj(jE
spent in browsing all of pages in set S.
s)denotes the total time
321
Definition 2: UserRank (UR) is a ranking vector for personalizing pages based on user behaviors of browsing pages, and its individual UR denoted by UR(i) (i€ R) will be initialized as 1 for the foremost time. According to Definition 1, if user U submit query Q at time D, producing information I in the process of browsing pages, then we compute each URG) GE S) in sequence which satisfies
The definition of UserRank above has another intuitive basis. If a user browses page i, we said that page i got a vote from the user. So, the UR value of page i will increase, and the UR values of other pages in query results will decrease. Furthermore, with the increase of time of browsing page i, the UR values of pages will continue to increase or decrease. But the increment of UR value of page i always equals to the total decrement of others. 3.3. Personalized PageRank ( P R ) An important component of the PageRank calculation is p[31which turns out to be a powerful parameter to adjust the page ranks. Intuitively, the p vector corresponds to the distribution of Web pages that a random surfer periodically jumps to. So, the key to creating the P2R vector discussed in Section 1 is that we can bias the computation to increase the effect of particular user behaviors via a nonuniform n X 1 personalized vector. In fact, query results ranked by the standard PageRank largely influence user behaviors of browsing pages. So, it is necessary for us to take the UR vector combined with the PageRank vector to generate a new ranking vector P2R. For a given UR, the P2R equation can be written as: P’R=CXLXP ~ R (1- C) X P x UR. (3)
+
4.
Experiments
In order to embody the personalized characterization of our ranking system, we found three test volunteers from three different realms and selected two test queries-- “blues” and “architecture” from Bharat and Heivzinger”’. For queries “blues” and “architecture”, Google returned 1 1400000 and 16300000 items of query results, respectively. Table 1 shows the top 20 query results. When volunteers were browsing Web pages, we recorded user behaviors, and created a log database to save them. Table 2 recorded three volunteers’ behaviors of browsing pages for query “blues” and “architecture”, respectively. The computer hardware basic, the configuration of network environment etc. may influence the speed of computing UR(i) value, but have little effect on the precision of experiment data. Before computing UR(i) value, where iE R, we let UR(i)=l. According to the sequence of records ordered by query time in Table 2, we computed UR values in terms of the equations (2). For more intuitional
322
representation of the change of UR values, we produced a chart. Chart 1 shows the change of UR values of the top 20 pages for query "blues" and "architecture", respectively. Analyzing Chat 1, we can draw a conclusion that the more times and time spending a user browses the same page, the higher the UR value is given. And the earlier the page is browsed, the lower the UR value is given. For the given PageRank values, we computed P2R values in terms of equation (3), reordering the query results showed in Table 3. From Table 3, we can see that these pages in which volunteers have interest are ranked in the front of the query results. So, the P2R vector can better improve the ranking of query results and reflect the preference of particular users. Table 1. The top 20 query results returned by Google on Dec,9,2003 Query: blues Query: architecture 1. www.hob.com/ 1. www.architecture.com/ 2. www.bluesnews.com/ 2. www.architecturemag.com/ 3. www.blues.org/ 3. www.architecture.org/ 4. www.stlouisblues.com/ 4. www.cs.wisc.edu/~arch/www/ 5. www.bluesworld.com/ 5. library.nevada.edu/arch/rsrce/webrsrce/ 6. www.lib.virginia.edu/dic/colls/arhl02/ 6. www.island.net/~blues/ 7. www.greatbuildings.com/gbc.html 7. www.bluestraveler.com/ 8. adam.ac.uk/ 8. www.thebluehighway.com/ 9. www.clr.utoronto.ca/virtuallib/ 9. www.pbs.org/theblues/ 10. dmoz.org/Arts/Music/Styles/Blues/ 10. www.designarchitecture.com/ 11. www.iab.org/ 11. www.blueflamecafe.com/ 12. rubens.anu.edu.au/ 12. www.mnblues.com/ 13. cca.qc.ca/ 13. www.bluesaccess.com/ 14. www.greatbuildings.com/ 14. www.bluesaccess.com/ba home.html 15. www.alsa-project.org/ 15. www.electricblues.com/ 16. www.islamicart.com/ 16. www.moodyblues.co.uk/ 17. www.pitt.edu/~medart/ 17. www.bluesrevue.com/ 18. www.getty.edu/research/tools/ 18. www.nickjr.com/ 19. www.cyburbia.org/ 19. bluesnet.hub.org/ 20. www.architectureweek.com/ 20. www.deltabluesmuseum.org/ Table 2. Recording three volunteers' behaviors of browsing pages corresponding to Information I in Definition 1 on Dec,9,2003 Sequence of Time spending of Query User Query time browsing Query (Q) results (R) browsing pages (T) (U) (D) pages (S) 141.1,84.1,83.8, 2003-12-9 1,2,..., 10, 18,20,9 blues A 11400000 69.5 20:00 2003-12-9 1,2 67.4,78.8,90.1 6, 16, 3 B blues 11400000 20:10 1,2,..., 2003-12-9 130.0,141.2,77.1 1,3, 11 blues C 1 1400000 20:20 1,2,..., 2003-12-9 1, 3, 20 98.2,281.1,101.2 architecture A 16300000 20:30 162.7,72.6,47.0, 12 2003-12-9 3,9, 13, 17 architecture B 16300000 29.0 20:40 1,2,..., 2003-12-9 8, 13, 16, 18 69.7,46.1,32.7,32.9 architecture C 16300000 20:50
323 Table 3. The top 20 query results reordered by P2R values (only list the sequence number) blues 3,16,20,11,1,10,9,18,6,2,4,5,7,8,12,13,14,15,17,19 Query architecture 3,13,8,20,18,9,16,1,17,2,4,5,6,7,10,11,12,14,15,19
I
1
20
1.3335 .314 1.4716
15
1.3834
::3%
10
,2852
5
2.B
2.231
2
-IR
1
2
-m
Chart 1. The change of UR values .the top 20 pages for query ‘‘blues” and “architecture”. respectively
References 1. J . M. Kleinberg. Authoritative sources in a hyperlinked environment In Proceedings of the Ninth Annual ACM-SIAM Symposium on Discrete Algorithms, 1998. Bharat and M. R. Henzinger. Improved algorithms for topic distillation in a hyperlinked environment. Proceedings of the Twenty-First Annual International ACM SIGIR Conference on Research and Development in Information Retrieval,l998. L. Page, S. Brin, R. Motwani, and T. Winograd. The PageRank citation ranking: Bringing order to the web. Technical report, Stanford University, Stanford, CA, 1998. S. Brin and L. Page. The anatomy of a large-scale hypertextual Web search engine. Proceedings of the Seventh International World Wide Web Conference, 1998. T. H. Haveliwala. Topic-sensitive PageRank. In Proceedingsof the Eleventh International World Wide Webconference, Honolulu, Hawaii, May 2002. G.Jeh and J.Widom. Scaling Personalized Web Search. Online at http://dbpubs.stanford.edu/pub/2OO2-12. G.Culliss. User Popularity Ranked Search Engine.Online at http://www.infornortics.com/searchengines/boston1999/culliss. J.Wang, X.Li, SShan and Z.Xie. Massive Web Search Engine: User Behaviors’ Distribution Characteristics and their Implications. Science In China(Series E).,Vo1.31,No.4,Aug,2001. G.H.Golub and C.F.Van Loan. Matrix Computations.Johns Hopkins Univ. Press, 1996. M. Richardson and P. Domingos. The Intelligent Surfer: Probabilistic Combination of Link and Content lnformation in PageRank, volume 14. MIT Press, Cambridge, MA, 2002. S.Dominich. PageRunk: Quantitative Model of Interaction Information Retrieval. Online at http://www.research.att .corn/-rjanalDominich.pdf. S. Chakrabarti, B. Dom, D. Gibson, J. Kleinberg, P. Raghavan, and S . Rajagopalan. Automatic resource compilation by analyzing hyperlink structure and associated text. Proceedings of the Seventh International World Wide Web Conference,l998. C.Ridings and M.Shishigin. PageRank Uncovered. Online at http://www.linking I01.com/articles/PageRank.pdf. S.Kamvar, T.Haveliwala and G.Golub. Adaptive Methods for the Computation of PageRank. Online at http:// www.stanford.edu/-sdkamvar/papers/adaptive.pdf.
2. K. 3.
4.
5. 6.
7.
8.
9. 10.
11. 12. 13.
14.
IMPLEMENTATION OF AN INTELLIGENT SPIDER’S ALGORITHM FOR SEARCH ENGINE * YAJUN DU YANG XU’ ZHANSHENG L12 DONGMEI QIZ 1. Department of Applied Mathematics, Southwest Jiaotong University, Chengdu 610031, Sichuan, China 2. College of Computers & Mathematical-Physical Science, Xihua University, Chengdu 610039, Sichuan, China
Recent research on the crawler algorithm becomes a hot spot in the field of search engine. By analyzing the crawler algorithm and its programs, an intelligent spider’s algorithm has three characteristics: (1) intelligently guarded by the user interests; (2) a much smaller user search space than the space of Internet; (3) online with the keywords. An implementation of an intelligent spider’s search engine is presented in this paper.
1. Introduction
In 1993, Colorado University developed the first generation spider which named the World Wide Web worm. In 1994, Yang and Filo developed a program, which named Wander to catch user resource locate in Internet. At the same time, DeBra and Post developed an information retrieval system, named TueMosaic “I. In 1994, Pinkerton reported a highly speed web crawler [’I. In 1995, Washington University developed successfully a Webcrawler which can load down the web page. In 1997, Chen et al. put forward an individuality spiderr3] based on the context of web page. In 1998, Yang and Yen put forward a simulated annealing spider to solve lots of problems crawling all Internet web pages [41. In 2001, Thomas first introduced artificial intelligence into spider research and put forward A1 spider [’I. In 2002, Pant and Menczer put forward Your Own Intelligent Web Crawlers[61. In this paper we present a concept of the interest rank of a web page. Meanwhile, we study the process of a user searching information in Internet, and put forward a thought- “the anchor texts in a Web page take a priority, the title takes the second place, the content of the web page takes the last,” which makes the use to match the keyword. Finally, we design a Real-time algorithm for crawling web page in the user search space which reflects the user keyword and history interest. ~~
* The project was supported by National Natural Science Foundation, China
(Grant No.600740 14).
324
325 2.
An Intelligent Spider
The data structure of the user web log is: (User-id, URL, Date, Start-time, Endtime, Keywords, Interest-rank), where URL is the user resource locate of web page.
2.1. User Search Space The search space of the actual search engine is apparently made up of all web pages in Internet. Importantly, some factors must be considered in spider, in order to improve its speed: 1. The query is relational to the users’ history knowledge. The spider should make fully use of the user history knowledge. 2. The search result should approve of the user requirement. It is impossible for the users to obtain a great deal of Web pages in short time. They want to get information taking on “fewness, almightiness, exactness, real-time” characteristic. 3. A keyword of user is important, which reflects a special meaning and special space. A special search space for these keywords should be a fraction of Internet. In a word, we thought the search space of a keyword and interest is a tree. And so, a spider can only crawl on the tree. A spider starts to crawl from some initialized URL which reflects the user‘s history knowledge. 2.2. Construct an Initialized URL List
An initialized URL list is used to start from these URLs; its structure is made up of three parts: 1. When the user uses our search engine for the first time, we offer a client interface program to collect the user interest, which is expressed by (URL, Web-Page). 2. It is noticed that some URL in which the user usually is interested in the user’s resource locate of web page. We pick up some keywords of these URLs and write them into the Web user log database. 3. The Web user log database has recorded the history of user visiting Web pages, we can rank some interest URLs and compute interest, according to the sequence of URL clicked by users. When the users submit keywords, we use those Web pages matched by keywords to create initialized URLs. 2.3. Computing Interest-Rank The Interest-Rank is an important degree in which a user is interested in a web page. Some determining factors of Interest-Rank of the web page A is listed below:
326 1. The frequencies and the time of which the user browses the web page A; 2 . The frequencies and the time of which the user clicks and browses the other web page relating to the web page A; the sequences indicate that firstly the user is 3. For an urlA-urlg-urlc, interested in the web page A, the web page B, and the web page C. The interest rank of the web page A should allocate to the web page B and the web page C.
3. The Crawling Algorithm of Spider 3.1. The Crawling Strategy
The Crawling strategy is a principle for Web pages matched by keywords: “the anchor texts take priority, the title takes the second place, and the content of the web page takes the last.” The Crawling strategy can describe below: If url has an anchor text then If matching (keyword, anchor text) then crawl url Else if url does not have an anchor text or not matching (keyword, anchor text) then If matching (keyword, title text) then crawl url Else if url does not have a title text or not matching (keyword, title text) then If matching (keyword, web page text) then crawl url Else Stop crawl. 3.2. The Crawling Algorithm
1. Get-Document (ur1:string) snatch web page document and parse some useful urls, and return a text, some urls. 2 . Analyzer (f: text) analyze the web page document, pick up some urls and syncopate words, find some main keywords by the use of the word frequency statistics method and save (url, keyword) to database in which a spider returns to the user. Algorithm: Intelligentspider INPUT urli; // root node of tree Wait-F, Complete-F; // a queue; Result-Buff; Document //textfile; URLS: //Array String; BEGIN Complete-F-urli; Wait-F-urli; URLS+ ; Document+Get-Document (urli); WHILE NOT Empty (Wait-F) DO
327 BEGIN URLS -Analyzer (Document, Keywords); ResultBuff-Keywords; If (URLS 1-MatchAnchor (URLS)) Popqueue (URLS 1, Wait-F) Else If (URLSi-MatchTopic (URLS)) Popqueue (URLSi, WaitElse IF (URLSi-Matchcontent (URLS)) Popqueue (URLSi, Wait-F); url -Pushqueue (WaitF); If url in URLS then url-Pushqueue(WaitF) Else Document- GetDocument (urli); END WHILE; END. Output ResultBuff. 4.
Experiments
We design a series of experiments to measure the search time, the search depth, and the search width of the user’s keywords. 1. The average search time is about 82.4 to 3401.8 s. 2. The average search depth is about 4 to 52 web pages. 3. The average search width is about 1 to 102 web pages. In comparison with the currently spider, the index is highly improved.
References 1. P. DeBra and R. Post. Information retrieval in the World Wide Web: making client-based searching feasible. In proceedings of the first international world wide web Conference’94. Geneva, Switzerland (1994) 2 . B. Pinkerton. Finding what people want: experiences with the Webcrawler. In proceedings of the second international world wide web Conference’94. Chicago IL (1994) 3. H. Chen,Yi-Ming C.; Ramsey, M.; Yang, C.C., Pai-Chun Ma;Yen, J. Intelligent spider for Internet searching. System Sciences, Proceedings of the Thirtieth Hwaii International Conference , Volume: 4 ,7-10 Jan (1997) 4. Yang, C.C., Yen, J., Chen, H. Intelligent Internet searching engine based on hybrid simulated annealing. Proceedings of the Thirty-First Hawaii International Conference (1998) 5. B. Thomas. Web Spidering with A1 Methods. Christian Wolff (2001) 6 . G. Pant and F. Menczer. MySpiders: Evolve Your Own Intelligent Web Crawlers (2002)
A MODEL FOR SEMANTIC OF LINGUISTIC INFORMATION WITH INCOMPARABILITY *
J. MA School of Electric Engineering, Southwest Jiaotong University, Chengdu 610031, Sichuan, China, E-mail: [email protected]
D. RUAN Belgium Nuclear Research Centre B-2400 Mol, Belgium E-mail: [email protected] Y. xu Department of Mathematics, Southwest Jiaotong University, Chengdu 610031, Sichuan, China, E-mail: [email protected]. c n
In this paper, a model for semantic of linguistic information on lattice implication algebra is presented and discussed by taking a decision making problem for example. In this model, semantic of linguistic information is treated as a triple and the processing of it is combined by formal inference.
1. Introduction Nowadays, linguistic approaches have been one of the important issues in knowledge representation and inference, and have been applied to various areas, such as for example, education,^' [l] “clinical diagnosis,” [5] “marketing,” [8] especially to “decision making.” [2,4,6,12,13] To our best knowledge, most of existing linguistic approaches are proposed in the framework of fuzzy set theory, which can be divided into two types: (1) numeric computation methods, such as Zhang’s method [13]; and (2) symbolic computation methods, such as Herrera’s method [6]. Taking *This work is partially supported by the National Natural Science Foundation of China with grant no. 60074014
328
329 a decision making problem for example, generally speaking, a common resolution scheme of linguistic approach composes of three stages, ie., (1) turning linguistic information into a computing model; (2) computing is carried out in that model; and (3) evaluating on computed results. However, in such approaches two limitations are obvious: l) The assumption that all linguistic terms must be totally ordered and placed symmetrically is unnatural. 2) The substitution of computation for linguistic information processing completely is insufficient. This paper aims at treating with linguistic information companied by incomparability. A model for describing it in lattice implication algebra is investigated and applied to an evaluation problem. Concretely, in Section 2, some basic concepts of lattice implication algebra are presented; in Section 3, the model for describing linguistic information is discussed on an example; In Section 4, some concluding remarks are presented. 2. Preliminary
In order to study the uncertain information processing with incomparability, Xu presented the logical algebra - lattice implication algebra - in 1993.
Definition 2.1. Let ( L ,V, A, 0,I ) be a bounded lattice with universal boundaries O(the least element) and I(the greatest element) respectively, and I an order-reversing involution. If a mapping + : L x L L satisfies for any z, y, z E L ,
-
(11) 2 (12) 2
z ) = y -i (z + z ) ; -+ II: = I ; 3
(y
(13) 2 + y
= y’ + 2’;
( I d ) if z
y =y
( 1 5 ) (z (166) (Z
-+
+
Y)
V y)
( 1 7 ) (X A
(1)
-i
+
+z
+z =
y = (y = (z
9) + z =
-+
I , then z z)
+
(2) = y;
(3) (4)
(5)
2;
z);
(6)
(x 4 Z) V (y + z),
(7)
-i Z )
A (y
-i
then ( L ,V, A, /, +, 0,I ) is called a lattice implication algebra. By Definition 2.1, it is easy to show that the Boolean Algebra and the Lukasiwicz Implication Algebra are special cases of lattice implication algebras. More details of lattice implication algebra can be found in [7].
330 3. Main Results We shall discuss our model on the following example of decision making problem [4]. Example 3.1. A manager needs to decide which of four available candidates, a l , a2, ag, a4, for an open job to hire. The goal ( D ) is to hire a candidate with “long experience relevance t o the job description (G),” under the constraints that “the required salary is low ((71)” and that the candidate is “not much older than 30 ((72)”.
For this example, an conventional fuzzy-set-based approach is (1) Designing particular fuzzy sets for the goal and every constraints, (2) Deciding how to aggregate evaluations for each candidate and compute it according to the operations law in fuzzy sets, and (3) Selecting the best candidate(s) in terms of computed evaluations. It can be found that some linguistic information has been dealt with in advance in above processing. Taken G for instance, it is always described in linguistic terms as “lacking,” “rich,” or others. These terms, in general, represent a range of user perception. Moreover, for a particular number of candidate’s years of relevant experience, it may also be described as “short,” “long,” etc, which are also a range of personal preference. Hence, the description should be a range rather than a fixed point (Zadeh’s theory of information-granularity is the very idea [ll]).For these reasons, Chen and other researchers have paid contribution on interval-based methods for linguistic information [2,3]. However, interval-valued linguistic approaches in not sufficient to illustrate the incomparability in linguistic terms for the reason that interval can but illustrating the incomparability of the whole. In our opinion, linguistic information processing can be seen as a course of uncertainty reasoning. Hence, it is rational t o handling linguistic terms in the framework of a logic system, especially a non-classical logic. For these reason, Xu’s lattice-valued logic based on lattice implication algebra may be an alternative. In the following, we shall discuss how t o describe linguistic information in a lattice implication algebra. Let S be the set of linguistic terms, whose elements are called labels, L a lattice implication algebra, a - , a+ E L and a- < a+. [a-, a+] = ( 3 ; E L I a- 6 3; < a+} is called an interval in L (shortly for ti) and the set of all intervals in L will be denoted by Int(L). The order in Int(L) is defined as: for any ti, 6 E Int(L), ti < 6 if and only if a- 6 b- and a+ < b+. Also
331 we define operations on Int(L) as:
(a)’ A
6+!
[a;,.’], [u- A b-, a+ A b+],
- A
avb= [a-Vb-,a+Vb+], 8 6 k [u- 8 b-, a+ 8 b+J,
a @ 6 k [a- @ b-, a+ CB b+], a -b A= [a+ + b - , a b,], a n 6 {z E L I z E a and z E 6}, -+
-+
aubk
{z E L
I IC E a o r z €6).
By these definition, we have the following properties.
Theorem 3.1. (Int(L), A, V) is a distributive lattice and (1)
a + (63 C)
= 6 -i (a ---f C),
(2) a 6 = (6)’ -+ (a)’, (3) (a V 6) + C = (a -+ E ) A --f
where a,
(b + C),
6,and C E Int(L). 0
Remark: It can be proved that operation I is a order-reversing involution and for any E Int(L), [O,O] < 6 < [ I , I ] . However, in general, (Int(L),V, A, +, I,O, 7) is not a lattice implication algebra. Semantic of linguistic information is treated as a triple ( L ,S, M ) , where L is a lattice implication algebra as the set of elementary semantical measurements. S the set of linguistic terms as linguistic labels, and M a function assigning labels in S to Int(L). The semantic of an inference rule is a 2-tuple ( r ,s), where r reflects the relation between factors and goal, and s is the label(s) assigned to r . For example, the relation between factor G and goal D is taken as rule (relevant experience
+
goal, important).
Therefore, linguistic information processing is turned into solving the problem of how to get linguistic label of D from (G,TO)and (G + D , SO). For this problem, there are two methods to choose from generally. The first one is to compute on TOand SOdirectly, such as TOA SO,and the other one is to solve the following equation:
M(To)
+
M ( X )= M(So),
(16)
332 where M ( X ) is the smallest one in Int(L) including
n n
{YE~~u4y=b}.
(17)
a E M ( T o )b E M ( S o )
Then X will be the obtained label(s) for D. For Example 3.1, suppose the selected lattice implication algebra is shown in Fig. 1 and (G + D , u ) . For candidate a l l because he(she) has been worked in this fields about eight years, hence the label “rich” ( [ b ,I ] ) is assigned to him(her), then ( G ( u l ) ,[ b , I ] ) .By Eq. (17), we get for a1
M ( X ) = [c, 4,
(18)
for {c, u } = {c, u } n ( 0 ,c, u } ) . Suppose the label assigned to [c, u] is (‘acceptable,” then it means that according to the factor of experience, the candidate a1 is acceptable for the goal. We omit other processions here for they can be carried out similarly.
Figure 1. Selected lattice implication algebra as measure.
4. Conclusion
In this paper, we present a model for semantic of linguistic information on lattice implication algebra. The present work can deal with not only totally ordered linguistic information but also incomparable linguistic information. However, because of the complicity of linguistic information, there are some problems need to be resolved, for instance, the combination
333 of difference universals for different factors, t h e aggregation approaches for various evaluations, a n d so on. Some work is on t h e way.
References 1. C. K. Law, Using Fuzzy Numbers in Educational Grading System, Fuzzy Sets
and Systems, 83,261-271 (1996). 2. C. W. Entemann, A fuzzy logic with interval values, Fuzzy Sets and Systems, 113, 161-183 (2001). 3. Q. Chen and S. Kawase, On fuzzy valued fuzzy reasoning, Fuzzy Sets and Systems, 113,237-251 (2000). 4. G. Klir, Fuzzy Sets - An Overview of Fundamentals, Applications, and Personal Views, Beijing Normal University Press, Beijing (2000). 5. F. Herrera and E. Herrera-Viedma, Choice functions and mechanisms for linguistic preference relations, Technical Report #DECSAI-97134, Granada University, Spain (1997). 6. F. Herrera, E. H.-Viedma and L. Martinez, Representation Models for Aggregating Linguistic Information: Issues and Analysis. In: R. Mesiar and T. Calvo (Eds.), Aggregation operators: new trends and applications, PhysicaVerlag, Germany (2001). 7. Y. Xu, D. Ruan, K. &in and J. Liu, Lattice-Valued Logic - An Alternative Approach ofll-eating Fuzziness and Uncertainties, Springer, Germany (2003). 8. R. R. Yager,A New Methodology for Ordinal Multiobjective Decisions Based on Fuzzy Sets, in: D. Dubios, H. Prade and R. R. Yager, eds., Fuzzy Sets for Intelligent Systems, San Mateo, Morgan Kaufmann Publichers, USA (1995). 9. R. R. Yager, An Approach to Ordinal Decision Making, Internat. J. of Approximate Reasoning, 12,237-261 (1995). 10. L. A. Zadeh, The concept of a linguistic variable and its application to approximate reasoning, I, 11, 111, Information Sciences, 8 , (3): 199-251, (4): 301-357, 9: 43-80 (1975). In: D. Ruan and C. Huang (Eds.), Fuzzy Sets and Fuzzy Information-Granulation Theory - Key selected papers by L. A . Zadeh, Beijing Normal University Press, Beijing (2000). 11. L. A. Zadeh, Fuzzy sets and information granularity, in: M. M. Gupta, R. K. Ragade and R. R. Yager (Eds), Advances i n Fuzzy Set Theory and Applications, North-Holland, New York, 3-18 (1979). In: D. Ruan and C. Huang (Eds.), Fuzzy Sets and Fuzzy Information-Granulation Theory - Key selected papers b y L. A . Zadeh, Beijing Normal University Press, Beijing (2000). 12. X. Zeng, Y. Ding and L. Koehl, A 2-Tuple Fuzzy Linguistic Model for Sensory Fabric Hand Evaluation, in: D. Ruan and X. Y. Zeng, eds., Intelligent Sensory Evaluation-Methodologies and Applications, Springer, Germany (2003). 13. G. Zhang and J. Liu, Using General Fuzzy Number to Handle Uncertainty and Imprecision in Group Decision-Making, in: D. Ruan and X. Y. Zeng, eds., Intelligent Sensory Evaluation-Methodologies and Applications, Springer, Germany (2003).
FUZZY ADAPTIVE CONTROL BASED ON L-R FUZZY NUMBER AND APPLICATION IN THE PENSION SYSTEMS XIAOBEI LIANG, BINGYONG TANG AND YUAN XUE Glorious Sun School of Business and Management, Donghua University, Shanghai, 200051, P. R. China E-mail :[email protected] ZHICHANG ZHU Business School, The Universityof Hull, Hull, HU6 7M, UK
Fuzzy control algorithms have been extensively used in business and industry. In this paper, a fuzzy adaptive control technique based on L-R fuzzy number is presented and its application in the pension systems is discussed.
1.
Introduction
Fuzzy control algorithms have been extensively used for non-linear systems in business and industry’. Among fuzzy control algorithms, adaptive fuzzy control provides a possible means to cope with some time-varying and even non-linear behavior of a system. This paper discusses a fuzzy adaptive control .method based on L-R fuzzy number for a time-varying system with applications in the pension systems. 2.
Fuzzy State Equation Based on L-R Fuzzy Number
The discrete-time-varying fuzzy linear system is considered here as follows. x- ( k ) , u- ( k ) , y ( k ) are n-dimension fuzzy state, r-dimension fuzzy input I
and m-dimension fuzzy output of a fuzzy system at k time respectively. If these fuzzy variables are represented by L-R fuzzy number: = (m, a,P)LR (1)
a ,p are left and right extensions, respectively. The fuzzy state equation can be obtained as follows: x( k + 1) = A( k ) x( k ) + B( k ) u(k) I
y ( 4 = C(k)x(k) then, Eqs. (2) and (3) can be written by their branches as follows: 334
335
i = 1,2,...,n
y J ( k )= 2 . i r ( k ) x- r
-
j = 1,2;..,m
(5)
r=l
If Eq. (4) is substituted into Eq. (5)Othere is0
Y-,
1
(4= ~ c , r ( k ) ; ~ u , ~ ( ~ - l ) x ( t - l ) + ~ ~ , / ( ~ - l ) u ( k - l ) r=l
s=l
/=1
j = l , 2, ..., m (6) Then, Eq. (4) is substituted into Eq. (6) again, when start-state x- ( 0 ) is known, a',( k ) is the time-varying parameter of start-state after proper
combined, &,(k)
is the time-varying coefficient of u,(k - p ) , which is the I
fuzzy input variable with a time-delay, and if the time-delay of the system input is determined as d by a proper rank identification method, Eq. (6) can be written as a fuzzy input-output model as follows:
j = l , 2, ..., m e ( k ) is the corresponding fuzzy stochastic noise.
(7)
-
3. Identification of Time-Varying Parameter For Eq. (7), if let:
d ( k ) = {XI
-
-
(O),X,
-
(o), ...,x,- (o), u,- ( k - I), 241- ( k - 2), ..
.?
-
( k - d),
u , ( k - l ) , . . . , u , ( k - d ) ; . . , u , ( k - I),-.,u,(k-2)
-
-
-
-
I
e ; ( k ) = (a;@), a: ( k ) ; ..,a;( k ) A ( k ) A , (k),.. . P : d (k),
Pi,(k), '. . , P : d
( k ) ? '. .
9%
then Eq. (7) can be transformed into: y , ( k ) = e f ( k ) p (-k ) + e , ( k )
-
-
( k ) , .. . J ; d ( k ) ) j = i , 2,..., m
(8)
Thus, the parameter gradient recursive estimation formula' obtained from the known data of fuzzy input and output can be as follows:
336
j=1,2,. ..,m (9) OJ(k)represents the estimate value of B J ( k ) , yJ( k ) is the main value of LOR fuzzy number y ( k ) .
where
-I
To forecast the future time-varying parameter, the “past” and “present” estimate value series
{8,(I),
(2)
,a
*
., gJ ( N ) ) (N
is the present time), which
obtained by Eq. (9), is analyzed to find their regulation. With the proper mathematical method3, the forecasting formula of time-varying parameter B,(k) can be obtained and a series of forecasting value
s,’ ( N + 1) , - 8: ( N + h ) can be further obtained. Here, h is a forecasting*
a ,
step length. 4.
Adaptive Control Algorithm and Its Application
The fuzzy mathematical method of time-varying fuzzy linear system is as follows:
Y- ( k + 1) = b,T u(k)+ pT(k+ i ) e ( k + I), I
(10)
I
I
where y ( k ) is one-dimensional fuzzy output and I
fuzzy input (Y 2 1) . The vector vector
~- ( k is)
Y -dimensional
b, is the known non-zero constant. The fuzzy
p(k + 1) consists of fuzzy input and fuzzy output at one time of the
-
system. The vector
B(k - + 1) is composed of fuzzy parameters. From the above
fuzzy parameter estimate algorithm, we know that the recursive estimate formula for fuzzy B(k) in Eq. (10) is as follows:
-
(1 1)
In calculation, the estimation of vector estimations of its branch B - 1
B(k) can be separated into the -
( k ) . Further, the hzzy parameter estimate value
337
-
B*(k - + 1) can be obtained by the fuzzy parameter forecasting algorithm at the k + 1 time. The algorithm for Eq. (1 1) is
{
u(k) bO 2 "k - = u(k - 1) + I
where
y *( k +
-
P oI
1
+ 1) - b,T u(k - 1) - p T ( k ) B * ( k+ 1) , I
(12) 1) is anticipated fuzzy output at the k 1 time.
+
The proposed method has been applied to the adaptive to the adaptive control in ) the pension and the fizzy inputs the pension systems. Here y ( k ) and ~ ( kare
-
-
include social, non-technical factors. According to the historical data,
B(k) can -
be estimated by fuzzy parameter estimation formula (1 l), further, for example, if y* ( k )= (490,0.5,0.3) (k=23), then by (12), we have
-
u (23)={ (189.44,0.42,0.32),(128.51,0.29,0.36),(143.94,0.32,0.24),(158.68,0.37,
-
0.42),( 182.87,0.45,0.32),(171.44,0.42,0.35)}. Similarly, we can obtain u (24), u (25),. .., and so on. I
The proposed adaptive control algorithm based on the fuzzy parameter identification has found applications in the pension system and its further investigations and other areas' applications will be considered as our future research tasks. References 1. Zadeh, L.A., 1996. Fuzzy control: A personal perspective. Control Engineering 43 (lo), 5 1-52. 2. Han, Z., 1984. The identification of time-varying parameter in dynamic system, Acta Automatica Sinica, 10 (4), 381-387. 3. Tang, B., Xu, L., Wang, W., 2000. An adaptive control method for timevarying systems. European Journal of Operational Research, 124, 342-352. 4. Han, Z., 1992. Adaptive control algorithms versus parameter estimations. Control Theory and Applications 9 (6), 374-379.
STUDY ON INTELLIGENTIZED SPLIT-SPREAD AND INTELLIGENTIZED RENEW-COMPOUND OF MULTIPLEX INFORMATION-FLOW IN WORKING-FLOW MANAGEMENT * WEIHUA KOU AND YANG XU
Intelligent Control Development Center, Southwest Jiaotong University, Chengdu 610031,Sichuan,P. R.China E-mail: [email protected], xuyang@home,swjtu,edu.cn
Based on the multiplex information-flow, Intelligentized Split-Spread and Intelligentized Renew-Compound of Multiplex Information-Flow in Working-Flow Management are put forward, the essential idea is that the multiplex information-flow is disassembled, spread, reverted, and replicated. The following problems are solved: repeatedly spreading overlapping data, spreading unamended data.
1.
Introduction
The monotype information-flow in working-flow management is relatively simple, the reason is that it points to other working-flow administrative levels by the whole information-flow, namely, the whole information-flow is spread when the monotype information-flow interacts with other administrative levels[']. In addition, other working-flow administrative levels aslo deal with the whole monotype information-flow. The multiplex information-flow is composed of many information logic units, only the correlative information logic units are spread when the multiplex information-flow interacts with other administrative levels, namely, a information logic unit points to one or many other working-flow administrative levelsL2].In addition, other working-flow administrative levels only deal with the correlative information logic units of this multiplex information-flow, but not the whole multiplex information-flow.
* We gratefully acknowledge the support of National Natural Science
Foundation of P.R. China (Grant No. 60074014).
338
339 2.
The problem about the multiplex information-flow
The existence of the multiplex information-flow in working-flow management gives rise to many problems, for example: how to spread interactively the multiplex information-flow by the information logic unit, how to determine the spread direction of the information logic unit when interacting, how to renew the information logic unit with its multiplex information-flow after interaction, how to compound the newly added information logic unit with the destination multiplex information-flow, etc. 3.
Description of intelligentized split-spread and intelligentized renew-compound
In order to solve the above problems, a method of intelligentized split-spread and intelligentized renew-compound is put forward. Its basic idea is described as follows: Given a multiplex information-flow A, A is splitted into n information logic units according to the rule of its working-flow management, let
A=
u
ai. In this working-flow net B, there are m administrative levels,
i=l
B={bj;l Gj G m }. The splitted multiplex information-flow A is distributed and spread according to the rule of its working-flow management. Let si be the flow direction of the information logic unit %, si ={(%,Bi');l
c.. The return information 9
I<j<m
corresonding to Ci, is denoted as sij', sij'=(%',ci,,b,), where a,' is the framework of the information logic unit a,, so all return information which the multiplex information-flow A obtain is S'= 1 9 9 l<j<m
s ..' . 9
Based on the above simple analysis, the basic idea of the intelligentized splitspread and intelligentized renew-compound of the multiplex information-flow in working-flow management is illustrated in Fig. 1.
340 working-flow management levels net B 3 management levels b,
I
Snlit-Snread set S
I
renew-compound multiplex information-flow A
I 4.
4.1.
i information logic unit ai
Fig. 1 the intelligentized disposal method of the multiplex information-flow Analysis of the disassembly and reductive process The disassemblyprocess
The disassembly process of once Spread Replication method refer to the D-M algorithm of references [2]. First, define a new set V‘ that is a subset of V, the set V’ satisfies {ti}cWi, V’={(RI,WI),(RZ,WZ), ***,(Ri,Wi),***,(R,,W,)}; 1Sib
34 1
n, {ti)sWi},Any Ri of the set V' is the object spread to the site ti .if for any Ri and Rj, Ri n Rj=O,i#J, namely, the source copy Ri and the source copy R, do not exist overlapping data,so the set R is not disassembled. Otherwise, let Rk=Ri fl Rj, Ri=R;-Rk,Rj=Rj-Rk,Wk=Wi U Wj,where k>n. Handle any Ri and any R, in such the way, until any Ri and any Rj satisfies Ri n Rj=O. in order that the disassembled copy is integrally and right reverted in the site ti ,the disassembly Rk should be labeled, define integral set Zi, Zi={ 1,2;.*,n}, Zi shows that the disassembled Rk belongs to the copy which is not disassembled. The set V' of source copy spread to the site ti can be expressed as: Vf={(R,,ZI,W1),(R2,Z2,W2),...,(Ri,Zi,Wi),..',(Rn,Zn,W,>,.'.,(Ry,Zy,Wy)), where for any Ri and Rj satisfies Ri n R,=0,1
After the site ti receives the set V', revert the disassembled copy by the integral set Zi : R,=Rj URi, wherej EZi, (Ri,Zi,Wi)EV',l <j
Conclusions
This paper discuss principally that data is spread unrepeatedly, data unamended is not spread, and how to spread and replicate the multiplex information-flow& also shows that this method can solve the problem about the disposal and distributive spread that the information logic unit of the multiplex informationflow causes. Acknowledgments We gratefully acknowledge the support of National Natural Science Foundation o f P.R.China (Grant No. 60074014). References
Nicola,M,Jarke,M.Performance modeling of distributed and replicated databases. IEEE Transactions on Konwledge & Data Engineering, 2000, 12(4):645-672. 2. Zhe J,Sun YF. Optimized propagation algorithms for multi-replication definition.Journa1of Software,2003,14 ( 2 ) :230-236. 1.
A FUZZY SET THEORETIC METHOD TO VALIDATE SIMULATION MODELS
MARTENS J. AND PUT F. Faculty of Economics and Applied Economics Catholic University of Leuven, Naamsestraat 69 Leuven 3000, Belgium
KERRE E. Department of Applied Mathematics and Computer Science Ghent University, Krijgslaan 281/S9 Ghent 9000, Belgium Validation is one of the most important steps in developing a reliable simulation model. It evaluates whether or not the model forms a n accurate enough representation of the system of interest for the goals of the modelling study. Most of the methods that are currently available for validation are statistical techniques, and are as such binary in nature, in the sense t h a t a model will turn out t o b e either fully valid or fully invalid. In this paper, we develop a new, fuzzy set theoretic method for validation t h a t is continuous in nature, in t h e sense t h a t it allows t o express degrees of model validity. We experiment with our method on a number of simulation models for a real-life airline network.
1. Introduction In this paper, we present a new method for the validation of simulation model behaviour. By simulation model, we mean in fact any kind of model that imitates a system of interest in reality. Examples of simulation models are models of manufacturing processes, models of the functioning of call centres, airports and nuclear power plants, models of communication systems, material handling models, models of highway traffic, and so on. We define behaviour of a simulation model as a mathematical relation between statistics (means, variances and/or correlations) of stochastic input and output processes of the model. To give an example, for a model that simulates an airport, behaviour can be a relation between mean arrival delays and mean departure delays of aircraft a t the airport. To give another example, for a model that simulates a manufacturing process, behaviour can
342
343 be a relation between correlations among successive inter-arrival times of jobs and throughput time variances. We then conceive the process of validation as that of assessing the similarity between behaviour (or an estimate thereof) that is computed out of model generated data, and behaviour that is computed out of data of the real system that is simulated by the model. The literature on modelling and simulation offers today a variety of different (behaviour) validation techniques. The most important techniques are statistical methods, including the method of confidence intervals, the method of regression meta-modelling, and methods based on a time series analysis - see Law and Kelton5 for an overview of these and other techniques, and Balci and Sargent2, Kleijnen and Sargent4, and Naylor et aL8 for details and example applications. Generally, these methods come down (one way or another) to the testing of the hypothesis that there is no difference between model and real system behaviour. The outcome of the test is binary, i.e. the validity of the model is either confirmed or rejected. Since it is however a commonly adopted point of view that model behaviour is bound to be different from real system behaviour - as not every detail of the system can be simulated by the model -, it is our aim in this paper to develop a new method for behaviour validation that allows to express degrees of model validity on a continuous scale. The paper is organised as follows. First, in section 2, we review a concept that has recently been introduced in fuzzy set theory, and that is known as resemblance relations. Then, once we have indicated how a resemblance relation can be learned from applying a fuzzy inference algorithm to both simulation model and real system behaviour, we design our validation method in section 3. Finally, in section 4, we perform a number of experiments with our method in the context of a real-life airline network. The paper is ended by summarising the most important research findings, and by presenting some ideas for future work.
2. Rule base induced resemblance relations
Let K: be a simulation model, the degree of validity of which needs to be determined in view of a real system A. Let K:’ be an ideal model for A; in the terminology of Zeiglerg, K’ would be termed a base model for A. Denote the behaviour of K: and IC’ by B ( K ) and B’(K:’) respectively. Since the base model remains unknown, all that we have available on the real system side is a collection of data, out of which we can estimate the behaviour of K:‘. Denote this estimate by &(K:’), and call it the behaviour of A.
344
We intend to judge upon the degree of validity of K by investigating whether the points in B ( K ) are reflected by approximately equal points in &(I?); thus, in case of the airport that we raised in the introduction, whether every pair of mean arrival and departure delays in B ( K ) has an approximately equal pair in &(K’). For that purpose, we make use of a concept that has recently been introduced in fuzzy set theory and that is known as resemblance relations - see De Cock and Kerre3 for details and examples. For R a non-empty set, ( X ,d) a pseudo-metric space, and g a mapping from R into X , a fuzzy relation R on R is called a (9, d)-resemblance relation if and only if it holds for all w, G , w’ and W’ in R that 1) R(w,w) = 1, 2) R(w,w) = R ( ~ , w and ) 3) d(g(w),g(w)) I d(g(w’),g(G’)) R(w,G) 2 R(w’, 6’). To give a typical example, assume that A1, A 2 , . . . , A k are fuzzy sets on R, and let g be a function that maps every point w of R on a tuple (Al(w),A ~ ( w ). ., . , Ak(w)) with membership values. Then, the fuzzy relation R on R, the membership function of which is defined in ( w , G ) by R ( w , G ) A 1 - d(g(w),g(W)), is a resemblance relation. When d is the sup-pseudo-metric, defined by d(z,z’) A maxt=l{lzi for every z A ( X I , 2 2 , . . . , zk) and z’ A (xi,&. . . ,xi)in X , and each of the fuzzy sets A l , A2,. . . , Ak has the interpretation of a particular characteristic, then the resemblance R(w, 5)indicates whether w and W share these characteristics to the same extent. In order t o measure the resemblance among behaviour points, we first apply a fuzzy inference algorithm to both the behaviour of K and the behaviour of A. Then, keeping everything as in the example above and putting R 2 R2, we identify the fuzzy sets A l , A2,. . . , Ak with cylindrical extensions into R of the antecedent and conclusion parts of all the rules that are learned by the algorithm - the antecedent and conclusion parts are fuzzy sets on R. Because of the definition of the sup-pseudo-metric, the resemblance between two behaviour points will then be small if there is at least one rule for which holds that the hit of one point to the cylindrical extension into R of the antecedent or conclusion part of the rule deviates significantly from the hit of the other point, whereas the resemblance will be large if and only if both points have similar hits to the cylindrical extensions into R of the antecedent and conclusion parts of all the rules. We call the resulting resemblance relation a rule base induced resemblance relation.
*
XII}
345 3. Fuzzy sets of gradual validity propositions In the line of the classic system theoretic notions of sub- and supersystems - see e.g. Zeiglerg -, we call K a subsystem (supersystem) of K‘ when B ( K ) ( 2 ) B ( K ’ ) . Introducing here a gradual version of the notion of sub- and supersystems, we say that K is a subsystem (supersystem) of K’ with degree 1-4 in case exactly l O O ( 1 -q)% of the points in B ( K ) belong to B(Ic’)(or vice versa). We denote gradual sub-and supersystem statements by B ( K ) Zl-¶ B ( K ’ ) and B ( K ) B ( K ’ ) respectively, with q E [0,1]. With the aim then to develop a useful notion of gradual validity, we label every gradual statement with a truth value (estimate), computed from a comparison of the points in B ( K ) to the points in &(K‘) as follows - for details, see Martens et aL6. For every behaviour point p of K , we look up whether there is a behaviour point of A that highly resembles p according t o the rule base induced resemblance relation that we defined in section 2. We call the function that maps every behaviour point in B ( K ) to the highest resemblance value with regard to the behaviour points in &(K’) a forward maximal In a similar way, we call resemblance function, and denote it by i&. the function that maps every behaviour point in &(K’) t o the highest resemblance value with regard to the behaviour points in B ( K ) a backward maximal resemblance function, and denote it by PX. In the following, we treat forward and backward maximal resemblance functions as real valued random variables. Now, for p a certain behaviour point of K , the value of i& at p can be interpreted as the truth value of the statement p E &(K’). In case p remains unknown however, the qth quantile x q of i& can still be looked upon as an (approximate) probabilistic lower bound on the truth value of the statement, for preferably small values of q in [0,1]. In effect, it holds by definition of a quantile that the statement will be true with a truth value that exceeds xq with a probability of (approximately) 1-4. That being the case, an estimate of the truth value of B ( K ) C l U qB ( K ’ ) should reasonably be in proportion t o zq for relatively small values of q . In the line of our reasoning above, the complement of the qth quantile of &K can be looked upon as an (approximate) probabilistic lower bound on the truth value of the negation of the statement p E &(K’),for preferably large values of q in [0,1]. Indeed, applying again the definition of a quantile, the maximal resemblance of a random behaviour point of K will be lower than or equal to zq with (approximate) probability q , causing the truth
c
346
value of the negation t o have a probability of (approximately) q of being greater than or equal to 1 - z q . That being the case, an estimate of the truth value of 9 ( K ) $Zq B ( K ’ ) should reasonably be in proportion to 1- z q for relatively large values of q. Since the statements 9 ( K ) C I p q B ( K ’ ) and 9 ( K ) gq 9 ( K ’ ) convey the same information, we define an estimate of their truth values, apart from a scaling factor, as a t - n o r m of the qth quantile zq and its 4 complement 1 - zq of mx. In particular, using Zadeh’s t-norm, and A denoting a = maxq,[O,l~{min(zq,l - z q ) } , we define the estimate by a ~ (-lq ) 2 i min(zq,1 - z q ) / a , and this for every (1 - q ) E [0,1]. Clearly, {(I - q , a x ( l - q))}(l-q)EIO,l] constitutes a fuzzy set on [ O , 1 ] . We call it a f u z z y set of gradual subsystem propositions. In a completely similar way, replacing C by 2 , and using the quantiles of P K ,we define an estimate of the truth values of B ( K ) 21pqB ( K ‘ ) and B ( K ) B(lC‘)by , D K ( ~ - q i)l min(z,, 1- z q ) / b with b maxqEIO,l]{min(zq, 1- x q ) } , and this for every (1 - q ) E [0, 11. Again, ((1 - q,Px(l - q))}(l-q)EIO,ll constitutes a fuzzy set on [0,1]. We call it a fuzzy set of gradual supersystem propositions. Finally, we refer to a t-norm based intersection of the cylindrical extensions of ax and ,&into [0, l]’, and thus to the fuzzy set on [O, 11’ that is defined by yc 4? cextjo,~]2(a~C) A cextpl]2(&), as a fuzzy set of gradual
zq
validity propositions. For “ylc a fuzzy set of gradual validity propositions, we call the fuzzy centroid of Y K , i.e. J: yx(z, y)min(z,y ) d z d y / ~x(z, y ) d z d y , an estimate of the average validity of K. We then say that K: is valid to a degree that equals its average validity (estimate).
Jl
JiJt
4. Example
In this section, we compute the degree of validity of a simulation model that we built in the context of a punctuality study that we carried out for a (former) Belgian airline company - see Adem and Martens1 for details on the model. T h e real system of interest is the central airport or hub in a network of flights. All flights are scheduled from the hub (Brussels) to a line station, or from a line station (Venice, Bordeaux, Amsterdam, etc.) back to the hub. When an aircraft arrives at the hub, it undergoes a process t o be made ready for its next flight. This process is called rotation, and involves activities like unloading, cleaning, refuelling, etc. In this example, we select an important parameter that is used to determine the delay at departure of flights a t the hub, i.e. the target rotation
347
0.75
8.1
0.25
c
(a) Gradual sub- and supersystem propositions
I
0 75 05
0 25
0
1
0
(b) Gradual validity propositions Figure 1. Fuzzy sets (smoothed)
time or TRThub, as a key parameter of the model that has to be decided upon before using the model for decision making. The decision t o try out different values for TRThvb comes from the fact that we are unsure which
348
of the values yields the most valid simulation model. We define the values of interest to be 40, 45, 50, 55 and 60 minutes. Each of these values hence points to a fully calibrated simulation model. We denote these models by K40 through K 6 0 . In this example, we define the behaviour of a simulation model as a relation between m e a n s of stochastic input processes, realisations of which are sequences of average daily arrival delays a t the hub, and m e a n s of stochastic output processes, realisations of which are sequences of daily relative frequencies of on time departures at the hub. Every stochastic input or output process comes about under one particular setup of the parameters (different from TRThub) of the model. As such, each model has roughly 50 behaviour points. For the purpose of ‘validating’ our experimental results, we let the simulation model K40 play the role of a base model K’ for the hub, and treat an estimate of its behaviour as the behaviour of the hub. In order t o learn a rule base induced resemblance relation, we employed a neuro-fuzzy inference algorithm, called NEFPROX - see Nauck et al.7 for details on the algorithm. In particular, for every of the simulation models K45, K 5 0 , K55 and K60, we estimated a rule base induced resemblance relation in the manner that we outlined in section 2, from estimates of the behaviour of the model and the base model. We then estimated the cumulative distribution of the forward and backward maximal resemblance functions of the models, and estimated from there the membership functions of all fuzzy sets of gradual sub- and supersystem propositions, as well as of all fuzzy sets of gradual validity propositions, according to our reasoning in section 3. We depicted some typical membership function estimates that we obtained in figure 1. The estimates of the average validity of the models K45, K 5 0 , K55 and K 6 0 equalled respectively 0.62, 0.41, 0.27 and 0.20. Comparing the membership function estimates that we showed in figure 1 for the models K45 and K50 with one another, and doing the same for the results strike with what we expect to find the models K55 and from an intuitive point of view. For, the more the value of TRThub in a simulation model deviates from the ideal value (being 40 minutes), the more an estimate of a membership function that goes with the model should exhibit a sharp rise near the begin point of the unit interval or near the south west corner of the unit plane. In that respect, if we would now have to rank the simulation models in decreasing order of validity, taking into account the estimated validity degrees that we computed, then we would set up the intuitively correct ranking K45 K50 + K55 K60, where >reads as i s estimated t o be m o r e valid than.
+
+
349
5 . Conclusion
In this paper, we designed a fuzzy set theoretic method t h a t allows t o express degrees of model validity. We experimented with the method in the context of a real-life airline network, and showed how it can effectively be used t o discriminate more from less valid simulation models. In the future, we intend t o increase the scope of our experiment, and t o generate computational results for additional modelling and validation problem cases. Also, we plan to investigate the sensitivity of the experimental results t o the particular fuzzy inference algorithm t h a t we used. References 1 . J. Adem and J. Martens. Analysing and improving network punctuality at a
2.
3. 4.
5. 6.
7. 8.
9.
belgian airline company using simulation. K . U.Leuven DTEW Research Report 0140, Katholieke Universiteit Leuven, 2001. 0. Balci and R.G. Sargent. Validation of simulation models via simultaneous confidence intervals. American Journal of Mathematical and Management Sciences, 4(3-4):375-406, 1984. M. De Cock and E.E. Kerre. On (un)suitable fuzzy relations to model approximate equality. Fuzzy Sets and Systems, 133(2):137-153, 2003. J.P.C. Kleijnen and R.G. Sargent. A methodology for the fitting and validation of metamodels in simulation. European Journal of Operational Research, 120(1):14-29, 2000. A.M. Law and W.D. Kelton. Simulation Modeling and Analysis. McGraw-Hill, 2000. J . Martens, F. Put, and E.E. Kerre. A fuzzy set and resemblance relation Leuven DTE W Research approach to the validation of simulation models. K. U. Report 0355, Katholieke Universiteit Leuwen, 2003. D. Nauck, F. Klawonn, and R. Kruse. Neuro-Fuzzy Systems. Wiley, 1997. T.H. Naylor, K. Wertz, and T.H. Wonnacott. Spectral analysis of data generated by simulation experiments with econometric models. Econornetrica, 37(2):333-352, 1969. B.P. Zeigler. Theory of Modeling and Simulation. Academic Press, 1976.
APPLICATION IN EVALUATION OF LOESS COLLAPSIBILITY WITH INFORMATION DIFFUSION TECHNIQUE* WANLI XIE+ Geological Department, Northwest University, No.229 Taibai North Road, Xi ’an 710069, People’s Republic of China Key Laboratory of the Continental Dynamics, The Ministry of Education of China, No.229 Taibai North Road, X i h n 710069, People’s Republic of China
JIADING WANG JINHU YUAN RUI’E LI Geological Department, Northwest University, No.229 Taibai North Road, Xi h n 710069, People’s Republic of China
In this paper, we introduce the method of information diffusion to evaluate the coefficient of loess collapsibility. Based on 68 sets of collapsing loess data fiom geo-technical tests in the Loess Plateau, China, in which moisture content, pore ratio and compression coefficient are chosen as major effecting factors. Using the fuzzy information diffusion method, first and second fuzzyapproximation reasoning and information integration methods we establish the fuzzy relationship matrix between every evaluating elements and the collapse coefficient. The model is verified by comparing the calculation with the experimental result of another ten data sets of collapsing loess: the method described as simple, fast, practical and able to incorporate more factors to assess loess collapsibility objectively. It can be applied in the disaster prevention planning for inhabited areas, organization of urban geological maps in loess areas, and the regional analysis of loess collapsibility.
1. Introduction
Loess collapsibility is a common on the Loess Plateau of China. Research into loess collapsibility has two facets: the first is the mechanism of loess collapsibility; the second is the evaluation method of loess collapsibility. Many scholars have proposed various theories concerning the first facet, such as the theory of fusible rock of loess collapsibility, the theory of insufficient colloid of loess collapsibility, the mechanic theory of loess collapsibility, and the pulsatile theory of loess collapsibility [1,2]. As to the second facet, scholars conventionally use the coefficient of loess collapsibility, together with other loess physical-mechanical indexes (such as moisture content, original pressure, * This work is supported by the Key Natural Science Foundation of Shannxi
Province, China, No. 2001D05. Corresponding author. Fax: +86-2948304789, E-mail: xieandwei@ 163.com.
350
351 etc.) to evaluate the degree of loess collapsibility [3]. Along with the increasing exploration and development of western China, disaster prevention and preparation in urban areas along with the creation of urban geological maps becomes increasingly important. It is unpractical and uneconomic to obtain collapse coefficients of loess and physical-mechanic coefficients through geotechnical tests. Thus, geologists need to setup corresponding mathematic models to make assessments of loess collapsibility, based on available materials and information. The new approach should be prompt and trend-oriented. In fact, processing this kind of problem with fuzzy information processing of optimization [4-71 will be most appropriate since the categorization of quantity of collapsibility is itself an ambiguous and fuzzy concept. Guan [8] set up loess a collapsibility evaluation method with the synthetical fuzzy method. Wang [9] used the information distribution method to investigate the relationship between moisture content and collapsibility coefficient. The basic property of the information diffusion method is that we can change an observation into a fuzzy subset that naturally fills up the information gaps caused by incomplete data. In this paper, moisture content, pore ratio and compression coefficient were used as evaluating criteria; then a loess collapsibility evaluation method using information difhsion principles is set up, based on 68 sets of loess collapsibility data. 2.
The Theory of Fuzzy Information Optimization
The basic characteristic of Fuzzy Information is fuzzy pattern recognition of necessary or statistical rules, based on primary information. This information, in the form of data, usually shows no phenomenal rules. The advantage of fuzzy information optimization is not to study the measurement of information, but to analyze the content and structure of the information from which we may discover some useful natural laws-that is an information matrix. This is a more flexible way of processing data, compared with function relationship, fuzzy relationship. Information Matrix is use of the information difhsion method, which avoids artificial factors and disturbing the objectivity of primary information data. Because the information difhsion method focuses on the analysis and processing of the primary data’s form and structure, it moves directly from primary information to the system model. The needs for constructing attribution function interim processes are eliminated.
2.1. Information diffusion Fuzzy relation ( R ) describes a group of observed events and the rules between causal set U and result set V. The determination of R is by fuzzy identification
352 through fuzzy approximate reasoning. In this paper, the information diffusion method is used to determine the fuzzy relationship. The model is as follows: Suppose a universe of !2, which contains the extent of model's basic variable to be proceeded. Any primary information is dispersed to all points in the !2 under certain rules. Each point in the !2 gets several information from multiple primary information sources. For each point in !2 , we combine all the information received and obtain a primary information data base. After standardization [lo], we could reach the fuzzy relationship (R). This process of information diffusion is show in figure 1.
Figure 1. Sketch map of information diffusion
2.2. Fuzzy approximate reasoning For independent variable field U and dependent variable field V we have U 4 A { U I , 242, .*., urn,>and V = { V I , v2, **., vn,) ; suppose A ; , B j is the fuzzy subset of U , V separately, the fuzzy relation is R, then the following model can be presented [lo]: B;=A;'R. (1)
where
0=
matrix multiplication method.
2.3. Information integration
A particular data set itself may not be fuzzy; however, after information diffusion processing, it inherits fuzzy characteristics. In order to offset the influence of information diffusion and to get the optimal forecast value, an information integration method was utilized, as in formula [ 111: i=l
i=l
Where 4'= final result of induction, Bti = probability distribution of secondary fuzzy approximate induction, 4;= stratify value of inductions, K = 2.
353 3.
Information Diffusion Method in Evaluation of Loess Collapsibility
There are many factors influencing the collapsibility of loess, such as moisture content, pore ratio and compression coefficient, initial pressure, level of rainfall infilteration, depth of loess deposit, the micro structure of loess etc. In this paper, three factors: moisture content (w),pore ratio (e), and compression coefficient (a,-2)were selected; and fuzzy relationship between the indicators and loess collapsibility coefficient (4) were established. We gathered 68 sets of loess collapsibility geo-technique test data in the Loess Plateau Area of China (including Shan’xi, Gansu, Ningxia, Shan’xi, Henan et aZ.),as shown in Table 1.
W -
7.0 7.2 7.3 7.6 7.9 8.0 8.1 8.5 9.0 9.4 9.9 10.0 10.8 11.0 11.7 12.0 12.4 12.8 13.5 13.8 14.0 14.5 14.6 -
Tab. 1 The loess collapse data in Loess Plateau, China e 0.850 0.860 0.867 0.875 0.894 0.900 0.901 0.902 0.925 0.940 0.950 0.95 1 0.955 0.966 0.971 0.975 0.980 0.982 0.984 0.990 0.990 0.991 0.999
-
e * 6s - 0.038 0.032 15.0 1.ooo 0.361 W
0.084 0.014 0.130 0.035 0.136 0.037 0.176 0.0375 0.208 0.0458 0.210 0.0207 0.212 0.0460 0.215 0.0270 0.222 0.0520 0.230 0.0332 0.240 0.0600 0.254 0.0410 0.256 0.0300 0.260 0.0330 0.264 0.0346 0.277 0.0390 0.295 0.0398 0.300 0.0590 0.320 0.0450 0.332 0.0620 0.350 0.0580 0.360 0.0310 -
15.2 15.3 15.6 16.5 16.8 17.0 17.1 17.5 17.6 17.9 18.0 18.2 18.5 18.9 19.0 19.2 19.4 19.5 19.6 19.8 20.0 20.3 -
al.2
1.006 1.012 1.020 1.024 1.025 1.040 1.042 1.044 1.054 1.056 1.058 1.073 1.074 1.075 1.078 1.080 1.084 1.090 1.096 1.100 1.102 1.103 -
0.380 0.390 0.395 0.400 0.432 0.444 0.452 0.470 0.478 0.480 0.502 0.530 0.545 0.560 0.573 0.575 0.593 0.590 0.600 0.610 0.620 0.632
6, 0.057 0.04 0.071 0.0335 0.1065 0.0610 0.0270 0.0837 0.1135 0.0465 0.0500 0.0480 0.0280 0.0345 0.0963 0.0700 0.0750 0.0730 0.0420 0.0560 0.1032 0.0780 0.0681
w
20.4 20.8 21.0 21.4 22.0 22.2 22.8 23.0 23.2 23.3 23.5 24.0 7.50 8.20 11.8 12.6 14.2 15.8 17.4 21.5 22.5 23.6
e 1.109 1.120 1.125 1.127 1.138 1.150 1.175 1.200 1.210 1.253 1.396 1.400 0.967 1.068 0.956 0.835 1.004 1.036 1.121 0.970 1.003 1.145
qi-2
6,
0.633 0.636 0.676 0.680 0.682 0.694 0.735 0.750 0.860 0.870 0.890 0.900 0.082 0.370 0.303 0.213 0.492 0.618 0.548 0.790 0.930 0.767
0.0813 0.0835 0.072 0.0325 0.0630 0.0380 0.0714 0.1330 0.0680 0.0970 0.1400 0.1030 0.0353 0.0480 0.0340 0.0354 0.0468 0.0540 0.0790 0.0594 0.0380 0.1020
-
3.1. Set up of the fuzzy relationship matrix between collapsibility coefficient and its affecting factors
We analyze the fuzzy relation between loess collapsibility coefficient and water content. From Table 1, we manipulate two discussion sets as: Uw = { w I , w ~ , * *=* {7,8.7,10.4,12.1,13.8,15.5,17.2,18.9,20.6,22.3,24} ,w~~} V, = { V ~ I , W ~ Z , *=. {0.014,0.035,0.056,0.077,0.098,0.119,0.14} *,W~~} Where Uw = discussion set of loess water content, step size = 1.7; V, = discussion set of loess collapsibility coefficient, step size = 0.021.
354 Setting the fuzzy relationship matrix based on two dimensions normal information diffusion formula [ 111:
where u‘ = (u-ul)/(bl-ul); V‘ = (v-a2)/(b2-a2);u,’ = (uJ-ul)/(bl-ul); v,‘ = (v,-az)/(bz-a2); a, = min{uJ b, = maxtu,}; a, = rnin{v,}; 6, I<J-
1;
l<J
= max{vJ}; I<J
l<J<m
h = 1.4208/(m - I), where rn = sample number. According to formula (3), we set up a fuzzy relationship matrix with the 58 sets of data in Table 1, and the remaining 10 sets were used to verify the R; so rn = 58; al = 7.0; bl = 24.0; a2 = 0.014; b2 = 0.14; u E U,; v E V,; uj, v, is the water content and collapsibility of the 58 data sets, respectively. In the same way, we set up a fuzzy relationship between e and a1-2 with 6, respectively; thus we derived the primary information distribution matrix Q , g (11x7) and Qal-2,g (11x7). After standardization of Qw,s(11x7), Qe,8(11x7), and Qul-2,s(1 1x7) on a row basis, we get the hzzy relationship matrix Rw,6 , Re,J and Ral.2,g between loess collapsibility coefficient and moisture content, -
0 0.502 1.000 0.002 1.000 0.612
0
0 0
0 0
0.489 0 0 0.001 0 0 1.000 0 0 0.465 0.127 0.001
0 0 0
0 0 0
0 0
0 0.215 1.000 0.426 0.029 0.716 0.704 0.278 1.000 0.916 0
0 0
0.125 0.062 1.000 0.043 1.000 0.094 0.154 0.085
0 0
0 0.056
0
0.960
0 0
Rw,6 =
0 0
0 0
- 0
1.000 1.000 0.210 1.000
0
0
0.000 0.013 1.000
0 0 0
3.2. Fuzzy approximate reasoning We verify with the 10 data sets of loess collapsibility in Table 1, using fuzzy approximate reasoning. There are various methods of deriving A; in formula (l), we use the one Wang [l 11 proposed: @ when a 5 amin,amin E Ai,Ai = [l, 0, * - - , 01 0when a 2 amax,amaxE A;, Ai = [0, 0, --.,11 @ when amin< a < amax, A i = [max(O, l-la -u;]/A)], i = 1, 2, rn a m - ,
355 where A = step size, A = ai+l-ai. 3.2.1. First fuzzy approximate reasoning We selected 10 sets of data in the latter part of Table 1 to illustrate the process of fuzzy approximate reasoning. With the available primary data (w,e, al-2,&) = (15.8,1.036,0.618,0.054), we use w = 15.8, e = 1.036 and a1-2 = 0.618 respectively to induce the value of 4.According to the formula 0, when w = 15.8 we could get the value of &: A , = [0,0,0,0,0,0.824,0.176,0,0,0,0], Bw = Aw o Rw,s= [0,0.8618,0.5592,0.1796,0.0059,0.1260,0]. In the same way, when e = 1.036 and al-2= 0.618 ,we can obtain &: Be = [0,0.7257,0.6063,0.2251,0,0,0], Bal-2 = [0,0.0103,0.2060,0.8075,0.7293,0,0]. Since B,, Be, and Bal-2 derived above are the results of single affecting factors; we should be concerned that the each factor should effect loess collapsibility differently. To be more objective, a secondary fizzy approximate induction must be conducted. 3.2.2. Secondaryfuzzy approximate reasoning The result of the secondary fuzzy approximate induction comes fiom the combination of weight array A' and the fuzzy relation matrix R'. From the first fuzzy approximate induction we got the fuzzy vector B,, Be, and Bal-2as factors that influence loess collapsibility, which formed a 3 X 11 matrix as the fuzzy relationship matrix of secondary induction. Then, using the secondary induction again, we finally get the induction result of loess collapsibility. The secondary fuzzy induction formula: B'=A'OR'
(4)
Where A' = weight of each single indicator. In this paper the weights in the secondary fuzzy induction were standardized, giving weights of w, e as a1-2 are 0.26, 0.46, and 0.28 respectively, in the form of array A' = [0.26,0.46,0.28]. R' is combination of B,, Be, and Bal-2. 0 0.8618 0.5592 0.1796 0.0059 0.1260 0 R' = Be = 0 0.7257 0.6063 0.2251 0 0 0.0103 0.2060 0.8075 0.7293 O 0 O0 1
r.1[
B' =A' O R ' = [0,0.5608,0.4820,0.3763,0.2057,0.0328,0] B' can not be taken as the final result. We analyze it with the information convergence method, and take the result as final.
356 3.3. Information integration Deriving the final result with the information integration formula (2): S,‘= (O2x0.O 14+0.56082~0.035+0.4822~0.056+0.37632~0.077+0.20572~0.098+ 0.032 g2x 0.1 19+02x 0.140)/(02+0.560g2+0.4822+0.37632+0.20572+0.032g2+02) = 0.05356 The computation of the derived S,‘ (0.05356) from w = 15.8, e = 1.036 and al-2 = 0.618 is very close to the actual value 4’ (0.054). In the same way, we verified the latter 10 sets of data in Table 1, the result are shown in Table 2.
[
w
Tab.2 Computation and original value of collapsibility
e -
0.967 1.068 12.6 0.956 0.835 1.004 -
0.082 0.370 0.303 0.213 0.492
0.0343 0.0353 0.0509 0.0480 0.0354 0.0340 0.0311 0.0354 0.0488 0.0468
15.8 17.4 21.5 22.5 23.6
e -
1.036 1.121 0.970 1.003 1.145
3.4. Error analysis The analysis presented in this paper is based on mean square deviation:
(5) where P(vJ
=
actual collapsibility coefficient &;
7(vi) =
computed
collapsibility coefficient hi’. Putting the data in table 2 into formula (5) we get: So,= 0.0000192; We can see clearly, that computing collapsibility coefficient S, with the information diffusion method results in little deviation. 4.
Conclusion
The verification by data set shows that forecasting the loess collapsibility coefficient with the fuzzy information optimization method results in more accuracy and smaller variance. This method is applicable in town disaster prevention, geology engineering graphic series drawing, and regional loess collapsibility analysis. It is a method that is simple, easy to use, and can incorporate more variables to objectively evaluate loess collapsibility.
Acknowledgments
357
The author wishes to express his most sincere appreciation to Prof. Chongfu Huang ,who read the manuscript carefully and gave valuable advice. References 1. ED. Derbyshire And T. A. Djikstra, “Failure mechanism in loess and the effects of moisture content changes on remoulded strength ”, Quaternary International 24, 5- 15 (1994). 2. J. D. Wang, Zh. Y. Zhang and B. X. Li, “The Mechanism of Pulsatile Liquefaction on loess collapsibility caused by dead load ”, Geographic Science 19, 271-275 (1999). 3 . The planning committee of Shannxi Province, The building code in collapsible loess region (GBJ 25 -90)(The Planning Press of China, Beijng) (1991). 4. C. F. Huang, “Information matrix and application”, General Systems 30, 603-622 (2001). 5. C. F. Huang, “Principle of information diffusion”, Fuzzy Sets and Systems 91, 69-90 (1997). 6. C. F. Huang and L. Yee, “Estimating the relationship between isoseismal area and earthquake magnitude by hybrid fuzzy-neural-network method”, Fuzzy Sets andSystems 107, 131-146 (1999). 7. C. F. Huang and Y. Shi, Towards Efficient Fuzzy Information Processing -- Using the Principle of Information Diffusion (Physica-Verlag (Springer), Heidelberg, Germany) (2002). 8. W. Zh. Guan, New Edition on Engineering Properq of Loess Collapsibility (Xi’an Jiaotong University Press, Xi’an , China) (1990). 9. J. D. Wang, “Geography research by the method of h z z y information optimization processing”, Geography and Territorial Research 15, 75-80 (1999). 10. C. F. Huang and D. Ruan, “Information difhsion principle and application in fizzy neuron”, Fuzzy Logic Foundations and Industrial Applications (Kluwer Academic Publishers, Massachusetts) 165-189 (1996). 11. C. F. Huang and J. D. Wang, Technology of Fuzzy Information Optimization Processing and Applications (Beijing University of Aeronautics and Astronautics Press, Beijing, China)( 1995).
MODE OF SOFT RISK MAP MADE BY USING INFORMATION DIFFUSION TECHNIQUE *
JUNXIANG ZHANG, CHONGFU HUANG College of Resources Science €4 Technology, Beijing Normal University, No. 19 Xinjiekouwai Street, Beijing 100875, China E-mail: [email protected]
SEN QIAO Seismological Bureau of Yunnan Province, Kunming 650204, China
There is no believable risk map because of the tremendous imprecision of natural disaster risk assessment. To improve the representation of risk information spatially in cartographic form, an alternative mode of risk maps is required. A fuzzy risk represented by a possibility-probability-distribution, which is calculated by employing the interior-outer-set model as an information diffusion technique, can represent the imprecision. In this paper, we introduce fuzzy risk into the domain of risk maps and research into the appropriate way of transforming numeric fuzzy risks into risk maps. A mode of soft risk map is defined which is more representative for risk information with an example of soft risk map with respect to strong earthquakes in Yunnan province. A comparison between traditional risk map and soft risk map is presented.
1. Introduction Risk information that is portrayed spatially in cartographic form is conventionally represented by risk maps. In general, the risk is considered as the possibility of the occurrence of natural disasters. The probability is usually employed to measure the possibility'. When we study a risk system using a probabilistic method, usually, it is difficult to judge if a hypothesis of probability distribution is suitable] and sometimes we may meet the small sample problem6 , where the data is too scanty to make a decision in any classical approach. It means that it is impossible to precisely estimate the *supported by the research grant no.40371002 of national natural science foundation of china.
358
359 probability with respect to the risk within a specified error range. Furthermore, the scheme of representation relating to traditional risk maps has great difficulty dealing with situations in which risk can not be precisely described by a single value. As a result, there is no believable risk map because of the tremendous imprecision of natural disaster risk estimation. However, until now, there is few studies on how to make risk maps under conditions of the tremendous imprecision of the risk assessment. Therefore, it is still an interesting question. As this is not often possible for showing the imprecision of risk estimation in terms of probabilities in risk maps, we offer the ”fuzzy (soft) risk map” by employing the information diffusion technique5. In this paper, the theory of fuzzy risk3 with incomplete data set4 is introduced into the domain of risk maps and a new approach t o risk map by using the information diffusion technique is presented.
2. Fuzzy Risk with Respect to Natural Disasters Risk assessments are typical issues with imprecision, uncertainty and partial truth. A fuzzy risk represented by a possibility-probability-distribution3, which is calculated by employing the interior-outer-set model7 as an information diffusion technique, can represent the imprecision. A natural way to improve the quality of risk maps is to introduce the concept of fuzzy risk to overcome difficulties resulted from incomplete knowledge. In terms of fuzzy risk theory, we suppose that we cannot obtain a probability-risk p ( r n ) , but we can obtain possibility-probabilitydistribution(PPD) as a fuzzy risk
Where M is the universe of the risk event and P is the universe of the probability of occurrence. A fuzzy risk with respect to a kind of natural disaster is not one value but a fuzzy set7. That is, a fuzzy risk is a multi-valued risk, which can offer more information. Consequently, Fuzzy risks are more complex to represent spatially in cartographic form, however they provide more expressive and accurate results. The scheme of representation relation to traditional risk maps for the cartographic representation of fuzzy risks are inadequate because they do not tolerate imprecision with a single risk values. For the representation of multi-valued risk information in cartographic form, an alternative mode of risk maps is needed.
360 3. Soft Risk Map of Strong Earthquakes in Yunnan
Province The purpose of this section is to show how imprecision of risk estimation resulted from small sample may be incorporated into the domain of risk maps for producing risk maps of high accuracy and confidence. In this section, We take the risk map of strong earthquakes in the 7 seismic zones or belts in Yunnan province, China(Fig.1) as an example to illustrate the procedure of transforming numeric fuzzy risks into cartographic form with respect to natural disasters.
Figure 1. Distribution map of seismic zones or belts in Yunnan province.
There are 6-10 seismic records observed in each of the seismic zones or belts from 1900 t o 1979 with magnitudes in the range of 5.0-7.7. The sizes of these samples are so small that it is impossible to precisely estimate the underlying probability distributions. We employ IOSM to calculate fuzzy risks of strong earthquakes in these zones. The method of IOSM has been detailed in the reference 3, so it is not necessary to rehearse the technique in detail once again. For a fuzzy risk ~ ( mof )strong earthquakes in one of the zones in Fig.1, b'a E [0,1], let
361
F,(m) is called the maximum probability in a-cut with respect t o m. Next, let
p (m)and &(m) are the normalization of ~ ( mand ) Fa(m>, respec-a tively. Then, let
E,(m) and Fa(m)are the fuzzy expected risk values relating t o the minimum probabilities and the maximum probabilities, respectively, in the a-cut of ~ ( mwith ) respect to m. Finally, we use the fuzzy expected risk values, &(m) and F,(m) as parameters for transforming numeric fuzzy risks into cartographic form. Thus, by using the a-cut method7, we can provide a series of risk maps. They are produced by employing the information diffusion technique, which is regarded as one of the soft computing8 approaches, these maps are called soft risk maps. A fuzzy risk represents the imprecision. A a-cut represents some confidence in the estimate. In this paper, we provide the soft risk map with respect to 0.25-cut of fuzzy risks of strong earthquakes in Yunnan province, China(Fig.2). The soft risk map includes two specific maps: (1)conservative risk map, which is produced by using the fuzzy expected values with respect to the minimum probabilities in a-cut of fuzzy risks; (2)risky risk map, which is produced by using the fuzzy expected values of the maximum probabilities in a-cut of fuzzy risks. From Fig.3, we know that soft risk map is a kind of risk map in which each area are assigned two risk values. The double risk values represent the imprecision. The 0.25-cut represents some confidence in the estimate.
4. Comparison Between Soft Risk Map and Traditional Risk Map The form of traditional risk maps is given in Fig.3. Traditional risk maps are resulted Gom the probabilistic* or deterministic approach'. Traditional methods for representing risk information are inadequate because they do not tolerate imprecision with a single risk value. Fig.3 indicates that,in traditional risk map, each area is assigned a crisp risk value. This assignment is
362 c
a. conservative risk map
Figure 2.
Figure 3.
b. risky risk map
Soft risk map of strong earthquakes in Yunnan province.
Traditional risk map of strong earthquakes in Yunnan province.
unique for that area. However, in practical situations, risks with respect to natural disasters are impossible to estimate with precision. The expressive inadequacy is largely due to the traditional approaches used t o estimate risks. Expressive inadequacy may lead to loss of valuable information and reduction of accuracy of analysis. Apparently, the scheme of representation relevent to soft risk maps is superior t o that of traditional risk maps. Soft risk map is resulted from fuzzy risks obtained by employing the technique of information diffusion. Fuzzy risk is a multi-valued risk which can represent the imprecision of risk estimations. A a-cut represents some confidence in the estimate. In contrast, soft risk map can provide a better representation for risk information
363 leading to more satisfying results. In this respect, soft risk map do meet the demands of risk information users. Soft risk maps can provide more risk information for a variety of businessmen who invest in dangerous projects(e.g., nuclear power plants) or non-dangerous projects(e.g., flower shops). A soft risk map with Q >- 0 might be useful for the investors in a nuclear power plant, An owner of a flower shop might e interested in the soft map with Q = 1 when he chooses an insurance company to buy a property insurance policy for his shop. 5 . Conclusion
A fuzzy risk is a multi-valued risk. The benefit of this is that one can easily understand the imprecision of risk assessment of natural disasters in case of lack of data. Soft risk map resulted from fuzzy risk method can tolerate imprecision and provide more information for map users. Soft risk map is the normal update of traditional risk map.
Acknowledgement The work on this paper was done in Key Laboratory of Environmental Change and Natural Disaster, The Ministry of Education of China. References 1. Bazzuro, Paolo, and C.A. Cornell (1999). Disaggregation of seismic hazard, Bull. Seism. SOC.Am., 89(2), 501-520. 2. C. A. Cornell(l968), Engineering seismic risk analysis, Bull. Seism. SOC.Am., 5 8 , 1583-1606. 3. C.F. Huang(1998), Concepts and methods of fuzzy risk analysis. Risk Research and Management in Asian Perspective (edited by Beijing Normal University and et al., International Academic Publishers, Beijing), 12-23. 4. C.F. Huang(1997), Principle of information diffusion, Fuzzy Sets and Systems, 91(1),69-90. 5. C.F. Huang, D. Ruan( 1996). Information diffusion principle and application in fuzzy neuron, Fuzzy Logic Foundations and Industrial Applications (edited by Da Ruan, Kluwer Academic Publishers, Massachusetts), 165-189. 6. C.F. Huang, Peijun Shi(1999), Fuzzy risk and calculation, Proceedings of 18th International Conference of the North American Fuzzy Information Processing Society, New York, June, 90-94. 7. C.F. Huang, Y. Shi(2002), Towards efficient fuzzy information processing - Using the principle of information diffusion, Physica-Verlag (Springer), Heidelberg, Germany. 8. L.A. Zadeh(1994), Soft computing and fuzzy logic, I E E E Software, 11(6), 48-56.
BENEFIT OF SOFT RISK MAP MADE BY USING INFORMATION DIFFUSION TECHNIQUE *
CHONGFUHUANG Institute of Resources Technology and Engineering College of Resources Science and Technology, Beijing Normal University Beijing 100875, China. E-mail: [email protected]. c n
HIROSHI INOUE School of Management, Tokyo University of Science Kuki, Saitama 346 Japan
In this paper, we discuss the benefit of the soft risk map calculated by the interiorouter-set model. We suppose that a company will invest three projects in a district consisting of four zones with different flood risks. The company chooses zones based on the risks beside water resource and cost. The result shows that the soft risk map is better than the traditional risk map.
1. Introduction Due to the complexity of natural disaster systems, it is impossible to accurately estimate the risks of natural disasters within a specified error range. As a result, there is no believable risk map. In this case, fuzzy probabilities would be used to represent fuzzy risks and give a new risk map, called soft risk map. With given samples, we can employ the interior-outer-set model (1OSM)l as an information diffusion technique2 to calculate fuzzy probability distributions serving for the map. In this paper, we discuss the benefit of soft risk map from IOSM with respect to optimization of resources allocation. In this research, we suppose that, in a map, the real risks, in term of their underlying probability distributions, are known, and some samples from the populations are given. Using the histogram method to analyze the samples, we produce a traditional risk map, and the interior-outer-set model to a soft risk map. Then, *Project Supported by National Natural Science Foundation of China, No. 40371002
364
365 we suppose that a company will invest three projects in the district shown in the map. To promote the optimization of resources allocation, we choose zones for the projects based on a risk map beside other restrictions.
2. Flood risk map as a restriction Let D be a district consisting of four zones: A l , A z , As,Aq. A river flows through A1 where often flood, abundant water resource and the most convenient transportation. A lake is lying in A2, now and then flood, reasonable water resource and convenient transportation. As is a mountainous area, rare flood, little water resource and inconvenient transportation. A4 is a basin where flood occurs when the rainfall so much or drought when the rainfall so little, meanwhile the most convenient transportation. Obviously, the natural conditions of the zones are different. They can be distinguished by used three attributes: flood risk, water resource, transportation. Theoretically speaking, the flood risk of a zone can be represented by using a probability distribution of a flood measured by submersed area. To compare different zones with respect to the flood, we often employ the proportion of submersed area and total area of a zone to be the flood index and use the probability distribution of the index to represent the flood risk of the zone. The expected value of the probability distribution can be regarded as the flood risk degree of the zone. This paper focuses on discussing the benefit of the soft risk map, not to calculate the flood risks of real zones. Therefore, we are allowed to assign a probability distribution of the flood index to a zone. In general, the frightful flood does not occur frequently for any zone where people are living. Meanwhile, the frequency of the light flood may very high. Therefore, it is logical t o assume the probability distribution p ( z ) of the flood index z to be an exponential distribution in Eq.(l). p(z)=
There is no loss in generality when we suppose that zones Al,Az,A3,Ad have parameters X = 9,15,100,25, respectively. In this case, real risks are 0.111,0.0667,0.01,0.04,respectively, shown in Fig. 1. Using the generato? of random numbers obeying an exponential distribution to these parameters, with seed number 574985 and size n = 10, we obtain the following four samples regarded as data of flood events occurred
366 in the last 10 years in zones A1, A2, As, A4, respectively.
X i = {0.093,0.228,0.115,0.081,0.079,0.005,0.347,0.066,0.037,0.721}, X , = {0.056,0.137,0.069,0.048,0.047,0.003,0.208,0.039,0.022,0.433}, X3 = { 0.008,0.020,0.010,0.007,0.007,0.000,0.031,0.005,0.003,0.064}, X4 = {0.033,0.082,0.041,0.029,0.028,0.002,0.125,0.023,0.013,0.260}. Practically impossible to prove if the probability distribution of the flood index is an exponential distribution or other theory distribution, such as normal distribution. Hence, if only given samples are available, we use the histogram model(HM):
1 Ij(z E I j ) = -(number of zi in the same bin as x).
(2)
n
to estimate flood risk for making a traditional risk map. With respect to HM, Otness and Encysin3 proved that the asymptotic optimum estimate of the bin number m is m = 1.87(n - l)2/5.
(3)
Hence, for the given samples, we obtain: m = 1.87(10 - 1)2/5 M 4 to be the number of the intervals I j , j = 1 , 2 , . . . , m, for HM. From X I ,X2, X3, X4, we obtain HM risks shown in Fig.1. Using IOSM, with the same intervals in HM, from X I , X ~ , X ~ we ,X~, obtain four possibility-probability distributions. For example, from X I , we PO PI PZ P3 P4 P5 P6 P7 P8 P9 PI0 0.00 0.07 0.08 0.11 0.16 0.32 0.50 1.00 0.26 0.00 0.00 0.26 0.41 1.00 0.11 0.00 0.00 0.00 0.00 0.00 0.00 0.00 1.00 0.41 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.50 1.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
,
(4)
where Ij = [0.005+ ( j - l)h,O.O05+jh),h = 0.179,j = 1 , 2 , 3 , 4 , and pk = k/10, k = 0 , 1 , 2 , . . . , l o . From Eq.(4) we know that, the probability of a flood event occurring in interval I1 would be in b6,PS] with possibility more than 0.26. The probability interval with possibilities can be represented as in Eq.(5).
PI, = [ ~ j / a ~j ,i / b j ] . where ~ j7ri, are possibilities, Eq.(4) we obtain
a j , bj
PI, = [0.50/P6,0.26/PS],
(5)
are probabilities. For example, from
PI, = [0.41/Pl, o.ll/P3],
367
Let
m
m
m
m
where u j is the midpoint of interval Ij. Then, we obtain an risk interval [u,, Ub] with possibilities 7r and T I , respectively, represented by Eq.(6). Defuzzifying the interval by Eq.(7), we obtain u to be IOSM risk.
RA = [T/%,T1/ub].
(6)
For example, for zone A1, we have n-=
0.5
+ 0.41 + 1+ 0.5 = 0.602, 4
=
0.26
+ 0.11 + 0.41 + 1 = 0.445, 4
R A= ~ [0.602/0.121,0.445/0.205],
uA1
=
0.602 x 0.121 0.602
+ 0.445 x 0.205 = 0.157.
+ 0.445
Similarly, uA2 = 0.094, uAQ= 0.0144, uA4 = 0.0571, shown in Fig.1 as IOSM risks.
Real risk: 0.01 HM risk: 0.0167 IOSM risk: 0.0144
Real risk: 0.04 HM risk: 0.0665 IOSM risk: 0.0571
Real risk: 0.111 HM risk: 0.185 IOSM risk: 0.157
(Basin Zone)
IOSM risk: 0.094
Figure 1. The real flood risks, and estimated risks by histogram model(HM) and interior-outer-set model (IOSM). 3. Projects, water resource and cost
There is no loss in generality when we suppose that, the natural environment and transportation condition of the district are relatively stable. Furthermore we suppose that, in the district, a company will invest three projects 01,02,03 whose profit functions, in next 10 years, are
+
gol (w, s, 2 ) = 10 0 . 6 ~ 0.1s - 3.12, go2(w, S, 2 ) = 9 1 . 1-~0.1s - 3.12, gO3(w, s, 2 ) = 5 l . l w - 0.1s - 5.62.
+
+
where w is an index of water resource, s is the cost of raw materials, labour force and transportation, and 2 is flood risk. The profit is measured in billion RMB Yuan. According to the natural environment and transportation condition of the district D ,we suppose that
w A 1= 0.8,w A Z= 0 . 6 , w A 3~ 0 . 2 ,w A 3~ 0 . 4 ,
369
4. Zones for the projects If project Oi is located at zone Aj, then the profit in next 10 years will be 9oi (WA •> SA i XA • ) where XA . is flood risk in zone Aj. Table 1 shows the profit of the projects in the district D, and the names of zones to produce maximum profit, from which we know that the IOSM risks lead to the same zones chosen with the real risks. Table 1. Profit (billion RMB Yuan) of projects o\, 0%, 03 located at zones Project
Risk Real HM
01
IOSM Real HM
02
IOSM Real HM
03
IOSM
Zone AI 10.106 9.877 9.963 9.506 9.276 9.363 5.228 4.814 4.971
Zone A2 10.143 10.007 10.059 9.443 9.307 9.359 5.276 5.030 5.124
Zone As 10.029 10.010 10.015 9.129 9.110 9.115 5.104 5.070 5.079
Zone A^ 10.096 10.014 10.043 9.296 9.214 9.243 5.196 5.048 5.100
Chosen Zone A2 AA A2
A! A2 Ai A2 A3 A2
5. Conclusion and discussion The soft risk map is better than the traditional risk map for choosing zones for projects. In this paper, we merely use the weighted midpoint of the fuzzy risk interval RA in Eq.(6). There might be some model to use all information in the interval for reaching a better result. Acknowledgement The work on this paper was done in Key Laboratory of Environmental Change and Natural Disaster, The Ministry of Education of China. References 1. C.F. Huang, An application of calculated fuzzy risk, Information Sciences, 142(1), (2002), 37-56. 2. C.F. Huang, and Y. Shi, Towards Efficient Fuzzy Information Processing— Using the Principle of Information Diffusion, Heidelberg, Germany: PhysicaVerlag(Springer), 2002.
3. R.K. Otness, and L. Encysin, Digital Time Series Analysis, New York: John Wiley, 1972.
A RISK ASSESSMENT MODEL OF WATER SHORTAGE AND ITS APPLICATION FOR ANALYZING BEARING CAPACITY OF WATER RESOURCES*
LIHUA FENG+ Department of Geography, Zhejiang Normal University, Jinhua 321004, China
CHONGFUHUANG Institute of Resources Technology and Engineering, College of Resources Science and Technology, Beajing Normal University, Beijing 100875, China
In this paper, we suggest a risk assessment model of water shortage based on information diffusion theory and use it to analyze the bearing capacity of water resources in Jinhua City. From the application we know that the carrying capacity of water resources is the maximum support capability on human activities under a certain stage of social development and sound circulation in ecosystem. For Jinhua City, we discover that the water shortage is not the type of lacking of water sources, but the water quality and water conservancy projects. Therefore, only the middle scheme of giving attentions to both economic development and environmental protection is the first choice.
1. Introduction
Due to the complexity of a water system with natural and societal attributes, it is impossible to accurately estimate the risk of water shortage for a city. In order to simplify the analysis, we assume that the uncertainty in water shortage is only related to the uncertainty in natural rainfall. The core in the risk of water shortage is to estimate the possibility distribution of rainfall of the study region. We also assume that the natural rainfall in the study period is a stable Markov process, therefore there is no change on the possibility distribution with the period of study. Even then, the esti*The Project Supported by Zhejiang Provincial Natural Science Foundation of China(402034) t Corresponding author: Tel.: +86-579-2306806, E-mail: [email protected]. net.cn
370
371 mation of the risk of water shortage must be very rough when the size n of sample including observations of natural rainfall for n years is small, such as n < 30. A small sample provides incomplete information to estimate water shortage. In this case, the information diffusion techniques4 can help us t o improve the estimation. It has been proven3 that, the estimation from the techniques is better than one from the classical histogram at least 28% higher in the work efficiency when the given sample is small. Hence, with the result of risk assessment from the techniques, it is possible to reasonably calculate the carrying capacity of water resources for a city on human activities under a certain stage of social development and sound circulation in ecosystem and discover the types of the water shortage. In this paper, we employ the normal diffusion model5 t o construct a risk model for estimating the risk of water shortage with a small sample of observations of natural rainfall. 2. Risk assessment model for water shortage
Information diffusion is a kind of set numerical method processing on samples by fuzzy mathematics2, it could transfer single-valued sample into set numerical sample. The simplest model is of normal diffusion model. Give a sample X = {x1,x2,..., x n ] in a universe of water shortage, U = { u I , u ~ , . ., u. r n } ,we can diffuse the information of a single-valued observation xi to all points of U by using Eq. (1).
where h is diffusion coefficient calculated by the maximum, minimum values as well as the number of the given Let m j=1
Then, observation xi is changed into a normalization fuzzy set with the following membership function: PI1
(Uj)= fi(Uj)/Ci.
(3)
With p x i( u j ) ,i = 1,2, . . . ,n, as fuzzy observations, we assign information gain to a monitoring point uj as n i=l
372 Its physical meaning is to extrapolate from X through information diffusion that, if the observing value of water shortage could only choose one from u1, u2,. . . ,um, then when deeming xi as the representative of samples, the number of samples with observing value ujwill be q ( u j ) ,which is often not a positive integer, but surely not a number below zero. Let m j=1
and
we obtain the probability in exceeding u j as the following.
k=j
that is just the required estimation for risk assessment. 3. Risk assessment of w a t e r shortage in Jinhua City
Jinhua City is located in the middle of Zhejiang Province, China, total area 10918km2, total population 4.492 millions. The city is of sub-tropic monsoon climate, possesses balanced rainfall and heat, light and temperature supplements each other, a comprehensive agriculture development zone with farming, forest, and fishery, and is called as the “Second barn” of Zhejiang Province. In recent years, the problem of water shortage has emerged quietly. The drought in 1996 caused problem in drinking water for 160 thousands person and 80 thousands lives tock in Jinhua with direct economic loss of RMB 170 millions; 50 years not experienced high temperature occurred in the Summer of 2003, Autumn drought occurred after the Summer drought, the precipitation from July-October is 124mm only, just 30% of the normal year. Due to the uneven distribution in time and space, plus river pollution and the lack of water storage project, Jinhua city is forced t o give up taking the river water which they had taken for generations as drinking water in 1998, in stead of that they spent huge money to get water from Shafan reservoir for the city; On November 24, 2000, Yiwu invested RMB 200 millions to purchase fresh water 49.99 million m3 from Hengjing reservoir of Dongyang city, which is the first transaction on water right after the theory of “Water right and water market” by the Ministry
373
of Water Resources, China. Therefore, the lack of water has become an important restraint for Jinhua City to build a well-off society in an all-round way by 2020. For 23 years' measured data on rainfall at Jinhua Station in 1980-2002, we take the interval [0, 2000] as the universe of Xi and transfer the continuous universe into discrete universe: t/ = {«i,u 2 ,-" ,u n } = {0,20,40,---,2000}. By using the risk model consisting of Eq.(l)-(7), we obtain the risk assessment of water shortage in Jinhua City shown in Table 1. Table 1. Risk assessment for water shortage in Jinhua City Annual Rainfall 900 1000 1100 1200 1300 1400
Exceed- probability 1 0.9965 0.9469 0.8597 0.7782 0.6581
Annual Rainfall 1500 1600 1700 1800 1900 2000
Exceed- probability 0.4934 0.3221 0.1919 0.0906 0.0308 0.0014
As it is calculated by year, therefore the line of 1400 in the table means that the probability with annual rainfall > 1400mm in Jinhua in the future is p = 0.6581. In another word, rainfall of such kind of risk in future low water year in Jinhua is about 3 years a meet. The impact of annual rainfall on the utilization of water resources in Jinhua is quite high. Based on the 23 years' measured data on rainfall at Jinhua Station in 1980-2002, by using of the above calculation for risk assessment, the rainfall in future 5 years a meet low water year in Jinhua is only 1276mm by inner insertion, and only 1151mm in 10 years a meet low water year. 4. Analysis on the bearing capacity of water resources in Jinhua City The annual average rainfall in Jinhua City is 1503mm, which belongs to humid area. Comparing with the water shortage in dry area7, the current water shortage in Jinhua is not the type of lacking of water sources, but the water quality and water conservancy projects, namely water shortage due to water pollution and the lack of storage project. According to the index system for the study of bearing capacity of water resources in Jinhua, it can be divided into 5 subsystems, namely population, agriculture, industry, environment protection and water resources subsystems. With the guide of ecological concept, based on the actual situation
374
and SD model6 demand in Jinhua, field investigations are carried out along main streams such as Jinhua River, Dongyang River and Wuyi River, water resources and related data on social economic system since 1980 are fully collected. Then system dynamics equations in population, agriculture, industry, environment protection and water resources 5 subsystems are established based on the features of water resources in Jinhua. Totally more than 100 variables and parameters are selected into the model with 9 level equations] 9 rate equations and large number of auxiliary equations, and 3 step functions, 3 table functions, 4 ramp functions and 8 clip functions are used. Through analysis on parameter error and sensitivity, it shows that the model is reasonable in structure] and could reflect the actual features of the bearing capacity of water resources in Jinhua, therefore, it can be used for the prediction of the dynamic development process after the implementing of future policy parameters. In order t o discuss the way of harmonic development between the bearing capacity of water resources and social economy in future 20 years of the region, based on historic data and the standard for a well-off-in-all-rounds society, agriculture, industry, GDP and the investment growing rate, irrigation quota, water consumption amount per unit output value of ten thousand yuan, sewage disposal volume are selpcted a s policy parameters; by using of SD model, 3 kinds of development schemes are simulated (shown in Table 2), where the high scheme is suggested to realize average GDP per capita >USD 3000 in 2010, the middle scheme fully considers the relationship between population, resources, environment and economic development in Jinhua, and the low scheme takes environment protection as the main objective. Fi-om Table 2 we know that the bearing capacity of water resources would be unable t o satisfy the total water demanded in the high scheme, the objective of quadruple GDP will not be realized in 2020 if we choose the low scheme. Only the middle scheme of giving attentions t o both economic development and environmental protection is the first choice.
5 . Conclusion
The information diffusion techniques can be used to effectively estimate the risk of water shortage that helps us to analyze the bearing capacity of water resources for determining which scheme is suitable for a city t o develop.
375
Table 2. Comparison on major indices in three schemes on bearing capacity of water resources in Jinhua City. APV IPV GDP IEP SDV PPRL TWD BCWR Year 2000 8.16 126.00 54.44 2.1 130.30 17.3 2.47 2.49 2005 12.55 193.87 83.76 2.8 154,40 20.4 2.62 2.70 19.31 298.29 128.88 3.8 175.80 22.9 2.91 2.77 2010 High 2015 29.71 458.95 198.29 5.1 229.80 29.9 3.24 2.94 334.00 43.7 3.72 3.01 2020 45.71 706.15 305.10 6.8 8.16 126.00 54.44 2.1 130.30 17.3 2.47 2.49 2000 2005 11.71 180.89 78.16 3.0 130.90 16.9 2.57 2.70 16.81 259.69 112.20 4.4 97.41 11.5 2.79 2.92 Middle 2010 24.13 372.82 161.08 6.3 58.84 5.2 3.01 3.24 2015 2020 34.65 535.23 231.25 9.0 48.87 2.4 3.29 3.48 2000 8.16 126.00 54.44 2.1 130.30 17.3 2.47 2.49 2005 10.91 168.62 72.85 3.3 123.80 15.8 2.53 2.70 2010 Low 14.61 225.65 97.49 5.0 77.66 8.4 2.69 3.07 2015 19.55 301.96 130.47 7.7 17.75 0 2.81 3.55 2020 26.16 404.10 174.59 11.9 0 0 2.96 3.94 APV— Agriculture Production Value, IPV— Industry Production Value, GDP — Gross Domestic Product, IEP — Investment on Environment Protection, SDV— Sewage Disposal Volume,PPRL — Polluted Percentage of River Length, TWD— Total Water Demanded,BCWR— Bearing Capacity of Water Resources Scheme
References 1. E.A. Chatman, Diffusion theory:A review and test of a conceptual model in information diffusion, Journal of the American Society for Information Science, 37(6), (1986), 377-386. 2. C.F. Huang, Principle of information diffusion, Fuzzy Sets and Systems, 91(1), (1997), 69-90. 3. C.F. Huang, Demonstration of benefit of information distribution for probability estimation, Signal Processing, 80(6), (2000), 1037-1048. 4. C. F. Huang, Information diffusion techniques and small sample problem, International Journal of Information Technology and Decision Making, 1(2), (2002), 229-249. 5. C. F. Huang, and Y. Shi, Towards Efficient Fuzzy Information Processing—Using the Principle of Information Diffusion, Heidelberg, Germany: Physica-Verlag(Springer), 2002. 6. Y. Motohashi, and S. Nishi, Prediction of end-stage renal disease patient population in Japan by system dynamics model, International Journal of Epidemiology, 20(4), (1991), 1032-1036. 7. D. Verschuren, K.R. Laird, and B.F. Gumming, Rainfall and drought in equatorial east Africa during the past 1100 years, Nature, 403, (2000), 410-414.
AN EARTHQUAKE RISK ASSESSMENT METHOD BASED ON FUZZY PROBABILITY IMAN KARIMI Department ofStructura1 Statics and Dynamics, RWTH Aachen University Aachen, 52056, Germany EYKE HULLERMEIER Department of Mathematics and Computer Science, University of Marburg Marburg, 35032, Germany KONSTANTIN MESKOURIS Department of Structural Statics and Dynamics, RWTH Aachen University Aachen, 52056, Germany
This paper presents an outline of a proposed risk assessment system for evaluating expected damage of structures and consequent financial losses and casualties due to a likely earthquake. Uncertainties caused by insufficient knowledge about the correlation of parameters has been considered by fuzzy relations, while uncertainties in eliciting the likelihood of earthquake magnitude from scarce past events, has been expressed by fuzzy probability instead of conventional probability. The approach represents fuzzy probability by constructing a possibility distribution over probability out of a small data set and a sample case calculation is presented.
1. Introduction: Seismic hazard and earthquake risk assessment An earthquake occurs when two sides of a fault suddenly slide along or on each other. The released energy propagates from this ruptured zone, which is called focus, in every direction and causes a strong ground motion in the region around the focus. The intensity of this motion depends on the released energy (earthquake magnitude), the distance from the focus, the soil properties and the topography of the site. This strong ground motion causes damages to the structures and consequently leads to financial losses and casualties. Because of various uncertainties, assessing the seismic risk of a structure and consequent losses is highly complicated. These uncertainties can be divided into two categories: 1. Uncertainties about the correlation among the parameters of the hazard, damage and loss. These uncertainties are due to the lack of subjective knowledge or lack of abundant data sets for objective determination of the correlations
376
377 2.
Uncertainties concerning the likelihood of occurrence and magnitude of the seismic hazard
In this study we have tried to consider the uncertainties of the first category by constructing fuzzy relations, employing different data mining and machine learning methods as well as utilizing all possible data sources such as observed data, expert opinion and numerical models. On the other hand, the uncertainties of the second category have been modeled with a two-dimensional uncertainty pattern, instead of the conventional probabilistic representation. The latter issue is the main focus of this paper and will be dealt with in the following sections. Therefore the main features of the proposed risk assessment system can be summarized as follows: Constructing fuzzy relations among parameters of magnitude, distance, soil properties, topography, intensity, damage to each type of structure and subsequent loss (causalities) Constructing a hazard pattern of the magnitude, which will be applied on the above-mentioned fuzzy relations in order to obtain the risk pattern of the loss.
2. Assessing the probability of natural hazards There are at least two ways to define or interpret probabilities: deduction from the knowledge of the particular constitution of the events under which they occur; or, derivation from the long-continued observation of a past series of events. In the case of natural hazards, particularly earthquakes, available physical knowledge of their genesis and pattern, are not sufficient for deducing their probability. Therefore, one has to resort to objective assessment and since in most regions earthquakes are scarce events, objective assessments based on such small data sets will not yield reliable values of probabilities as well. Hence, a framework capable of expressing imprecise probability is required. Various approaches have been proposed for elaborating this concept, which from the viewpoint of their output format might be divided into two categories: Interval valued probabilities Fuzzy probabilities The concept of introducing a probability interval instead of a single value for representing likelihood of an event has been implemented based on different theories by defining upper and lower probabilities, between which the unknown precise probability will fall. For instance, Dempster-Shafer Theory represents this interval by defining belief and plausibility functions [1,2], while in possibility theory it is expressed by necessity and possibility of an event [3,4].
378 It can also be argued that if a complete set of observations leads us to a clear objectively assessed probability, then an incomplete data set, which might contain imprecise observation, delivers a fuzzy image and thus the probability of an event would be a fuzzy number instead of a crisp one. We can consider this concept as the extension of the concept of the first category as well; for instance by generalization of the Dempster- Shafer Theory, i.e. defining fuzzy valued belief and plausibility functions [5]. This fuzzy probability could also be achieved by using the theory of fuzzy random variables [6] Another concept in this category is defining a possibility distribution for the probability of an event [7]. In the next sections we will present our approach of eliciting fuzzy probability from a scarce data set by constructing a possibility distribution of the probability. 3.
Possibility-Probabilitydistribution
Consider a real-valued random variable X (e.g. the magnitude of an earthquake) whose range is partitioned into several intervals (bins) A,, A, ,K ,A, . We are interested in characterizing the probabilities Pr(X E A i ) that X takes a value in the different intervals. Let { x , ,x,,K,x,}be a given sample, p") = Pr(X E A ; ) and ni the number of observations xj so that x j E Ai. The standard (point) estimation of the probability is then given by the relative frequency:
This estimation converges stochastically toward the true probability p, i.e. limp = p n.r+-
However, for small values of n, the estimated probability of Eq. (1) will obviously be afflicted with a high degree of uncertainty. Therefore, the point estimation is usually endowed with (or even replaced by) a confidence interval in classical statistics. A confidence interval C, also referred to as a credible set in Bayesian statistics, is constructed so that Pr(p E C,) = a , where a is the socalled confidence level. Commonly used confidence levels are values such as, e.g., 0.90, 0.95, and 0.99, although the final choice remains arbitrary to some extent. An alternative approach to characterize the uncertainty associated with estimation is to consider the complete family of confidence intervals C, 0 < a: I 1 . This approach avoids a particular choice of a and obviously includes more information about the estimated quantity. In the following, we will present an approach in which the
379 information provided by a family of confidence intervals is represented in terms of a possibility distribution. We employ a Bayesian approach and assume prior knowledge about the (unknown) probability degree p to be represented in terms of a prior probability distribution (density function) pi over [0, 11. That is, p i ( p ) is the probability (density) of the probability degree p. This distribution allows one to incorporate expert knowledge into the inference procedure. Even though this point will not be explored further in this paper, let us note that the incorporation of background knowledge is crucial in the context of our application where data is usually sparse but expert knowledge more or less available. In the case where no prior knowledge is available, piis simply specified by the uniform distribution. Now, recall that we have a sample in which ni among n data points fall into the interval Ai.The posterior probability p i is then given by
For computational reasons, p iand p ; will not be defined as density functions over the complete unit interval [0,1] but rather as discrete probability distributions on a finite subsetbo,pu,,K,pm}that O=po < p, < . . .
Dubois et al. proposed and theoretically justified the following approach for deriving a possibility distribution from a probability function [8]: For a probability measure P with unimodul (continuous) density function p , a level-cut of the form I , = { y I p ( y ) 2 A}defines the shortest (maximally precise) interval I such that P ( 1 ) = P ( 1 , ) . The most specific possibility distribution zr that contains P (in the sense that P I n ) satisfiesn(inf I A ) = n(sup I , ) = 1- P ( 1 , ) . In other words, an a-cut of ncorresponds to the confidence interval around the mode of p with confidence level 1-a. Applying this approach, we construct confidence intervals of the form C , = [ l a , u,] where la, u, E b o , p U , , K ,p,}and xi(p,) for all xi(pj)E C , and x,(pk)E C, . The possibility distribution n; associated with these intervals is given by
zi(pj)= m a x ( l - a I p j E C , } t m a x ( p ~ ( p j ) I j= 0, l...m}
(5)
The second term on the right side of Eq. 5 is due to the discretization and guarantees that n; is normalized.
380 What we have thus obtained is a characterization of the probability of each interval Pr(X E Ai) in terms of a possibility distribution that is called a Possibility-ProbabilityDistribution (PPD) of the random variable X. We should mention that this distribution is an approximate model, because we have characterized the probability Pr( X E Ai ) independent of other intervals, which is not the case. Extension of this approach in order to consider the interdependence of the probabilities is an important topic of ongoing work. 3.1. Case study: Extracting hazard pattern of magnitudes
Table 1 shows the data set we used as the case study as well as the employed intervals. Using the Eqs. (4) and ( 5 ) with p j = j / 4 0 0 ( j = O,K ,400) , the results are shown in the Figs. 1-5. As it was expected, ,h(i)has possibility value of unity for each interval Ai . Table 1 . Earthquake magnitudes (1500-present) in North Anatolian fault in Marmara region and the correspondingintervals Magnitudes I Frequency Interval I 1 2 M=6.8 I [6.75,6.85[ [ 1 M=6.9 [6.85,6.95[ M=7.0 [6.85,6.95[ M=7.1 [7.05,7.15[ M=7.2 [7.15,7.25[ M=7.3 [7.25,7.35[ M=7.4 [7.35,7.45[
I
1 1 I
Figure 1. PPD of A1 and A4 (ni= 2)
Figure 2. PPD of A2 (ni = 1)
1
0
.
0
5
i
0.5
Figures 3. PPD of A3 (ni = 0 )
I
1
0.5
1
Figure 4. PPD of As and A7 (ni = 1)
381
ln 0.5
0Figures 5. PPD Of A6 (ni= 5)
4.
I;
0.5
Figures 6. PPD of Huang Method
A6 (ni= 5 ) by
Comparing the results with another approach and conclusions
Applying Huang's approach [7] to this sample, the so-call information gains and losses are obtained as 4,- = q$ = 0 for all x , and A,
(6)
Therefore all the possibility distributions of the probability for each interval will be unit spikes at ,Li(i)(e.g. Figure 6); i.e. in this case Huang's PPD expresses the point probability estimation to be the precise probability which is not the case. This real case shows clearly the advantage of the proposed approach for constructing a reasonable possibility-probability distribution of the magnitudes based on a small data set. Obviously by increasing the number of observations in the sample the slenderness of the possibility distribution increases as well, i.e. the fuzziness of the probability decrease as it logically should.
References 1. A. Dempster, Ann. Math. Star., 38, 325-339 (1967) 2. G. Shafer, A Mathematical Theory of evidence, Princeton Univ. press (1976) 3. D. Dubois and H. Prade, Possibility Theory, NY: Plenum Press (1998) 4. G. Cooman, D. Ruan, E. Kerre (Eds.), Advance in Fuzzy Systems,8 (1995) 5. C. Lucas and B. Araabi, IEEE trans. Fuzzy Systems., 7,255-270 (1999) 6. B. Moller and M. Beer, Fuzzy Randomness, Berlin: Springer-Verlag 7. C.F. Huang and C. Moraga, Int. Jour. Uncertainty, Fuzziness and knowledge-based Systems, 4,347-362 (2002) 8. D. Dubois, H. Prade and Ph. Smets, 2"d Int. Symposium on Imprecise Probabilities and Their Applications, Ithaca, New York (2001)
RELIABILITY THEORY USING HIGH CONDITIONAL PROBABILITY EVENTS M. OUSSALAH Electronics Electrical & Computing, University of Birmingham, Edgbaston Birmingham, B15 27T. UK
Conditional event algebra put forward by Goodman has been advocated as a mathematical framework that ensures coherence for defining logical implication in probabilistic setting and allowing logical deductions. On the other hand, Bamber proposed a nice logical setting for dealing with assertion of high conditional probability. This paper attempts to review the basic reliability models of serial and parallel system in the light of Goodman’s conditional event algebra. Especially, sufficient rules together with the underlying assumptions are investigated in order to provide a full characterization of serial / parallel systems.
1. Introduction Standard reliability theory is based on usual frequency and Bayesian probability theory [ 111, which defines the reliability of a given product as a probability distribution over time. Besides, the frequency and Bayesian reliability get somehow strengthened over the last decades in almost all areas of reliability analysis including reliability prediction, testing, design, among others, due to the effect of mass production, especially for physical devices, where the notion of statistical data is preponderant. On the other hand, several others issues motivate nowadays the need to revise the standard approaches to reliability, a stream that appears even within the probabilistic community. Particularly, as most of the devices integrate both software and hardware components, the notion of failure occurs differently. Besides the operator, often, becomes an important part of the system itself, which renders the statistics, even available, sometimes useless. On the other hand several recent studies on critical systems, often referred to as zero failure or highly reliable systems question the standard approach to reliability, see for instance [81. Indeed there is an ongoing debate on whether a high reliability event is equivalent to zero failure. Several fact-scenarios like Chernobyl accident and recent space shuttle blast have risen serious questions on the above equivalence, which motivates partly the need to re-examine the interpretation and refinement of highly probability events. Recently Bamber [2] has proposed a nice mathematical setting to handle assertion of high conditional probabilities (AHCP). The proposal extends the conditional event algebra (CEA) framework put forward by Goodman et al. [41. Strictly speaking as any unconditional probability can be written in the form of conditional probability where the conditional part coincides with the whole universal event, the idea of
382
383 extending Bamber’s proposal to standard reliability models is tempting. On the other hand as assertions in high conditional setting are essentially represented as “if . . . then.. .” rules, any such extension would implicitly establish some links to rule-based system description. This paper examines the system reliability from the viewpoint of CEA and AHCP. Especially, we will focus on basic serial and parallel systems and we will attempt to provide another characterization of these systems in the framework of CEA & AHCP. Section 2 of this paper highlights the basis of high conditional logic Probability. Section 3 investigates the application of such description to serial and parallel system reliability. 2.
Basis of High conditional logic probability
Loosely speaking both conditional event algebra (CEA) put forward by Goodman et al. [4] and Bamber’s proposal of assertions of high conditional probabilities [3] are meant to represent “if ... then ...” rules and ensure consistent deductions from the rulebase. Specifically, each rule “if a then b”, where a is the premise and b the conclusion parts of the rule, can be represented by the conditional probability P(b1a). The value of P(bla) stands to the reliability of the above rule, in the sense that greater the value of P(bla), larger the truth of the assertion ”if a then b”. A possible representation of the fact that the above rule is true or held is by setting the value of P(b1a) very close to one. This can be represented as P( b I a ) 2 1- E , E > 0 , where E is close to zero. The latter often refers as a threshold representation of the AHCP in Bamber’s theory [3]. On the other hand, representing the above rule using conditional probability raises further difficulties. First the logical implication a 4 b , which is equivalent to ii v b is not consistent with conditional probability representation [7] in the sense that P ( Z v b ) = l - P ( a ) + P ( b ) - P ( i i A b ) # P ( b I a ) . Second, the deduction mechanism may sometimes be very incoherent and unrealistic. For instance, it is completely possible to find a probability measure P such that both P(b1a) and P(c1b) are close to one, while P(cla) is close to zero, which renders the standard transitivity syllogism (if rules “if a then 6” and “if b then c” are valid, then so is the rule “if a then c”) obsolete. To overcome this difficulty, Bamber [2], Bamber and Goodman [3] have proposed the appealing framework of conditional event algebra, which extends earlier Adam’s probabilistic logic [ 11. The proposal agrees with Lewis’s “triviality”’ theorem [ 6 ] , which states essentially that “conditional events”, i.e., events compatible with conditional probability evaluations, cannot exist in the same space as the unconditional ones making up their antecedents and consequents. Consequently, if ordinary events like a and b lie in some (sigma) algebra B , which is part of a probability space (Q,B, P ) , the quantities like (bla) are considered as single events; namely conditional event, and lie in a larger (sigma) algebra, say k , a proper subalgebra of which can be identified with B.
384 They proposed to construct B as an infinite product of B’s. The newly constructed sigma algebra is also part of an extended probability space; that is a new probability p is constructed as the infinite product measure generated by P on ((6, b ) , where 6 stands for the infinite product Q x Q x ... The set of all events (a l S ) where U E B is a subalgebra of B that is isomorphic to B, and p( a 19)= P( a ) . Similarly, it has been shown that
That is, the probability of a conditional event is a conditional probability. Interestingly a set of rules has been put forward for manipulating conditional events. Especially [4], (bla)’ = (b’ la). (bla) v (dc) = (bvd I ( h a ) v ( ~ A c )v ( U A C ) ) . (bla) A (4.) = ( b ~ Id(b’Aa)v ( ~ ‘ A c )v ( a ~ c ) ) .
Consequently, one can take probability of both parts of equalities in (2-4), and get equivalent results. This allows us to manipulate a set of “if .. . then .. .” rules in very attractive way while keeping track of probabilistic representation. The proposal also agrees with Lehmann’s and Magidor’s finding in conditional knowledge [ 5 ] , see also Snow’s paper [7] on conditional logic.
3.
Conditional event algebra and reliability
The aim of this section is an attempt to discuss the impact of the above formulation on some basic reliability models [ 1 11. Especially, serial and parallel systems are considered. First let us consider for simplicity a two-components system constituted of components A and B, and assume the following events:
SF: overall system S failed AF: component A failed B F component B failed. 3.1. Serial system reliability Definition 1 Given the events SF, AF and BF, a serial system of components A and B is fully characterized by the following, for any probability p in B , 3~ > 0 , & ( S F I AF ) A ( s F I BF )) 2 1 - E , (5)
385
Or ~ E , , E>O, ~
where
E
,
and
i((SFIAF))>l-&, & F((SFIBF))21-&,,
E~
(6)
are as close to zero as possible.
This definition follows straightforwardly from the definition of a serial system and the materialization of the underlying rules defining it. Indeed, any serial system, with components A and B, must be such that the following two rules hold simultaneously “if A F then SF’ and “if BF then SF’. These rules translate that, in a serial system, the overall system fails if at least one of its components A or B fails. Putting these rules into conditional event forms and noticing that statement like (; b I a ) 2 1- E for E close to zero” means that the rule “if a then b” is very likely, thereby, expression ( 6 ) is straightforward. Expression (5) arises by gathering conditional events (SWF)and (SFlBF),representing the above two rules, whose resulting (conditional) event must also be very likely with respect to any probability measure in . Now let us look to a more mathematical characterization of the serial system in the light of the rule base description “
Proposition 1 Provided that events A F and B F are statistically independent, ( S F A F ) and ( S F A F ) are positively dependent, then expressions (5) and (6) are equivalent, and provide full characterization of serial system A-B.
Proof Due to the paper size restriction, only the outline of the proof is given here. First let us use conditional event algebra to expend expression in (5). Using conjunction of conditional algebra (4), we have
Now using assumption of statistical independence of events A F and BF, and after some manipulations and canceling empty events, (8) is equivalent to
H=
P( SF I AF A BF ) ( 1 - P( SF I AF ))/ P( BF )+( 1 - P( SF I BF ))/ P( AF )- 2( 1- P( SF I AF A BF ))
Consequently ( 5 ) is equivalent to
i) Proof of (6) Let E ~ ,
= (5) E~>O
i)c( S F I A F )) 2 1- E~
&
and
close
p(( SF I BF )) 2 1 - E
to
zero
such
that
~ then , using (l), it holds
P(SFIAF)2l-E, & P(SFIBF)21-&2. (10) On the other hand using Frtchet bounds and the positive dependence assumption it holds that P( SF A AF).P( SF A B F ) < P( SF A A F A B F ) 5 min( P( SF A A F ) , P( SF A B F ) ) , or, equivalently, using statistical independence of S A and S B P(SF I A F ) P( SF I B F ) P ( S F I AF).P( SF I B F ) < P( SF I AF A B F ) S min( ).(Il) P(BF) ’ P(AF) Now using inequalities (10) and the lower bound in ( l l ) , and choosing an appropriate upper bound for P( SF I A F A B F ) , which appears in denominator of H , leads to
where clearly the upper bound of the inequality tends to 1 as both E~ and E~ tend towards zero. So, it is enough to take E = 1-( l-&1 )( 1 - E 2 )/[ E, / P( BF )+ ~2 / P( A F )+ 11 ji) Proof of (5) =r (6) Similar reasoning can be applied. The spirit is as follows, first start with inequality (9), and ( 1 1) (use lower bound of ( 1 1) for P( SF I A F A B F ) in the denominator of H and trivial upper bound 1 for P( SF I A F A B F ) in the numerator part. Then transform the inequality to be such that, i.e., P ( S F P F ) 2 where H’ is function of P(SFIBF), P(AF), P(BF) and E only. Use again trivial upper bound 1 to end up with an upper bound U for P ( S q B F ) , which tends to one as E tends towards 0. Take 1-U as a candidate for E ~ The . bound for P(SFHF) can be obtained by symmetry, regarding the analytical expression of H . HI,,
387 Finally, statements (5-6) together with Definition 1 fully characterize the serial system A-B. Clearly, Proposition 1 lies bare the extent to which high conditional logic can be used as a framework to fully characterize the serial system. Especially, it was pointed out the need for statistical independence together with positive dependence assumptions as a pre-requisite to ensure the proof of equivalence of statements ( 5 ) and (6). Although, the statistical independence of components’ failures is very common, the positive dependence seems less trivial.
3.2. Parallel system Similarly to serial system, the following definition will be used to define parallel system A-B using high conditional logic
Definition 2 Given the above events, a parallel system of components A and B is fully characterized by the following, for any probability @ in , 3~ > 0, @(((SF [ ( A FA BF ) ) A ( ( A F A BF )I SF )))2 1 - E , (13) or, 3el,E~ > 0, @((SF 1 ( AF A BF))2 I-&, & F((( AF A B F )1 SF))2 I - E ~(14) , where E , E~ and E~ are as close to zero as possible. Notice that the above definition translates the statement, held for every parallel system A-B, which states that the overall system fails if and only if both components A and B fail, (i.e., the implication holds in both sides). Similarly to serial system, one can establish conditions, which ensure equivalence of statements (13) and (14).
Proposition 2 Statements (13) and (14) are equivalent regardless the independence of events AF and BF. The proof of the preceding is omitted, we shall just mention that using ( l ) , (4), Bayes’ rule and further simplification, (13) is equivalent to 3 & > 0 , G 2 I - E , with P( SF I ( A F A BF )).P( AF A B F ) G= (15) P ( S F ) + P( A F A BF )- P(( A F A B F ) S~F ) . P ( S F ) Putting P(SF)=P(SF(AFABF).P(AFABF)/P((AFABF)ISF), leads to
388 P( SF I ( A F
G=
A
BF )).
‘(SF I ( A F A BF )) + I - P ( s F I ( ,w A BF )) P(( A F A BF )I S F )
(16) ‘
From (16), one can easily establish the implications ( 1 3 ) a ( 14) and ( 1 4 ) a ( 13). The detail is omitted. The preceding shows that one can define a rule base system that fully defines and characterizes the parallel system without recourse to statistical independence assumption in the framework of CEA and AHCP. Alternatively, one may also define the parallel system using different rules. Consider the following statement
(17) translates the logical behaviour of the parallel system A-B in the sense that the overall system is functioning if at least one of its two components is functioning.
Proposition 3 Under assumption of statistical independence of events AF and B F , statements (13) and (17) are equivalent. Again the complete proof is omitted and the hint is to transform expression (15) applying total probability theorem to P(SF) as
P( SF ) = P( S F
I AF
- _
BF ).P(
AF
_ -
A
BF ) + P( SF
I
I BF A AF ).P( BF A AF ) +
P( SF I AF A B F ) . P ( AF A B F )+ P( S F AF A BF).P( AF
A
BF)
Then apply statistical independence assumption and basic probability calculus.
4. Conclusion This paper provided sufficient conditions that characterize serial and parallel system reliability using Goodman’s conditional event algebra and Bamber’s high conditional assertions. The paper so far has not investigated the interesting part of looking to the contraposition of the rules provided in Definitions 1-3, and whether additional pre-requisites are necessary to ensure the validation of the underlying rules in the spirit of high conditional events. Another ongoing research concerns the
389 deduction mechanism that allows us to ensure basic composition of serial and parallel system using high conditional assertions. From this perspective, an appealing hint is to reason not with respect to specific probability P but in average; that is, the statement is not deemed to be true for every choice of probability P but in average over all possible probability measures fulfilling the problem constraints. This extends imprecise probability framework [ 101 to AHCP and CEA. Some preliminary results are reported in [3].
References 1. E. W. Adams. The Logic of Conditionals. D. Reidel, (1975). 2. D. Bamber. Entailment with near surety of scaled assertions of high conditional probability. Journal of Philosophical Logic, 29: 1-74, (2000). 3. D. Bamber and I. R. Goodman, New use of second order probability techniques in estimating critical probabilities in command and control decision making, Proceedings of the Command & Control Research & Technology Symposium. Naval Postgraduate School, Monterey, CA, 1-53, (2000). 4. I. R. Goodman. H. T. Nguyen and E.A. Walker, Conditional Inference Logic for Intelligent Systems, A Theory of Measure-Free Conditioning, North-Holland, (1991). 5. D. Lehmann & M. Magidor. What does a conditional knowledge base entail? Artificial Intelligence, 55: 1-60, (1992). 6. D. Lewis, Probability of conditionals and conditional probabilities, Phylosophical Review, 85,297-3 15, (1976) 7. D. V. Lindley. Bayesian Statistics, A Review. Society for Industrial and Applied Mathematics, (1971). 8. K. H. Roberts and D. M. Rousseau, Research in Nearly failure-free, highreliability organizations: having the bubble, IEEE Trans. On Engineering Management, 36(2), 132-139, (1989) 9. P. Snow. Diverse confidence levels in a probabilistic semantics for conditional logics. Artificial Intelligence, 112, 269-279, (1999). 10. P. Walley. Statistical Reasoning with Imprecise Probabilities. Chapman and Hall, (199 1). 1 1. L. C. Wolstenholme, Reliability Modelling, A statistical Approach, Chapman & HalVCRC, London, (1999)
This page intentionally left blank
Part 3 Applied Research and Nuclear Applications
This page intentionally left blank
A FUZZY IMPULSE NOISE DETECTION AND SUPPRESSION FILTER FOR COLOR IMAGES Y. TSIFTZIS AND I. ANDREADIS Department of Electrical and Computer Engineering, Democritus University of Thrace. GR-67100 Xanthi, Greece This paper describes the knction of a noise detection and suppression filter that uses fuzzy logic techniques and aims both at improved performance and reduced computational complexity. It is applicable to color images and outperforms other similar reported techniques in terms of quality indexes and image detail preservation. The filter specializes in the detection and elimination of random and fixed valued impulse noise. The improved performance of the algorithm is confirmed by experimental results.
1.
Introduction
The presence of noise in images leads to unexpected results and misleads most of image processing techniques. In order to obtain objective results from any of those techniques we must remove noise that has affected the image and acquire a noise free image. The proposed algorithm is an impulse noise removing method that acts in two steps; first it detects impulse noise contaminated pixels by means of fuzzy logic techniques, by assigning them a fuzzy flag that indicates the degree of noise presence. In the following step, which is the noise elimination part, all pixels that look like impulse pixels according to their fuzzy flag are subject to the noise-removing algorithm that produces the final noise free output image. The filter is applicable to color images and the VMF [l] is used in order to compare the performance of the proposed filter in terms of both noise removing capability and computational complexity. 2. Noise detection In order to detect noise-contaminated pixels, we apply the following algorithm to each one of the principle color image components. Let x(ij) be the pixel value at position (ij) of the noise corrupted image and u(ij) the median value in an (2Nd+1)x(2Nd+l)window ofthe pixel at position (ij):
..,xij,. ..,Xi+Ndj+Nd)
UIJ = median{xi-Ndj-Nd,.
(1)
The absolute difference Ix(ij) - u(i,j)l is used as an effective measure of impulse noise presence. A large value, probably, indicates noise presence, whereas a small value implies a noise free pixel. All values in between can be considered as fuzzy. We obtain the fuzzy set ‘impulse noise pixel’ that contains all pixels of the image and is associated with the sigmoid membership function in the following way: 393
394
f (ij)=sigmoid[abs(x,,-u,,)]
(2)
The specific membership function assigns each pixel of the noise image with a value between 0 and 1. Pixels of the image that have a membership function value close to 1 or close to 0 are considered to be noise pixels or noise free pixels, respectively, whereas all values in between cannot provide us with a direct measurement of noise presence. In order to make all these values more distinguishable we apply the intensity modification operator p [2]
By selecting a proper threshold Td between 0 and 1 we can decide whether a pixel having a hzzy value over the threshold will be subject to the noise cancellation procedure at the second step or it will be left intact if it has a value below the threshold. The threshold value depends on the application. 3.
Noise suppression
The noise cancellation procedure is applied only to the pixels that satisfy f(xl,J)>Tdwhereas for f(x,,)iTd the corresponding pixels are left intact. The basic idea of the algorithm is to subtract or add a proper value to the impulse noise pixel in order to restore it to its original form. This proper value is calculated by the following algorithm. By having subtracted the median value u,, from xlJ in the noise detection part the result can either be o positive number, if xIJ has an impulse value close to 255, or it can be a negative number, if x , has ~ an impulse value close to 0. In the first case the median value uIJ is subtracted from the noise value, leading to a smaller value close to the original whereas in the second case the median value ulJ is added to the noise pixel value in the noise corrupted image, leading to a larger value closer to the original one. Let yIJ be the pixel value at position i j at the output of the noise cancellation algorithm. xij4xij-
~ i j =
uij)
(4)
The algorithm can handle both random and fixed valued (salt & pepper) noise and the experimental results in color images exhibit its efficiency. 4. Computational complexity
In this chapter, the computational complexity of the proposed filter and the VMF are examined. The number of operations calculated below refers to one of the three image principle components and consequently the total amount of operations is triple. During the noise detection part the proposed method uses a median filter that requires approximately 2N?LogN, compare/pixel operations in an optimal situation, where N,xN, are the median filter dimensions, using a Quick sort algorithm [3]. Moreover, a subtractiodpixel is required to calculate xIJ-ulJand another compare to threshold Td operation is needed. In the next step of the
395 algorithm, where the noise cancellation scheme is applied, assuming that the noise percentage is p, then pxl subtractions/pixel are required in order to apply eqn. 4. To calculate the total amount of operations we must multiply the above figures with the image dimensions. As a result, MxNx(2N,2LogNC+1) compare/pixel and MxNx(p+l) subtraction/pixel are required from the noise detecting and removing algorithm described (assuming that MxN are the image dimensions). The classic VMF requires O(n3) additions, O(n3) multiplications, O(n3) square roots and O(n2) comparisons (nxn are the dimensions of the neighborhood used by the VMF) [l]. The improvement in terms of computational complexity that the proposed filter offers is obvious.
5. Experimental results The efficiency of the method described above has been verified by many and thorough experiments concerning color images of various sizes. A typical presentation of the function of the algorithm is shown in Fig. 1. This is the Lena image having a size of 256x256~3pixels. The noise models that have been used are salt & pepper (fixed impulse values of 0 and 255 with equal probabilities) and random valued impulse noise (random values uniformly distributed between 0 and 255). As a comparison the VMF [l] is used and the performances of both are illustrated in Fig. 2. To obtain an objective quality measure we use the Peak Signal to Noise Ratio (PSNR). The threshold in the noise detection step was Td=0.4 and the dimensions of the median filter used were 3x3 (Nd=l). In all cases the VMF filter used the Euclidean distance as a similarity measurement in a 3x3 neighborhood.
Fixed \falued impulse noise p=O 4, p~=p~=p,=O3
proposed method, PSNR=29.34
VMF,PSNR=2697
Figure I . A comparative presentation of the performances of the proposedfifter and the V i F .
396 Random and fixed valued impulse noise +Proposed filter (random valued impulses) -VMF (random valued impulses
5 10 15 20 25 30 40 50 60 noise %
I
Proposed filter (fixed valued impulses) VMF (fixed valued impulses
Figure 2. PSNRs of the proposedfilter and the VMF with random and fued valued impulse noise.
As it can be seen from Fig. 2 the proposed noise removal algorithm outperforms the VMF filter in a wide range of noise percentages in the cases of both random and fixed valued impulse noise. 6.
Conclusion
The impulse noise detecting and removing filter that was described in this paper is an effort to meet satisfactory image quality criteria with the minimum amount of computational complexity. The experimental results that have been presented prove the fulfillment of the above goals and the computational complexity analysis of the proposed method clearly depicts the significantly lower amount of operations required by the algorithm compared to the classic efficient Vector Median Filter. References 1. 2.
3.
J. Astola, P. Haavisto, Y. Neuvo, Vector median filters, Proceedings of the IEEE, 78, pg. 678, (1990). H. Haussecker, and H.R. Tizhoosh, Fuzzy Image Processing, in Handbook of Computer Vision Applications, Edited by B. Jahne, H. Haussecker, and P. Geisster, Academic Press, 1999. S. Zhang, M.A. Karim, A new impulse detector for switching median filters, IEEE Signal Processing Letters, 9, pg. 360, (2002).
VIDEO DEINTERLACING USING ADAPTIVE FUZZY FILTERS* A. SANZ', F. FERNANDEZ~,J. GUTIERREZ~,G. TRIVIRO~, A. SANCHEZ', J.C. CRESPO~AND A. MAZADIEGO~
I ESCET-URJC, Campus de Mdstoles, 28933 Madrid, Spain [email protected], [email protected] DTF-FI-UPM, Campus de Montegancedo, 28860 Madrid, Spain Felipe.FernandezBes. bosch.com,jgr@dtflfi. upm.es, gtrivino@d$j?. upm.es, juanc-crespo@ieci. es, [email protected] This paper presents a new fuzzy motion adaptive video deinterlacer that is adaptive at pixel and frame level. It is mainly based on a decomposition of the corresponding fuzzy motion detector into three main modules: a linear spatio-temporal low-pass filter described by two separable ID FIR filters and two fuzzy modules described by linear saturation functions. Moreover, the involved saturation parameters are on-line adjusted taking into account the motion quantity of each frame. Experimental results with several video benchmarks demonstrate the robustness and high-quality reconstruction of the presented algorithm with relatively low computational load.
1. Introduction
Deinterlacing is today a key technology in consumer TV that converts ordinary interlaced formats into progressive ones by reconstructing the missing lines. Some typical defects in video deinterlacing will cause uncomfortable visual artifacts and critical distortions in the output frames. The most common deinterlacing methods are frequently grouped in two main categories: motion compensated and non-motion compensated methods. Motion compensated algorithms provide the highest reconstruction quality. They are computationally more expensive because they require the estimation of twodimensional motion vectors and pixel shifting calculations. On the other hand, non-motion compensated methods are cheaper and can achieve a good compromise between performance and cost. A deep and general review of deinterlacing technology is made by Gerard de Haan in [ 11. The rest of the paper shortly reviews the motion adaptive deinterlacing methods, presents the proposed adaptive hzzy motion detector and shows the experimental deinterlacing results obtained using standard benchmark videos.
* This research has been supported by CICYT TIC2000-1420
397
398
2. Motion Adaptive Deinterlacing Methods
Non-motion compensated deinterlacing methods are also clustered in three main classes: temporal or inter-field, spatial or intra-field and spatio-temporal methods. Temporal or inter-field techniques work better than spatial or intra-field techniques for static scenes, because when there is no motion, the missing lines of a fiame are the same as the known previous ones. On the other hand, when there is motion in the scene, the previous lines contain information that is not coincident with the present data and ghosting, tearing or combing artifacts appear in the moving regions. For these moving areas, spatial interpolation gives better results. Many spatio-temporal hybrid-deinterlacing techniques have been proposed to exploit the spatial and temporal correlation of video pictures and to overcome the artifacts associated with simple deinterlacers. The corresponding techniques called motion-adaptive (MA) algorithms, typically compute a motion-weighted combination of a temporal interpolation function It(ij,f)=(Z(i,j,t-])+Z(i,j,t+l))/2and a spatial interpolation function Is(ij,t)=(I(ij-f,t)+Z(ij+ I , t))/2:
Its(i,j,t) = (1 - a)Zt(i,j,t)+ a Zs(i,j,t)
(1)
where Its(i,j , t) is the obtained luminance on the column i, linej and time t of the corresponding field, and c i ~ ( 0 , l is ) the involved motion value. To detect motion areas, it is necessary to estimate this weighting parameter ci over each missing pixel. Most of these techniques are based on the computation of the absolute difference between the luminance of the two adjacent frames A(.):
h(i,j , t ) = II(i,j , t + 1) - I ( i , j , t - 1)1
(2)
Unfortunately, due to several noise sources, the luminance difference does not become zero in all picture parts without motion. This implies that the corresponding motion detector should include some kind of additional low-pass spatio-temporal filtering in order to avoid some undesirable noise effects. The considered low-pass fuzzy motion detector has been designed taking into account that the color carrier does not contain significant motion information and the moving objects size are frequently larger than pixels size. 3. Bilevel Adaptive Fuzzy Filter
The most important part of a reliable MA deinterlacer is the spatio-temporal motion filter. Motion detection failure can result in the use of inter-field data in moving parts of the video, where it can cause the appearance of combing artifacts. Alternatively, oversensitive motion detection can cause the motion
399 detector to be triggered by noise, generating a spatial data interpolation in still parts of the picture. This can lead to noticeable loss of resolution in the video. Therefore, there is a need of a good balance between motion sensitivity and robustness in different video conditions. To carry out this task, a fuzzy video motion detector was developed by D. Van de Ville [5,6]based on a set of 5 fuzzy rules (FMDI). The corresponding ASIC has received the 2003 European IST Prize Nominee. An improved version of this approach it is shown in [ 151. This paper uses an analogous philosophy but proposes a simpler and more robust adaptive fuzzy motion detector (FMD3) that simplifies the corresponding computation and provides an outstanding picture quality in both moving and still image areas. The associated low computational cost gives the chance to easily accomplish the real time algorithm in software. We have implemented FMD3 in real time on a Pentium I11 1.2GH2, 256 ME3 SDRAM, using MS DirectShow and the visual tool MS GraphEdit, obtaining a throughput of 3Ofps without employing MMX or SSE instructions. Figure 2 describes the proposed fuzzy motion detector FMD3 with pixel and frame adaptation levels. The pixel-adaptation controller is mainly based on the serial composition of two linear I D FIR low-pass filters and two nonlinear Bilevel adaptation controller
Its
h
Figure 2. General block diaeram of fuzm motion detector FMD3
saturation functions. The corresponding saturation parameters (a,b,c,d) are online computed by the frame-adaptation controller, considering the normalized average motion p of each frame.
400
The basic structure of the uniform recurrence equations (URE’s) of FMD3 algorithm, for the missing pixels in odd and even fields, is shown in Figure 3. The saturation functions are used to specify diverse nonlinearities of the corresponding fuzzy filters [3]. This paper only considers simple piecewiselinear saturation functions S , , , , ~ ( X ) specified by the set of fuzzy rules: {If (x is LOW) Then s=O; If (x is HIGH) Then s=l}
(3 1
where the fuzzy labels LOW and HIGH belong to the corresponding trapezoidal fuzzy partition [2] defined by the coordinates (xmin,XI,x2,x,,,~). General saturation or squashing functions are considered in this approach as powerful fuzzy primitives. Verbal labels such as “approximately linear with a little saturation on the left side” or “fairly crisp” have been utilized during the structural design of the corresponding saturation functions. We claim here that as we have accepted verbal labels and linguistic hedges [2] to describe membership functions, we can also accept an analogous fuzzy functional framework to directly specify basic nonlinear functions. During the design process, these fuzzy functional primitives were revealed as a very useful complementary fuzzy tool.
Figure 3. URE’s of fuzzy motion detector FMD3
401
Saturation functions have been compiled into conditional piecewise functions. Parameters x I and x2 simultaneously specify the threshold, gain and saturating regions of the corresponding variables. In the considered fuzzy filters, the utilization of two saturation functions at the input and output of the filter, gives more flexibility to remove undesirable noise, for different video conditions, in the corresponding motion detector.
Lowmotion
18
:
Highmotion
Results of Suzie video sequence
:
,
t
Frame saturation parameters r=5; s.255 0.270; f.270
-High
Motion Frames Param.
~ a O = l ; b 0 = 2 0 c0=2;d0=50 0 al=Z;bl=10 cl=2;dl=100 -Motion
MSE-L-Parameters. 7.68 MSE-H-Parameters=12.70 MS+LH-Parameters=B.O
0
20
40
80
60
100
120
140
160
Frame Number
Figure 4.Comparison between non-adaptive and adaptive saturation methods
4. Experimental Results
The interlaced test sequences used were generated from standard progressive video benchmarks of size 176x144 pixels [4]. Figure 4 shows the MSE results for each frame of Suzie sequence from #5 to # 150. Three types of algorithms were analyzed: I) deinterlacing with fixed saturation parameters suitable for low motion frames: (ao,bo,co,do),11) deinterlacing with fixed saturation parameters suitable for high motion frames: (u/,bI,c,,dI),and 111) deinterlacing suitable for low and high motion frames (algorithm FMD3), using adaptive saturation parameters: (a,b,c,d)=(1-P)x(ao,ba co,do)+P(al,bl,C I ,4). The MSE values were obtained excluding border pixels because their contribution is not computed with the uniform filter mask defined.
402
The proposed method FMD3 is much simpler and more robust than the referred algorithm FMDI [5,6] that gives lower video quality results i.e. for Suzie sequence: MSE-FMD1=7.6 1 versus MSE-FMD3=6.94. Moreover, algorithm FMDI is less flexible since it has only one non-adaptive saturation function and it is clearly more computational demanding. The deinterlaced frames #8 (with low motion) and #48 (with high motion) of Suzie sequence, using the described algorithm FMD3, are shown in Figure 5
Figure 5 Deinterlaced frames #8 and #48 of Suzie sequence using the adaptive fuzzy filter considered
5. Conclusions The obtained results show that the presented fuzzy motion detector algorithm
FMD3, with bilevel adaptive structure, works very well for different kinds of interlaced video sequences and can be a reasonable trade-off for deinterlacing.
References 1. 2. 3. 4. 5.
6. 7.
E. B. Bellers and G. de Haan, De-interlacing. A Key Technology for Scan Rate Conversion, Elsevier (2000). G. J. Klir, B. Yuang, Fuzzy Sets and Fuzzy Logic, Prentice Hall (1995). M. Nachtegael et al. (Eds), Fuzzy Filters for Image Processing, Springer, (2003). QCIF 176x144, Test video sequences, http://thanglong.ece.jhu.edu/-cjtdlink.htm1. D. Van de Ville et al., Fuzzy-Based Motion Detection and its Applications to Deinterlacing, In Fuzzy Techniques in Image Processing, E. E. Kerre and M. N. Nachtegael (Eds), Chap 13, pp. 337-369, Physica-Verlag (2000). D. Van de Ville et al., Motion Adaptive De-interlacing using Fuzzy Logic, In Proceedings of IPMU’2002, pp 1989-1996, Annecy, France, July (2002). J. GutiCrrez, F. Fernindez, J. C. Crespo and G. Trivifio, Motion Adaptive Fuzzy Video De-interlacing Method Based on Convolution Techniques. Proceedings OJ IPMU 2004, Perugia, Italy, July 4-9 (2004).
COLOR IMAGE ENHANCEMENT METHOD USING FUZZY SURFACES IN THE FRAMEWORK OF THE LOGARITHMIC MODELS VASILE PATRASCU Department of Informatics Technology, TAROM Company Sos. Bucuresti-Ploiesti, K l 6 . 5 , Bucuresti-0topeni.Romania E-mail: [email protected] The paper presents a color enhancement method based on a transform defined in the framework of a bounded logarithmic model. The enhancement transform uses three fuzzy surfaces associated with the following image parameters: brightness, contrast and saturation.
1.
Introduction
Simple and efficient enhancement methods can be obtained using a threeparameter affine transform within logarithmic models [ 1,4,5]. The three parameters used have the following functions: the first determines a translation and is computed using the luminosity average; the second is a multiplication factor for luminosity and is determined using the luminosity variance; the third is also a multiplication factor only for chromatic components and is determined by the average of image saturation. In the necessary calculus, using real number algebra operations may lead to some results that are outside the domain of image values. A way to avoid this drawback is the utilization of algebrical structures defined on real and bounded sets [4,5]. Using only one transform for the entire image has the disadvantage that the global statistics properties of an image may not be close to those computed in the local context created by a neighborhood system [3,6]. Thus one can define fuzzy partitions on the image support and then the pixels are separately processed in each fuzzy window. The final image is obtained using a function similar to an affine transform, but the three statistical parameters are replaced with three fuzzy surfaces [2]. The paper has the following structure: section 2 comprises a short presentation of the logarithmic model; section 3 presents the color image enhancement method using the affine transforms; section 4 presents the fuzzification of the image support and then, the enhancement method using fuzzy surfaces and experimental results are presented in section 5. Finally, section 6 comprises a few conclusions.
403
404
2.
The logarithmic model
Let us consider the set V =(0,1) as the space of intensity values. The addition (+) and the multiplication ( x ) by a real scalar will be defined in the set V. Then, defining a scalar product (. I and a norm 11. IIv , a Euclidean space
.b
will be defined. We will consider the addition:
The neutral element for the addition is 0 = 0.5. Each element opposite w = 1 - v . The subtraction operation (-> is defined by:
VEV
has an
The scalar multiplication is defined by: ..h
" (2.3) v +(l-v>h The vector space of intensity values ( V , ( + ) , ( x ) )is isomorphic to the space of real numbers (R,t,.) by the function cp : V + R , defined as: h(x)v=
VVEV , V A ER ,
VVE
v,
The isomorphism cp verifies: VVl,V2E
v
7
' d h R~, V V E V , The scalar product
cph( + b 2 ) = cpbl >+c p b 2 ) c p ( W V )= cp(4 (. I .)v : V x V -+R
(2.4) as:
cph
v,
is defined using the isomorphism
1
(2.5) (9I v2 )v = ).(Pb2 Based on the scalar product (. I .)v the vector space V becomes a Euclidean
V'Y,v2 E
space. The norm
11. IIv:
V -+ R+ is defined via the scalar product:
Practically, the closed interval the open interval (0,l).
[EJ- E] , where
0 < E << 1 , was used instead of
405
3.
Enhancement method using a three-parameter affine transform
A color image is described in the RGB coordinate system by three real boundedfunctions: R : Q + V , G : Q + V , B : Q + V . Q c R2 is a compact set that represents the image support and V = (0,l) is the
set used for image representation. One can define the luminosity and the saturation by following: 1
L = - ( x ) (R(+)G(+)B) 3
(3.1)
One can define the mean and variance of luminosity and the mean of saturation by following:
Having the statistical parameters pQ, on, yQ one defines the following affine transform:
where X = R, G,B and KO,Kr are two positive constants that establish the contrast and saturation level for the enhanced image. 4.
The fuzzification of the image support
Let be Q = [a,b]x[c,d ] the image support. A fuzzy partition is built on the set Q and its elements are called fuzzy windows [5]. Let there be rn > 1, n > 1 and P = I ( i , j ) E [I,rn]x[l,n]]a fuzzy partition of the support Q . The
{e,
membership degrees of a point (x,y ) E 52 to the fuzzy window Wv are given by the fimctions wv : Q +[OJ] defined by the relation:
406 ~j
wjj(x,y)=-. m
where
ui : [a,b]+ [OJ]
vj (Y)
(XI
(4.1)
xUi<x>
Cvj(Y)
i=l
j=1
, v j :[c,d ] -+[OJ]
are the following Gaussian 2
-(STand
functions: ui(x)= e
vj(y) =e
-(3 1
i-1 j-1 wherexi =a+-(b-a) and y j = c + - ( d - c ) . m-1 n-1 The parameters a , p (~0 , ~ control ) the fuzzification degree and offer more flexibility for the partition P. For each window q,,the fuzzy cardinality card(Wij),the fuzzy mean of luminosity pii , the fuzzy variance of luminosity 2 2 oij and fuzzy mean of saturation yij are defined by:
card(Fv,)=
c
w&y)
(X>Y)EQ
(4.4)
5. The enhancement method using fuzzy surfaces
Let be a color image described by its three scalar functions R : Q + V , G : 51 + V , B : 51 + V and L, S its luminosity and saturation computed with relations (3.1,3.2). The fuzzy window Wij will supply a triple of parameters
(pij,oii,yii), which reflects the statistics according to the pixels belonging (in fuzzy meaning) to this window. Then we can define the following hzzy surfaces [ 2 ] :
407
The enhanced image components will be calculated using the following function: (5.4)
where X = R , G , B , The method presented in this section was used for the images “kidsatl” and “kidsat4” shown in figures l a and l c [7]. The enhanced images can be seen in figures 1b and 1d.
Figure I . a), c) The original images “kidsatl”, “kidsat4”; b), d) The enhanced images.
408
6.
Conclusions
This paper presents an enhancement method for color images. After splitting the image support in fuzzy windows, for each one of them three statistical parameters are computed: the mean of luminosity, the variance of luminosity and the mean of saturation. Then, using three fuzzy surfaces, an enhancement function is built. Due to the support fizzification, this hnction is adjusted to diverse areas of the processed image. The truncation operations were eliminated using logarithmic operations defined on bounded sets.
References 1. M. Jourlin and J.C. Pinoli, Image dynamic range enhancement and stabilization in
2.
3. 4.
5.
6. 7.
the context of the logarithmic image processing model, Signal processing, Vol. 41, no. 2, 225, (1995). Y . Lin, G.A. Cunningham, S. V. Coggeshall, Input variable identification - Fuzzy curves and fuzzy surfaces, Fuzzy Sets and Systems, 82,65, (1996). J. J. Koenderink, A. J. V. Doom, The Structure of Locally Orderless Images, International Journal of Computer Vision, 31(213), 159, (1 999). V. Patrascu, V. Buzuloiu, Modelling of Histogram Equalisation with Logarithmic Affine Transforms, Recent Trends in Multimedia Information Processing, World Scientific Press, 2002, Proceedings of the 9 I h International Workshop on Systems, Signals and Image Processing, IWSSIP’O2, 312 Manchester, UK, (2002). V. Patrascu, Color Image Enhancement Using the Support Fuzzification, In Fuzzy Sets and Systems - IFSA’03, Vol. LNAI 2715, Springer-Verlag 2003, Proceedings of the lothInternational Fuzzy Systems Association World Congress, 41 2, Istanbul, Turkey, (2003). Z. Rahman, D. J. Jobson, G. A. Woodell, Retinex processing for automatic image enhancement, Human Vision and Electronic Imaging VII, (Eds. B. Rogowitz, T. Pappus),Proc. SPIE 4662,390, (2002). http://dragon.larc.nasa.gov/retinex/.
GIS AND SATELLITE IMAGE PROCESSING FOR THE STUDY OF HUMAN IMPACT ASSESSMENT ON STEPPE DEVELOPMENT IN UZBEKISTAN ISKANDAR MUMINOV National University of Uzbekistan,Institute of Applied Physics, Tashkent 700174 muminofmbcc.com.uz JOSEF BENEDIKT GEOLOGIC Dr. L?enedikt9Lerchengasse 34/3, A- I080 Vienna E-mail:josej [email protected]
The availability of satellite imagery and GIS databases enhances the evaluation of human impact on changing ecosystems. This area was not selected by accident, it has been cultivated by people for very long time starting from the Palaeolithic age. Furthermore it is a trans-border zone between settled and nomadic cultures . The impact of extensive human activities to changes in land use are described by landscape features in a GIS layer based environment using several physical characteristics (vegetation, soils, agricultural use). MapModels will be used to develop scenarios of anthropogenic impact on the process of increasing steppe development. The use of fuzzy set based logical generalizations in modeling geographical phenomena is argued to improve human impact assessment as well as to use GIS as an advanced decision support tool to promote regional development. A knowledge - driven approach is suggested as a working environment to further enhance environmental risk evaluation. Future steps necessary are listed.
1
Introduction
In addition to the physical conditions a lot of land use development in Uzbekistan is due human impact on the environment. The paper gives a geographical description of the general problems of human impact on the degradation of land in the Nuratau area and adjacent regions in Uzbekistan, quantifies the changes in the environment and describes a possibility in assessing human impact on nature in a future MapModels-application. 1.I
Geographic Description Of Historical Development
To study human impact on the environment thoroughly historical and archaeological data have to be considered. We would like to emphasize that historic data on the Study Area are of general nature and they are by far 409
41 0
incomplete and need correction and completion, which is being investigated in the future [l]. “Agricultural activity of people gradually creates artificial ecosystems, so called agrocoenoses, which live by their own laws: to be sustainable they need permanent targeted human labor: they are unable to exist without intervention” (Moiseev, quoted in [2]). During the last decades of the 20th century many ecosystems of the region became destroyed because of extensive agriculture. Furthermore, industrial development and growth of large cities, population of the area began to grow rapidly as well. Rich pastures attracted more and more cattle breeders, but excessive pasturing led to degradation of grass cover and caused soil erosion. Almost all juniper forests, pistachio and almond bushes on mountain slopes were cut off. Isolated species of juniper still remained in most godforsaken ravines among inaccessible rocks only. Most factors threatening natural complexes are of human origin, namely fires, pasturing, fruits collection, harvesting of mulbeny-tree leaves, uncontrolled melioration in. Cows are pasturing constantly in planted forest belt areas, sheep and goats continues to be driven massively in the area of mountain areas (gullies). Overgrazing, typical to lower mountainous area and territories close to villages, leads to sharp degrading of vegetation and acute soil erosion. Harvesting of hay and cutting trees in flood-lands are more local, and their negative impact is not so obvious [ 13. Today, Nuratau and adjacent territories are being intensively used as pastures, as well as for growing forage and grain crops where possible, people grow gardens and keep vineyards. People annually sow wheat and partially barley on large areas both in Northern and in Southern piedmonts. Irrigated lands are mainly used for cotton, boghara lands are used for wheat growing, gardening and vineyards.
1.2
The Study Area
A Digital Elevation Model (DEM) was created using the topographic map in the scale of 1:500 000 to provide a better perspective of the Study Area. A satellite image (250m resolution) was superimposed on the DEM (Fig. 1.)
41 1
Fig. 1: Aidarkul and Tuzkan Lakes (1,2); Farish Steppe (3); Nuratau Mountain Ridge (4); Nuratau Valley (5); Aktau Mountain Ridge ( 6 )
Northern slopes of the Nurata Mountain ridge are directed towards Kizilkum Desert. In watery periods, there are many gullies with spring waters. Due to the latter, settlements appeared in ravines and close to the ravines where people cultivate lands and breed cattle. In water scarcity periods, many gullies get dry and there is hardly enough water to irrigate gardens, and water practically does not reach the valley. Width of small rivers in watery years is 2-3 meters as a rule, with depths of tens of centimeters in some places, whereas in years of water scarcity, width of small rivers is hardly 30-40 centimeters. For example, this situation was observed in the Hayat sai area in 2001, when there was a considerable drought. In 2002, a year with lots of rain, the landscape had a completely different look. By visual interpretation of satellite images made in 1989, 1998, 2002 and 2003 acreage indicators were obtained for the Farish Steppe of Nurata State Reserve. It has to be noted that the GIS layers are inaccurate due to the general lack of metadata at this point resulting in inaccuracies in spatial data referencing among others. The following quantities in land use changes were derived from visual interpretation of digital imagery and GIS layers using Desktop GIS MapInfo: In 1989, total acreage of agricultural lands in the study area, excluding pastures in the Farish Steppe, comprised approx. 36,000 hectares, in 1998 it was approx. 50,000 hectares, and in 2002 approx. 60,000 hectares were cultivated. Area of flood-land vegetation in Nurata State Reserve in 1989 comprised 2310 hectares, and in 1998 it comprised 2148 hectares. During the last decade floodland vegetation was reduced by 162 hectares. Defined area of water of Aidarkul and Tuzkan lakes changes were also calculated using satellite imagery analysis
412
made at different points of time in the period from 1989 to 2003. If we consider 1989 as a reference point, the largest defined area of water was in 2002 comprising approx. 3,000 km2 in April. For the same period in 2003, the defined area of water of these lakes was approx 2,900 km2. 2
Human Impact on Steppe Development in Uzbekistan
The term “landscape” refers to a homogenous territory defined by its origin and history of development. A landscape is characterized by a homogenous geologic foundation, relief of the same kind, common climate, and common combination of hydrothermal conditions, soils and biocenoses, and consequently, having a similar set of simple goecomplexes. Every specific landscape is a genetically integrated, a dynamic geosystem with vertical and horizontal links. Every landscape is in constant interaction with other surrounding landscapes through exchange of substances and energy. Every landscape is unique in space as a geographic individual, at the same time being a part of some typological whole. Major problems caused by human impact on the landscape in the study area include: Detriment and halting of renewal of forest, vegetation cover of mountain valleys, flood-lands as well as mountain slopes; increased wash out of soil due to excessive and mismanaged pasturing of livestock in the areas of Nuratau and Aktay Mountain Ranges; Deforestation of mountain slopes as a result of a longtime practice of cutting trees and bushes for fuel in all mountain areas of the study area; Pasturing of livestock in the territories of natural reserves causing overgrazing, and possibility of infection transmission from domestic animals to wild animals, Severtsev’s ram, in particular, which is registered in the Red Book; High pasture load near villages and uneven utilization due to pasturing of thousands of sheep, goats and cattle. Almost entire piedmont zone, and partially slopes are constantly under pasture load - within the radius of 1 km from settlements 80% of vegetation has been destroyed. In the zone of piedmonts, where most of population is concentrated, pastures are 7-8 times overloaded, even more in Nuratau and Aktau areas; Overgrazing, degradation of vegetation, baring and compacting of soil in the piedmont plain, especially near water (wells, artesian wells), in Farish Steppe area is well visible on the satellite image; Acute change and transformation of vegetation cover because of intensive cultivation, irrational expansion of pasturing.
413
A Geographic Information System seams suitable to address the multi-layered question of human impact assessment. Many Desktop GIS systems, however, do not take into account the uncertainty other than quantifying errors data and information. In this work we attempt to include local - geographical knowledge and to provide means as GIS extensions based on fuzzy sets and logical extensions to adequately model some of the issues described above. 2.1
Uncertainty and Human Impact Assessment
Humans act irrational, that is, we are not aware of all the facts and data available when making decisions. Our evaluation is not scientific in the sense that it is not possible to estimate probabilities and make decisions. Human actions do not follow the logic of a computer system or a binary decision making. A lot of social, perceptional, linguistic and other elements have an impact on human decisions. Traditional computer-based models do not reflect the complexity of the human mind in, for example, shaping a landscape. A generalization of logical concepts may be more suitable in addressing the vast field of human decision making. Soft Computing techniques are tolerant to errors and focus on a generalized concept of formalisms like Fuzzy Logic extends logical axioms. Throughout the so-called "age of information" knowledge itself has not improved but the possibilities to handle uncertainty have gained tremendous importance. 3
Map Models - A CIS modeling language
MapModels is a flexible tool for explorative spatial data analysis and deemed suitable in modeling complex decision making. It has been developed at the Regional Science Institute of the Vienna University of Technology with the intention to bridge the gap between spatial decision analysts and computer programmers [3]. MapModels is a visual programming language based on the widespread desktop GIS ArcView 3. It supports the development and implementation of analysis procedures based on flowchart representations in a very intuitive and user-friendly manner. It is particularly suited for extended decision making with fuzzy set modeling of geographical notions [4]. Flowcharts are used for the visualization of models and analysis processes in a wide range of applications. Normally this kind of graphic representation is simply focused on the illustration of the model structure and information flow but doesn't directly control the underlying processes.
414
Within MapModels the nodes of a flowchart are in fact active elements of the model. They provide a visual encapsulation of real analysis procedures and data objects where input data and analysis operations are represented by labeled icons connected by edges which characterize the dataflow (Fig. 2).
Fig. 2. a simple spatial query: "find all relatively flat areas with an elevation higher than a given threshold" (center: MapModel; left: slope
Since MapModels flowcharts contain executable code, the specification and the implementation of the analysis model is just one single step. It is a kind of a drawing-process where flowchart elements are inserted into the model environment and connected by means of drag-and-drop operations with the mouse [ 5 ] . 4
Outlook
The text reports on work in progress and gives you an idea of the situation and possible methodology used with the data collected in this area. Although there are no results yet in assessing human impacts within a GIS based Decision Support System, the authors intended to give the reader a description on the necessity of using such tools and the promising methodology of generalizations of classical methodology in handling uncertainty and semantic knowledge in the field of land use development. The main topics to be addressed which are of
415
particular importance in working on land use development in Uzbekistan and map modeling are: Develop a model to assess human activity Create a database with elements of electronic maps on natural and human factors. This database should include numeric and descriptive data and knowledge about condition of environment, as well as social and economic situation; Study possibilities of using wind technologies and solar engineering as replacement for existing sources of power in order to preserve trees and bushes in the study area from cutting for fuel. Sow perennial forage crops in the piedmont plain in cattle water access and driving areas to reduce desertification; Hold regional seminars to discuss existing issues of environment protection, its assessment, and ways to address these issues, and to develop practical recommendations to preserve and rationally use natural environment. RemarWAcknowledgment This work is a report on parts of Mr. Muminov's thesis project, who provided field data, satellite imagery and local knowledge to be integrated within a CIS modeling environment. References
1 . The German Environmental Protection Union: Nurata Mountains and south-west part of the Kizilkum Desert, Nuratau Report (1 996) pp. 125-126 2. Muminov I., To a problem of compiling of a modem landscape map, Geography and Values. National University of Uzbekistan, Tashkent Workshop abstracts (2001) (russian only) 3. Riedl L., Vacik H. and Kalasek R, MapModels: a new approach for spatial decision support in silvjcultural decision making, Computers and Electronics in Agriculture 27 (2000) pp. 407-41 2. 4. Benedikt J., Reinberg S., Riedl L., Vague Geographical Knowledge Management - A flow-chart based application to spatial information analysis, In: R. DECALUWE, G.DETRE, G.BORDOGNA (eds.), Flexible Querying And Reasoning In Spatio-Temporal Databases: Theory And Applications (2004, in press) 5. Benedikt J., Reinberg S . and Riedl L., A GIS application to enhance cellbased information modelling, Information Sciences 142 (2002) pp. I5 1- 160
THE HAAR WAVELETS IN A FUZZY SYSTEM FORM AND AN APPLICATION TO THE JOHNSON NOISE THERMOMETRY
B. S. MOON, I. K. HWANG, C. E. CHUNG, K. C. KWON Korea Atomic Energy Research Institute, 150 Dukjin-Dong, Yusong-Ku, Daejeon, 305-353, Korea, Rep. of E-mail: [email protected] We describe how the multi-resolution analysis using the Haar wavelets and the corresponding discrete wavelet transformation change when the approximate functions involved are represented by fuzzy systems. Using the scaling function of the Haar wavelet or the step function as the input fuzzy sets, we prove that the fuzzy system representation of the intermediate approximate functions differ from the corresponding functions in the MRA of the Haar wavelet by O(h) where h is the length of the subintervals. Thus, the fuzzy system representation of the functions are identical with those of the Haar wavelet when it is applied to the gray scale image or when the manipulations of the function values are performed in integers with a fixed number of bits. We prove that the approximation based on the Haar wavelet picks up the trend curve when it is applied to a band pass filtered sinusoidal signal. An example to determine the trend of the temperature change curve from the power of the Johnson noise signal is included.
1. Introduction
It is well known[l,2] that a continuous function can be approximated by a fuzzy system within an arbitrary accuracy on any bounded set. One of the advantages of using fuzzy system representation is that in fuzzy systems not only the independent variables are approximated but also the dependent variable is approximated. The approximation can be done in such a way that the accuracy is not lost any further than the one by the corresponding crisp approximation. Note that the gray scale images or the functions with A/D converted values can be considered so that they are already in a fuzzy system form. For the gray scale images, the row number and the column number of each pixel may be taken as input fuzzy set numbers, and the gray scale value in each pixel ranging from 0 to 255 can be taken as the output fuzzy set number. Thus, every gray scale image in its original form can be 41 6
41 7
Figure 1. T h e scaling function for the Haar Wavelet 4(2kt - 1 )
considered as a fuzzy rule table. For the functions obtained by a sampling with their values from an n-Bit A/D converter, we take the sample numbers as the input fuzzy set numbers and the n-Bit numbers in their integral form as the output fuzzy set numbers. Note that there will be 2n output fuzzy sets in this case. In the following, we will consider the multi-resolution analysis (MRA) of a function based on a scaling function as a successive approximation of the function, and similarly we will consider the discrete wavelet transformation (DWT) of a function as a successive approximation based on the wavelet. Recall that all of the functions used in these approximations can be represented or approximated by the corresponding fuzzy systems and hence we expect the MRA and DWT in a fuzzy system form should be possible. 2. The Haar Wavelet in a Fuzzy System Form
The Haar wavelet decomposition for a continuous function f ( t ) on [0,1] is defined [3] as follows; Let xi = ih,i = 0, 1 , 2 , . . . ,2", where h = 2-n and let
for 1 = 0,1,2, ' . . , 2k - 1. Note that the function f k ( t ) is constant on every subinterval of length 2-' starting from 0. Let the scaling function of the Haar wavelet be
4(t) =
{
l j ! (
o< t l l otherwise
41 8
then the function f k ( t ) can be written as 2k
and it is routine to check that the multiresolution relation[3] becomes 1 c(k - 1,1) = - ( c ( k , 21) + c ( k , 21 2 For the wavelet transformation, we define
+ 1))
(5)
gk-l(t) = f k - l ( t ) - fk(t)
(6)
and the Haar wavelet G ( t ) by $ ( t ) = 4(2t)- 4(2t - 1). Then, we can write
where the coefficients d ( k , 1 ) satisfy 1 d ( k - 1,1) = - ( c ( k , 21 1) - ~ ( k21)) , (8) 2 To define the wavelet decomposition in a fuzzy system form, let C(n,1) = fn,l for 1 = 0 , 1 , 2 ; . . , 2 n - 1, and for k = n - 1,k - 2 , . . . , 1 , let
+
1 C(k - 1,1) = [-(C(k, 21) 2
+ q k , 21 + l))]
(9)
for 1 = 0 , 1 , 2 , . . . ,2"' - 1 where we used the Gauss bracket function to obtain integer values. Similarly for the discrete wavelet transformation, we define 1 - 1 , l ) = [2 - ( c ( k ,21 1) - C(k,21))] (10)
&
+
Note that C ( k , l ) are integers for all k and 1 and that the sequence {C(k,0), C(k, l),C(k,2), . . . , C(k,2k - 1 ) ) is the fuzzy rule table for a function d ( t )which an approximation of fk(t). In the following, we will show that the error due to this approximation is not very large and it is controllable.
Lemma 2.1. Let f n , l be integral values of a function as defined above and let c(n,Z) = f n , l for 1 = 0 , 1 , 2 , . . . , n . If c ( k , 1 ) and C ( k , l ) are defined as (5) and (10) respectively, then we have IC(k,l) - c ( k , l ) l 5
q.
Theorem 2.1. Let f;c(t) = x;10C(k,1)4(2kt - 1 ) and fk(t) be as defined in (4). Then we have Ifk(t) - fk(t)I 5
"g.
419 2k
Proof. Note that the sum in fk(t) - f k ( t ) = Cl=o(C(k,1) -c(k, 1))4(2%-1) is a direct sum of functions whose supports are disjoint. Hence, we have M a z l f k ( l ) - f k ( t ) ( = rnaz{(E(k,I) - c(k, 1)(1 = 0 , 1 , . . . , 2') and the latter is less than or equal to by Lemma 2.1. Q.E.D.
9
3. Application to the Johnson Noise Thermometry Johnson noise is the temporarily fluctuating electrical current I ( t ) in an electric circuit of resistance R and self-inductance L at absolute temperature T . The Johnson noise can be expressed by the following formula [4,5] based on the assumption that the many complex interactions between the conducting electrons and the thermally vibrating atomic lattice of the wire produce a thermal electromotive force in the circuit.
This can be rewritten with r
=L/R
as
Note that it is an Ornstein-Uhlenbeck process and v(t) can be written as
v(t)= ,+r(t)
(13)
where r is a Gaussian white noise and c = 2 k T R [4].Therefore, we have v(t)' = 2kTR from which we find that the temperature T is proportional to v(t)2.
Figure 2.
Comparison of the Haar Approx. and the Fuzzy Haar Approx.
420
In the following, we prove that the approximation of a sinusoidal signal in a fixed frequency band by the 4th step of the multi-resolution analysis using the Haar wavelet reduces the amplitude of the oscillation to less than 15% of the original.
Lemma 3.1. Let m and N be positive integers such that N is divisible by 256, i.e. N = 256N1. If 5 5 !j and if m = moN1 ml where 0 5 ml < N1, then we have 48 5 mo 5 127.
+
&
~w
Proof. From 5 < f , we have 3 x 24N1 5 moN1 +ml < 28N1. Hence, by dividing by N1, we get 3 x 24 5 mo < 28. Now, note that mo 3 if and only if 48 5 mo. Q.E.D. 0 I < 1 and hence 3 x 24 I N1
+
+
&
Next, we define for 1 = 0,1,2,. .., S(1) = CEl k ’ S i n ( e ) and C(1) = k ’ C o s ( e ) , then by direct calculations we have the following.
& EL,
Lemma 3.2. If S(1) and C(1) are as defined above, then f o r all integers m with 48 5 m 5 127, we have IS(O)l, IC(0)I 5 0.1, I S ( l ) l ,IC(l)l 5 1.0, 15., IS(3)l, IC(3)l 5 250., and IS(4)1, IC(4)I 54000. IS(‘2)1,IC(2)I I Lemma 3.3. If 0 5
J:
< 0.5, then EL”=, $-5
$(l
+ 32).
Theorem 3.1. If m and N are positive integers with N 16 1 6< - N -< 2 I, then we have l & ~ k = l S i n ( ~ 5) 0.15. I
Figure 3.
=
256N1 and i f
Transient Johnson Noise Power with Wavelet Approximation
42 1 nm k
+ dk nm
Proof. Let m = moN1+ ml and write Sin(-) as Sin(-&Note that we have 48 5 mo 5 127 by Lemma 3.1. Let a = Now, using then p 5 and hence p 5 and ,B = Sin(a k p k ) = Sin(ak)Cos(bk) Cos(ak)Sin(pk)which is equal t o Sin(ak)x&(-l)lw Cos(,&k) C ~ o ( - l ) Therefore, L ~ . we can
+
write
I&
a.
+
16
Ck=l s
2nmk
i n ( T ) (5
s.
+ c;=:,
5% +
9.
(s(o)( f plc(l)l + gls(2)l + GIC(3)l +
glS(4)I Now, by using Lemma 3.2 and Lemma 3.3, it is routine t o check that the last sum is less than or equal t o 0.15. Q.E.D. Corollary 3.1. If f(t) is a signal with mean 0, sampled at a rate of N samples per second, band pass filtered with lower limit greater than &N and upper limit frequency below then the maximum amplitude of the 4th step Haar approximation of f ( t ) is below 15% of the original amplitude.
y,
Figure 3 shows an example Johnson noise signal power and its approximation, where the oscillating curve with larger amplitude is the original and the curve in the middle is the approximated. 4. Conclusion
We have shown that the multi-resolution analysis of functions and the corresponding discrete wavelet transformation based on the Haar wavelet, can be represented by fuzzy systems. We have also shown t h a t the Haar wavelet approximation can be used to determine the transient temperature from the power of the Johnson noise signal without time delay. For a further study, i t is left t o prove t h a t Theorem 3.1 is true even when the sample rate N is not a multiple of 256.
References 1. B. S. Moon, A Practical Algorithm for Representing Polynomials of Two Vari-
2.
3.
4. 5.
ables by Fuzzy Systems with Accuracy O(h4), Fuzzy Sets and Systems 119(2), 321(2001). J. L. Castro, M. Delgado, Fuzzy systems with defuzzification are universal approximators, IEEE Trans. Systems Man Cybernet. 26(l), 149(1996). Raghuveer M. Rao and Ajit S. Bopardikar, Wavelet Transforms; Introduction to Theory and Applications, Addison-Wesley Longman (1998). D. T. Gilespie, A Mathematical Comparison of Simple Models of Johnson Noise and Shot Noise J . Phys., Condens. Matter. 12, 4195-4205 (2000). D. T. Gilespie, The Mathematics of Brownian Motion and Johnson Noise, Am. J . Phys. 64 (3) , 225-240(1996).
NEURAL NETWORK BASED SONAR TARGET DIFFERENTIATION W.S.LIM M.V.C.RA0 C.K.LO0 Center of Robotics and Automation, Faculty of Engineering and Technology, Multimedia University, Jalan Ayer Keroh Lama, 75450 Melaka, Malaysia. Email: [email protected]
In this paper, the stability-plasticity behaviors of Minimal Resource Allocation Network (MRAN) and Probabilistic Neural Network (PNN) in differentiating targets by means of sonar signals have been investigated. MRAN has shown to have lower network complexity but experiences higher plasticity compared to PNN in the experimental study. In terms of on-line learning performance, MRAN has also proven to be more superior to PNN.
1. Introduction Pattern classifying has become an important topic for robotics research in many applications [l]. These classifiers are capable of predicting the shape of objects or obstacles surrounding the robot by means of processing the input data received from numerous types of sensors or detectors on the robot. Generally, different objects with different curvature will reflect the signals at different angles and intensity. In sonar signals, they supply the data of the distance between the target and the detector that serve as the input to the neural networks. In this paper, we investigate the use of neural networks in processing the sonar signals reflected by different targets and localization applications for indoor environments. It describes how M R A N algorithm performs on pattern classification of various targets and a comparison between its performance and Probabilistic Neural Network (PNN). The robustness and plasticity of MRAN neural network performance are tested and compared. The comparison of results shows that MRAN has a high level of plasticity that deteriorates its capability in maintaining the neurons’ weights of the previous encountered patterns. Hence it is more suitable for online learning where the weights change rapidly according to the online received data.
2.
Target Classification with Neural Networks
The target differentiation algorithm used in earlier works like in [2] is reviewed. It has given useful ideas of how to differentiate targets by means of their shapes and radius of curvature. In this work, two types of neural network algorithms are 422
423
used, namely Minimal Resource Allocation Network (MRAN) and Probabilistic Neural Network (PNN) for classifying three primitive targets. The target primitives modeled in this study are wall, corner, and edge (Figure 1). The M R A N and PNN used in this work have one input, one hidden and one output layer. For MRAN, there are 6 input nodes and 3 output nodes employed, whilst the number of hidden neurons varies with each training data set. For PNN, there are also 6 input nodes but only 1 output node. The number of hidden nodes in PNN depends on the number of training data [ 5 ] .
wall
corner
edge
Figure 1 . Cross sections of the target primitives differentiated in this work.
Figure 2. Sensitivity region of an array of ultrasonic transducers in P2AT robot.
2.1. Minimal Resource Allocation Network (MRAN)
In M R A N , the network begins with no hidden neurons. As each training data pair (input and output) is received, the network builds itself up based on two growth criteria, equation (1) and (2). (1) 1i.l - pinr 11 > ~i (2) )Iei 11 = 1I Yi -Axil )I > emin where J L ' is ~ the center (of the hidden unit) which is closest to xi (the input received). ei is the calculated error, the difference between output received, yi and the network output,Axi). ~i , emin are threshold to be selected appropriately. The algorithm adds new hidden neurons or adjust the existing network parameters according to the training data received. The algorithm also incorporates a pruning strategy that is used to remove hidden neurons that do not contribute significantly to the output. Its simulation results in [ 101 show that the algorithm is able to build up a network that can perform equalization better than several existing methods. Details on the classical PNN are not further explained here, as more information can be referred from [3]. 2.2. Experiment procedures In our system, a commercially available robot simulator, Amigobot modeled P2AT is employed for data collection (Figure 3). Six identical acoustic sonar
424
transducers on the front side of the robot were utilized as shown in Figure 2. Each transducer can operate both as transmitter and receiver and detects echo signals reflected from targets within its own sensitivity region. The echo signals, which indicate the distance of the target from the transducers, are collected at 200 locations for each target, by positioning the robot from r = 102.5 mm to r = 600 mm in 2.5 mm increments, which gave 600 training data (3 target types x 200 data each) in total for training purposes.
Figure 3. Amigobot simulation software
Three training methods are employed to judge the robustness and plasticity of both networks. Firstly, the networks are trained in a way that it received data collected from one target type, the second and then followed by the third target type in sequence. In the second method, they are trained by randomly mixed data of all three target types. In third training method, both networks are trained in a similar way to the first method, but it is extended further by repeating the first target type training again in sequence. After the training process, the networks are tested to investigate their robustness and stability for targets situated at new distances from the robot. For example, the robot was located at a distance of r = 105 mm to r = 602.5 mm in 2.5 mm increments. It is basically the space between two side-by-side original locations previously used in collecting the training data, thus providing 600 new data (3 target types x 200 data each) for testing the neural network performance. The same testing data used to test h4RAN was applied again to test the network performance in PNN. As stated earlier, there are 3 output nodes for the MRAN network. Each of these output nodes corresponds to each type of target. For simplification of calculation, the three outputs are scaled to yield only the value of 0 or 1. Then we employed the “winner takes all” method for selecting the largest value among the three outputs to be a “1” while the other two remain as “0’. The output node
425 with a “1” indicates the type of target detected. This in turn will give every test result in combination of 0 and 1 only (001,010 or 100).
3. Comparative Analysis and Discussion Both networks are compared on their accuracy of estimating the correct target, namely wall, corner and edge. As discussed in the previous section, the data used for testing the networks are different from the data used for training. This is to further ensure that the robustness and stability of the networks are tested. The numerical values in Table 1 show the percentages of correct target-type classification.
Table 11 comparative maran and pnn withthree traning meth met Table 11 comparative maran and pnn withthree traning meth met Network used Target type
MRAN
PNN
Training Method Method 1
wall
comer
edge
wall
comer
edge
37.5 %
55.3 %
100 %
98.0 %
93.8 %
97.5 %
Method 2
65.8 %
72.4%
48.0%
98.0%
93.8 %
95.6 %
Method 3
100%
52.1 %
70.7%
98.0%
93.8 %
95.6 %
Table 11 comparative maran and pnn withthree traning meth met Table 11 comparative maran and pnn withthree traning meth met Table 11 comparative maran and pnn withthree traning meth met The results obtained above have led us to study further on the concept of stability-plasticity dilemma in MRAN and PNN neural networks. Till today, many research works have been done to encounter the stability-plasticity problems in neural networks such as in [6,7] for pattern classifications and function approximation. A major restriction on traditional artificial neural network is that the approximation capability will be frozen after the completion of training process. This results in a gradual degradation of estimation performance when applied to non-stationary environment. In solving this problem, the key challenge is the requirement to maintain a compromise between robustness toward interference and the adaptability to environment changes. Until recent decades, artificial neural networks have been providing us with many successful evidences on the application of multivariate and nonlinear time series prediction [8,9]. However, traditional neural networks always perform unsatisfactorily in non-stationary cases because of a deficiency of feedback mechanism to accommodate the input distribution changes. Missing this feedback mechanism, the common way to adapt the distribution skewness is to
426
completely clearing the existing network memory and begin with a new training set including information about current changes.
3.1. MRANperformance study The results from MRAN show that perfect classification (100%) is obtained for the object / target-type that is trained last in the sequence, i.e. for Methods 1 and 3. This implies the network learns very quickly and able to classify accurately on what it has just learned. In Method 1, edge is the last batch of training data received by the network, and edge is the only target accurately classified whereas for wall and corner, the network failed to classify them accurately. The same case goes for Method 3, where wall data is repeated in the training process, and as expected, wall has the perfect classification percentage this time. In Method 2, the network classifies poorly in overall when the training data is randomly mixed. The number of hidden neurons generated in MRAN for each training is in the range of 15 to 20. Generally the algorithm starts from null state with zero number of neurons. While learning occurs, neurons are incrementally allocated into hidden layer according to the two criteria given by equation (1) and (2). If the two criteria are simultaneously satisfied, a neuron would be created and associated with the network to diminish the errors from unanticipated vector. If the two criteria are not satisfied, instead of using neuron allocation strategy to minimize the errors, MRAN will adapt the Extended Kalman Filter algorithm to optimize the estimation by gradually repairing the errors on the connection weights and neuron centers [4]. Hence the frequent updating of weights and centers would cause the network to be unstable and high in plasticity.
3.2. PNNperformance study On the other hand, the results from PNN show that the highest classification percentage is from wall, followed by edge and then corner. Generally, the classification percentage of all targets achieved the satisfactory level (above 90%).The number of hidden neurons generated in PNN for each of the training is depending on the number of training data [5]. In this case, the number of neurons generated is approximately 600 which is equivalent to the sample size used to train both networks. Hence, the training process time is much longer, and a large amount neurons generated also adds to the complexity of the network. However, PNN does not suffer from these unstable and plasticity problems because generally it creates a separate neuron for each training sample. These generated neurons are neither pruned nor adjusted in the training process, hence the weights of every neurons are maintained. For moderately sized databases this
427 is not a problem, but unfortunately it will be a major drawback for large databases and applications where it deteriorates the speed and consequently increases the complexity of the network. 4.
Concluding remarks
The percentage of the correct target type classification is high at 100% for the last target trained by the MRAN for all the three training methods. This shows that the target type which is trained most recently will give a high classification accuracy. MRAN can be said to have high plasticity and furthermore it is unstable. In other words, the weights carried by the hidden neurons during training process change too rapidly, making the network tend to “forget” what it learned previously. No doubt, it learns very quickly, but it only “remembers” what is trained most recently. Hence MRAN is not suitable in pattern classification, but it is extremely useful and efficient for online learning. The advantage of M R A N network is its low complexity due to small number of hidden neurons generated during the training. PNN shows to be a more stable network than MRAN but the disadvantages are, it requires longer time to be trained due to the large number of hidden neurons, high network complexity, and its learning speed is slower than MRAN. Hence PNN is more suitable as a pattern classifier than MRAN.
References 1. R. P. Lippman, IEEE ASSP Mag., pp. 4 - 22, Apr. 1987. 2. B. Barshan, B. Ayrulu, and S . W. Utete, IEEE Trans. Robotics and Automation, vol. 16, pp. 435 - 442, August 2000. 3. D.F. Specht, IEEE Conf., Neural Net., Vol. 3, no. l., pp. 109-118, 1990.M. 4. Chan, C. Fung, IJCNN ‘99. IEEE International Joint Conference, Volume: 3, 10-16,July 1999, pp. 1554 -1559 ~01.3. 5. Donald F. Specht, IEEE Conf, Volume:1,7-l1,June 1992,pp.761-768, vol. 1. 6. B. L. Pulito, T. R. Damarla, S . Nariani, Neural Networks, 1990 IJCNN International Joint Conference on 17-21 June 1990, pp. 825 -833, v01.2. 7. J. P. Albright, 1994. IEEE World Congress on Computational Intelligence., 1994 IEEE International Conference, Volume: I , 27 June-2 July 1994, pp. 498 -502, ~ 0 1 . 1 . 8. Rok Rape, Dusan Fefer, Janko Drnovsek. Time Series Prediction with Neural Networks, IMTC’94, IEEE, May 1994. 9. Ben Jacobsen. Time Series Properties of Stock Returns, Kluwer Bedrijfshformatie, 1997. __ . 10. Deng Jian Ping, Narasimhan Sundararajan, and P. Saratchandran, IEEE Trans. Neural Networks, vol. 13, pp. 687 - 696, May 2002.
SYSTEMATIC DESIGN OF A STABLE FUZZY CONTROLLER FOR A ROBOTIC MANIPULATOR USING DESCRIBING FUNCTION TECHNIQUE
EVREN GURKAN Department of Electrical Engineering and Computer Science Case Western Reserve University Cleveland, OH 44106, USA E-mail: evrengometu.edu.tr Our aim in this paper is to develop a systematic method for design of stable fuzzy controllers using describing function technique. A given multi-input single-output fuzzy system is first reduced to a set of single-input single-output fuzzy systems using the additivity property. Then, the analytical expressions for the describing functions for each of the single-input single-output system are evaluated. The describing function of a fuzzy system is an interval-valued function; therefore, robust control results for the stability analysis are used. We have applied the theoretical results to the single-axis control of the Phantom haptic interface robotic manipulator.
1. Introduction The stability analysis of fuzzy systems becomes an important issue when they are used as controllers. Thus, finding systematic ways of the stability analysis is a major interest in literature. The existing approaches are reviewed in Sugeno l2 and Kandel '. The methods can be grouped as Lyapunov methods (12, 16, and 4), robust stability analysis (7, 11 and lo), adaptive fuzzy controller design (15,13 and 14), and frequency domain methods (9,5 and 1). Describing function method is used in 9, 5 and 1. Kim et al. derive analytical expressions for the describing functions of a fuzzy system with single input, and a fuzzy system with two inputs, where the second input is the derivative of the first. The existence of the limit-cycle of the fuzzy control system is predicted using the describing function analysis. In 1, the describing function method is used to analyze the behavior of PD and PI fuzzy logic controllers. The existence of stable and unstable limit cycles are predicted. Describing function analysis of a T-S fuzzy system is done in 5, 428
429
where the describing function is evaluated experimentally. The existence of multiple equilibria and of limits cycles are examined. In this paper, we develop a systematic design method for fuzzy controllers using describing function technique. We first apply additivity property of fuzzy systems to multi-input single-output fuzzy systems to decompose them into a sum of single-input single-output fuzzy systems. The describing function of these systems can then be calculated analytically using the methods in 9. The describing function obtained in this way is interval-valued, so in the stability analysis we use robust control results. The mathematical overview of the calculation of the describing function together with the additivity property, and the stability analysis is given in Section 2. The theoretical results are applied to the single-axis control of the Phantom haptic interface robotic manipulator in Section 3. Section 4 concludes the paper.
2. Mathematical Overview In this section, we give a brief overview of the additivity property of the fuzzy systems, and the analytical calculation of the describing function. The stability analysis based on describing function technique is also discussed here.
2.1. Additive Fuzzy S y s t e m s We extend the additivity property of fuzzy systems introduced in Cuesta et al. in order to reduce the multi-input single-output fuzzy systems into a sum of single-input single-out ones. The closed form of the fuzzy logic system with center average defuzzifier, product-inference rule and singleton fuzzifier is:
i
j
1
k
where,
P
~
tS
We decompose the fuzzy system in Equation 1 using the additivity property. For simplicity, we develop the theory for n = 4, and the extension of the theory to higher degrees is straightforward. For the fuzzy system to be
430 additively decomposable, it should satisfy the following property [5]:
f(.)
=f(21,x2,...,xn)=f(~l,O,...,O)+
f(O,x2,.
.. ,O)
+ . ..+ f(O,O,.
(3)
. . ,xn)
The assumptions on the membership functions for the system to be decomposable are given in 5. We use triangular membership functions that satisfy these assumptions: 2 - 4qi-1 7
54i-1
I xq < 5bqi
4qi-1
where 4-qi = -&i. These type of membership functions also satisfy the assumptions in the calculation of the describing function, which is introduced in subsection 2.2. When 4 1 a 5 x 1 < & + I , two consequent rules are fired for x1 with memberships pla(xl) and p l a + l ( q ) . The same applies for the other inputs. As a total, there are 24 = 16 rules fired. The corresponding fuzzy system output is:
f(x) = plap26/13cp4dYabcd + ~ l a ~ 2 b ~ 3 ~ ~ 4 d + l Y a b c d + l
+ .. + ~ ~ a + 1 ~ 2 b + l ~ 3 c + l ~ 4 d Y a + l b + l c + l d
(5)
'
+pla+l p2b+l P3c+l/14d+l Ya+lb+lc+ld+l
The decomposed system should have four single-input single-output systems of the form:
f(%l,O, 070) = f l ( 4
= PluYafgh
+ Pla+lYa+lfgh
+ pZb+lYeb+lgh =~ ~ c ~ e+ f c~ ~h 3 c + l ~ e f c + l h
f ( O , x 2 , 0 1 0 ) = f 2 ( 5 2 ) = PZbYebgh
f(o,o,~ 3 7 0 )= f 3 ( 5 3 ) f(o,o, 0 7 x 4 ) = .f4(24)
= p4dYefgd f
(6)
p4d+lYefgd+l
We derive the conditions under which f l ( x l ) + f 2 ( 2 2 ) + f 3 ( 1 ~ 3 ) +
f4(54)
f(x) is satisfied [6], such that we should choose yafgh+yebgh+yefch+yefgd Yabcd,
= =
etc. All the constraints and the proofs can be found in 6.
2.2. Describing Function After reducing the fuzzy system into a sum of single-input single-output fuzzy systems, we use the analytical calculation of describing function of fuzzy systems introduced in 9. We review the describing function for a fuzzy system without giving the proofs (the basic assumptions and proofs can be found in 9). The membership functions are in the form of Equation
431
4, and the closed form of the system is given by Equation 1 for n = 1. The describing function of a single-input single-output fuzzy system is then given as: d
-C{-
bl 4 AuiA N(A, W ) = N ( A ) = - = A T Ai = O 2A4i I,&( - sin &+I cos &+I) - (bi - sin bi cos Si)) 1
+ - - - ( 4 i U i + 1 - 4i+lUi)(COS &+l
Adi
(7 )
- cos bi)}
where d satisfies 4 d 5 A < &+I, d > 0, and varies with A; {hi} are defined to be the angles where the input sinusoid x = Asinb intersects the centers {q5i}’s of membership functions. For {&}’s, we have:
=0 4i Si = sin -1 (-), A 7r &d+l 5 60
7r
(i = 1,.. . , d, 0 < bi < -) 2
(8)
2.3. Stability Analysis We use describing function method for the stability analysis of the fuzzy systems. In this method, the describing function of the fuzzy system is cons ) sidered in cascade with a linear plant with transfer function G ( s ) = 4-
4s)’
which has a low-pass property. The characteristic equation of the feedback system with the fuzzy controller replaced by the describing function N ( A ) that is in cascade with the linear plant G ( s ) is: C ( s ) = 1 N ( A ) G ( s ) ,so we have:
+
C(S)= 1
+ N(A)G(s) = d ( s ) + N(A)n(s)
(9) The C in the above equation is an interval polynomial, since N(A) is real and interval-valued that depend on A. For the stability analysis of this interval polynomial, we use Kharitonov’s theorem for real polynomials [ 2 ] . For our system in Equation 9, we need to check the Kharitonov polynomials for the characteristic equation C , where N(A) E [N,i,,N,,,]. If these polynomials are found to be Hurwitz, then we conclude that our system is stable.
3. Application Example We apply the theoretical results to the control of the Phantom haptic interface robotic manipulator. The state equations of the system is in the
432
following form [3]:
where K , = 0, B, = 6.46 x and M , = 2.02 x lop5. We use our fuzzy system as a state feedback controller with the describing function N ( A ) = NI N2, so the controller takes the form u = -[Nl Nz]z. Here, N1 E [al, b l ] and N2 E [a2,b2]. The characteristic equation for the closed loop system becomes C(s) = s2 "GSNzs which is an intervalvalued polynomial. For this second order polynomial t o satisfy Kharitonov's theorem, the lower bounds of N1 and N2 should be higher than zero, i.e. a1 > 0 and a2 > 0. We choose 5 rules, and the controller parameters after application of additivity property (f (z) = fl(z1) fi(z2)) are chosen as follows: For f l , the centers of the membership functions are 4 - 2 = -42 = -3, 4-1 = -41 = -1.5, 40 = 0 and y-2 = -y2 = -0.0005, y-l = -y1 = -0.0001, yo = 0. The range for N I for these parameters is [0.0667 x 10-30.1449 x lop3], which is positive. We assign the same membership functions for f 2 , and y-2 = -y2 = -0.00008, y-1 = -y1 = -0.00003, yo = 0, and the range for N2 is [0.2 x 10-40.2521 x The stable system states that result using this fuzzy system to control the robot manipulator are shown Fig.la. When we change the assignment for f 2 such that y-2 = -y2 = 0.0008, y-1 = -y1 = 0.0003, yo = 0, the range of N2 changes to [-0.2521 x - 0.2 x becoming negative. The resulting unstable system states are given in Fig.lb.
+
+
+ KG3NL,
+
(4
(b)
Figure 1. (a) Stable and (b) Unstable System States
433 4. Conclusion We have presented our stable fuzzy controller design approach using describing function technique. We have applied the theoretical results to t h e control of the Phantom haptic interface robotic manipulator, where t h e simulation results agreed with the theory.
References 1. J. Aracil and F. Gordillo. Describing function method for stability analysis of pd and pi fuzzy controllers. Fuzzy Sets and Systems, in press, 2003. 2. S. P. Bhattacharyya, H. Chapellat, and L. H. Keel. Robust Control: The Parametric Approach. Prentice Hall, 1995. 3. M. C. Cavqoglu and F. Tendick. Kalman filter analysis for quantitative comparison of sensory schemes in bilateral teleoperation systems. In Proc. IEEE Int’l Conference on Robotics and Automation, May 2003. 4. S. Y. Chen, F. M. Yu, and H. Y. Chung. Decoupled fuzzy controller design with single-input fuzzy logic. Fuzzy Sets and Systems, 129:335-342, 2002. 5. F. Cuesta, F. Gordillo, J. Aracil, and A. Ollero. Stability analysis of nonlinear multivariable takagi-sugeno fuzzy control systems. IEEE Transactions on Fuzzy Systems, 7(5):508-520, 1999. 6. E. Giirkan. Uncertainty Modelling and Stability Analysis for % W a y Fuzzy Adaptive Systems. PhD thesis, Middle East Technical University, 2003. 7. H. Han and C. Y. Su. Robust fuzzy control of nonlinear systems using shapeadaptive radial basis functions. Fuzzy Sets and Systems, 125:23-38, 2002. 8. A. Kandel, Y. Luo, and Y. Q. Zhang. Stability analysis of fuzzy control systems. Computers and Structures, 105:33-48, 1999. 9. E. Kim, H. Lee, and M. Park. Limit-cycle prediction of a fuzzy control system based on describing function method. IEEE Transactions on Fuzzy Systems, 8( 1):11-22, 2000. 10. L. Luoh. New stability analysis of t-s fuzzy system with robust approach. Mathematics and Computers in Simulation, 59:335-340, 2002. 11. C. W. Park. Lmi-based robust stability analysis for fuzzy feedback linearization regulators with its applications. Information Sciences, 152:287-301, 2003. 12. M. Sugeno. On stability of fuzzy systems expressed by fuzzy rules with singleton consequents. IEEE Transactions on Fuzzy Systems, 7(2):201-224, April 1999. 13. Y. Tang, N. Zhang, and Y. Li. Stable fuzzy adaptive control for a class of nonlinear systems. Fuzzy Sets and Systems, 104:279-288, 1999. 14. S. C. Tong, T . Wang, and J. T. Tang. Fuzzy adaptive output tracking control of nonlinear systems. Fuzzy Sets and Systems, 111:169-182, 2000. 15. L. X. Wang. Adaptive Fuzzy Systems and Control: Design and Stability Analysis. Prentice Hall, 1994. 16. S. Y. Yi and M. J. Chung. Systematic design and stability analysis of a fuzzy logic controller. Fuzzy Sets and Systems, 72:271-298, 1995.
AN INTELLIGENT ROBOT OVERVIEW FOR MEDICAL IN VITRO FERTILIZATION JIA LU Computer Science and Information System University of Phoenix 5050 NW 125 Avenue, Coral Springs, Florida, FL 33076, USA. cluiia @email.uouhx.edu
WNXIA HU Health Science, Nova Southeastern University [email protected] This paper proposed an intelligent robot for In Vitro Fertilization (TVF) clinic research. The surgeon consistent is important issues for the research. The end effector of the robot was obtained through right, left, up, and down motions at the location. The robot was attachable to the bedside using a conventional scope holder with multiple joints. The control system was expected to enhance the overall performance of the system during eggs retrieval procedure. The auto-tracking control algorithms were demonstrated and computed for the potential assistant robot system.
1. Introduction In 1994 Green developed a telesurgery system (Green 1994). He showed the possibility of teleoperation robot’s performance reaching on site open surgery. In 1995, Funda developed a system including a remote motion robot that holds a laparoscopic camera and the instruments. This robot has degrees of freedom and rotations components for the instrument mounted joystick (Funda 1995). In 1996, IBM Research and Johns Hopkins Medical Center developed ceiling mounted surgical robot system for laparoscopic camera navigation (Taylor 1996). Casals proposed a camera control strategy to track the instruments during a surgical procedure based on the computer vision analysis of the laparoscopic image (Casals 1996). To alleviate the problems, Cavusoglu has developed a compact laparoscopic endoscope manipulator, which was mounted on the port of the patient’s, using the McKibben artifical muscles (Cavusoglu 1998). A robotic arm was designed to hold the telescope with the goal of improving safety and reducing the need of skilled camera operator (Zhang 1999). In order to enhance the performance of IVF eggs retrieval surgery and to overcome disadvantage of conventional systems, in this paper, we proposed an intelligent robot. 434
435 2. An Intelligent Robot For the hardware, the workspace of an assistant robot must be used not to affect tissues during eggs retrieval procedure. An optimized workspace reduces the possible damage in case of system failure. To observe and analyze the motions at the ranges, it is important for us to do the simulation first for the eggs retrieval. For the software, the programming commends are also critical for an assistant robot moving functionality. In order to achieve high safety and adaptability for the optimal range of the motion, the system was followed the two steps (Lu and Hu 2004): A potential assistant robot was used to minimize the interference during the procedure operation. The robot was used with a conventional laparoscope holder for its fixation to the bedside. The robot was mounted and positioned at various locations for necessary views at different degrees. The robot senses to the shape of eggs in the human environment. The motion reconstruction algorithm was based on the calculation of any surface on Euclidean 3-space R3 (Lu and Hu 2004). Theorem 1: For the approximated tangent vector, A is a regular rotation on Euclidean 3-space R3 and A(p + Ap), where Ap had small value in Eq. (1).
where A 0 indicates the location length of rotation A. Theorem 2. For a proper definition of area in R3, the coordinate patch (x) was performed as x: M ->N. Let AP be a small coordinate rectangle in M with side Ap and Av. Such that, x distorts AP into a small rotation region x(AP) in N for different location of parameters rotation. If the location from x(p, v) to x(p + Ap, V) is linearly approximated by the vector Ap, the location from x(p, v) is approximated for x(p + Ap, v). Therefore, the region x(AP) is approximated by the parallelogram in the tangent plane at x(p, v). We can find that the parallelogram in the tangent plane coincided with the parallelogram in different points on any location. If different points on any location were determined, the points can be derived by parallelism theorem. If the local area composed by different points was very small, the points would be on certain location. The algorithm was proposed to derive the motion of the eggs in the environment by using degree of freedom sensor. In the Euclidean 3-space R 3 , N time sweeping was executed with a range sensor (N22). In this case, all locations of human must pass through B, (body common location). If there were a large number of points on the eggs retrieval, these points can be expressed in C, as the
436
motion position vector. The performance of the proposal algorithm can be evaluated through the several simulations. The important steps for the robotic systems were the motion tracking control. This control can be used the powerful mixed integer quadric programming algorithm. The optimality criterion for the robotic system can be used the following Eq. ( 2 ) (Yin 2003),
I = [vT(t) vT(t+l)... vT(t+N-l)]
(2)
where I defined index function for the optimal solution for N time of horizon optimization. The larger N results were better control performance in increasing the computation burden exponentially. The number N can be restricted within the small value. The desired operation of the eggs retrieval procedure may not be reasonably accomplished by only one time of N steps optimization. We considered the receding of the horizon for repeating optimization. The process would continue until the desired manipulation was achieved. The time sequence of continuous and discrete logical solutions corresponds to the optimal motion. However, the desired model recede the horizon for the easy computation in large computation time. The following approach (Yin 2003) was applicable to the control of fast varying dynamics for the robot system. The main controller ran under PC in order to do the retrieval eggs. The board acquired the eggs data from 3-CCD camera on the tip of robotic section. To control the positions of the motors, PID controller was used. The control algorithms analyzed the images obtained from the retrieval board for the control. The operations for the eggs retrieval surgeon were occupied by surgical tools. The PC commands such as ‘right’, ‘left’, ‘down’, and ‘up’ were used for controlling the surgical robot. The locations of surgeons varied depending on surgical types. A patient was at fix position, and the camera operator was changed in different view during the procedure. The surgeons in actual surgery required an assistant robot that was capable of visualizing the entire patient’s environment during the procedure. We determined the ranges of the motion were from -45,45, -45,and 45 degrees for each direction. 3. Simulations The result of the motion estimation process was capable of computing the motion among these positions. The eggs of motion compensation kept some kinds of stabilized retrieval sequence. According to the proposed algorithms, the motion estimation and correction was performed for the eggs retrieval sequence. For the real image test, we got image frames for implementing to the system. The image
437 frame sequence was the size of 340 x 346 pixels and contained 45 frames in length. We chose the rotated location 200x 300 x 200 x 300 pixels of the frame as the active block for estimation. Eggs remained in the fallopian tube for a while, and robot was propelled toward the uterus by its arm through different motions at the locations. The auto-tracking algorithm was used for the simulation. We used the desired rotated location from the robot to get the optimal trajectory in eggs retrieval. We showed the simulation and manipulator kinematics graphically was more efficient. 4. Conclusions
Trajectory searching plan was often the kinenatic analysis only, and it is usually ignored for the actual dynamics of the robot. The obstacle avoidance strategy for end-effector and the manipulator’s joints are the key for an assistant robot. Joint space can be considered as a manipulator dynamics based on the inputs and the output of trajectory planning algorithm in terms of a time sequence of the values attained by position, velocity, and acceleration. Therefore, we think this approach may be effective if deviations from the linearization point are alternatively if the different linearizd models are used as the assistant robot moves along its trajectory.
References 1. P.S. Green, Teleprsence surgery demonstration system. Proc. Of the Conf. on Robotics and Automation, pp 2302-2307. (1994). 2. J. Funda, A Telerobotic assistant for laparoscopic surgery. IEEE, EMBS Magazine Special Issue Robotics in Surgery, pp. 279-291. (1995). 3. R. H. Taylor, An overview of computer-integrated surgery at the IBM. Watson Research Center, IBM, Vol. 40, No. 2, pp 163-183. (1996). 4. J. A. Casals, (1996). Automatic guidance of an assistant robot in laparoscopic surgery. Proc of Conf. on Robotic and Automation, pp 895-900 (1996). 5. M. C. Cavusoglu, Laparoascopic telesurgical workstation. Proc. Of the SPIE International Symp. On Biological Optics pp 296-303. (1998). 6. J. M. Zhang, A Flexible New Technique for Camera Application in Internet, Technical Report MSR-TR. (1999). 7. Y. J. Yin, (2003). On Novel Hybrid System Control Framework of Intelligent Robots. ICSSSE, Hong Kong, pp 535-540 (2003). 8. J. Lu and Y. X. Hu, A potential assistant robot for IVF retrieve. Proc. IEEE SoutheasternCON, NC, U.S.A (2004).
SELF-ADAPTATION OF THE SYMBOLIC WORLD MODEL OF A MOBILE ROBOT. AN EVOLUTION-BASED APPROACH* CIPRIANO GALINDO JUAN-ANTONIO FERNANDEZ-MADRIGAL JAVIER GONZALEZ System Engineering and Automation Department University of Malaga, Spain
A world model adapted to changes in the robot’s environment as well as to changes in its operations permits a situated agent, i.e. a mobile robot, to properly plan and execute its tasks within its workspace. Such adaptation of the symbolic world model involves two problems: maintaining the world model coherent with the environment information, and optimizing the model to perform efficiently the robot tasks. This paper presents a first step towards the self-adaptation of the symbolic world model of a mobile robot with respect to both aspects. Then, the paper focuses on the optimization subproblem, proposing an evolution-based approach to keep the symbolic model of the robot optimized over time with respect to the current agent’s operational needs (tasks).
1. Introduction A situated agent must account for a certain model of its workspace in order to perform deliberative actions, i.e., planning tasks. In particular, mobile robots that intend to perform autonomously usually work within highly dynamic and largescale environments, possibly performing tasks that also vary over time. The success of the agent in these situations largely depends on maintaining its internal model updated and coherent with the real environment. We call this selfadaptation of the symbolic model of the agent, and distinguish two parts in that problem: firstly, achieving coherence between the model and the real environment just in order to plan and execute tasks correctly (the coherence subproblem); secondly, tuning that model in order to improve the efficiency of task planning within the particular environment and operation needs of the robot (the optimization subproblem). The coherence subproblem is in fact the problem of anchoring, profusely addressed in the recent literature ([ 1],[2]). Anchoring deals with connecting symbols of the world model of an agent with its sensory information. The solution of the anchoring process usually involves to check for new anchors, that is, relations between symbols and sensory information, and to maintain previously
This work was supported by the Spanish Government under research project DP12002-0 13 19. E-mails: {cipriano, jafma, jgonzalez }@ctima.uma.es
438
439 found anchors coherent with the new information gathered by the robot sensors. In this paper, the problem of anchoring is not addressed, although a general scheme is proposed where future work on that issue can fit. Concerning the optimization subproblem, we have found less work in literature ([5],[9]). Elsewhere, we have been proposed an approach based on the psychological idea that the human brain arranges environmental information in a way that improves hislher efficiency in performing tasks [4]. That approach, called the task-driven paradigm, is intended to improve over time the performance of the set of operations carried out by the robot. Up to our knowledge, considering both subproblems within a single framework has not been addressed. However, we believe that a situated agent must face both of them in order to obtain the greatest benefit in its operations. In this paper, we propose a general system called ELVIRA where both subproblems can be integrated. ELVIRA has been tested in simulation within large-scale environments modelled through graph structures and under variations on the tasks that the agent must carry out. It has yielded valuable results concerning the improvement of self-adaptation of the symbolic model of the robot. The optimization subproblem is approached by ELVIRA through an evolutionbased method. In general, evolutionary approaches, like genetic algorithms, may not be seen appropriated to face dynamic systems. Nevertheless, in some particular situations, they can be adapted to any-time schemes, since they can provide approximate solutions that improve over time ([5],[9]). Our experiments clearly show that the evolutionary optimizer of ELVIRA achieves a high degree of adaptation of the symbolic model of the robot respect to its tasks. Next, the symbolic world model used in this work is introduced. Section 3 gives a description of ELVIRA. In section 4 some discussion and experimental results are presented. Finally, conclusions and future work are outlined. 2. The Hierarchical World Model
Within the ELVIRA system, the symbolic model of the environment plays a central role. Assuming that our agent works in large-scale space, the model should be suitably arranged for coping with a potentially huge amount of information. As stated in literature, humans widely use a mechanism called abstraction ([7],[8]) to model their environments. Abstraction works by establishing different levels of detail in the symbolic data, grouping symbols into more general ones and considering them as new symbols that can be abstracted again. The result is a hierarchy ofabstraction that ends when all information is modelled by a single universal symbol. In order to manage abstractions we use a mathematical model called AH-graph [4]. An AH-graph is a graph representation that includes hierarchical information
440
organized in different layers (see fig. 1). These layers are called hierarchical levels, and are isolated from one another; they consist of flat graphs whose nodes represent elements of the environment, and whose arcs represent relations between them. The lowest hierarchical level is called the ground level and it represents the environment with the maximum amount of detail that is available. The highest hierarchical level is called the univeixd level. In our particular approach, the AH-graph represents the topology of space. Thus, nodes represent distinctive places for robot navigation and localization while arcs indicate the possibility of navigating between these places. The AH-graph model has demonstrated a significant computational improvement in route planning 131 and robot task-planning [6]. More details about the AH-graph model can be found in [4].
a)
b)
C)
Fig 1 An example of an AH-graph (a) A schematic map of a real envlronment Distinctive places for robot navigation are marked with small rhombuses (b) Ground level topology of distinctive places (c) Upper levels of the hierarchy
3.
ELVIRA
We have developed a system, called ELVIRA for producing plans while adapting the symbolic world model of an agent to changes in both the world and the tasks that the agent must face. As shown in fig. 2, ELVIRA has two inputs: the task to be performed by the robot and the environment information gathered by its sensors. It outputs a plan that solves the current requested task and an adapted hierarchical model of the current world. ELVIRA comprises four processes. Three of them -anchoring, planner, and optimizer- are periodically executed, possibly with time-variant periods. This sampling implies that the hierarchical world model may be temporarily inconsistent with the environment. However, in this work we do not deal with such inconsistency since we assume that the environment remains inalterable while the robot plans and executes tasks. In the following, the components of ELVIRA are described in detail. -Anchoring Keeper. It gathers world information captured by the robot sensors and updates the current world model (only the ground level). The anchoring
441
relations must be continuously updated during the robot operation in such a way that each symbol unequivocally represents world elements. Such anchoring process is out of the scope of this paper. -Task Planner. It uses the current hierarchical world model to efficiently plan tasks [6]. From the number of times that each task has been requested up to the moment, this process maintains an estimation of the probabilities of arriving the tasks. These probabilities are provided to the Hierarchical Optimizer to guide the optimization search, as explained below. -Hierarchical Optimizer. It is based on a genetic algorithm that improves the current model of the robot's environment to reduce the computational cost of planning tasks. The genetic population encodes the minimum number of parameters to generate a set of potential models (AH-graphs) upon the ground level maintained by the anchoring process. Genetic individuals are evaluated using a cost function that weights the cost involved in planning the tasks by the estimated probabilities given by the task planner, that is, we give more weight to those more frequent tasks. Thus, the symbolic world model adapts better to frequent requested tasks to the detriment of other infrequent ones. -Hierarchy Constructor. From the information encoded in the best individual given by the optimizer and the current ground level information, this process constructs the current best hierarchical world model for the robot. Such construction is based on a graph clustering algorithm (see [4] for more detail).
Fig. 1. The ELVIRA system. ELVIRA is fed with the information gathered by the robot sensois and the requested tasks. It yields the best known hierarchical world model adapted to the agent tasks and its environment and the resulting plan for the requested tasks.
4.
Discussion of Experimental Results
In order to evaluate the suitability of ELVIRA for adapting the world model of a robot, we have conducted a variety of tests in large-scale environments and with different number of tasks to plan. This section presents an illustrative example of
442
planning five different navigation tasks within an environment of more than 500 distinctive places modeled by an AH-graph. A navigation task consists of planning a path between two distinctive places represented by nodes of the AHgraph. In this scenario, while ELVIRA is adapting the AH-graph through the genetic Hierarchical Optimizer, the robot is requested to plan a new navigation task (out of the five possible) every ten cycles of the optimizer (see fig. 3a). Fig. 3 shows the self-adaptation achieved by ELVlRA in the symbolic model of the world for this experiment. Here, the cost of planning a task is calculated by weighting the number of explored arcs of the AH-graph to find the shortest path, by the frequency of the task. By doing so, we provide a measure that accounts for both the cost of task planning and the probability of dealing with a particular task (a very costly but infrequent task should not influence the model to a great extend). Fig. 3a shows the evolution of the planning cost along the optimization steps from the point of view of the robot (which may not know all possible tasks). On the other hand Fig. 3b measures the same model adaptation using the groundtruth frequency to calculate the cost of task planning. Finally, fig. 3c indicates the evolution of the task frequencies calculated by the robot which tends to the ground-truth frequencies when time tends to infinity. Task Planning Cos1 wing the Task FrequencyCalculated by the Robot (numberafexplaredAHGr& R
A
C
I
C
I
A
R
I
I
C
C
A
I
B
I
8
40
80
A+.
B
C
C
E
E
o
A E
180
ZOO
b, task bewemy)
a c A E C B A O E D E A B A & . B E A E 1 I i l l l ! l ! l l l l l ; l l l l l l l ~ l l l l 1l 1 r ~ - ~ - r i - i - r ~ - i - r i - i - r ~ - ~ - r i - i - r i - ~ - r i - - i - r -
A
_I _ _I l_ ._1 _ _ _I _i _- I_ _ 1_ 1_ _- 1_ _ 1_ _1 _ _7 _1_ _1 _ 1_ _I _ _1 _ _/ _I _ _I _ 1_
!
1 -I_ L 1 - 1 -
0
o
L
80
100
120
140
160
220
240
260
280
300
320
310
360
380
4
Gsnellonr
Fig 3 a) Evaluation of the symbolic model adapted by ELVIRA Sharp increments are due to the arrival of new tasks not previously known by the agent Arrivals of tasks are denoted by a letter (from A to E) b) Evaluation of the adapted model considering the ground-truth frequency of tasks (A 30%,B 20%,C 20%,D 20%, and E 10%) c) Quadratic error between the frequency estimated by the robot and the real frequency, over time
In fig. 3a two stages of the self-adaptation process can be distinguished. At first, when the agent has no knowledge of all possible tasks, the fitness of its symbolic model oscillates substantially. As long as new tasks are requested, the agent
443
increases its knowledge on their frequency, so oscillations diminish until a steadystate is reached, approximately at the 180thstep of the optimizer cycle. Notice how dealing with a new task not previously considered by the robot may increase sharply the cost of planning since the robot world model was not tested against the new task. This occurs frequently in the transitory state in fig. 3a (before step 200), as shown by the abrupt risings of the planning cost. However, observe that immediately after a new task arrives, the optimizer starts to improve the planning cost, which reflects the self-adaptation to the new situation. 5.
Conclusions and Future Work
This paper has proposed a system to adapt the symbolic world model of a situated agent to changes in its environment and operations. . Our work focuses on the evolutionary optimization of the world model of the agent in order to improve efficiency in planning tasks, which has exhibited promising results. In the future we plan to apply ELVIRA to a real mobile robot, which requires going into the anchoring process. We also intend to extend the symbolic world model of the robot to a multi-hierarchical arrangement to better improve the model adaptation (as shown in [3]). References 1. Bonarini A., Matteucci M., Restelli M. Concepts for Anchoring in Robotics.
2.
3. 4. 5.
6. 7. 8.
9.
AI*IA: Advances in Artificial Intelligence. Springer-Verlag, 2001. Coradeschi S., and Saffiotti A., An Introduction to the Anchoring Problem. Robotics and Autonomous Systems Vol. 43, No. 2-3, 2003. Fernandez J.A., and Gonzalez J., Multihierarchical Graph Search. IEEE Trans. on Pattern Analysis and Machine Intelligence. Vol. 24. No.l,2002. Fernandez J.A., and Gonzalez J., Multi-Hierarchical Representation of Large-Scale Space. Kluwer Academic Publishers, 2001. Floreano D., Mondada F. Evolution ofHoming Navigation in a Real Mobile Robot. IEEE Trans. on Systems, Man, and Cybernetics, Vo1.26, 1996. Galindo C., Fernandez J.A., and Gonzalez J., Hierarchical task Planning through World Abstraction. To appear in IEEE TRA. Hirtle, S.C. and Jonides J., Evidence of Hierarchies in Cognitive Maps. In Memory and Cognition, Vol. 13, No. 3, 1985. Kuipers B.J., The Spatial Semantic Hierarchy, in AI, vol. 119, 2000. Nordin P, Bazhaf W., and Brameier M. Evolution of a World Model for a Miniature Robot using Genetic Programming. Robotics and Autonomous Systems, vol. 25, 1998.
MULTIPLE OBJECTIVE GENETIC ALGORITHMS FOR AUTONOMOUS MOBILE ROBOT PATH PLANNING OPTIMIZATION OSCAR CASTILLO, LEONARD0 TRUJILLO, and PATRlCIA MELIN Dept. of Computer Science, TijuanaInstitute of Technology Tvuana, B. C.. Mexico [email protected] This paper describes the use of a Genetic Algorithm (GA) for the problem of Offline Point-to-Point Autonomous Mobile Robot Path Planning”. The problem consist of generating “valid’ paths or trajectories, for the robot to use to move from a starting position to a destination across a flat map of a terrain, represented by a 2 dimensional grid, with obstacles and dangerous ground that the Robot must evade. This means that the GA optimizes possible paths based on two criteria: length and difficulty. First, we decided to use a Conventional GA to evaluate its ability to solve this problem (using only one criteria for optimization), and due to the fact that we want to optimize paths under the two criteria or objectives, then we extended the Conventional GA to implement the ideas of Pareto optirnality, making it a Multiple Objective Genetic Algorithm (MOGA). We present useful performance measures and simulation results of the Conventional GA and of the MOGA that show that both Genetic Algorithms are effective tools for solving the point-to-point robot path planning problem.
1. Introduction The problem of Mobile Robot Path Planning is one that has intrigued and has received much attention in Robotics, since it is at the essence of what a mobile robot needs to be considered truly “autonomous”. A Mobile Robot must be able to generate collision free paths to move from one location to another, and in order too truly show a level of intelligence these paths must be optimized under some criteria that are important to the robot and the terrain given. Genetic algorithms and evolutionary methods have extensively been used to solve the path planning problem, such as in [l], [2], [3], and [4], but this paper uses as basis for comparison and development the work done by Sugihara [5]. In this work, a grid representation of the terrain is used, and different values are assigned to the cells in a grid to represent different levels of difficulty for the robot to traverse a particular cell, also they represent a codification of monotone paths for the solution of the path planning problem. The conventions mentioned that are used in [5] are also used in this paper, but that is were the similarities end. The next sections show a comparison of the two methods and how they differ, but first we present a simplified version of the path-planning problem. This version only uses a binary representation of a terrain intended to represent solid obstacles and clear cells for the robot, and in which paths are only optimized under the criteria of the length. 444
445
2.
Genetic Algorithms
A Genetic Algorithms is an evolutionary optimization method used to solve, in theory “any” possible optimization problem. A GA [l 11 is based on the idea that a solution to a particular optimization problem can be viewed as a individual and that this individuals characteristics can be coded into a finite set of parameters. These parameters are the genes or the genetic information that makes up the chromosome that represents the real world structure of the individual, which in this case is a solution to a particular optimization problem. Because the GA is an evolutionary method, this means that a repetitive loop or a series of generations are used in order to evolve a population S of p individuals to find the Jittest individual to solve a particular problem. The fitness of each individual is determined bye a givenfitness function that evaluates the level of aptitude that a particular individual has to solve the given optimization problem. Each generation in the genetic search process produces a new set of individuals using genetic operations or genetic operators: Crossover and Mutation, operations that are governed by the crossover rate y and the mutation rate p respectively. These operators produce new child chromosomes with the intention of bettering the overall fitness of the population while maintaining a global search space. Individuals are selected for genetic operations using a Selection method that is intended to select the fittest individuals for the role of parent chromosomes in the Crossover and Mutation operations. Finally these newly generated child chromosomes are reinserted into the population using a Replacement method. This process is repeated a k number of generations. The Simple or Conventional GA [ll] is known to have the next set of common characteristics: Constant number o f p individuals in the genetic search population. Constant length binary string representation for the chromosome. One or two point crossover operator and single bit mutation operator, with constant values for p and y. Roulette Wheel (SSR) Selection method. Complete or Generational Replacement method or Generational combined with an Elitist strategy. 2.1 Multiple Objective Genetic Algorithms
Real world problem solving will commonly involve [12] the optimization of two ore more objectives at once, a consequence of this is that it’s not always possible to reach an optimal solution with respect to all of the objectives evaluated individually. Historically a common method used to solve multi objective problems is by a lineal combination of the objectives, in this way creating a single objective function to optimize [7] or by converting the objectives into restrictions imposed on the optimization problem. In regards to
446
evolutionary computation, [131 proposed the first implementation for a multi objective evolutionary search. The proposed methods in [9], and [lo] all center around the concept of Pareto optimality and the Pareto optimal set, using this concepts of optimality of individuals evaluated under a multi objective problem, they each propose afitness assignment to each individual in a current population during an evolutionary search based upon the concepts of dominance and nondominance of Pareto optimality. Where the definition of dominance is stated as follows: Definition 2.1: For an optimization problem with n-objectives, solution u is said to be dominated by solution v I$
Vi = 1,2,....,n.
f , ('>
3j = 1,2,....,n,
... f ,('1
' ('1 f,
,and
> f , (4
(1) (2)
2.2 Triggered Hypermutation
In order to improve on the performance of a GA, there are several techniques available such as [l 11 expanding the memory of the GA in order to create a repertoire to respond to unexpected changes in the environment. Another technique used to improve the overall speed of conversion for a GA is the use of a Triggered Hypermutation Mechanism [8], which consists of using mutation as a control parameter in order to improve performance in a dynamic environment. The GA is modified by adding a mechanism by which the value of ,u is changed as a result of a dip in the fitness produced by the best solution in each generation in the genetic search. This way p is increased to a high Hypermutation value each time the top fitness value of the population at generation k dips below some lower limit set beforehand, this causes the search space to be incremented at a higher rate thanks to the higher mutation rate, and conversely p is set back to a more conventional lower value once the search is closing in to an appropriate optimal solution.
3.
Conventional GA With One Optimization Criteria
The GA used to solve the path planning problem in a binary representation of the terrain, is the Simple or Conventional GA, as its known in the Evolutionary Computing area (one point crossover, Roulette Wheel Selection, one bit binary mutation, complete replacement). But due to experimental results, some modifications where made, the one important modification made to the algorithm was the inclusion of a Triggered Hypermutation Mechanism [8], which drastically improves the overall performance of the GA, Table 1 synthesizes the simulation results of the best configuration of the GA with
447
terrains represented by nxn grids with n = 16, 24, 32, each with 100 runs, 500 generations per search and randomly generated maps.
Table 1 Simulation Results for the Conventional GA Replacement Elitism
population
n 16
Mutation Hipermutation
% Ideal Fitness
100
98%
Valid solutions 98%
Generational
100
24
Hipermutation
98.3 Yo
96 Yo
Generational
100
32
Hipermutation
97.2%
88%
Note: The Ideal Fitness column, expresses the percentage of the ideal solutions for a grid configuration that’s the value of a map with zero obstacles, which a particular best solution of a genetic search reaches. 4.
MOGA With Two Optimization Criteria
The complete solution we want for the path planning problem, includes a terrain with not only free spaces and solid obstacles, but also difficult terrain that a robot should avoid when possible making it a multiple objective optimization problem [9], [lo]. A direct comparison is made with the MOGA used here and the GA proposed by [5] in Table 2. Table 2 Sugihara and MOGA methods Sugihara
MOGA
Paths
Monotone
Monotone
Aptitude
Linear combination
Pareto Optimality
Repair Mechanism
Out of bounds
Out of bounds, collisions and incomplete paths.
Genetic Operators
One point Crossover and single bit Binary Mutation
One point Crossover and single bit Binary Mutation
SelectionMethod
Roulette Wheel
ReplacementMethod
Roulette Wheel with Tournament Generational
Generational , and Elitist strategy
Termination
Max. Generations
Max. Generations
44%
Using the benchmark test of Figure 1 presented in [ 5 ] , [6] and [7], and the performance measure ofprobability of optimality Lop,(k) we compare the two methods and we show the results in Table 3, Table 3 Simulation results for MOGA
Population No. of Generationsk Mutation Rate
Sugihara
AGOM Generational
AGOM Elitism
30 1000 0 04
60 500 0 05
200 150 0 09
08
08
09
0 95 45%
Not Applicable 44%
Not Applicable 76%
/1
Cross0ver Rate
Y Win Probability w Probability de Optimali@ Lop,
(0
Figure 1 Benchmark Test. 5.
Conclusions
We describe in this paper the application of a MOGA approach for autonomous robot path planning. We have compared our simulation results with other
449
existing evolutionary approaches to verify the performance of our method and the implementation. Simulation results show the feasibility of using genetic algorithms for optimizing the paths that can be used by autonomous mobile robots. Acknowledgments
We would like to thank the Research Grant Committee of COSNET for the financial support given to this research project (under grant 424.03-P). We would also like to thank CONACYT for the scholarship given to the student that participated in this research (Leonard0 Trujillo). References
1. Xiao, Michalewicz, “An Evolutionary Computation Approach to Robot Planning and Navigation”, Sop Computing in Mechatronics, pp. 117 - 128, 2000. 2. Ali, Babu, Varghese, “Offline Path Planning of Cooperative Manipulators Using Co-Evolutionary Genetic Algorithm”, 2002. 3. Farritor, Dubowsky, “A Genetic Planning Method and its Application to planetary Exploration”, pp. 1-3,2002. 4. Sauter, Mathews, “Evolving Adaptive Pheromone Path Planning Mechanisms”, First International Conference on Autonomous Agents and Multi-Agent Systems, pp. 1-2,2002 5. Sugihara, Kazuo, “Genetic Algorithms for Adaptive Planning of Path and Trajectory of a Mobile Robot in 2D Terrains”, IEICE Trans. Inj & Syst., Vol. E82-D, pp. 309-313,1999. 6 . Sugihara, Kazuo, “A Case Study on Tuning of Genetic Algorithms by Using Performance Evaluation Based on Experimental Design”, 1997. 7. Sugihara, Kazuo, “Measures for performance evaluation of genetic algorithms”, Proc. 3rd. joint Conference on Information Sciences, Research Triangle Park, NC, vol. I, pp. 172-175, 1997. 8. Cobb, “An Investigation into the Use of Hypermutation as an Adaptive Operator in Genetic Algorithms Having Continuous, Time-Dependent Nonstationary Environments”, 1990. 9. Fonseca, Flemming, 1993, “Genetic algorithms for multiobjective optimization: formulation, discussion and generalization”, 5th Int. Conf. Genetic Algorithms, pp. 4 16-423, 1993. 10. Srinivas, Deb, 1994, “Multiobjective optimization using non-dominated sorting in genetic algorithms”, Evolutionary Computation, pp. 22 1-248. 11. Castillo, Oscar and Melin, Patricia, Soft Computing for Control of NonLinear Dynamical Systems, Springer-Verlag, Heidelberg, Germany, 2001. 12. Castillo, Oscar and Melin, Patricia, Soft Computing and Fractal Theory for Intelligent Manufacturing, Springer-Verlag, Heidelberg, Germany, 2003.
FINE TUNING FOR AUTONOMOUS VEHICLE STEERING FUZZY CONTROL J. EUGENIO NARANJO, CARLOS GONZALEZ, RICARDO GARC~A,TERESA DE PEDRO, JAVIER REVUELTO Instituto de Autombtica Industrial. Consejo Superior de Investigaciones CientiJicas. Ctra. Camp0 Real Km. 0,200. 28500 La Poveda. Arganda del Rey. Madrid, Spain. Phone: +34 918711900. Fax: +34 9187170.50.
The application of fuzzy logic based control is a way for modeling human behavior successfully proven . These techniques are very useful, particularly in the automated vehicle field, where the elements to be controlled are mostly very complex and cannot be described by a linear model. This paper presents a control system for vehicle steering. We have developed a set of controllers, which have been fine tuned according to human experience and a study of the human driving. The comprehension and interaction of input control variables are very important factors during control adjustment. These controllers have been used and tested on human-like automatic driving routes, generating a robust and repetitive system.
1. Introduction Our work addresses the intelligent transportation systems field, of which one of the research topics in this field is automatic driving systems. Our research work focuses on this topic. We design and implement control systems to automate vehicle actuators, that is, steering wheel, throttle and brake pedal, and test these actuators on real vehicles and real roads in route tracking experiments [ 11. Techniques based on classical control [2] as well as AI-based control, like neural networks [3] or fuzzy logic [4], are used for car steering wheel control. In this case, we use fuzzy logic-based controllers, which were demonstrated by M. Sugeno to be very well suited for mobile robot and industrial automation applications [5] in the early 1990s. Another reason for choosing fuzzy logic is that the steering of a car does not have a clearly linear mathematical model. Next, we describe part of the work done within the Autopia program [ 6 ] ,a set of Intelligent Transportation Systems research projects in which we develop our work.
2.
Building the Fuzzy Controller
The basic application of an automatic steering control is to keep a vehicle in the right lane of a road, without sharp unnecessary turns and overactuations that could cause a hazard. In our case, we have built a private driving zone, laid out to resemble a built-up area, where we run the experiments. Each street of the 450
45 1
circuit has been modeled using high precision RTK-DGPS positions. We use these data to represent the streets as a series of straight segments. Two massproduced Citroen Berlingo vans have been automated and equipped with another GPS receiver and an onboard computer that houses the control system. Map matching techniques are used to get the car to automatically go along defined routes, where the control system calculates the necessary input information. Two variables have been defined to control the steering: lateral and angular errors. Lateral error is the distance from the reference line to the car and angular error represents the deviation of the car direction vector from the same reference line. Human drivers also use these inputs, and they can be modeled as fuzzy variables in our automatic system. Another important consideration is that driving along a straight road is a far cry from driving along a road with bends. Steering wheel movements should be slow and very slight along straight stretches of road, where we should avoid sharp turns that could make us to get out of lane. When driving around bends, however, the movements of the steering wheel should be wide and fast, but if we turn the steering wheel as far as we can very quickly on a bend, we are likely to lose control of the vehicle and have an accident. This behavior also has to be considered at control system design time.
2.1. Defining the Steering Fuzzy Controller The elements that conform a fuzzy controller are inputs, outputs and rules. Two controllers have been defined, one for straight road driving and other for bend tracking, which will be selected alternatively depending on the features of the road and the status of the vehicle. 2.1.1. Input
The fuzzy control input is composed of the system variables. The fuzzification method transforms their crisp values into the granulated fuzzy values. For this purpose, we define membership functions for the two input variables, as shown in Figure 1 Note that there are two fuzzy sets for each variable that correspond to the straight and bend controls. Although the name of the labels for each variable is the same, its meaning is not. Right and Left are the linguistic labels for each membershp function. They indicate whether the car is pointing to the left or to the right of the reference line. This information is used to take the appropriate control actions. Figure 1 also shows that the slop of the straight membershp functions (b, d) are higher than the bend slop (a, c). This definition represents human behavior in the two driving modes, which is very fast and reactive in straight lane driving and slower when driving around bends.
452 Right
Left
-63
0
63
Right
Degrees
LeJi
-2
a)
Left
0
2
Degrees
b)
Right
-3
0
3
Meters
Right
-0.8
LeJi
0 0.8
Meters
Figure 1. Input membership functions. a) Angular error in bend tracking. b) Angular error in straight tracking. c) Lateral error in bend tracking. d) Lateral error in straight tracking.
2.1.2. Rules Only four rules are needed for the lateral control of a vehicle. We have chosen Mamdani [7] operators and the inference rules are built like Sugeno-Takagi [S] rules, forming very intuitive but powerful sentences. The fuzzy variables are represented in bold and the labels for these variables in italics. IF Angular-Error Left THEN Steering Right IF Angular-Error Right THEN Steering Left IF Lateral-Error Left THEN Steering Right IF Lateral-Error Right THEN Steering Left 2.1.3. Output The output value is generated using Sugeno’s singletons. Two labels (Left and Right) have been defined for managing the steering wheel turn. There are also separate definitions for bend and straight line driving; unlimited turning of the steering wheel is allowed for tracking bends, the output being defined between 0 and 1. In straight road tracking, steering wheel turns must be very limited, and we emulate this behavior by c o d i i n g the possible values of the Steering variable to the [0, 0.0251 interval. This means that the car can turn at most 2.5% of the total steering wheel rotation. The defuzzification method is the weighted average (l), where 0 is the otput value and S.Wi is steering value inferred by each rule i. Specifically, S = I for driving around bends and S =0,02.5 for straight driving. Finally W iis the weigth of the rule i, that is, the extent to whch the current input values satisfy the conditions of the rule i.
453
2.1.4. Controller Tests This version of the fuzzy controller has been installed in the test-bed vehicle to test its performance. We plotted an automatic route, in which straight segments of road alternate with tight bends. The results are shown in Figure 2. The reference trajectory map used for defining the desired route is formed with the representative GPS positions of the target route. The car drives on the right as in the continent, and left and right turns are included to test the performance of the controller under all circumstances. Figure 2 shows that the intuitive adjusted controller performs as desired. Looking closely, however, we find that the controller bend mode is not finely tuned. Two problems appear when taking right-hand bends. First, the car starts turning too late, leading to a massive left-hand lane invasion, corrected by turning the steering wheel as far as it will go. This maximum steering wheel turn leads to the second problem: a costly adaptation maneuver and trajectory overshoot until the vehicle recovers the correct positioning. Around left-hand bends, the car adapts correctly to the map shape, but the steering turn decreases a little at the end of these bends, and the car goes off the side of the road, forcing the control system to sharply correct the trajectory until the car is positioned correctly again. At right turns, the left lane very often has to be invaded in human driving, because the presence of a curb prevents the car from cutting off the comer. However, the car should get as close as possible to the corner.
2.1.5. Fine Tuning the Steering Fuzzy Controller After examining system performance, the membership functions of the bend controller have to be fine tuned. The driving for left or right turnings is different when the vehicle is in the right-hand lane from when it is traveling down the center of the road. That is, unlike the straight control, the input membership functions for the bend control cannot be symmetric. The first problem concerning right-hand bends happens because there are overactuations on the steering. The two rules involved in this situation are: IF Angular-Error Left THEN Steering Right IF Lateral-Error Right THEN Steering Left The reason behind this problem is that the second rule carries too much weight in the control system because the slop of the Lateral-Error Right (Figure 1) forces the car to go too far into the other lane, before it loses strength. To correct this, the slope of that label needs to be smoothed out. After some tests, the optimum value for the vertex of the h c t i o n was found to be -5.9 meters.
454 4462599
I I I
I I
I I
I I
I I
I I
I
I
I
I
I
I
I
I
I I
456800
456850
458900
458950
459000
459050
I
4462549
4462499
4462449 456750
459100
Figure 2. Automatic driving route through the driving zone. The figure shows the edges of the road, centerline, reference trajectory map and actual car trajectory map.
The second problem about right-hand bends is that steering overactuations occur when the car tries to get back into the right lane, after taking the bend. In this case, it is the rules that are complementary to the ones described for the above problem that come into play: IF Angular-Error Right THEN Steering Left IF Lateral-Error Left THEN Steering Right This problem is caused by the first rule and happens after the car invades the left-hand lane, thus activating the second rule: the correction to take the car back to the correct path is too sharp, and there is a little oscillation. The solution is to smooth the shape of the Lateral-Error Left. The new found value is 3.75 m. The left turns entail the following rules, in which the tracking error occurs: IF Angular-Error Right THEN Steering Left IF Lateral-Error Right THEN Steering Left The reason behind the problem is that, when the angular error is too low and the lateral error is near zero, the steering correction to the left is insufficient. To correct this, the strength of the Angular-Error Right needs to be increased to amplify the actuation of the steering wheel. So, we set the new value at -53". After modifying the controller, we repeated the experiment to compare the results. The new route is shown in Figure 3. As we can see, the problem that appeared in the first version of the controller have been corrected.
3.
Conclusion
An artificial drive-by-wire steering system can be implemented using simple and intuitive fuzzy rules and input and output variables, which can be defined and fine-tuned based on human experience and knowledge. The experiments show a
455 human like behavior that is as good in straight lane driving as in driving around bends, demostrating the power of the developed controllers. 4462599
4462549
4462499
I
4462449 458750
458800
458850
458900
458950
459000
459050
459100
Figure 3. Automatic route tracking controlled with the fine-tuned fuzzy system.
Acknowledgments The authors whsh thank the to Spanish Ministries MFOM and MCYTI that support COPOS and ISAAC projects and to Citroen Espaiia SA References 1. R. Garcia, et al., Integration of Fuzzy Techniques and Perception Systems for ITS. Computational Intelligent Systems for Applied Research FLINS2002. Ed. World Scientific. Gent, Belgium, pp. 3 14-320 (2002) 2. A. Broggi et al., Automatic vehicle guidance: the Experience of the ARGO Autonomous Vehicle, World Scientific (1999) 3. D. Pomerleau, ALVINN: An Autonomous Land Vehicle In a Neural Network, Advances in Neural Information Processing Systems 1, Morgan Kaufmann (1989) 4. M. Hitchings et al., Fuzzy Control, Intelligent Vehicle Technologies, Vlacic, Parent, Harashima eds. pp. 289-327, SAE International (2001) 5. M. Sugeno, Industrial Applications of Fuzzy Control, North Holland (1985) 6. J. E. Naranjo et. al., Adaptive Fuzzy Control for Inter.-Vehicle Gap Keeping, IEEE Trans. on ITS, Vol. 4: No. 3, Sep. (2003) 7. E. H. Mamdani, Application of Fuzzy Algorithms f o r Control of a Simple Dynamic Plant, Proc. IEE, 121, 12, pp 1585-1588, (1974) 8. T. Takagi and M. Sugeno, Fuzzy Identification of Systems and Its Applications to Modeling and Control, IEEE Trans.on Systems, Man and Cybernetics, Vol. SMC-15, No. 1, pp. 116-132, (1985)
A NEW SONAR LANDMARK FOR PLACE RECOGNITION
A. PONCELA: C. URDIALES, C. TRAZEGNIES AND F. SANDOVAL Departamento de Tecnologz'a Electrdnica, E . T. S.I. Telecomunicacidn, Universidad de Mdaga, Campus de Teatinos, 29071, Mdaga, Spain
This paper presents a new sonar based landmark to represent significant places in an environment for localization purposes. This landmark is based on extracting the contour free of obstacles around the robot from a local evidence grid. This contour is represented by its curvature, calculated by a noise-resistant function which adapts to the natural scale of the contour at each point. Then, curvature is reduced to a short feature vector by using Principal Component Analysis. The landmark calculation method has been successfully tested in a medium scale real environment using a Pioneer robot with Polaroid sonar sensors.
1. Introduction One of the main concerns of autonomous navigation is localization, defined as the problem of correlating the coordinate system of a robot with that of the external world. The most classic approach, odometry, consists of equipping the wheels of the vehicle with encoders to estimate displacements. Unfortunately, odometry is highly affected by noise due to deformation of the wheel radius, wheel slippage, vibrations and other errors. To achieve reliable localization, most methods rely on acquiring and processing external information to compare it to an internal environment model, either available a priori or acquired on-line. Commercial applications (i.e. [4]) typically rely on artificial beacons to reduce the problem to measurement, correlation and triangulation. This is fast and accurate [3] but depends on knowing a priori the beacons layout. A second choice is to use natural landmarks in the environment. In this case, the robot acquires landmarks on-line while its position is still known and then uses the learnt ones to correct odometry. These algorithms depend on the kind of sensors available, typically video-cameras, optical range finders and sonar sensors. While vision-guided localization results in superior accuracy with reduced speed 'Email: [email protected]
456
457 and increased cost, optical range finders present a low performance when compared to the large beamwidth of non-expensive sonar sensors [l].Thus, this paper focuses on a method to extract significant landmarks from sonar sensors. Sonar sensors provide the distante to obstacles around the robot so that structures like corridors, corners or intersections can be perceived. The main problem of these sensors is that they are affected by errors, specially angular inaccuracy and mu!tiple echoes. Most sonar error correcting techniques rely on fusing sensor information in time and space, usually by means of evidence grids [ 5 ] . These grids are fast and easy to construct and they efficiently combine any number of sensor readings. However, grids usually involve a huge data volume. Also, they are affected by the robot heading and relative position so they are not easy to compare. To avoid this problem, some methods prefer to work with raw range data [6][8] which is corrected by using statistical methods like Principal Components Analysis (PCA). Other methods [9] rely on matching techniques to evaluate the similarity between grids. In this work, we propose a new method t o combine both approaches: we rely on grids to achieve more stable sensor readings, but grids are reduced to a short feature vector by means of PCA. However, prior to PCA, we extract the contour of the area free of obstacles around the robot and represent it by means of a curvature function very stable against noise and transformations. Thus, the feature vector returned by PCA is resistant against noise and transformations too. This paper presents these vectors and shows how they can be used to recognize significant places in the environment.
2. Contour representation Sonar landmarks are extracted from local grids built around the robot (51. A local grid is a two dimensional tesselation of the space around the robot into cells presenting a probability of occupation ranging from 0 to 1. We propose a new landmark extraction algorithm to extract a short feature vector from a local grid so that they can be easily compared and stored. Our method initially relies on representing the contour of the region free of obstacles (RFO) in each grid by means of its curvature. First, the grid is thresholded: cells whose occupancy value is below a threshold U are considered free space and the rest are occupied or non explored. Then, the thresholded grid is stored in polar coordinates to easily look for the closest non-free cell 0 6 to the robot in each direction 4. Then, we use a median
458
filter to partially remove noise from the grid. The contour C is given by O+,V4 E [-180,180]. Fig. 1 shows the different stages of the contour extraction algorithm. In the grid in Fig. l . a , obstacles, free space and unexplored areas are printed in white, black and gray, respectively. Fig. 1.b shows the thresholded local map after obtaining the closest obstacles around the robot 0, which is noisy because of sonar errors. The filtered map is presented in Fig. 1.c and, after all regions in the grid not centered around the robot are removed, the RFO appears in Fig. 1.d. Finally, Fig. 1.e presents the contour of the grid in Fig. 1.a.
Figure 1. Contour extraction: (a) local map; (b) thresholded local map; (c) thresholded local map after filtering; (d) region of interest; (e) contour of the local map in (a).
Contour C is represented by its adaptively estimated curvature function, calculated as proposed in [7]: 1. Calculation of the incremental chain code of C for each pixel n, , the difference in z and y between pixels n and n+l. ( A z ( n )A , y ( n ) ) as 2. Calculation of the maximum contour length free of discontinuities around n, k(n). k(n) is obtained by comparing the Euclidean distance d(n-k(n),n+k(n))t o the real length of the contour l m a z ( k ( n ) ) . k(n) is the largest value that satisfies:
4. - k ( n ) ,n + k ( n ) )2 l m Q Z ( k ( n -) ) u k
(1)
3. Calculation of the incremental adaptive chain code ( A z ( n ) k&(n)k), , associated to n. 4. Calculation of the curve slope a t every point n of C. It is equal to:
5. Calculation of the curvature a t every point n, CF,. This value is locally approximated by Ang(n + 1) - Ang(n). The main advantage of this curvature function is that it is very resistant against noise [7].However, it shifts depending on the robot orientation
459
even for the same position. Thus, we use the module of its FFT. Figs. 2.a and c present two different real contours and the clearly different IFFTI of their curvature (Figs. 2.b and d). 256-IFFTls are still complicated to compare and store so we use Principal Component Analysis (PCA) t o compact them. PCA is used to reduce the dimensionality of a data set of correlated variables while retaining most of their original variation. Given a set of IFFTls of N arbitrarily chosen local maps, we can find a P-dimensional subspace, P 5 N , which all JFFTlsbelong to. Thus, any JFFTJcan be represented with only P components on a basis of P 5 N . PCA allows calculation of this P-dimensional basis. Fig. 3.a shows a real test environment and the positions where 250 grids are gathered to calculate our basis. The basis calculation is performed off-line and only once. Then, ( F F T J s can be projected onto it on-line in a fast way. Figs. 3.b and c show how much information is explained by different sized bases. It can be noted that most information is explained by the first component, roughly the function average. The variance percentage of the input set explained by the first 10 components is equal to 85%. It has been tested that a basis of 10 vectors is enough t o separate different places unless differences are quite subtle.
3. Tests and results Once a basis is available, the robot can gather local grids while it is moving and project their curvature IFFY'( onto the basis to acquire a landmark at its current position. If grids used to calculate the basis were captured a t places different enough, the resulting feature vectors are representative whether the current local map was used to obtain the basis or not. To test the validity of the proposed landmark extraction method, we have performed several experiments in a real environment using a Pioneer robot equipped with frontal Polaroid sonar sensors. To grant that a landmark
Figure 2. (a) Contour of local map; (b) IFFTI of the curvature function of contour in (a); (c) contour of local map; (d) ( F F T I of the curvature function of contour in (c).
460 is representative enough, grids corresponding to mostly non-explored areas are discarded. This is important because in mostly non-explored areas grids depend strongly on the orientation of the robot and may change significantly from one sensor reading to the next. A landmark is representative as long as similar places yield similar landmarks and different ones do not. Thus, a first test has consisted of extracting landmarks a t close positions while the robot is moving ahead. If those positions are close enough, the layout of the environment around the robot is bound to remain mostly the same in absence of dynamic obstacles. Thus, landmarks should be very similar. Fig. 4.a shows nine consecutive RFO contours captured in the test environment in Fig. 3.a and their feature vectors (Fig. 4.b). Some of these nine contours were used to obtain the basis and some of them were purposefully not. Nevertheless, despite minor differences between these contours, it can be noted that feature vectors are very similar. Fig. 5 shows a second test of landmark similarity. In this case, all landmarks have been captured a t different locations (Fig. 5.a). However, it can be observed that landmarks 1 and 3 are captured a t similar locations, both having a wall on the right. Landmark 2 , however, is captured in front of a wall, near a corner, while landmark 4 is captured at a free of obstacles location. Fig. 5.b shows the feature vectors of landmarks 1 and 3. It can be observed that both vectors are quite similar as expected. Fig. 5.c shows landmarks 2 and 4, which are clearly different between themselves and also from landmarks 1 and 3. In order to establish a distance between landmarks for matching, we have evaluated different metrics and chosen the
Figure 3. (a) Landmarks for basis calculation in the real test environment; (b) percentage of variance explained by the principal components.
46 1 loom
,
-20001 0
' 2
I
I 4 6 8 Feature Vector Camponenl Index
10
Figure 4. (a) Consecutjvely captured grid contours; (b) vectors for contours 1-9. (c) landmark locations.
Tanimoto distance [a], which yielded the best results. Table 1 shows the distance among each two landmarks. It can be appreciated that landmarks 1 and 3 are t,he closest ones. However, landmarks 1 and 2 are not too far. This is reasonable because they correspond to near positions of the robot. However, the wall appearing in front of landmark 2 makes them different eriough to discriminate between both places. Finally, landmark 4 is quite different from all the rest. Table 1.
Tanimoto distances
Landmark1
Landm.urk2
Landmark3
Landmark4
La.ndmark1
0
0.1014
0.059
0.2768
Lan.dmark2
0.1014
0
0.1572
0.3362
Lnndmark3
L on,dma.rk 4
:=m 0.059
0.1572
0
0.1793
0.2768
0.3362
0.1793
0
2000 4000
l : j D
0 -2000 0
2
4
6
@)
Figure 5.
8
10
0
2
4
6
(c)
(a) Landmark locat,ions (b) vectors for 1 and 3 (c) vectors for 2 and 4.
8
10
462
4. Conclusions and future work This paper has presented a new sonar based landmark for localization. Landmarks are extracted from the curvature of the region free of obstacles around the robot in local grids. The modules of the FFT of the curvatures are processed by Principal Component Analysis to extract a short feature vector. These vectors have been successfully evaluated in a real environment using a Pioneer robot equipped with 8 frontal Polaroid sensors. Landmarks are useful to distinguish different places but, since similar places a t different locations yield similar vectors, future work will focus on statistically combining information from consecutive landmarks for localization purposes. Acknowledgments This work has been partially supported by the Spanish Ministerio de Ciencia y Tecnologia (MCYT) and FEDER funds, project No. TIC2001-1758. References 1. G.C. Anousaki and K.J. Kyriakopoulos, Simultaneous localization and map building for mobile robot navigation, IEEE Robotics f5 Automation magazine, pp. 42-53, September, (1999). 2. J.D. Holliday, C.Y. Hu and P. Willet, Grouping of Coefficients for the Calculation of Inter-Molecular Similarity and Dissimilarity using 2 0 Fragment Bit-Strings, Combinatorial Chemistry €3 High Throughput Screening, 5 (2), pp. 155-166, (2002). 3. L. Kleeman, Optimal estimation of position and heading for mobile robots using ultrasonic beacons and dead-reckoning, Proc. of the IEEE Int. Conf. on Robotics and Automation (ICRA '92), Nice, France, pp.2582-2587, (1992). 4. T.S. Levitt and D.T. Lawton, Qualitative navigation for mobile robots, Artificial Intelligence, 44 (3), pp. 305-360, (1990). 5. H.P. Moravec, Sensor fusion in certainty grids for mobile robots, AI Magazine, 9, pp. 61-74, (1988). 6. C. Urdiales, A. Bandera, R. Ron and F. Sandoval, Real time position estimation for mobile robots by means of sonar sensors, Proc. of the 1999 IEEE Znt. Conf. on Robotics and Automation (ICRA '99), pp. 1650-1655, (USA), (1999). 7. C. Urdiales, C. Trazegnies, A. Bandera and F. Sandoval, Corner detection based on adpatively filtered curvature funcion, Electronics Letters, 39 (5), pp. 426-428, (2003). 8. N. Vlassis and B. Krse, Robot Environment Modeling via Principal Component Regresion, Proceedings of the 1999 IEEE/RSJ Int. Conf. on Intelligent Robots and Systems (IROS 1999), pp. 677-682, Kyongju, Korea, (1999). 9. B. Yamauchi, and R. Beer, Spatial learning for navigation in dynamic environments, IEEE Trans. on Syst., Man and Cyb., 6 (26), pp. 496-505, (1996).
AUTOMATIC PARKING WITH COLLISION AVOIDANCE
D. MARAVALL AND J. DE LOPE Department of Artificial Intelligence Faculty of Computer Science Universidad Polite'cnica de Madrid Campus de Montegancedo, 28660 Madrid, Spain E-mail: dmaraval1,jdlope @fi.upm.es
M. A. PATRICIO Department of Computer Science Universidad Carlos 111 de Madrid Campus d e Colmenarejo, Madrid, Spain E-mail: [email protected]
Automatic parking of a car-like robot in the presence of a priori unknown obstacles is the topic of the paper. This problem is solved by means of a bio-inspired approach in which the robot controller does not need t o know the car kinematics and dynamics, neither does it call for a priori knowledge of the environment map. The key point in the proposed approach is the definition of performance indexes that for automatic parking happen t o be functions of the strategic orientations to be injected, in real time, t o the robot controller. This solution leads t o a dynamical multicriteria optimization problem, which is extremely hard t o be dealt with analytically. A genetic algorithm is therefore applied. The results of computer simulations are finally discussed.
1. Introduction In this paper we address the problem of automatically parking a backwheel drive vehicle with the additional difficulty created by the presence of a priori unknown obstacles, so that the car controller has to autonomously perform, in real time, two different tasks: parking and collision avoidance. To solve this double-sided problem we put forward a solution based on a biomimetic approach that we have recently proposed1>'. 463
464
2. Sensory-Motor Coordination in Automated Vehicles
Using a Biomimetic Approach The idea underpinning the method is to optimize specific measurable robot's behavior indexes using appropriate sensors. The optimization is solved by means of heuristic techniques, which makes the robot controller highly flexible and very simple. In the manner that living beings solve their physical control problems, like manipulation and locomotion, the robot develops a behavior strategy based on the perception of its environment, embodied as behavior indexes and aimed at improving (optimizing) the evolution of the above-mentioned behavior indexes. It does all this following the known perception-decision-action cycle. The robot vehicle considered in this paper is a conventional back-wheel drive car, whose discrete dynamic equations can be modeled, for the lowspeed range typical of parking maneuvers, as:
where ( x , y ) are the coordinates for the point of application of the force of traction on the vehicle; 0 is the heading of the vehicle on the plane on which it is moving; v is its speed; L is the distance between the front and back axles, and the variable q5 is the direction of the driving wheels with respect to the vehicle heading 8. Obviously, (v,4) are the robot control variables and ( 2 ,y , 0) are its state variables. Finally, &,ax is the maximum angle that can be applied to the direction of the driving wheels. In the case of automatic parking without obstacles, there are two behavior indexes of interest: J1 and 52. These two indexes quantify the goal that the robot should park in the final position ( x d , y , j ) and the goal that the robot should park in line with the parking space direction, e d , respectively. Hence:
Set out in the terms described above, automatic parking can be considered as a standard multicriteria optimization problem, where the agent control actions (in this case, the direction of the steering wheel, because the robot is moving a t constant speed) should simultaneously minimize the two indexes that appear in expression (2). As any driver will have found
465
in practice, the dynamic coordination of these indexes in a parking maneuver is an extremely complex control problem, as, in nonholonomic vehicles, any slip in the combination of the actions suggested by the approach and heading indexes may be disastrous for the parking maneuver. The presence of arbitrary and a priori unknown obstacles introduces an additional goal -i.e. collision avoidance- and, consequently, an additional behavior or performance index, which is the topic of the next section.
3. Collision Avoidance We have previously proposed3i4 a method for autonomous navigation of mobile robots, based on the coordination of some strategic orientations of the mobile robot. These orientations correspond, roughly speaking, to the two fundamental tasks of any mobile robot, including car-like robots: (1) collision avoidance and (2) goal reaching, which in automatic parking is, obviously, the parking space. As regards the general task of collision avoidance, we introduced3 a generalization of the well-known artificial potential field (APF) theory5. More specifically, we added to the customary normal orientation the tangential orientation, making it possible for the robot to perform smoother and more efficient trajectories for collision avoidance. We suppose that the robot only knows the space parking coordinates (zd, yd) and its direction O d . To perform the parking maneuver, the car is equipped exclusively with four sets of ultrasound sensors placed at each of its four edges. Roughly speaking, the normal orientation represents the objective or goal of flying from the obstacle, whereas the two tangential orientations, right and left, are meant for the robot to round up the obstacles, either by following the right or the left direction. Obviously, the other important orientation for the robot is the one that provides the trajectory towards the parking space. It is straightforward to show4 that the normal orientation is given by
where UT(x, y ) is the repulsive potential field:
466
in which p ( 5 , y) is the shortest distance from the car to the nearest obstacle and po is a threshold distance beyond which the obstacles do not influence the car movements. The tangential orientations are
where the opposing signs correspond to the right and left tangential orientations. One important drawback of expressions (3) and (5) is that we need to compute, at each robot position, the repulsive potential field to obtain the two orientations for collision avoidance. We applied' a simpler method for the computation of both orientations @' and 4Tlby means of a ring of ultrasound sensors, so that we shall suppose that the normal and tangent orientations are available at each robot position without computing the exact potential field function. 4. Dynamic Multicriteria Optimization with Genetic
Algorithms Summarizing, we have at each instant all the necesary orientations to guide the car towards the parking space without colliding with the existing obstacles: (1) orientation 4' that optimizes the performance index J1 which drives the robot towards the final position (Q, g d ) ; (2) orientation 4' that optimizes the performance index Jz responsible of making the car to park in line with the parking space 19d; (3) the normal orientation 4n that precludes the robot from being too close to the obstacles and the tangential orientation +T that constraints the robot to follow the optimum trajectory to circumvect the obstacles. The next and crucial step is to coordinate all these competitive and opposing control orientations:
4d(t)= J [ 4 W , &t), 4"(tIlW)]
(6)
where by @ ( t ) we are referring to the desired and final orientation to be applied to the car. The simplest, although by no means trivial, coordination of the competitive orientations is a linear one:
&t)
= w1(t)+'(t)
+ W ' ( t ) 4 ' ( t ) + w n ( t M n ( t )+ wr(t)4'(t)
(7)
Therefore, at each instant the car controller computes its final orientation 4d(t)as a function of the four basic angles, so that the key point is
467
to select the suitable set of instantaneous weights w l ( t ) , w z ( t ) , w n ( t ) and w T ( t ) .This is a rather difficult optimization problem, due to the dynamical nature of the four competitive objectives, as they depend on the relative position and shape of the existing obstacles, which are highly uncertain and a priori unknown by the designer of the robot controller. To make more manageable this multicriteria optimization problem, let us first merge the two orientations 4'(t) and +'(t) into a single goal orientation:
so that the final car orientation can be rewritten as
4"t)
= w g ( t ) @ ( t )+ w n ( t W ( t )
+WT(t)4T(t)
(9)
NOW,let us consider the following rule-based reasoning. (1) If the nearest obstacle is v e q close to the robot, t h e n give maximum priority to the normal orientation. ( 2 ) If the nearest obstacle is at an intermediate distance, t h e n the tangential orientation has maximum priority. ( 3 ) If the nearest obstacle is far from the robot, then give maximum priority to the goal orientation. This reasoning process, expressed as linguistic rules, can be implemented either by means of a fuzzy logic-based controller or by a process based on the dynamical coordination of multiple performance indices1i4. More specifically, these rules can be formalized, as far as the numerical values of the weights w g , wn and wT are concerned, in the way shown in Fig. 1, where we have represented the distribution of the coordination parameters as a function of the distance of the robot to the nearest obstacle.
Figure 1. Distribution of the coordination parameters w g , wn and w, a s a function of distance d.
Fig. 1 is self-explanatory. Thus, d, is a critical distance in the sense that it acts as a security threshold beyond which the robot behavior is
468
dominated by the objective of flying out the nearest obstacle by following the normal orientation. In a similar way, d M is another crucial distance beyond which the robot behavior is dominated by the objective of going straightforwardly to the goal. Finally the third parameter A d determines the region of influence of the tangential navigation; i.e.when the robot's main objective is to round up the nearest obstacle by following the tangent orientation. Thanks to the introduction of this dependence on distance of the coordination weights w g ( t ) ,w n ( t ) and w7(t) the search space has been dramatically simplified and reduced. Now, the multiobjective optimization problem exclusively depends on the three critical distances d,, A d and d M , which unlike the parameters w g ( t ) ,w n ( t ) and w T ( t ) are not time-dependent, making the optimum search one order of magnitude simpler, as compared with the direct optimization based on the coordination weights. As regards the coordination of the two subgoal orientations & ( t ) and @ ( t ) ,we shall make use of the results obtained in our previous work on automatic parking without obstacles. Hence, we introduce the following linguistic rules: (1) If the vehicle is far from the parking space, then priority should be given to the approach subgoal (index J1 and orientation q5l). (2) If the vehicle is near to the target, then priority should be given to the heading subgoal (index 52 and orientation q52). Note the simmilarity of these rules with the preceding rules, although now there are only two coordination weights, w l ( t ) and w z ( t ) , and, consequently, two optimization parameters, d A and d h , for the coordination of the subgoal orientations. In summary, there are two embedded optimization processes. The first one concerns the computation of the goal orientation @ ( t )as a result of the coordination of the two subgoals for the parking task -i.e. approximation orientation q5l(t) and alignment orientation 42(t)--. Once @ ( t )has been obtained, the next optimization process affects the coordination of this goal orientation and the two orientations for collision avoidance -i.e. normal and tangential orientations-. The optimum search is very hard to be solved by means of analytical methods or by means of gradient-based techniques, due to the highly uncertain and a priori uknown spatial distribution and shape of the obstacles. Therefore, we have applied a genetic algorithm to solve this twofold multicriteria optimization problem. The experiments were conducted using the University of Sheffield's Ge-
469
netic Algorithm Toolbox for Matlab7. For all cases, a 20-bit resolution binary coding was used for the parameters processed; the parameter ranges depend on the variables to be optimized. The stochastic sampling method was used to select individuals. The crossover probability used is 0.7; the mutation probability is set proportionally to the size of the population and is never over 0.02. Additionally, elitism from generation to generation is used. Hence, 90% of the individuals of each new population are created by means of the selection and crossover operator and 10% of the best individuals of each generation are added directly to the new population. As regards the fitness evaluation, quality is determined by rewarding the individuals that minimize final position in each experiment, the closer an individual is to the position defined as the target, a t the end of the path, the better this individual is. Additionally, individuals who manage to reach the target along a shorter path are also considered better, although the weighting of this factor is lower. Finally, a penalty is added every time that the robot collides with an obstacle. The experiments were actually designed by defining a set of initial and final vehicle positions that would cover the different relative situations between the starting position of the robot and final goal. Each individual generated in the evolutionary process was simulated with these initial and final conditions to thus determine its problem-solving quality. As we mentioned above, the five parameters evolved were d,, d M , Ad, d; and db. The number of individuals in each population was 40 and the convergence to the best solutions was reached in the 60th generation. Fig. 2 shows the path achieved by the fittest individual of two different generations.
5 . Concluding Remarks
Automatic parking of a car-like robot in the presence of unpredictible obstacles has been solved by means of a bio-inspired method, in which the robot controller optimizes, in real time, its trajectory towards the parking space without colliding with the existing obstacles. The coordination of the two main tasks of the robot, i.e. parking and collision avoidance, leads to a dynamical multicriteria optimization problem which has been solved using a genetic algorithm. The results of computer simulations have shown the viability of the proposed method.
470
:[
11 /
P
-30 -10
I
-5
.
0
’
I
.
10
XI
IS
,
20
,
21
,
31
, s
I
10
.
1 -10
0 -8
,
,
0
,
I
,
30
,
I5
m,
zr,
,
P
, s
.
10
Figure 2. Paths achieved by the fittest individual. Left: 20th generation, d , = 5.113, d M = 2.068, A d = 1.702, d& = 7.982, d h = 18.868 and Right: 60th generation, d , = 2.559, d M = 5.998, A d = 2.746, d&, = 2.199, d h = 4.576.
Acknowledgments This work has been partially funded by the Spanish Ministry of Science and Technology, project DP12002-04064-C05-05.
References 1. D. Maravall, J. de Lope, “A Bio-Inspired Robotic Mechanism for Autonomous Locomotion in Unconventional Environments”, C. Zhou, D. Maravall, D. Ruan (eds.), Autonomous Robotic Systems: Soft Computing and Hard Computing Methodologies and Applications, Physica-Verlag, Heidelberg, 2003, pp. 263292. 2. D. Maravall, J. de Lope, “A Reinforcement Learning Method for Dynamic Obstacle Avoidance in Robotic Mechanisms”, D. Ruan, P. D’Hondt, E.E. Kerre (eds.), Computational Intelligent Systems for Applied Research, World Scientific, Singapore, 2002, pp. 485-494. 3. J. de Lope, D. Maravall, “Integration of Reactive Utilitarian Navigation and Topological Modeling”, C. Zhou, D. Maravall, D. Ruan (eds.), Autonomous Robotic Systems: Soft Computing and Hard Computing Methodologies and Applications, Physica-Verlag, Heidelberg, 2003, pp. 103-139. 4. D. Maravall, J. de Lope, “Integration of Potential Field Theory and Sensorybased Search in Autonomous Navigation”, 15th IFAC World Congress, International Federation of Automatic Control, Barcelona, 2002. 5. 0. Khatib, “Real-time obstacle avoidance for manipulators and mobile robots”, Int. J . of Robotics Research, 5(1), 1986, pp. 90-98. 6. D. Maravall, J. de Lope, M.A. Patricio, “Competitive Goal Coordination in Automatic Parking”, Proc. 1st European Workshop on Evolutionary Algorithms in Stochastic and Dynamic Environments, Coimbra, (2004). 7. A. Chipperfield, P. Fleming, H. Pohlheim, C. Fonseca, Genetic Algorithm Toolbox for Matlab, Department of Automatic Control and Systems Engineering, University of Sheffield, 1994.
MODELING THE RELATIONSHIP BETWEEN NONWOVEN STRUCTURAL PARAMETERS AND THEIR PROPERTIES FROM FEW NUMBER OF DATA P. VROMAN, L. KOEHL, X. ZENG Ecole Nationale Suptrieure des Arts & Industries Textiles, 9 rue de I’Ermitage, Roubaix 591 00, France T. CHEN College of Textiles, Donghua University, Shanghai 200051, P.R. China
This paper deals with the modeling of the relations between the functional properties (outputs) and the structural parameters (inputs) of nonwoven products using soft computing techniques. To reduce the complexity of such model, a selection of the most relevant input parameters is required. A new selection method based on a ranking criterion is first presented. Several models, taking into account specificities of the nonwoven families, are secondly defined using multilayer feed forward neural network. The interest of the selection method is also tested and discussed.
1.
Introduction
Nonwoven products have been more and more widely used nowadays because of their numerous and interesting functional properties (e.g. insulation, protection, filtration, breathiness). The number of end-uses designed with nonwoven materials has significantly grown in the last decades while the production in Western Europe rose by 8% [l]. Consequently, great attentions have been paid to explore the relationships between the structural parameters of nonwovens (thickness, basis weight, raw material.. .) and their functional properties. Such approach enables manufacturers to obtain a better understanding of the influence of the nonwoven structure and the related process parameters on the product quality. Thus, we develop, based on soft computing techniques, several mathematical models for characterizing the relations between the structural parameters (input variables) and their functional properties (output variables) in order to design new nonwoven products [ 2 ] . On the top of the nonlinear relations between inputs and output variables, the complexity of the designed model is related to the huge number of structural parameters, their interdependencies and also the critical lack of available data. A new selection procedure of the input variables (structural parameters) based on a ranking criterion is first presented [3]. In order to solve the difficulty that only a 47 1
472
few number of measured data is available for parameters selection and modeling due to the limitations of the production lines, our method has been developed by properly integrating both human knowledge on processes and products and measured data. Similar works has been done using neural networks [4]. After selecting the relevant structural parameters, the model characterizing the relation between the structure parameters and each property is then developed using a multilayer feed forward neural network [ 5 ] . This modeling procedure has been successfully applied to the prediction of filtration property of nonwoven product.
2.
Selection Procedure of Relevant Structural Parameters
As it is quite difficult to produce large numbers of samples for studying the influence of each structural parameter on the functional properties selected from the final product specifications, small-scale ANN models will be built from limited number of learning data and the most relevant structural parameters should be selected before the modeling procedure. A criterion is defined to rank the structural parameters considering both the human knowledge and their sensitivity to the properties [3]. Accordingly, the ranking criterion is the linear combination of two elements. The first element (H) represents the human knowledge on products. The second element (Sk) represents the measured data of nonwoven properties based on a distance based method, which is defined according to the following two rules: 1. IF a small variation of an input variable corresponds to a big variation of the output variable, THEN this input is considered as a sensitive variable. 2. IF a big variation of an input variable corresponds to a small variation of the output variable, THEN this input is considered as an insensitive variable. Therefore, according to this criterion (&), an input variable is considered relevant if its small variation induces a great variation of an output. The ranking criterion is formulated as follows: Denote X, = (xsl,xs2, ..., xsk, ..., x , , , ) ~the input vector of all the structural parameters and Y, = (ysI, ys2, ..., y,,, . . ., ySm)*the output vector of properties that correspond to the sample s (SE { 1, .. ., z } ) . All the recorded data have been normalized to eliminate the scale effects and the learning data set contains z samples. To rank the relevant inputs for a given output yj, a criterion variable Fk is defined as follows: F k = gl . H (xk, yj) f g2 . s k
(1)
473
where dL(X,,X,)= J d 2 ( X , , X , ) - d ~ ( X i , X , ) /CE{I, , ..., n } , j e { l , ..., m},gl and gzare two positive coefficients. The criterion is designed for searching the best compromise between both the human knowledge and measured data. d(X,, X,)is the Euclidean distance between X, and XI. dk(& &) is the projection of d(X,,XI)on the axis x k . And dCy,,, yo) is the Euclidean distance between y,, and yb. The larger Fk is, the more relevant the input xk will be to this y,. The first element H of the ranking criterion characterizes the degree of coherence between the human knowledge expressed as seen in Table 1 and the variation of measured data. Its principle is as follows. If a variable x k has the same variation trend in learning data set as in the human knowledge, it will be considered as relevant. Otherwise, it will be considered as irrelevant. The universe of discourse ofy, is divided into t equivalent intervals C,.The set AkP is constructed with the set of input data x k , which belongs to the output interval C, of y,. zkp is generated by the overlap between Akpand Akp+l @ E { 1, .. .,t- 11). The human knowledge is expressed with linguistic sentences, such as: IF x l is increasing THEN yl is increasing (see Table 1): R(xl, y l ) = +1 Then H can be calculated using the following formula:
As shown in Figure 1, lzkpl and [ u k p [ are the lengths of the intervals, which correspond to the intersection and union generated by C, and cJPcI, respectively.
474 Table 1. Extract of the human knowledge table
I Structural Parameters (Input space) r
I
thickness basis weight
Hydraulic Functional Properties (Output space) ~
liquid strike through time
filtration level
I
-1
+I
I
+I -1
water permeability
I
-1
+l
I I I I I I I I I I I I I I
Figure 1. Relationship between the input and output spaces.
After calculating H and sk, the value of the relevancy criterion Fk for each input x k and the given output yi can be determined. Then all the Fk's can be ranked in a descending order. Namely, the input corresponding to the highest value of Fk will be the most relevant input to this output, and so on. According to this procedure, the most relevant structural parameters of nonwoven products are obtained and will be used for further modeling procedure. These parameters take into account both the conformity of human knowledge on process technology and the sensitivity of measured data to the physical properties. 3.
Artificial Neural Network Modeling
In our work, since there are several families of nonwoven products and each family has some different structural parameters, all the structure parameters are divided into two groups. One group includes the common structural parameters for all the families and the other group includes the specific structural parameters for each family. Accordingly, two kinds of neural network models are established. A general model makes use of all common structural parameters as its inputs. A specific model makes use of the common and specific structural parameters of each family as the model inputs (Figure 2). The Levenberg-
475 Maquardt fast learning procedure [5] is then used for determining the parameters of the neural networks from the public and the specific data groups. A
Fber count public data
-
filtration level
thicknlss basis weight I
specific data
binder rate
Figure 2. Feed forward neural network general model structure.
4.
Results and Discussion
In our work, 18 samples have been used for studying the water permeability of several families of nonwoven. At first, the structural parameters are selected using the ranking method described above. The same weight of 0.5 is assigned to both the human knowledge criterion (H) and the data sensitivity (&). Table 2 gives 7 most relevant structural parameters of nonwoven fabrics. For the general neural network model, the first 5 most relevant structural parameters are used as public input variables. Next, a specific model is built for only one family of nonwoven, manufactured with a chemical bonded. In this model, the binding rate is added to the set of input variables. The general model is trained with 15 data and validated on 3 new data. The specific model is trained with 5 data and validated with only one datum. Table 2. Ranking of structural parameters according to their relevancy to water permeability Data Sensitivity
c
Input Basis weight Thickness Fiber density Total pore volume Basis weight uniformity Fiber count Fiber length
SI, 0.1000 0.1006 0.1086 0.1036 0.1052 0.1052 0.1045
Ranko 7 6 1 5 2 2 4
Human Knowledge
R -1 -1 -1 1 1 -1 0
H 0.6177 0.5212 0.4574 0.4248 0.3333 0 0
Ranking Fk
0.3588 0.3109 0.2830 0.2642 0.2193 0.0526 0.0523
Rank 1 2 3 4 5 6 7
Table 11 comparative maran and pnn withthree traning meth met Table 11 comparative maran and and pnn pnn withthree traning methmeth met met Table 11 comparative maran withthree traning Table 11 comparative maran and pnn withthree traning meth met
476
1. The general model makes use of sample from several families which differ from each other in many aspects while the specific model only makes use of sample of one family which has quite a few similarities. 2. The specific model uses the same structure, learning procedure and the final weights and biases than the general model and adjusts the weights connecting the specific input and hidden neurons. 5.
Conclusion
In this paper, the relationship between the structural parameters and the functional properties of nonwoven products are modeled using neural network. In order to reduce the complexity and solve the difficulty of insufficient available data, we select the most relevant structural parameters as input variables of the model according to the data sensitivity and the conformity of human knowledge. In the modeling procedure, a general model is first designed for all families of nonwoven. It is built from the set of public input variables. Then, a specific model is built for one family of nonwoven by adding specific structural parameters to the set of public input variables. In the learning procedure, the general model is learnt from all available nonwoven samples while the specific model adjusts its weights and biases considering the specific input. The simulation results show low prediction error on water permeability using both the general and specific models. References
1. Western Europe: nonwoven production +8%. Technical Textiles, 46(3): El00 (2003). 2. L.A. Zadeh, The Roles of Fuzzy Logic and Soft-Computing in the Conception, Design and Deployment of Intelligent Systems, BT Technol. Jo. 14(4), 32-36 (1996). 3. L. Koehl, P. Vroman, T. Chen and X. Zeng, Selection of Nonwoven Relevant Structural Parameters by Integrating Measured Data and Human Knowledge, Multiconference Congress CESA '2003, 9-1 1th july, Lille, France. 4. S.J. Raudys and A.K. Jain, Small Sample Size Effects in Statistical Pattern Recognition: Recommendations for Practitionners, IEEE Trans. On Pattern Analysis and Machine Intel. 13(3) (1991). 5 . D.W. Patterson, Artificial Neural Networks - Theory and Applications, Prentice Hall (1996).
A NEW FRAMEWORK FOR FUZZY MULTI-ATTRIBUTE
OUTRANKING DECISION MAKING OZGURKABAK Istanbul Technical University, Department of Industrial Engineering, 34367 Ma& Istanbul /Turkey
FUSUN ULENGIN+ Istanbul Technical University, Department of Industrial Engineering, 34367 Ma& Istanbul /Turkey
In this paper, a new fuzzy multi-attribute decision making framework for group decision making is proposed. The proposed framework combines both the concept of fuzzy outranking and fuzzy attributes to provide a more flexible way of comparing the alternatives. It consists of three steps. In the first step, the preferences of the decision makers are gathered and then aggregated. In the second step, the attributes are normalized by considering the threshold and the target values of each attribute. At the last step, the results of the previous steps are used as input to a fuzzy multi-attribute decision making approach in order to rank and/or select alternatives. A fuzzy logic based computation is applied in each step for the imprecise or ambiguous data.
1. Introduction
In many real life situations, one of the basic problems is the uncertainty contained in the information about the situation. Additionally, whenever the aim is to make a comparison of an alternative set based on different indicators, it is very hard to find exact quantitative values for all of them. For example, it is very difficult to arrive at straightforward values to measure ecological sustainability. In general, we set a number of criteria for sustainability of a system and we call this system sustainable if its dynamics never drive it outside the boundaries of acceptable values for these criteria. Therefore, in addition to knowledge about the current situation, it is important to formulate targets for assessing progress toward goals. However, real world systems are complex systems characterized by subjectivity, incompleteness and imprecision. Therefore, the specification of clearly defined targets is rather unrealistic. Fuzzy logic has the ability to deal with the complex concepts which are not amenable to a straightforward quantification and contain ambiguities. The starting point of decision models in a fuzzy environment was the model introduced in [4].Traditional approaches to discrete multicriteria evaluation in a corresponding author: [email protected] 477
478
fuzzy environment are provided by [13], [7], [2], [3], and [5]. For a detailed overview of these methods see [ 101. In the proposed framework, a fuzzy outranking approach is used in order to take into account the non-compensatory nature of some specific problems. [ 111 and [ 121 used outranking relations effectively by introducing fuzzy concordance relations and fuzzy discordance relation.
2.
The Framework of the Proposed Model
The proposed model to evaluate the introduced problem is composed of 3 steps. Details of the steps of the model will be given in the following sections.
2.1. Determination of Weights In the first step the weights of attributes are determined according to experts’ opinions based on a 1-6 scale. The experts are asked to evaluate contribution of each attribute to the evaluation. They have opportunity to indicate their preferences as very high, high, fairly high, fairly low, low and very low. Then these preferences are represented as triangular fuzzy numbers (TFNs) according to Chen and Hwang’s fifth scale [6]. Subsequently, by means of fuzzy arithmetic operations, logic of which depends on [9], firstly average weight of each indicator is calculated and normalization is then applied for normalized weights. Mathematical formulation of the weights is given below.
where the preference of kth expert forjth indicator is pk, = (pk ;p t ;p: weight of jth indicator is calculated as
=rs:,
s,=(s,L;sy;sP), e is
), average
the number of experts, t is
the normalized weight of the jth indicator is
,=I
w ,= (,$;
w y ;wp ), and m is the number of indicators.
2.2. Measurement of Performance Scores In the second step, performance scores are measured. Instead of using the data for each attribute directly, normalization is made to obtain a common scale to allow aggregation and to facilitate fuzzy computation. For this purpose, the
479
revised version of the fuzzy normalization proposed by [8] is used. The basic difference of the revised approach is that not only the target values but also the thresholds of each attribute are taken into account. This permits the possibility of avoid trade-off among attributes. Additionally, due to the vague and uncertain characteristics of these two values, they are represented by TFNs. The corresponding normalization formulations are given below: For utility attribute (the higher, the better), the following normalization is used:
min(max(x,,;LBf)-LB,!;TP -LB~).min(max(x,,,LB:)-LB~;T~ -LB:) TP - LB: Ti" - LB," min(max(x,; LB:)- LB,L;T; - LB;) T: - LB;
1
(3)
For the cost type of attribute (the lower, the better), however, the normalization is made by: min(UBf -rnax(x,;UBF);UB; -T:) UBj" -qL
min(UBy -max(x,,UBy);UBj" - T y ) . UBj" -T,M
min(UBy -max(x,;UBy),UBf -T,!) U B f -T,!
1
(4)
Here, Nu is a TFN and it represents the normalized value of a alternative i with respect to attributej; xii is the entry value of alternative i with respect to attribute j ; LBj is the lower bound of the utility attribute j and it is a TFN characterized by LB, = (LBR; LB!' l ;LBI.); UBj is the upper bound of the cost attribute j and it is a I l and T j is the target value of TFN characterized by UBj = (uBR;uBM;uB~.), J J J attribute j and it is characterized by Tj = (T,!;T; ; T ; ) . Here; subtraction and division operations for TFNs are based on the definitions in [6].
2.3. Fuzzy Outranking Approach for the Comparison of Alternatives In this study the revised version of the methodology suggested by Aouam et.al. [ 11 is proposed for the comparison of alternatives. Herein details of methodology will not given but changes made with respect to [ 11 will be emphasized. The original method can take both crisp and fuzzy inputs. An outranking intensity is introduced to determine the degree of overall outranking between competing alternatives, which are represented by fuzzy numbers. The comparison of these degrees is made through the concept of overall existence ranking index (I(a)). In the methodology first of all fuzzy concordance (dC(a,b)) and disconcordance functions (dD(a,b)) are calculated. Then outranking intensity (df(a,b)) is measured that is used to get the overall outranking intensity (I(a)). In
480
order to get fuzzy concordance function, the partial outranking number (dj(a,b)) and the fuzzy concordance number (C(a,b)) are calculated respectively. Maximum non-significant threshold (sj) and criteria weights (pj) are inserted to the model at this level. Additionally, in order form fuzzy disconcordance function, fuzzy disconcordance numbers (Dj(a,b)) are used in which fuzzy veto thresholds (vj) are included. The main drawbacks encountered in Aouam et.aZ.'s methodology are related to fuzzy operations. Although it is claimed that the model can take fuzzy inputs, there are some problems on applying the model for fuzzy inputs. Therefore in this study some revisions are made on the model. The first revision is made on the computation of partial outranlung fuzzy number, dj(a,b). If attributej is fuzzy then in the proposed revision dj(a,b) is defined as fallows:
d;(ab> = ( (giwL- g;(b>R>+, (gi(aIM- gi(b>M>+,
- gj(b>L>+ + s;
)I
(5)
where gj(a)L; gj(a)M,and gj(a)Rare the left, central and right values, respectively, of the triangular fuzzy membership function of gj(a). The second revision is made on the calculation of fuzzy concordance number as fallows:
where ('p:)L, (pi)M, and ('p:)R are left, central and right values, respectively, of the triangular fuzzy membership function of pic and
If gj(a) and g,(b) are fuzzy numbers then the comparison of them in eq.(7) will require fuzzy ranktng method. For this purpose in revised version we use Baldwin and Guild fuzzy ranking approach [6] to decide whether gj(a) 2 gj(b) or not. According to this approach if we want to compare A and B fuzzy numbers, and if the membership functions ~AA(xA)and ~B(xB)are piecewise linear functions then the preferability of A over B is
where 6 and y are right and central values, respectively, of the TFN of A; a and p are left and central values, respectively, of the TlW of B. As a result the fuzzy number with higher preferability will be bigger. If we apply this approach to eq.(7) , we get
48 1
Another revision made on [l] is concerned with the calculations of fuzzy disconcordance number for fuzzy attribute j , that is as fallows:
The revised method is also different from the original one on the calculation of fuzzy disconcordance function. During the calculation of fuzzy disconcordance numbers (DJ(a,b)), the comparison of two fuzzy numbers is required. Baldwin and Guild’s ranking approach is suggested to solve this problem. Finally fuzzy outranlung function and overall intensity index is defined similarly to that used in [ 13. Overall intensity index I(A) is used to evaluate alternatives.
3.
Conclusions and Further suggestions
In this paper, a fuzzy outranking approach which is non-compensatory in nature and which takes into account the fuzziness of the weights of the indicators as well as those of the threshold values is proposed. The proposed method is applied to the evaluation and ranking of 36 countries using 47 attributes. Details can be found in Kabak and Ulengin [14]. The proposed method improves the one given in Aouam et al. [ 11 by using more accurate fuzzy attributes as input. In the original paper if inputs are TFNs there is a possibility of getting TFNs such that a > b or b > c or a>c (TFN is considered as (a; b; c)), which is, in fact, in conflict with the logic behind TFNs (for a TFN a I b
482
Finally if the fuzzy ranking approach used in the proposed fuzzy outrankmg method, Baldwin Guild’s method, is substituted by another method the solution may change. The effects of applying different fuzzy ranking approaches can also be studied. Similarly the robustness of the method to the application in different areas should also be investigated.
References
1. Aouam, T., Chang S.I. and Lee E.S., “Fuzzy MADM: An outranking method”, European Journal of Operational Research, 145, 3 17-328 (2003) 2. Baas, S.M. and Kwakernaak, H., “Rating and ranking of multiple aspect alternative using fuzzy sets”, Automatica, 13,47-58 (1977). 3. Baldwin, J.F. and Guild, N.C., “Comparison of fuzzy sets on the same decision space”, Fuzzy Sets and Systems, 2 , 2 13-231 (1979) 4. Bellman, R. and Zadeh L.A., “Decision making in fuzzy environment”, Management Science, 17, 141-164 (1970). 5 . Bonissone, P.P., “A fuzzy set based linguistic approach: Theory and applications”, M.M. Gupta and E. Sanches (eds.), Approximate Reasoning in Decision Analysis, 329-339 (1982). 6. Chen, S. and Hwang, C., “Fuzzy multiple attribute decision making: Methods and applications”, , Springer-Verlag, Germany (1992) 7. Laarhoven P.J.M. and Perdrycz W., “A fuzzy extension of Saaty’s priority theory”, Fuzzy Sets and Systems, 11,229-241 (1983) 8. Phillis, Y.A. and Andriantiantsaholiniaina, L.A.,“Sustainability: An illdefined concept and its assesment using fuzzy logic”, Ecological Economies, 37,435-456 (2001). 9. Raj P.A., Kumar D.N., “Ranking alternatives with fuzzy weights using maximizing set and minimizing set”, Fuzzy SetsandSys., 105, 365-375, (1999) 10.Ribeiro, R.A., “Fuzzy multiple attribute decision making: A review and new preference elicitation techniques”, Fuzzy Sets and Sys., 78, 155-181 (1996) 11.Roy, B., “Partial preference analysis and decision-aids: The fuzzy outranking relation concept”, D. E. Bell, R. L. Keeney and H. Raiffa (eds.), Conflicting Objectives in Decisions, Newyork, 40-75 (1977). 12. Siskos, J.L., Lochard, J., and Lombard, J., “A Multicriteria decision making methodology under fuzziness: Application to the evaluation of radiological protection in nuclear power plants”, H.J. Zimmerman (ed.), TZMS/Studies in Management Science, 20, Elsevier SP, B.V., North-Holland, 261-283 (1984) 13. Yager, R.R., “Fuzzy decision making including unequal objectives”, Fuzzy Sets and Systems, 1, 87-95 (1978) 14.Kabak O., Ulengin F., “A new methodology for Sustainable Development Assessment: Fuzzy Logic Approach”, TUB-ZTU Workshop Proceedings, 83121 (2004)
FAULT DIAGNOSIS IN AIR-HANDLING UNIT SYSTEM USING DYNAMIC FUZZY NEURAL NETWORK
JUAN DU AND MENG JOO ER School of EEE, Nanyang Technological Uniuersity, Singapore 639798 e-mail: [email protected], [email protected]
In this paper, an efficient fault diagnosis method for Air-Handling Unit (AHU) using dynamic fuzzy neural networks (DFNNs) is presented. The proposed fault diagnosis method has the following salient features: (1) Structure identification and parameters estimation are performed automatically and simultaneously without partitioning the input space and selecting initial parameters a priori; (2) Fuzzy rules can be recruited or deleted dynamically; (3) Fuzzy rules can be generated quickly without resorting to the backpropagation (BP) iteration learning, a common approach adopted by many existing methods. Simulation results demonstrate that fast training and diagnosis speed and high diagnosis rate can be achieved.
1. Introduction
It is generally accepted that the performance of Heating, Ventilating, and Air-conditioning (HVAC) systems often falls short of expectations. The acknowledgment of the presence of faults in a HAVC system and the fact that it can lead to occupant discomfort, increasing energy use, and shorter equipment life have resulted in considerable efforts being devoted to the development of fault detection and diagnostic methods for HVAC systems. Fault diagnosis can be thought of as pattern recognition, and neural networks are well suited to this task. They estimate a function without requiring a mathematical description of how the output functionally depends on the input; they learn from examples In 2, the Back-Propagation (BP) network was utilized to diagnose the faults of an Air-Handling Unit (AHU) system. Fuzzy set theory has long been considered a suitable framework for pattern recognition. It can be used in diagnosis systems because fuzzy systems can articulate their knowledge into IF-THEN rules as a kind of expert knowledge ’. Fuzzy logic and neural networks are complementary technologies. A combination of them will have the advantages of both neural networks and fuzzy systems . In this paper, we introduce a dynamic-fuzzy-neuralnetwork (DFNN) based fault diagnosis system to deal with the problem of faults diagnosis in data generated by an AHU simulation model. The
’.
483
484
DFNN has the salent features of no predetermination of fuzzy rules and data space clustering and automatic and simultaneous structure and parameters learning automatically and simultaneously by online hierarchical learning algorithm. This paper is organized as follows. Section 2 gives a brief description of the AHU and the residuals used in the fault diagnosis. The seven faults and domain residuals are then described. The description of DFNN and its learning algorithm are presented in Section 3. Section 4 shows the simulation results and some comparative studies with other learning algorithms. Lastly, conclusions are drawn in Section 5 . 2. System and Model Description
2.1. Air-Handling Unit
A schematic diagram of the variable-air-volume (VAV) AHU utilized for this study is shown in Figure 1. The AHU consists of fans, dampers, coils, sensors, and controllers. The static pressure in the main supply duct was maintained at a constant set point value of 249 Pa (1.0 in. of water) by controlling the rotational speed of the supply fan. The supply air temperature is maintained at a constant set point value of 14.5"C. The airflow rate difference between the supply and the return airstream is maintained at a constant set point value of 0.472 m 3 / s (1000 ft3/min)by controlling the variable speed return fan. A simulation model of the VAV AHU is used to generate the data used in this study. The model is based on steady-state characteristic equations and approximate first-order dynamics.
Ttmptratin stis01
-Senmfault
lrt11lR 141101
Sippl( hi fbu rtatioi flit111hi fbu rtatoi
Figure 1. Schematic diagram of a AHU
Figure 2. Fault situations.
485
2.2. Residuals Residuals are defined as the differences between the actual and expected values of a variable or parameter An expected value could be a set point or a model prediction. Lee et al. (1996) identified patterns of residuals to use as signatures for various faults. The approach is the same in this study. Six residuals which are needed to identify the seven faults are described here:
'.
where R denotes a residual value, T, is supply air temperature, Ts,, is a set-point value of Ts, P, is supply air pressure, P s , ~ is a set-point value of Ps, &Of is airflow rate difference between the supply and return ducts, and &of,,is a set-point value of & O f . A residual is defined for the operation of the cooling coil valve and is given by Ru. UCC denotes actual control signal to the cooling coil valve and UCC,EV denotes the expected value of U c c . RNS and RNRare the residuals for the operation of the supply and return fan. N s and NR are the measured value of the supply fan and return fan respectively. U s and UR are control signal for supply fan and return fan respectively. 2.3. Fault Description There are two types of faults, namely complete faults (or abrupt failures) and performance degradations. Complete failures are severe and abrupt faults. Performance degradations are gradually evolving faults. Although there are many kinds of potential faults in an AHU, seven different equipment and instrumentation faults shown in Figure 2 are considered in this study, based on experimental testing 2 , 5 . The faults are introduced when the system is operating at normal, stead-state conditions, and the dominant symptoms correspond to steady-state conditions after a fault has occurred 2 , 6 . All the faults are simulated by MATLAB. 3. Dynamic-Fuzzy-Neural-Network-BasedFault Diagnosis 3.1. Structure of D F N N In this study we choose the DFNN whose structure is based on the ellipsoidal basis function (EBF) neural network, functionally, it is equivalent to the Takago-Sugeno-Kang (TSK) model-based fuzzy system. The architecture of the DFNN is shown in Figure 3. Basically, it is a multi-input multi-output (MIMO) system which has T inputs and s outputs. The function of the various nodes in each of the four layers is described here: Layer 1: Each node in layer 1 represents an input linguistic variable.
486 Layer 1
Layer2
Layer3
Rule gt-neratioii
Layer4
1 Est irnat icm of
p1-mi SP
pnramrtrx
5
I
I
I
Prlullllg StriltegV
1
DrtPiii iiint iun of tlir consequence par meters
I Figure 3. Architecture of the DFNN
Figure 4.
I Elid
Learning algorithm for DFNN
w.1,
Layer 2: Each node in layer 2 represents a membership function (MF), which is in the form of Gaussian function: p i j ( x i ) = e x p [ 13 where i = 1 , 2 * ' T , j = 1 , 2 . ..u. pij is the j t h membership function of the ith input variable xi. cij and aij are the center and width of the j t h Gaussian membership of xi respectively. Layer 3: Each node in layer 3 represents a possible IF-part for fuzzy rules. For the j t h rule , its output is
, , ..., x,) = exp [ -
+j (x1 x 2
5 +]
z.-c.
i=l
'3
.)2
j = 1, 2, ..., u
(2)
Hence, the firing strength of each rule in Equation (2) can be regarded as a function of regularized Mahalanobis distance (M-distance)
Where m d ( j ) is the M-distance, X = defined by:
(21,
. + ., x , ) ~E F, and C;' is
487
Layer 4: Each node in layer 4 represents the output variable as a weighted summation of incoming signals: y(q, l c 2 , . * , 2,) =
1 '1
Wj
j=1
. +j.
where y is the value of an output variable and w j is the THEN-part (consequent parameters) or connection weight of the jth rule. 3.2. Learning Algorithm
The flowchart of the DFNN learning algorithm is shown in Figure 4. The unique feature of the DFNN learning algorithm of is that the system starts with no hidden units. Fuzzy rules can be recruited and deleted dynamically according to their Error Reduction Ratio (ERR) to the system performance and the complexity of the mapped system. The weight matrix is calculated based on the Linear Least Square (LLS) method. In the proposed system, not only the parameters can be adjusted, but also the structure can be self-adaptive. 4. Simulation Results
Faults in the AHU system are diagnosed by inputting residual vectors to the trained DFNN and the DFNN was tested by using a jack-knife method a. The residual vectors are obtained by introducing faults in simulating the corresponding AHU system by MATLAB@ and recording the subsequent response of the system. Residuals are calculated using system variables measured 50 seconds after the system begins running. BPNN used in this study have a multilayer feedfonvard network structure and are trained using BP learning rule. In our research, we choose the BPNN architecture as 7 (input)*lO (hidden neuron)*9 (output). A logsigmoid activation function is used for both the hidden and output layers. Radial-basis-function-network(RBFN) used in this study consists of the six-dimensional input being passed directly to a hidden layer and 15 neurons in the hidden layer. It employs the gradient descent-learning algorithm with Gaussian RBF's. Detailed information of fault diagnosis performance of BPNN, RBFN and DFNN are shown in Table 1.
BPNN 10 Number of hidden layer neurons I Computing time (s) I 679.45:3 ' Average RMSE for testing data 0.258452 Accuracy of testing data (%) I 88.17
i
RBFN 15 390.156 0.0469 98.10
DFNN 4 98. 578 0.0339 98.53
488
From the above simulation results, we can see that: (1) The performance of RBFN-based and DFNN-based fault diagnosis methods are better than BP-based method in terms of learning speed, average RMSE and accuracy. (2) Although having the same accuracy of testing data with the RBFNbased method, the DFNN is the fastest fault diagnosis method because of the smallest computation time. And the DFNN-based method only has 4 EBF neurons, which means it is fit for different kinds of faults. (3)The BPNN-based and RBFN-based methods need to predetermine the number of hidden layer neurons or the RBF neurons, while the DFNN can decide the system structure automatically without any predetermination. 5. Conclusions
In summary, the proposed adaptive fault diagnosis method has the following attractive and outstanding properties: (1)No predetermination of the initial number of EBF neurons (fuzzy rules) and input-output space clustering. (2)Fast and efficient learning by trained the structure and the parameters simultaneously. (3) The parameters of the consequent parts are computed by the LLS algorithm which is simple and reliable to be used in real-time applications. The DFNN-based method can be extended in a straightforward manner to consider additional faults. As the complexity of the system and the number of faults considered grows, it may be desirable to improve DFNN to be more compact and effective.
References 1. C. M. Bishop, Neural networks for pattern recognition. Oxford, UK: Oxford : Clarendon Press, 1995. 2. W. Y. Lee, J. M. House, C. Park, and G. E. Kelly., “Fault diagnosis of an air-handling unit using artificial neural networks,” A S H R A E %nsaction, V O ~ . 102, pp. 540-549, 1996. 3. M. Meneganti, F. Saviello, and R. Tagliaferri, “Fuzzy neural networks for classification and detection of anomalies,” INeural Networks, IEEE Transactions on, vol. 9, pp. 848 -861, 1998. 4. J. Gertler, Fault detection and diagnosis in engineering systems, New York: Marcel Dekker, 1998. 5. J. M. House, W. Y. Lee, and D. R. Shin, “Classification tchniques for fault detection and dignosis of an air-handling unit,” A S H R A E Transaction, vol. 105, pp. 1987-1997, 1999. 6. W. Y. Lee, C. Park, and G. E. Kelly., “Fault detection of an air-handling unit using residual and recursive parameter identification methods,” A S H R A E Transaction, vol. 102, pp. 528-539, 1996. 7. M. J. Er, S. Q . Wu, and Y. Gao, Dynamic f i z z y Neural Networks: Architectures, Algorithms and Applications. New York: McGraw-Hill, 2003. 8. J. K. Kim and H. W. Park, “Statistical textural feature for detection of microcalcification in digitized mammograms,” IEEE Transactions on Medical Imaging, vol. 18, pp. 231-238, 1999.
PRIORITIZING DESIGN REQUIREMENTS BASED ON FUZZY OUTRANKING METHODS CENGIZ KAHRAMAN*.', TIJEN ERTAY', CAFER ERHAN BOZDAG' 'Istanbul Technical University, Department of Industrial Engineering, 34367 Maqka Istanbul Turkey 'Department of Managerial Engineering, Istanbul Technical University, Macka, 34367Jstanbu1, Turkey To increase customer satisfaction, Quality Function Deployment (QFD) is used to translate customer needs (CNs) into technical design requirements (DRs). Determination of DRs for product development is very important since these requirements are the vital keys of successful products. The methods used to evaluate DRs in the literature can be categorized into multi-criteria evaluation methods such as scoring methods, analytic hierarchy method (AHP), analytic network process (ANP), etc. There is a limited number of papers using multi-attribute outranking methods to evaluate DRs. This paper aims at comparing the results of three different fuzzy outranking methods to evaluate the DRs. A numerical example is presented to illustrate the use of these methods. A sensitivity analysis by changing thresholds is also made by using a software.
1. Introduction Product design and development methodology is quickly becoming the new standard for rapid creation of competitively priced and high-quality products. The methodology is continuous process whereby a product's cost, performance and features, value, and time-to-market lead to a company's increased profitability and market share. A successful product development results in a product that can be produced and sold profitably. Product development time and cost should be a coordinated interdisciplinary effort with a clear mission. The development time is dependent on both the design requirements and customer needs. There are many techniques to use for developing a product. Techniques like Quality Function Deployment (QFD), axiomatic design, and the theory of constraints, constitute the basis for Product Life Cycle Development (PLCD). A single-criterion approach may obscure and bias the solution of a design problem. Consideration must be given to several criteria, which leads to multi-criteria optimization where several objective functions are simultaneously optimized. Among the other multi-criteria decision making methods, outranlung methods constitute a class of ordinal ranking algorithms. As shown in the above literature, multi-criteria decision-malung procedures are widely applied today to ensure a rigorous analysis. For these reasons, outranking methods are quite suitable for decisions about which design requirements to be selected for product design. Outranking approach consists of building on a set of alternatives, a relation (called outranking relation) to represent the preferences of the decision-maker. * Corresponding author: E-mail: kahramanc @itu.edu.&,fax: +90(212)2407260
489
490
This relation is then used to make a choice or rank alternatives. The main objective is to provide the decision-maker with a simple method using realistic preference modeling for the selection of good alternative(s) and to rank the alternatives. Some methods have been developed to solve this type of problem. ELECTRE is multicriteria algorithm, which reduces the size of the set of nondominated solutions, Extensions to the ELECTRE I are available: ELECTRE 11, ELECTRE 111, and ELECTRE IV. Besides, M. Roubens proposed a multicriteria decision making method, ORESTE, in 1979, as an improvement over the ELECTRE method. PROMETHEE (Preference Ranking Organization Method of Enrichment Evaluations) is a new outranking method developed by J.P. Brans et a1 (1984). This method features simplicity, clarity, and stability. There are four versions of PROMETHEE. The methods above were originally developed under certainty conditions. But it is not surprising that uncertainty always exists in the human world. Research that attempts to model uncertainty into decision analysis is done basically through probability theory and/or fuzzy set theory. The former represents the stochastic nature of decision analysis while the latter captures the subjectivity of human behavior. Fuzzy set theory is a perfect means for modeling uncertainty (or imprecision) arising from mental phenomena, which are neither random nor stochastic. There are two main characteristics of fuzzy systems that give them better performance for specific applications. Fuzzy systems are suitable for uncertain or approximate reasoning, especially for the system with a mathematical model that is difficult to derive. Fuzzy logic allows decision making with estimated values under incomplete or uncertain information. Buyukozkan et al. (2003) employed ANP to prioritize DRs by taking into account the degree of the interdependence between the Customer Needs (CNs) and DRs, and the inner dependence among them. The paper is organized as follows: In section 2, we briefly describe design requirements in QFD using fuzzy concept. Section 3 presents fuzzy outranking methods in QFD.Section 4 includes the steps of the algorithm related to fuzzy outranking methods in QFD and the flow chart of FOUR, the developed software. Step 5 presents a comparison of fuzzy outranking methods for design requirements in PVC windows industry. 2. Fuzzy Outranking Methods in QFD In the literature, there are five different fuzzy outranlung approaches: Roy’s (1977) approach, Takeda’s (1982) approach, Siskos et al.’s (1984) approach, Brans et al’s (1984) approach, and Martel et al.’s (1986) approach. From the point of view of simplicity and originality, we will prefer Roy’s , Brans et al.’s, and Siskos et al.’s approaches and will compare the results for the problem of prioritizing DRs. The details of the methods can be found in (Chen et al., 1992).
49 1
2.1. Roy’s Approach Roy (1977) proposed the use of the degree of concordance and the degree of discordance to construct fuzzy outranking relations. There are three thresholds to be specified: indifference, preference thresholds, and veto. After the fuzzy outranking relation is obtained, three possible ranking schemes can be used to compare alternatives: dichotomy, trichotomy, and rank ordering. 2.2. Siskos et al.3 Approach Siskos et al. (1984) present a fuzzy outranking method that is similar to Roy’s algorithm. There are two major differences between these algorithms. The formulas used in deriving concordance relations, Cj, and discordance relations, dj, are different. After the outranking relation is constructed, Siskos et al. (1984) build a fuzzy dominance relation and subsequently a fuzzy nondominance relation. The alternative with the highest degree of nondominance is said to be the best. 2.3. Brans et al. ’s Approach Brans et al. (1984) proposed a family of outranking methods called Preference Ranking Organization Methods for Enrichment Evaluations (PROMETHEE). The family of PROMETHEE is composed of PROMETHEE 1-11-111, and IV. In Brans et al.’s (1984) approach, the first step is the initialization. The second is to calculate the difference between pairs of alternatives. Then the outranking relation matrix is constructed. In this step, some preference functions ( H(d) ) are used to determine the preference of an alternative over another. Later, the degree of optimality is determined. To calculate the degrees of optimality, the degree of dominance (F-) is subtracted from the degree of outranlung (F+). Finally the ranking order is obtained. The ranking order can be obtained using one of the methods PROMETHEE I, PROMETHEE 11, and PROMETHEE 111. 3.
Steps of Fuzzy Outranking Methods in QFD
The steps of the algorithm used in the developed software FOuR are given as follows: Step 1. Obtain the importance weights of CNs using fuzzy AHP (Buyukozkan et al., 2004), Step 2. Obtain the matrices of paired comparisons of DRs with respect to each CN, Step 3. Combine the column eigenvectors obtained in Step 2 by assuming that there is no dependence among the DRs, Step 4. Select one of three fuzzy outranking methods to determine the ranking order of DRs and select the inputs with respect to the selected outranking method, Step 5. Repeat Step 4 for the other two fuzzy outranking methods and obtain the ranking orders, Step 6. Make sensitivity analyses by decreasing or increasing the related thresholds and observe how the ranking orders change in each method and among the methods. Step 7. Summarize the ranking orders and propose the most similar ranking order among three fuzzy outranking methods.
492
4.
An Illustrative Example
The values of the product-planning matrix for a hypothetical writing instrument are given in Table 1. Based on the results from a market survey, the important customer needs (CN) are listed as easy to hold, does not smear, point lasts, and does not roll. The engineering design requirements are accepted as length of pencil (LP), time between sharpening (TBS), lead dust generated (LDG), hexagonality (H), and minimal erasure residue (MER).
Table 11 comparative maran and pnn withthree traning meth met Table 11 comparative maran and pnn withthree traning meth met Attribute Easy to hold
L ; ;
Does not smear Point lasts Does not roll
I
0.32
0.00
0.00
0.00 0.00
0.00 0.41
0.50 0.27
0.00
I
0.00
I
0.00
I
0.68 0.00
0.00 0.50
0.00
0.32
1.oo
0.00
The weights for attributes are accepted as 0.20, 0.25, 0.30, 0.10, and 0.15 respectively. The thresholds for Roy’s and Siskos et al.’s approaches are t‘=0.5, tp=2, and t’=5. In Table 2, the preference functions and parameter values for Brans et al.’s approach are given
Table 11 comparative maran and pnn withthree traning meth met Table 11 comparative maran and pnn withthree traning meth met Table 11 comparative maran and pnn withthree traning meth met Table 11 comparative maran and pnn withthree traning meth met Table 11 comparative maran and pnn withthree traning meth met Table 11 comparative maran and pnn withthree traning meth met Table 11 comparative maran and pnn withthree traning meth met In Table 3, the rank orders are given with respect to the various alpha values to make a sensitivity analysis. The ranking order is obtained using Siskos et aZ.’s approach as . [ 1,2}{ 4 ) { 3 }. Using Brans et al. ’s method, three different ranking orders are obtained with respect to PROMETHEE I, PROMETHEE 11, and PROMETHEE 111. The ranking orders obtained are { 2,3 } { 1,4} in Promethee 1; { 3 } [ 2 ) { 1,4} in Promethee 2; { 2,3 } [ 1,4} in Promethee 3. In Table 4, the results of sensitivity analysis by increasing veto threshold 2 by 2, preference threshold 1 by 1, and indifference threshold 0.1 by 0.1 are given. The results in Table 4 are
493 composed of the ranking orders whenever the threshold values change them meaningfully. In Table 5, the results of sensitivity analysis for Brans et al.'s method are given.
Table 11 comparative maran and pnn withthree traning meth met
a b
B
0.8
0.9
*h
*
x
h
24
v
!z
-
v
i
3
v
v
5
c'
h
**
0.7 h
**
",
0.6
-*-
3
" , 3
Y
Y
h
c' v
h
0.5
*,
0.4
0.3
h
h
4
*,
h
4
3
v
v
c?
" ,
v h
" *
c' z 5 -
h
*,
0.2
**h
" *
0.1
*"
h
**
c ' "
d
d
v
v
Table 11 comparative maran and pnn withthree traning meth met Table 11 comparative maran and pnn withthree traning meth met Table 11 comparative maran and pnn withthree traning meth met Table Table11 11comparative comparativemaran maranand andpnn pnnwiththree withthreetraning traningmeth methmet met Table 11 comparative maran and pnn withthree traning meth met Table 11 comparative maran and pnn withthree traning meth met Table 11 comparative maran and pnn withthree traning meth met Table 11 comparative maran and pnn withthree traning meth met Table 11 comparative maran and pnn withthree traning meth met Table 11 comparative maran and pnn withthree traning meth met Table 11 comparative maran and pnn withthree traning meth met Table 11 comparative maran and pnn withthree traning meth met Table 11 comparative maran and pnn withthree traning meth met Table 11 comparative maran and pnn withthree traning meth met Table 11 comparative maran and pnn withthree traning meth met Table 11 comparative maran and pnn withthree traning meth met Table 11 comparative maran and pnn withthree traning meth met Table 11 comparative maran and pnn withthree traning meth met Table 11 comparative maran and pnn withthree traning meth met Table 5 . Sensitivity analysis for Brans et al.'s (1984) method. Thresholds
Ranking orders
Thresholds
Ranking orders
vl=2, v2=2, v3=2, (u4=2, ~ 4 = 3 ) , (u5=2, v5=5), (u6=3, v6=5)
PI={2){ 1 ){3,6)( 5 9 ) PII={2){1){3,6}(5){ 4) PIII={2) { 1 ) { 3,6) { 5,4
vl=2, v2=2, v3=2, (u4=2, v4=5), (u5=2, v5=3) (u6=3, v6=5)
PI={ 1 ) { 2){6){3){4){ 5 ) PU={ 1 } { 2) {6}{3 ) {4){5) Pm={ 1 2 1{ 3.6 1{ $4 )
494
5. Conclusion In this paper, it is aimed at comparing the results of three different fuzzy outranking methods to evaluate DRs in an illustrative example. First of these methods is Roy’s approach and the degrees of concordance and discordance are used to construct fuzzy outranking relations. In the second method, Siskos et al. ’s method, after the outranking relation is constructed, a fuzzy dominance relation and subsequently a fuzzy nondominance relation are builded. The third method, Brans et al.’s method, is related to a family of outranlung methods called preference ranking organization methods for enrichment evaluations (PROMETHEE). Finally, a sensitivity analysis is made by using a software, FOUR,based o n visual basic application for excel.
References 1.
Brans, J.P., Mareshal, B., Vincke, P., (1984), PROMETHEE: A new family of outranking methods in multicriteria analysis, In: J.P. Brans (ed.), Operational Research’84, North-Holland, Amsterdam, (Proceedings of the 10th IFORS International Conference on Operational Research, Washington, D.C.), 477-490.
2. BiiyiikBzkan, G., Ertay, T., Kahraman, C., Ruan, D., (2004) Determining the Importance Weights for the Design Requirements in House of Quality Using Fuzzy Analytic Network Approach, International journal of intelligent systems, forthcoming.
3. Chen ,S-J., Hwang, C-L., (1992), Fuzzy Multiple Attribute Decision Making: Methods and Applications, Springer-Verlag, Berlin. 4.
Martel, J.M., D’avignon, G.R., and Couillard, J., (1986), A fuzzy outranking relation in multi criteria decision-making, European Journal of Operational Research, Vol. 25, No.2, May, 258-271.
5. Roy, B., (1977), Partial preference analysis and decision-aid: The fuzzy outranking relation concept, In: Conflicting Objectives in Decisions, D.E. Bell, R.L. Keeney, and H. Raiffa (eds.), Wiley, New York, 40-75. 6.
Siskos, J.L., Lochard, J., Lombard J., (1984), A multi-criteria decision making methodology under fuzziness: Application to the evaluation of radiological protection in nuclear power plants, In: TIMS/ studies in the Management Sciences, V01.20, H.J. Zimmermann (eds.), Elsevier Science Publishers, North-Holland, 261283.
7.
Takeda, E., (1982), Interactive identification of fuzzy outranking relations in a multi-criteria decision problem, In: Fuzzy Information and Decision Processes, M.M. Gupta and E. Sanchez (eds.), North-Holland, 301-307.
AN ANALYTIC STRATEGIC PLANNING FRAMEWORK FOR E-BUSINESS PROJECTS G U L ~ I NBUYUKOZKAN* Galatasaray University, Department of Industrial Engineering, Ciragan Cad. No: 36 Ortakoy, 34357 Istanbul-Turkey This paper proposes an analytic strategic planning framework that can assist managers to put their e-business projects into practice in an effective and efficient way. The suggested framework combines two widely used strategic management tools, the SWOT (Strengths, Weaknesses, Opportunities and Threats) matrix and the balanced scorecard, identifying the four critical successful perspectives for strategy development in e-business application. A decision analysis method, fuzzy analytic hierarchy process, is applied to the framework for improving the quantitative information basis of the strategic planning process and for providing with a more effective and comprehensive decision support under uncertain conditions. This paper also includes a case study in Turkey in which the proposed framework is used for the e-health service strategy development
1. Introduction The rapid growth of the Internet is driven by the use of this infrastructure to implement new business models called as e (electronic)-business and to reinforce their competitiveness in the market. Despite the great interest and demand shown towards e-business, the number of successful applications is quite few. For this reason, the significance for a well-organized and strategic plan in applying ebusiness has been emphasized in both empirical and prescriptive research studies recently [ 11. Taking all these into consideration, this paper focuses on an analytic strategic planning framework that can assist managers to put their e-business projects into practice in an effective and efficient way. The proposed framework integrates two widely used strategic management tools, the SWOT (Strengths, Weaknesses, Opportunities and Threats) matrix and the balanced scorecard (BSC). The first step of the proposed framework is to have the SWOT matrix identifies the critical factors of the situation and then to build the BSC with the identification of the different critical perspectives for success and excellence performance of strategy development in e-business application. By doing that, a more structural approach in setting up the foundation of the BSC is applied, instead of simply identifying the “key performance indicators” via gut feeling or by brainstorming. The second step of the framework is to apply the fuzzy analytic hierarchy process (AHP), a decision analysis method, to the identified factors for improving the quantitative information basis of strategic planning process and for providing with a more effective and comprehensive decision * E-mail: [email protected]
495
496
support under uncertain conditions. A case study is also introduced where the proposed framework is used for the strategy formulation of e-health services development in a Turlush Hospital.
2. The fuzzy AHP methodology for SWOT-BSC framework The SWOT analysis is the process of analyzing organizations and its environment to attain a systematic approach and support for a strategic decision situation [ 2 ] .The final goal of strategic planning process, of which SWOT is an early stage, is to develop and adopt a strategy resulting in a good fit between internal and external factors and the goals of the owners. Kaplan and Norton [3] defined the BSC as “translating an organization’s mission and strategy into a comprehensive set of performance measures and provides a multidimensional framework for strategic measurement and management”. Despite the fact that Kaplan and Norton outlined the four perspectives (financial, customer, internal processes, learning and growth) as the key elements of the organizational strategies that must be measured, the BSC remains a means of effectively measuring the result produced by strategy rather than a means of deciding strategy [4]. This is the main reason that the SWOT analysis matrix serves as a great “stepping stone” to built the key performance indicators of the BSC [5]. Another important issue is the obligation of thinking in a broader way when we want to apply the BSC perspectives to the e-business world [6]. To illustrate, the financial perspective can be perceived as the value of business, because ebusiness is not an explicit generator of income, but a service or a channel, which uses technology for the operation of the business. Similarly, we can broaden the consumer perspective as end users [6]. Applications for gaining extra value from SWOT analysis in further strategic planning processes have been presented [2]. However, none of these approaches presented a systematic technique for determining factors’ importance. Recently some studies applied AHP [7] and its eigenvalue calculation technique with SWOT analysis [8, 91. The use of AHP with SWOT yields analytically determined priorities for the factors included in SWOT analysis and makes them commensurable. However, both SWOT analysis and BSC involve various inputs in the form of the linguistic data, e.g., human perception, judgment, and evaluation on importance of criteria, which are usually subjective and uncertain. Hence, the crisp evaluation seems inadequate to explicitly capture the importance assessment for the strategic planning factors. For this reason, we use linguistic variables in AHP.Although SWOT-AHP is an established method in strategic planning literature, to our knowledge, this is the first study to integrate SWOT-BSC methods and apply them fuzzy AHP methodology. Figure 1 summarizes our proposed model.
497
Strategic SWOT factors
Building Balanced Scorecard on SWOT analysis
Identifying critical strategies
I Figure 1. Overview of proposed modeling approach
Many fuzzy AHP methods are proposed by various authors (see [lo] for detail discussions for this subject). In this paper, we use Chang’s extent analysis method on fuzzy AHP [ll] because of its computational simplicity and effectiveness and we also integrate the improvement proposed by Zhu et al. [ 121 to the methodology.
3. Application of proposed framework for e-health service development E-health is a new and risky issue. For this reason, the success and the positive results of e-health applications depend on the development of efficient strategies and the application of them. From this perspective, the suggested approach is used in ABC private hospital in Turkey, to formulate e-health service development strategy. There are four phases of implementation. Phase 1: To conduct SWOT analysis. Through the group discussion and brainstorming, a SWOT analysis is carried out. The relevant factors of the external and internal environment are identified. Phase 2: To apply fuzzy AHP in the SWOT factors and groups. Pairwise comparisons with linguistic terms between the SWOT factors are carried out separately within each SWOT group. When making the comparisons, the issue at stake is which of the two factors compared is more important and how much more important. Throughout this study, it is assumed that the decision
498
makers use the linguistic weighting set given in Table 1 to evaluate the importance of the SWOT groups, factors and developed strategies. The information technology managing director of the hospital and an expert on the methodology involved made the comparisons together, with the expert acting as a consultant explaining the situation to the managing director, who made the comparisons. By using the fuzzy values of Table 1 and applying the fuzzy AHP methodology given in [ 11, 121, we obtain the weights of each SWOT groups as given in Tables 2 and 3. Similarly, the mutual importances of the SWOT groups are also ascertained. The subjective wealth values of the SWOT factor groups help determine the strategic factors of the related groups. Table 4 shows the identified each SWOT group’s strategic factors that are ranking with their priority degrees. Table 1. The linguistic weighting set
Linguistic scale for importance
Triangularf u m
Triangularf u m
Table 11 comparative maran and pnnscale withthree traning meth met Table 11 comparative maran and pnn withthree traning meth met Table 11 comparative maran and pnn withthree traning meth met Table 11 comparative maran and pnn withthree traning meth met Table 11 comparative maran and pnn withthree traning meth met Table 11 comparative maran and pnn withthree traning meth met Table 2. The linguistic pairwise comparison of the SWOT matrix groups
Table 11 comparative maran and pnn withthree traning meth met Table 11 comparative maran and pnn withthree traning meth met Table 11 comparative maran and pnn withthree traning meth met Table 11 comparative maran and pnn withthree traning meth met Table 11 comparative maran and pnn withthree traning meth met Table 11 comparative maran and pnn withthree traning meth met Strenght Weakness Opportunity Threat
Strenght
Weakness
Opportunity
Threat
Weights
(1, 1, 1)
(1/2, 1, 3/2)
(1/2, 2/3, 1)
(1/2, 2 3 , 1)
0,20
(2/3, 1, 2)
(1, 1, 1)
(ID., 2/3, 1)
(1/2, 2 3 , 1)
0,20
(1, 3/2, 2)
(1, 3/2, 2)
(1, 1, 1)
(1/2, 1, 3/2)
0,30
(1, 3/2, 2)
(1, 3/2, 2)
(23, 1,2)
(1, 1, 1)
0,30
Phase 3: To conjoin the SWOT matrix with the BSC. Specific internal and external factors of SWOT analysis are matched for creating a strategic matrix, which makes sense. This matrix and the four key critical BSC perspectives are adopted. For each of four combinations of strategic matrix, the possible strategies are generated and they are tied to the each related BSC perspective. Table 4 shows the obtained results.
499 Table 4. The SWOT matrix of e-health service development with attributes of the BSC
Streneths
Weaknesses
Notes: V = Business value perspective, U = End-user perspective, I = Internal process perspective, L = Learning and growth perspective
S1. To be open to the new ideas or the support of the managers S2. Enough capital S3. Hard working qualitative people
W1. The problems of being known and demanded W2. The lack of knowledge about the computers W3. The weakness of the organization’sinformation system for an effective e-service.
Ovvorfunities 0 1. The lack of the rivals and the lack of the quality of the existing
SI.S2.01.02. To introduce the best possible structure by searching the present e-health services in abroad and to be a leader in the sector- Gain Speed & First Mover Advantages (V1) S2.03. To re-structure the business processes (1-1) S3.01. To expand the service and to surpass its rivals by differentiation (V-
W1.01. Good promotion and marketing campaign for best quality services. (V-3) W1.03.04. To reach especially the target customers, to be known better by providing them with special services. (U-2) W2.02. To train the staff about the advanced technology in the market. (L-1) W3.02. To rebuild the present system by mans of the advanced technology in the market. (1-2)
services. 02. The ease of providing with the technological equipment and software. 03. Ease delivery of services at a reduced cost, increases in effectiveness,andlor improves quality 0 4 . The possibility of customization of services
Threats T1. The diminutiveness of the ehealth market T2. Lack of the expertise in ebusiness world T3. Lack of the resources to fully exploit the e-health service T4. The threats about the security
2) S233.04. Enhanced relationships (U-1)
customer
SI.S2.S3.T2. To work cooperatively with a company which has a good command of e-business; to take the advantage of the company’s support and consultingservices. (1-3) Sl.S2.S3.T3. To train the staff about providing the consumer with e-health services. ( L Z ) S2.TI. To decide the niche consumer market and to focus on them and also to hy to expand the market step by step. (V-4) S2.T4. To structure the best security system (u-3) S2.S3.T4. To make the staff knowledgeable about protecting the security of the users. (U-4)
W 1.T1. To increase the number of the consumers by studying on the advertising and the marketing to expand the market. (V-5) W2.TZ.T3. To train the staff about ehealth services. (L2) W2.TZ. To give information the staff about e-business world, to make them accustomed to this world and to motivate them by improving the intranet and the e-mail system in the hospital. (G3) W3.T3. To set up e-service system process equipped with security preventive masures. (1-4)
Phase 4: To apply fuzzy AHP in the strategies developed through BSC perspectives. Following the determination of the strategies related to the each BSC perspective are included in the integration analysis (see Table l), the strategies’ priorities of each BSC perspective were estimated by linguistic pairwise comparisons following the steps presented in [ 111. We obtained the following results: V1=0.26; V2=0.11; V3=0.19; V4=0.25; V5=0.19; 11=0.19; 12=0.22; 13=0.35; 14=0.24; U1=0.29; U2=0.33; U3=0.19; U4=0.19; Ll=O.O8; L2=0.46; L3=0.46. In this way, the related strategies providing with the strategic success in each perspective in the business enterprise are determined in a balanced way. The final obtained critical strategies; those are high priority degrees, are V1 and V4; 13; U2 and U1; L2 and L3.
500
4.
Concluding remarks
This study had two main objectives: to integrate SWOT analysis and BSC which are the two significant tools of strategic management and to add an analytical dimension to these tools in order to improve the quality of the decision by making more effective strategic plans. With this aim, a framework with four phases is suggested and it is applied in e-health service development. According to the experiences of this study, the results of the combined use of SWOT analysis and BSC were promising. The suggested framework is suitable for many kinds of strategic planning situations in e-business related projects.
Acknowledgements The author acknowledges the financial support of the Galatasaray University, under Scientific Research Project.
References P. Auger, A. Barnir, and J.M. Gallaugher, Strategic Orientation, Competition and Internet Based Electronic Commerce, Information Technology and Management, 4, 139-164, (2003). T.L Wheelen, and J.D. Hunger, Strategic Management and Business Policy, 5th edition. Addison Wesley, Reading, MA, (1995). R.S. Kaplan and D.P. Norton, The Balanced Scorecard: translating strategy into action, Harvard Business School Press, (1996). R. McAdam, and E. O’Neill, Taking a critical perspective to the European business excellence model using a balanced scorecard approach: a case study in the service sector, Managing Service. Quality, 9,3, (1999). 5 . S.F. Lee, and K.K. Lo, e-Enterprise and management course development using strategy formulation framework for vocational education, Journal of Materials Processing Technology, 139,604412, (2003). 6. H. Hasan, and H.R. Tibbits, Strategic management of electronic commerce: an adaptation of the balanced scorecard, Internet Research, 10,5,439-450, (2000). 7 . T.L. Saaty, The Analytic Hierarchy Process, McGraw-Hill, New York, (1980). 8. M. Kurttila, M. Pesonen, J. Kangas, and M. Kajanus, Utilizing the analytic hierarchy process AHP in SWOT analysis: a hybrid method and its application to a forest-certification case, Forest Policy and Economics, 1,41-52, (2000). 9. R.A Stewart, S. Mohamed, and R. Daet, Strategic implementation of lT/IS projects in construction: a case study, Automation in Construction, 11,681- 694, (2002). 10. G. Buyiikozkan, C. Kahraman, and D. Ruan, A fuzzy multi-criteria decision approach for software development strategy selection, International Journal of General Systems, 33, 2-3, 259-280, (2004). 11. D-Y. Chang, Applications of the extent analysis method on fuzzy AHP, European Journal of Operational Research, 95, 3,649-655, (1996). 12. K-J. Zhu, Y. Jing, and D-Y. Chang, A discussion on extent analysis method and applications of fuzzy AHP, European Journal of Operational Research, 116,2,450-456, (1999).
URBAN LAND DEVELOPMENT WITH POSSIBILISTIC INFORMATION PEIJUN GUO Faculty of Economics, Kagawa University, Takamatsu, Kagawa 760-8523, Japan guo @ec.kagawa-u.ac.jp
In this paper, a decision problem whether the landowner should begin to construct the building at present for sale in the future is considered. The uncertainty of building price in the future is characterized by possibility distribution to reflect the potential how much the building price will be. Two focus points, called active focus point and passive focus point are introduced as sorts of equilibriums to balance utility and possibility to show which values should be considered for making decision under possibilistic information. Based on such two focus points, a general model is developed to analyze the investment behavior of landowner. The optimal building size based on active focus point is larger than the one based on passive focus point. Increasing the uncertainty of price can make active investor increase investment scale and careful investor decrease investment scale. It can be seen that the proposed method seems more reasonable for such one-shot decision problem than maximizing expected utility in probability framework.
1. Introduction There are many underutilized and vacant urban lots through the world, held by private investors who presumably wish to maximize their wealth. Generally, there are mainly two kinds of methods for urban land valuation. One is net present value (NPV) based method, that is, appraisers determine the most probable use of the land, appraise the property according to that use, and then discount this future value, such as rental rate for condominium and capital gain in real estate market, to the present [4]. By this method, the land will be developed when NPV of the land is positive. When uncertainty of price in the future is considered and characterized by probability distribution, expected net present value is used for making decision. Because land development is a kind of irreversible action, expected utility theory seems not suitable for such one-shot decision problem. The other method is real option. There is much research done for valuation land by real option [ 1,2,5] where the vacant land is considered as a derivative security, option, whose value is completely determined by an exogenously priced asset. In this paper, a new decision model for urban land development is considered where the landowner should make the decision whether it should begin to construct a building at present for selling at the end of construction date 1. Any delay of sale is not permitted because of huge interest cost. The 501
502 uncertainty of price of building at date 1 is characterized by possibility distribution to reflect the potential how much the building price will be. Two focus points, called active focus point and passive focus point are introduced as sorts of equilibriums to balance utility and possibility to show which price should be considered for making decision with possibilistic information. Based on such two focus points, models are developed to obtain the optimal building sizes. Whether to construct a building at present is judged by possibility and necessity measures. The proposed method seems more reasonable for such oneshot decision problem than maximizing expected utility in probability framework. 2. The Optimal Building Size Building, in this paper, is characterized by its size, or the number of units, q . The cost of constructing a building on a given piece land, C , is an increasing and convex differentiable function of the number of units q ,that is, dC/dq>O,
(1)
d2Cld2q>Q,
(2)
The rationale for the second assumption is that as the number of floors in a building increases and the foundation of the building must be stronger. The profit r that a landowner can obtain by constructing a q -size building as follows, r = P4 - C(q)
(3)
9
where p is the market price per building size at date 1. The building size that maximizes the wealth of a landowner will satisfy the following maximization problem: R ( p ) = max r(p, q ) = max p q - C ( q ). q
(4)
9
Differentiating (4) with respect to q , it follows that the solution to this maximization problem is to choose a building size which satisfies the following condition with considering the assumption ( 2 ) , dC(q)/dq=p.
(5)
Denote the solution of (5) as q v , the maximal profit of landowner is as follows: ~ ( p=)max r ( q ) = pqv - C(qv). 4
Theorem 1. R ( p ) is a strictly increasing and convex function of p
(6)
503 3. Decision Analysis Based on Active and Passive Focus Points Decision-making generally is a procedure to choose one of actions from all alternatives to maximize some utility or get some equilibrium among multiple players. In fact, one chosen action can lead to many possible outcomes corresponding to the different states, which can be only known after the action has been really taken. How to evaluate an action in advance is critical problem for decision-making. Denote the possibility distribution as n ( x ) , x E s where S is the set of states. The function u ( x , a ) characterizes the utility obtained by taking the decision a E A when the state is x , where A is a set of decision. Different from probability theory based methods, which consider expected utility, the possibility theory based methods consider each possible case individually. For example, an action can be simply evaluated only by the state with possibility grad 1. It is usual that an action can get high utility when the favorable state appears but when the disfavored state appears, unsatisfied result has to be obtained. However, one and only one state will appear. Which state would be considered? The following two states x * and x , are valuable for evaluating the action.
(8)
x , ( u ) = arginf max(l-n(x),u(x,u)), X€ S
that no other ( z ( x ) , u ( x , u ) ) can strongly dominate ( ~ ( x * ) , u ( x * , u ), ) that is, ( x ( x )> / z ( x * ) and U ( X , U ) > u ( x * , a ) ) can not hold simultaneously and there is no other ( n ( ~U ()X,, a)) simultaneously satisfying 1 - X ( X ) < 1 - n ( ~ .and ) U ( X , a ) < u ( x , , u ) ) , which is equal to n ( x ) > z(x,) and U ( X , U ) < u ( x , , a)) . X * and x, can be regarded as two equilibriums to show that it is impossible to increase utility and possibility from ( n ( x *), u ( x * ,a)) simultaneously and from ( n ( x ,), u ( x , , a ) ) increasing possibility will not decrease utility or decreasing utility will not increase possibility. However, there is no guarantee that from ( 7 c ( x , ), u(x., a)) decreasing possibility will decrease utility or increasing utility will increase possibility. It can be seen that focus point x * gives more active evaluation, and x, gives more passive evaluation of action a . X * and X, are called active and passive focus points of states, respectively, and u ( x * , u ) and u ( x , , u ) are called active and passive value of decision u ,respectively. The formula SUP min( z(x), u ( x , u ) ) and inf max( 1 - n ( x ) ,u ( x , u ) ) have been It
is
obvious
I€S
X€
S
initially proposed by Yager [ 101 and Whalen [8],respectively, axiomatized in the style of Savage by Dubois Prade and Sabbadin [7], based on
504 commensurability assumption between plausibility and preference, which, in fact, is not easily acceptable. The new explanation by focus points does not need such assumption. As a result, optimal decision is the one, which makes u(x'(u), a) or U(X. (a),a)
maximize, that is,
a* = argmaxu(x*(a),a) or a, = argmaxu(x,(a),a)
(9)
E A
UE A
It can be easily understood from (9) that an action is chosen according to the utility obtained when states are two focus points not expected utility, which is very suitable to deal with one-shot decision problem. 4.
Possibilistic Model for Building Constructing
The uncertainty of price p at date 1 is characterized by a possibility distribution to reflect the plausible information on price in the future. The possibility distribution of p , denoted as n, is given by the following continuous function
n* : [ p , p , 1 + [OJI 9
(10)
7
where n , ( p , ) = O , n D ( p , ) = O and 3 p , E [ p , , ~ , I , n p ( ~ , ) = . lED increases within [ p , , p , ] and decreases within [ p , , p , ] . p , and p, are the lower and upper bounds of prices, respectively, p , is the most possible price. It is straightforward that the landowner will maximize his wealth by choose the optimal building size according to (5) for each building price p in the future. With considering Theorem 1 and the region [ p , , p , ] , the profit region of landowner is [ R ( p , ) , R ( p , ) ] . The utility function of landowner is U ( R ( p ) )where U(.)is defined as follows. Definition 1. The function u(.) is defined by the following strictly increasing function, u :" p , ) ,
W P ,)I +to711
(11)
7
where u ( R ( p , ) )= 0 , u ( R ( p , ) )= 1 . It should be noted that R ( p , ) and W p , ) maybe are minus. Considering (7), the active focus point of price, denoted as p * is given as follows,
Considering (8), the passive focus point of price, denoted as p . is given as follows, P.
=
=g
inf PEI
P I . P"
1
m a ( 1 - z p ( p ) ,u ( R ( q V ( p ) , PI))
.
(13)
505 It can be seen that decision a in (7) and (8) are the optimal building sizes for each building price, that is, q v ( p ) in (12) and (13). q v ( p * ) and q v ( p . ) , denoted as q' and 9, , are called active optimal building size and passive optimal building size, respectively. Theorem 2. The active optimal building size q' is the solution of the following equations 4 P q - C(q))= 0)
(14)
7
where p > p , . Theorem 3. The passive optimal building size following equations 4 p q -C(q)) = 1- N P >
9
9.
is the solution of the
(16)
where p < p , . Theorem 4. The active optimal building size is larger than the passive optimal building size, namely, 4' > q* . Definition 2. Suppose that there are two possibility distributions, x Aand nB. If for any arbitrary x nA(x) 1 nB(x) holds, then nEis said more informed than nA, which is denoted as nB> nA. Theorem 5. Denote the active optimal building size based on possibility distributions A and B as q i and qi , respectively, the passive optimal building size based on possibility distributions A and B as qrA and q . B ,respectively. If * * nE> nA, then qA 2 qe and qtAIqeEhold. Theorem 6 means that increasing the uncertainty of price can make investor increase investment scale C ( q ) if he considers the active focus point and decrease investment scale if he considers the passive focus point. Given the possibility distribution of the building price p in the future as np( p ), according to the extension principle, the possibility distribution of profit with the building size q , is obtained as follows,
For a given possibility distribution, two optimal building sizes q' and q . , can be obtained from (14)-(17). Two possibility distributions of profits with q* and q* ,denoted as n * ( r ) and n.(r),respectively, can be obtained.
506
Theorem 6. The possibility measure of R 2 0 based on possibility distributions z * ( r ) and n~(t-0are denoted as P o s * ( R2 0) and Pos.(R 2 0) , respectively. The necessity measure of R 2 0 based on possibility distributions Z*( r ) and ~ ( r are ) denoted as Nec*(R20) and Nec,(R 20) , respectively. The followings hold, (1) Pos*(R 2 O ) = sup m i n ( n p ( p ) , u ( R ( q v ( p ) , p ) ) ) P E [ P , .P"
1
(2) N e c * ( R 2 0 ) = 0 (3) Pos,(R 2 0 ) = 1 (4) Nec,(R 2 0) = inf PEI P , P" 3
I
max( l - n t p ( p ) , u ( R ( g V ( p )p, ) ) )
Decision rule for urban land investment development If Nec, ( R L 0) 2 a or POS * ( R 2 0 ) 2 p ,then the build should be constructed at present for selling at date 1, or else, not constructed, where 0 I a I 1 and 0 5 f l < 1 are predetermined thresholds for making decision. References 1. M. Amram and K.Kulatilaka, Real Option: Managing Strategic Investment in an Uncertain World, Harvard Business School Press, Boston, 1999. 2. A. K. Dixit and P.S. Pindyck, Investment under Uncertainty, Princeton University Press, New Jersey, 1994. 3. D. Dubois, H. Prade and R. Sabbadin, Decision-theoretic foundations of possibilty theory, European Journal of Operational Research 128 (2001) 459-478. 4. R. Richard, Valuation for Real Estate Decision, Democrat Press, Santa Cruz, 1972. 5. S. Titman, Urban Land Prices under Uncertainty, The American Economic Review 75( 1985) 505-5 14. 6. T. Whalen, Decision making under uncertainty with various assumptions about available information, IEEE Transaction on Systems, Man and Cybernetics 14 (1984) 888-900. 7. R. R. Yager, Possibilistic decision making, IEEE Transaction on Systems, Man and Cybernetics 9 (1979) 388-392.
AN APPLICATION OF FUZZY AHPDEA METHODOLOGY FOR THE FACILITY LAYOUT DESIGN IN THE PRESENCE OF BOTH QUANTITATIVE AND QUALITATIVE DATA UMUT RIFAT TUZKAYA Department of Industrial Engineering, Yildiz Technical Universiry, Barbaros Street, Yildiz, istanbul, 34349, Turkey
TIJEN ERTAY' Department of Managerial Engineering, Istanbul Technical University, Macka, 34367,Istanbul, Turkey
An effective facility layout evaluation procedure necessitates the consideration of qualitative criteria, e.g. flexibility in volume and variety and quality related to the product and production, as well as quantitative criteria such as material handling cost, adjacency score, shape ratio, material handling vehicle utilization, etc. in the decision process. This paper presents a decision-making methodology based on data envelopment analysis (DEA), which uses both quantitative and qualitative criteria, for evaluating Facility Layout Design (FLD). A computer-aided layout-planning tool, VisFuctory, is adopted to facilitate the layout alternative design process as well as to collect quantitative data. Fuuy AHP is then applied to collect qualitative data related to quality and flexibility. DEA methodology is used to solve the layout design problem by simultaneously considering both the quantitative and qualitative data. The purposed integrated procedure is applied to a real data set consisting of twelve FLDs provided of the plastic profile production system.
1. Introduction Facility layout is one of the main fields in industrial engineering where much research effort was spent and numerous approaches were developed. A layout problem can surface in the design and allocation of space in a new building or the reassignment of space in an existing building [l]. Various constructive and improvement algorithms are built for new layouts and rearrangements respectively. Especially, in last decade commercial products have become available based on some original algorithms. These computer-aided approaches provide speed on calculations and constitute integrated structures. Also in this study, a software package that uses many algorithms at the same time is applied to generate alternative layouts. In addition, some qualitative
* Corresponding Author: [email protected]
507
508 criterions are determined by fuzzy AHP to choose the most required one between these applicable alternatives. At last stage, all of these data are embedded to DEA to conclude the evaluating procedure. 2.
Fuzzy AHP
Fuzzy set theory was introduced by Zadeh [2] to deal with vague, imprecise and uncertain problems. This theory has been used as a modeling tool for complex systems that are hard to define precisely, but can be controlled and operated by humans [3]. More detailed discussions related to fuzzy sets, fuzzy relations, and fuzzy operations can be found in Ross [4], Chen [ 5 ] , and Zadeh [63. By embedding AHP method into fuzzy sets, an other application area of Fuzzy Logic is revealed. Various types of fuzzy AHP methods can be seen in the literature, such as Laarhoven and Pedrcyz (1983), Buckley (1985), Chang (1996), and Leung and Cao (2000). Decision makers usually find that it is more confident to give interval judgments than fixed value judgment. This is because usually he/she is unable to explicit about hidher preferences due to the fuzzy nature of the comparison process [ 7 ] . In this study, Chang’s [8] extent analysis method is preferred since the steps of this approach are relatively easier than the other fuzzy AHF’ approaches and similar to the crisp AHP.
3. Data Envelopment Analysis DEA is a non-parametric technique for measuring and evaluating the relative efficiencies of a set of entities with common crisp inputs and outputs. DEA, developed by Charnes et al. [9] converts multiple inputs and outputs in to a scalar measure of efficiency. There are some multiple criteria approaches to DEA problems that are formulated based on different DEA models. Here, two-staged DEA is applied to get realistic results. Firstly, an ordinary linear program that transformed from fractional program is used. A deviation variable for the DMUo as do is defined and tried to minimize it. 6represent the inefficiency rate for DMUo. But usually, DEA’s discrimination power cannot be sufficient and DEA results can show that most of the DMU’s are efficient and their do values are zero. For this reason in the second stage Minimax Efficiency Method is preferred. Minimax efficiency is a good method to prevent weak discriminating power. This is a practical method to alleviate the problem of multiple relatively efficient DMUs. Also it is more restrictive than the efficiency defined in classical DEA. The mentioned DEA models was not given here but detailed and explicit information can be found in Li and Revees [ 101.
509 4. Methodology The layout problem is considered as an integrated framework of Fuzzy AHP and DEA. Qualitative ve quantitative performance measuresare are used together to find most desirable layout. T h s integrated framework is illustrated in Figurel. The methodology constitutes from three stage; generation of alternative layouts, gathering data concerned these alternatives for DEA, and using DEA to choose the most desirable layout alternative.
4 Layout alternative generation
I Flow Distance I
Handling cost
1
,
Material handling vehicle utilization
(
I
I
,
Shape Ratio
I
DEA methodology to ineasure layout designs perfonnances
I
c I I Selection of the final layout design solution
Figure 1. The framework for choosing most desirable layout design
4.1. Alternative Layout Generation
EM’SVisFactory software package is used to generate layout alternatives. This software has three main parts that are FactoryPLAN, FactoryFLOW and FactoryOPT. This software’s main parts can be used for various aims like measuring current layouts efficiencies, determining density between activities, and calculating total cost or time about the concerned material handling. In this study, the data about produced main products, their average weights, components that used in products, material handling vehicles and material flows between departments are given to FactoryFLOW. Also the data about department’s space requirements and relationships between departments are
510
given to FactoryPLAN. At last step, FactoryOPT is used the data obtained from FactoryFLOW and FactoryPLAN to generate layout alternatives. In this alternative generation process, FactoryOPT uses different algorithms by considering parameters, which can be changed by the user.
4.2. Gathering data for DEA Two types of criteria constitute DEA inputs and outputs. Handling costs and adjacency scores, inputs for DEA are quantitative criterions. Also material handling vehcle utilization, flow distances, and shape ratios are quantitative criterions and outputs for DEA. These listed criterions are obtained from VisFactory program’s outputs for all alternative layouts. Second type criteria, flexibility, and quality are qualitative ones and used as outputs in DEA too. These qualitative data must be converted to convenient state for DEA. To achieve this necessity and deal with the vagueness and imprecision of qualitative data set, Fuzzy AHP method is used.
4.3. DEA Methodology for choosing the most desirable layout alternative DEA is applied to choose the most desirable layout alternative by using the data gathered from VisFactory and Fuzzy AHP. Thanks to DEA methodology, performances of all alternative layouts are measured. A decision is given according to DEA results, like the current layout is efficient or it must be changed to improved one. 5.
Case Study
The purposed framework is applied to a plastic profile producer. The company produces various extrusion products. Some of the production characteristics bring problems relevant to productivity. Therefore, layout improvements are required. By using VisFactory, eleven layout alternatives are generated and the data, shown in Table 1 about alternatives and also current layouts are obtained as mentioned in section 4.2. Table 1. Quantitative Criterions’ Values for alternative layouts
Table 11 comparative maran and pnn withthree traning meth met Table 11 comparative maran and pnn withthree traning meth met Table 11 comparative maran and pnn withthree traning meth met Table 11 comparative maran and pnn withthree traning meth met Table 11 comparative maran and pnn withthree traning meth met Table 11 comparative maran and pnn withthree traning meth met Table 11 comparative maran and pnn withthree traning meth met Table 11 comparative maran and pnn withthree traning meth met Table 11 comparative maran and pnn withthree traning meth met Table 11 comparative maran and pnn withthree traning meth met Table 11 comparative maran and pnn withthree traning meth met
511 11 Current
I I
20330,681 203551
45161 174021
0,3437231 0,44214291
29,46 30,02
As fifth criteria, volume-flexibility, variety-flexibility, production-quality, and product-quality are combined by using Chang's Fuzzy AHP methodology. The fuzzy evaluation matrix relevant to the goals is given in Table 2. Table 2. Fuzzy evaluation matrix relevant to goals [Volume Flexibility Variety Flexibility IProduction Quality1 Product Quality
I
I
Table 11 comparative maran and pnn withthree traning meth met Table 11 comparative maran and pnn withthree traning meth met 150 Table 11 comparative maran and pnn withthree traning meth met 150 Table 11 comparative maran and pnn withthree traning meth met 150 Table 11 comparative maran and pnn withthree traning meth met 150 The vectors shown below are obtained from Table 2 by using the expressions that are presented in Chang's Fuzzy AHl' methodology. (5.29, 6.33, 7.40) 0(20.77,24.08,27.55) = (109.76, 152.53,203.89) S v s = (8.50, 10.00, 11.50) 0 (20.77,24.08,27.55) = (176.50,240.83,316.85) SpnQ= (5.19, 5.83, 6.57) 0(20.77,24.08,27.55) = (107.68, 140.49, 180,93) SpQ= (1.79, 1.92,2.09) 0(20.77,24.08,27.55) = (37.52,46.16,57.47) By using these vectors, degree possibilities are obtained like below: V(SvoF2Sva~)= 0.237, V(SvoF2S,) = 1.0, V(SvoFLSPQ)= 1.0 v(svaF 2SVoF) = 1.0, v(sVfi 2SbQ)= 1.0,v(sVfi 2SPQ)= 1.0 = 0.855, V(Sp,,Q 2Svfi) = 0.042, V(S,Q 2S p~ = ) 1.0 V(S,Q = 0.0, v(SpQ2spnQ)= 0.0. V(s, 2sVoF) = 0.0, v(SpQ Thus the weight vector of the goals are calculated as WG = (0.19,0.78, 0.03, O.O)T Pair-wise comparison matrices realize comparison of importance of a criterion over another or an alternative over another. Degrees of the importance are given as triangular fuzzy numbers: Equal importance (1, 1, 1), weak importance (2/3, 1, 3/2), fairly strong (3/2, 2, 5/2), very strong (5/2,3, 7/2), absolute (7/2, 4, 9/2). As a result of evaluation of alternative layouts with respect to the goals by using importance degrees, the weight vectors are acquired for each goal. By the combination of priority weights for goals and alternatives, the last alternative priority weights are obtained as; 0.003, 0.077, 0.078, 0.035, 0.136, 0.120, 0.036, 0.000, 0.110, 0.136, 0.135, and 0.134, with respect to alternative one to alternative eleven and current layout. In ths point all criterions' values for all alternative layouts are ready for DEA. Firstly, classical DEA is applied but the results are not distinguishable because seven alternative and current layouts seems efficient. Therefore minimax efficiency method is applied to increase the discriminating power. The result was
512
remarkable, only second and tenth alternative layouts are efficient. However, current layout’s efficiency score is 0.85 and needs improvement. 1
2
3
4
5
6
7
8
9
10
11
t
1.00 1.00 0.99 0.98 1.00 1.00 1.00 0.86 0.86 1.00 1.00
1.00
wnimaxefficiency~~~0.95 1.00 0.98 0.90 0.84 0.94 0.82 0.82 0.82 1.00 0.96
0.85
Classical Method
6. Conclusion This study addresses the evaluation of the facility layout alternatives by developing an integrated framework based Fuzzy AHPDEA methodology together with a software tool. The proposed framework is applied to a real data set consisting of twelve alternatives. As a result of this application, two alternatives determined as relatively efficient. References
1. R. S. Liggett, Automated facilities layout: past, present and future, Automation in Construction, 9, 197-2 15 (2000) 2 . Zadeh LA. Fuzzy sets. Information and Control, 8:338-53, (1965). 3. F. Dweiri, Fuzzy development of crisp activity relationship charts for facilities layout, Computers and Industrial Engineering, 36 (1999). 4. T. J., Ross, Fuzzy Logic with Engineering Applications, 2795 (1995). 5. S. Chen, C. Hwang, and F. P. Hwang, Fuzzy Multiple Attribute Decision Malung, 42,3140 (1992). 6. Zadeh LA. Probability measures of fuzzy events. Journal of Mathematical Analysis and Applications, 23:422 1-427 (1968). 7. C. Kahraman, U. Cebeci, and D. Ruan, Multi-attribute comparison of catering service companies using fuzzy AHP: The case of Turkey, International Journal of Production Economics, 87,171-184 (2004). 8. K. Zhu, Y. Jing, and D. Chang, A discussion on Extent Analysis Method and applications of fuzzy AHP, European Journal of Operational Research, 116,450-456 (1999). 9. A. Charnes, W. W. Cooper, and E. Rhodes, Measuring the efficiency of decision-malung units, European Journal of Operational Research, 2, 429444 (1978). 10. X. Li and G. R. Reeves, A multiple criteria approach to data envelopment analysis, European Journal of Operational Research 115,507-517 (1999).
AN INTELLIGENT HYBRID APPROACH FOR INDUSTRIAL
QUALITY CONTROL COMBINING NEURAL NETWORKS, FUZZY LOGIC AND FRACTAL THEORY PATRICIA MELM AND O S C A R CASTILLO Department of Computer Science, Tijuana Institute of Technology, P.O. Box 4207, Chula Vista CA 91909, USA. :pmelin@,tectiiuana.mx We describe in this paper a new hybrid intelligent approach for industrial quality control combining neural networks, fuzzy logic and fractal theory. We also describe the application of the neuro-fuzzyfractal approach to the problem of quality control in the manufacturing of sound speakers in a real plant. The quality control of the speakers was done before by manually checking the quality of sound achieved after production [4]. A human expert evaluates the quality of sound of the speakers to decide if production quality was achieved. Of course, this manual checking of the speakers is time consuming and occasionally was the cause of error in quality evaluation [8]. For this reason, it was necessary to consider automating the quality control of the sound speakers.
1. Introduction The quality control of the speakers was done before by manually checking the quality of sound achieved after production [4]. A human expert evaluates the quality of sound of the speakers to decide if production quality was achieved. Of course, this manual checking of the speakers is time consuming and occasionally was the cause of error in quality evaluation [8]. For this reason, it was necessary to consider automating the quality control of the sound speakers. The problem of measuring the quality of the sound speakers is as follows: 1. First, we need to extract the real sound signal of the speaker during the testing period after production 2. Second, we need to compare the real sound signal to the desired sound signal of the speaker, and measure the difference in some way 3. Third, we need to decide on the quality of the speaker based on the difference found in step 2. If the difference is small enough then the speaker can be considered of good quality, if not then is bad quality. The first part of the problem was solved by using a multimedia kit that enable us to extract the sound signal as a file, which basically contains 108000 points over a period of time of 3 seconds (this is the time required for testing). We can say that the sound signal is measured as a time series of data points [3], which has the basic characteristics of the speaker. The second part of the problem was solved by using a neuro-fuzzy approach to train a hzzy model with the data from the good quality speakers [9]. We used a neural network [6] to obtain a Sugeno fuzzy system [14, 151 with the time series of the ideal speakers. In this case, a neural network [5, 11, 131 is used to adapt the parameters of the hzzy system with real data of the problem. With this fuzzy model, the time series of other speakers can be used as checking data to evaluate the total error between the real speaker and the desired one. The third part of the problem was solved by 513
514
using another set of fuzzy rules [20], which basically are fuzzy expert rules to decide on the quality of the speakers based on the total checking error obtained in the previous step. Of course, in this case we needed to define membership functions for the error and quality of the product, and the Mamdani reasoning approach was used. We also use as input variable of the fuzzy system the fractal dimension of the sound signal. The fractal dimension [9] is a measure of the geometrical complexity of an object (in this case, the time series). We tested our fuzzy-fractal approach for automated quality control during production with real sound speakers with excellent results. Of course, to measure the efficiency of our intelligent system we compared the results of the fuzzy-fractal approach to the ones by real human experts. 2. Basic Concepts of Sound Speakers In any sound system, ultimate quality depends on the speakers [4]. The best recording, encoded on the most advanced storage device and played by a top-ofthe-line deck and amplifier, will sound awful if the system is hooked up to poor speakers. A system's speaker is the component that takes the electronic signal stored on things like CDs, tapes and DVD's and turns it back into actual sound that we can hear. 2.1 Sound Basics
To understand how speakers work, the first thing you need to do is understand how sound works. Inside your ear is a very thin piece of skin called the eardrum. When your eardrum vibrates, your brain interprets the vibrations as sound. Rapid changes in air pressure are the most common thing to vibrate your eardrum. An object produces sound when it vibrates in air (sound can also travel through liquids and solids, but air is the transmission medium when we listen to speakers). When something vibrates, it moves the air particles around it. Those air particles in turn move the air particles around them, carrying the pulse of the vibration through the air as more and more particles are pushed farther from the source of the vibration. In this way, a vibrating object sends a wave of pressure fluctuation through the atmosphere. When the fluctuation wave reaches your ear, it vibrates the eardrum back and forth. Our brain interprets this motion as sound. We hear different sounds from different vibrating objects because of variations in : sound wave frequency -- A higher wave frequency simply means that the air pressure fluctuates faster. We hear this as a higher pitch. When there are fewer fluctuations in a period of time, the pitch is lower. 0 air pressure level -- the wave's amplitude -- determines how loud the sound is. Sound waves with greater amplitudes move our ear drums more, and we register this sensation as a higher volume.
515
A speaker is a device that is optimized to produce accurate fluctuations in air pressure. Figure 1 shows a typical speaker driver.
Fig.1. A typical speaker driver, with a metal basket, heavy permanent magnet and paper diaphragm
3. Description of the Problem The basic problem consists in the identification of sound signal quality. Of course, this requires a comparison between the real measured sound signal and the ideal good sound signal. We need to be able to accept speakers, which have a sound signal that do not differ much from the ideal signals. We show in Figure 2 the form of the sound signal for a good speaker (of a specific type). The measured signal contains about 108 000 points in about 3 seconds. We need to compare any other measured signal with the good one and calculate the total difference between both of them, and if the difference is small then we can accept the speaker as a good one. On the other hand, if the difference is large then we reject the speaker as a bad one. We show in Figure 3 the sound signal for a speaker of bad quality. We can clearly see the difference in the geometrical form of this signal and the one shown in Figure 2. In this case, the difference between the figures is sufficiently large and we easily determine that the speaker is of bad quality. We also show in Figure 4 another sound signal for a bad quality speaker.
Fig. 2. Sound signal of a Good Speaker
516
Fig. 3. Sound Signal of Bad Speaker (Case 1)
Fig. 4. Sound Signal of Bad Speaker (Case 2 )
4. Basic Concepts of Fractal dimension Recently, considerable progress has been made in understanding the complexity of an object through the application of fractal concepts [8] and dynamic scaling theory [ 111. For example, financial time series show scaled properties suggesting a fractal structure [ 1, 2, 31. The fractal dimension of a geometrical object can be defined as follows: d = lim [InN(r)] / [ln(l/r)]
(1)
r+O
where N(r) is the number of boxes covering the object and r is the size of the box. An approximation to the fractal dimension can be obtained by counting the number of boxes covering the boundary of the object for different r sizes and then performing a logarithmic regression to obtain d (box counting algorithm). In Figure 5, we illustrate the box counting algorithm for a hypothetical curve C . Counting the number of boxes for different sizes of r and performing a logarithmic linear regression, we can estimate the box dimension of a geometrical object with the following equation: In N(r) = Inp - d Inr
this algorithm is illustrated in Figure 6 .
Yt Fig.5. Box counting algorithm for a curve C
Fig.6. Logarithmic regression to find dimension
517
We developed a computer program for calculating the fractal dimension of a sound signal. The computer program uses as input the figure of the signal and counts the number of boxes covering the object for different grid sizes. For example, the fractal dimension for the sound signal of Figure 2 is of 1.6479, which is a low value because it corresponds to a good speaker. On the other hand, the fractal dimension for Figure 3 is 1.7843, which is a high value (bad speaker). Also, for the case of Figure 4 the dimension is 1,8030, which is even higher (again, a bad speaker). 5. Experimental Results We describe in this section the experimental results obtained with the intelligent system for automated quality control. The intelligent system uses a fizzy rule base to determine automatically the quality of sound in speakers. We used a neural network to adapt the parameters of the fuzzy system using real data from the problem. We used the time series of 108000 points measured from a good sound speaker (in a period of 3 seconds) as training data in the neural network. We then use the measured data of any other speaker as checking data, to compare the form of the sound signals. A neural network is used to adapt a fuzzy system with training data of good sound speakers. The approximation is very good considering the complexity of the problem. Once the training was done, we used the fuzzy system for measuring the total difference between a given signal and the good ones. This difference is used to decide on the final quality of the speaker using another set of fuzzy rules with the Mamdani approach. The fuzzy rules are as follows: IF Difference is small IF Difference is regular IF Difference is regular IF Difference is medium IF Difference is medium IF Difference is large IF Difference is large IF Difference is small
AND Fractal Dimension is small THEN Quality Excellent AND Fractal Dimension is small THEN Quality is Good AND Fractal Dimension is high THEN Quality is Medium AND Fractal Dimension is small THEN Quality is Medium AND Fractal Dimension is high THEN Quality is Bad AND Fractal Dimension is small THEN Quality is Medium AND Fractal Dimension is high THEN Quality is Bad AND Fractal Dimension is high THEN Quality is Medium
We show in Figure 7 the non-linear surface of a type-I hzzy system for quality control of the sound speakers. We also show in Table 1 the results of the hzzy systems for I2 specific situations
Fig.7. Non-linear surface of type-I fuzzy system
518 Table 1. Outputs of the fuzzy systems for 12 specific situations
Table 11 comparative maran and pnn withthree traning meth met 150 Table 11 comparative maran and pnn withthree traning meth met 150 Table11 11comparative comparativemaran maranand andpnn pnnwiththree withthreetraning traningmeth methmet met150 150 Table Table 11 comparative maran and pnn withthree traning meth met 150 Table Table11 11comparative comparativemaran maranand andpnn pnnwiththree withthreetraning traningmeth methmet met150 150 Table 11 comparative maran and pnn withthree traning meth met 150 Table 11 comparative maran and pnn withthree traning meth met 150 6. Conclusions We described in this paper the application of a fizzy logic to the problem of automating the quality control of sound speakers during manufacturing in a real plant. We have implemented an intelligent system for quality control in the plant using the new approach. We also use the fractal dimension as a measure of geometrical complexity of the sound signals. The intelligent system performs rather well considering the complexity of the problem. The intelligent system has been tested in a real manufacturing plant with very good results.
References [I] Castillo, 0. and Melin, P. (2001). “Soft Computing for Control of Non-linear Dynamical Systems”, SpringerVerlag, Heidelberg, Germany. [2] Castillo, 0. and Melin, P. (2002):’Hybrid Intelligent Systems for Time Series Prediction Using Neural Networks, Fuzzy Logic and Fractal Theory”, IEEE Transactions on NeuralNetworks, vol. 13, pp. 1395-1408. [3] Castillo, 0. and Melin, P. (2003). “Soft Computing and Fractal Theory for Intelligent Manufacturing”, Springer-Verlag, Heidelberg, Germany. [4] Dickason, V. (1997). The Loudspeaker Cookbook, McGraw Hill. [5] Haykin, S . (1996). “Adaptive Filter Theory”, Third Edition, Prentice Hall. [6] Jang, J.R., Sun, C.T. and Mizutani E. (1997). Neuro-Fuzzy and Soft Computing, Prentice Hall. [7] Karnik, N. N. and Mendel, J. M. (1998). “An Introduction to Type-2 Fuzzy Logic Systems”, Technical Report, University of Southern California. [8] Loctite Co. (1999). Speaker Assembly Adhesives Guide. [9] Mandelbrot, B. (1987). “The Fractal Geometry of Nature”, W.H. Freeman and Company. [ 101 Melin, P. and Castillo, 0. (2002). “Modelling, Simulation and Control of Non-Linear Dynamical Systems”, Taylor and Francis, London, Great Britain. [ 111 Parker, D.B. (1982). “ Learning Logic”, Invention Report 581-64, Stanford University. [ 121 Rasband, S.N. (1990). “Chaotic Dynamics of Non-Linear Systems”, Wiley Interscience. [ 131 Rumelhart, D.E., Hinton, G.E., and Williams, R.J. (1986). “Learning Internal Representations by Error Propagation”, in “Parallel Distributed Processing: Explorations in the Microstructures of Cognition”, MIT Press, Cambridge, USA, Vol. 1, pp. 318-362. [I41 Sugeno, M. and Kang, G.T. (1988). Structure Identification of Fuzzy Model, F u y Sets and Systems, Vol. 28, pp.15-33. [15] Takagi, T. and Sugeno, M. (1985). Fuzzy Identification of Systems and its Applications to Modeling and Control, IEEE Transactions on Systems, Man, and Cybernetics,vol. 15, pp.116-132. [I61 Wagenknecbt, M. and Hartmann, K. (1988). “Application of Fuzzy Sets of Type 2 to the Solution of Fuzzy Equations Systems”, Fuzzy Sets andSystems, Vol. 25, pp. 183-190. [17] Yager, R. R. (1980). “Fuzzy Subsets of Type I1 in Decisions”, Journal ofcybernelics, Vol. 10, pp. 137-159. [18] Zadeh, L. A. (1971). Similarity Relations and Fuzzy Ordering, Information Sciences,vol. 3, pp. 177206. [19] Zadeh, L. A. (1973). Outline of a New Approach to the Analysis of Complex Systems and Decision Processes, IEEE Transactions on Systems, Man, and Cybernetics, voI3,pp. 28-44. [20] Zadeh, L. A. (1975). “The Concept of a Linguistic Variable and its Application to Approximate Reasoning”, Information Sciences, 8, pp. 43-80.
A FUZZY HEURISTIC MULTI-ATTRIBUTE CONJUNCTIVE APPROACH FOR ERP SOFTWARE SELECTION CENGIZ KAHRAMAN', GULCIN BUmKOZKAN*32,DA RUAN3 'Istanbul Technical University, Department of Industrial Engineering, Macka 34367, Istanbul, Turkey 2Galatasaray Universiw, Department of Industrial Engineering, Ortakoy 34357 Istanbul, Turkey 'Belgian Nuclear Research Centre (SCK-CEN), Boeretang 200, B-2400 Mol, Belgium In recognition of the importance of the ERP software selection and of the sizable risk that organizations take when they decide to buy this type of technology, this study proposed a systematic decision process for selecting a suitable ERP package. We are interested in two stages of decision-making: the identification of serious software candidates and the choice of the most appropriate ERP software package. To achieve this, we propose an integrated approach based on fuzzy logic, heuristic multi attribute utility and conjunctive methods. Finally, a case study is given to demonstrate the potential of the methodology.
1. Introduction Under the pressure to proactively deal with the radically changing external environment, many firms have changed their information system strategies by adopting application software packages rather than in-house development [ 11. An application package such as enterprise resource planning (ERP) system provides reduced cost, rapid implementation, and high system quality [2]. The growing number and quality of ERP software, the cost of the package (costs equaling several thousands, hundreds of thousands, and even millions of dollars) [3], the set-up, running and other related costs and the fact that the complexity of ERE' packages and profusion of alternatives require expertise for their evaluation make the selection of an appropriate ERP package a vital issue to practitioners [ 4 ] . In addition, since an ERP system, by its very nature, will impose its own logic on a company's strategy, organization, and culture, it is imperative that the ERP selection decision be conducted with great care. A group approach to ERP software selection decision offers many benefits, including improved overall decision quality and decision-making effectiveness. Clearly, ERF' software selection is not a well-defined or structured decision problem. The presence of multiple criteria and the involvement of multiple decision makers (DMs) will expand decisions from one to many several dimensions, thus, increasing the ~
~
* Correspondingauthor: e-mail: [email protected]
519
520 complexity of the solution process. For this reason, in this study, we propose to use both heuristic multi attribute utility and conjunctive methods by combining with fuzzy set theory in a two-phase ERP software selection process. The application of the suggested methodology is also explained through a case study. 2. Suggested framework for ERP software selection The evaluation procedure begins with identifying the software selection criteria. From a candidate list, the DMs have to narrow the field to four to six serious candidates. This can be accomplished by a preliminary analysis of the strengths and weaknesses of each supplier and the “goodness of fit” of the software to company needs by conducting heuristics rules. The measurement of performance corresponding to each criterion is conducted under the setting of fuzzy set theory. Finally, we apply in the second phase fuzzy conjunctive method to achieve the final ranking results and select the most appropriate ERP software package. The evaluation algorithm may then be summarized by the following: Step 1. Identify interest groups of people in the decision environment, Step 2. Identify attributes and establish a universe of discourse for each attribute, Step 3. List all alternatives, Step 4. Interact with DMs to identify their heuristics and according to them, construct heuristic decision rules for each group, Step 5. List the selected most appropriate alternatives and measure along the attributes, Step 6. Calculate the utilities of alternatives for each group and determine the alternatives to get for the following steps, Step 7. Interact with the DMs to identify the cutoff vector, Step 8. Determine the possibility and necessity measures for the identified alternatives using a fuzzy conjunctive method, Step 9. Rank the alternatives from the best to the worst and select the best ERP package for the company. The DMs are actively involved in the decision process and the decision rules are constructed through serious discussions between the DMs and analysts. 3.
Evaluation methods of ERP software selection
3.1. Heuristic Multi Attribute Utility Function (MAUF) approach The authors [ 5 , 61 argued that the multiple attribute utility function (MAUF) could not be practically obtained by the combination of single attribute utility functions because of the dependency among attributes. Therefore a heuristic approach is needed to define the MAUF [7]. A heuristic method is an algorithm that gives only approximate solution to a given problem. Heuristic methods are commonly used because they are much faster than exact algorithms. Since decision data may be numerically and/or linguistically expressed, fuzzy set
521 theory must be incorporated in the heuristic approach. In this case, the utility function is represented in the “IF .. THEN’ decision rule format.
3.2. Fuzzy conjunctive approach Dubois et al. [8] proposed the fuzzy version of the conjunctive method. They pointed out that when data in a decision matrix and the DM’s standard levels are fuzzy, the matching between these two fuzzy data becomes vague and, naturally, a matter of degree. The degree of matching is measured by the following membership function: p (a)= s u p h Q(x)(pp (x)= a ) tla , where
n,(.)
PI Q
represents the degree of possibility that x is the (unique) value which describes an object modeled by Q; p p ( x )is the degree of compatibility between the value x and the meaning of P. Thus, pqe(a)denotes the degree of compatibility of Q with respect to P. Dubois et al. [8] defined two scalar indices to approximate the ,uPlQ(a)measure so that compatibility between fuzzy sets can be estimated: the possibility of matching and the necessity of matching for single and multiattributes. In this study, we use the measures of matching for multi-attributes (as in most real world problems) by applying the following equations: (1) n ( A ~ , A , ) - rmn n(x,o,xq): the possibility measure J=1,
n
N(AO,A~)= 1=1, ,n
where
AO
=
p.(
~ ( ~ y , ~ ,the ) : necessity measure
,..., 2 ) , A, = (xZ1 ,..., x I n ) , and
XI”
(2)
and xv are defined on the same
domain U . The vector is the cutoff vector specified by the DM, while A,, i = l ,...,m , is the vector that contains the performance scores of the ith alternative under all attributes. The other computations are given in [8]. AO
4. Application of the proposed framework We consider a medium size company that wishes to replace its legacy manufacturing planning system with an ERP product. Each of the candidate ERP systems has different strengths and weaknesses. A diverse cross-functional team consisting of five members has been organized to determine performance aspects. We use five critical performance aspects that are proposed widely in literature and practice [3]. They are price, capability, reliability, credibility and service support. We assume that opinions of the five team members have been obtained in the form of linguistic variables for each performance aspect as shown in Table 1. Each of these aspects is treated as a fuzzy set, bounded to a predetermined interval, and characterized by a single probability distribution. Figure 1 gives the membership functions for the attribute ‘price’. If the range for
522 the other attributes is the interval [0,10], Figure 2 illustrates one set of assignments that can be made. Table 1. The conversion scales Performance asgects
Price
Very expensive WE)
Expensive (E)
.Rather cheap (VC)
Cheap (C)
Very cheap
Capability
Incapable (0
Little capable (LC)
Rather capable (RC)
Capable
Very capable (VC)
Reliability
Unreliable (U)
Little reliable
Rather reliable
Very reliable
(W
(W
Reliable (R)
Little credible (LC)
Rather credible (RC)
Credible (C)
Very credible (VC)
Medium(h4)
High0
Veryhigh(VH)
Credibility
Incredible (I)
Service support
Very low (VL)
150 200
(L)
500
400
300
(VC) (C)
600 650
Figure 1 . Membership functions for the attribute ‘price’
We established the universes of discourse for all attributes and their utility. Through careful discussion, we identified the heuristics and ideal solutions by the experts and obtained the decision rules. We listed all the alternatives and using the identified decision rules, we obtained the utility for each alternative as given in Table 2. Three alternatives (1, 4 and 9), which have the highest utility values, are identified for a more detailed analysis.
0
1
2
3
4
6
7
8
9 1 0
Figure 2. Membership functions for linguistic variables
523 Table 2. The linguistic evaluation with respect to each alternative ERP softwares
Alt. 1 Alt. 2
Service support
Price
Capability
Reliability
Credibility
Very expensive Rather cheap
Very capable
Very reliable
Very credible
Rather
Credible
Little reliable
Little credible
Alt. 3
Very cheap
Rather capable Little capable
Alt. 4
Expensive
Capable
Very reliable
Credible
Alt. 5
Rather cheap
Reliable
Alt. 6
Cheap
Rather capable Rather capable
Alt. 7
Rather capable
Alt. 8
Very exepnsive Cheap
Rather reliable Rather reliable
Rather credible Rather credible
Capable
Alt. 9
Expensive
Capable
Alt. 10
Rather cheap
Little capable
UTILITY
Very high High
VERY GOOD
Very high Very high
MEDIUMPOOR VERY GOOD
High
GOOD
Low
MEDIUM
GOOD
reliable
Medium
MEDIUM
Little reliable
Rather credible Little credible
High
MEDIUMGOOD
Reliable
Very credible
High
VERY GOOD
Reliable
Credible
Low
MEDIUM GOOD
Let's assume that the cutoff vector is given as A° = {(x°,0.25), (jt°,0.15), (*°,0.20), (jt4°,0.15), (jc5°,0.25)} where the numbers 0.25, 0.15, 0.20, 0.15, and 0.25 are the weights associated with each attribute, and the x°j, j = 1,2,3,4,5, are summarized as: jc°: rather cheap; x° : capable; jc°:very reliable; x°: credible; x° : very high. For the first alternative (/=!), the values of n(*°»•*,;,) are calculated as 0, 0.4, 1.0, 0.4, 1.0 and the values of N(x°,xiJ] are calculated as 0, 0, 0.5, 0, 0.5 for./ = 1,...,5. Applying Eqs (1) and (2), we compute the possibility and necessity measures for three identified alternatives as in Table 3. The rank order from the best to the worst is obtained as {4, (l, 9)}. Table 3. Possibility and necessity measures Alternative i
1
4
n(A°;A,)
0.75
0.75
9 0.75
0.75
0.85
0.75
44°; A)
If the WjS, j = 1,2,3,4,5, are changed to be (0.80, 0.45, 0.60, 0.45, 0.80), the new obtained measures are given in Table 4. The new rank order from the best to the worst is {4,9,1}.
524 Table 4.Modified possibility and necessity measures
I
Altemativei
I I
1
maran0.20 and
I I
4
II
9
I
0.50 0.40 withthree traning
Table 11 comparative pnn meth met 150 Table 11 comparative maran and pnn withthree traning meth met 150 5. Conclusion W e proposed a n integrated approach to the ERP software selection problem. W e used fuzzy heuristics to eliminate the worst alternatives among all at the first stage and then used a fuzzy conjunctive method to select the best among the remaining alternatives. Changing the weights of the attributes, it is shown that the rank order of the alternatives may change. These methods are very useful in comparing alternatives when there is incomplete information about them.
References 1. K.C. Laudon, and J.P. Laudon, Management Information Systems: Organization and Technology, Prentice-Hall, Englewood Cliffs, NJ, (1996). 2. H.C. Lucas, E.J. Walton, and M.J. Ginzberg, Implementing packaged software, MIS Quarterly, 537-549, (1988).
3. J. Verville, and A. Halingten, A six-stage model of the buying process for ERP software, Industrial Marketing Management, 32(7), 585-594, (2003).
4. C-C. Wei, and M-J.J., Wang, A comprehensive framework for selecting an ERP system, International Journal of Project Management, 22(2), 161-169, (2004).
5. J. Efstathiou, and V. Rajkovic, Multiattribute decision-making using a fuzzy heuristic approach, IEEE Trans. On Systems, Man, Cybernetics, Vol. SMC-9, 326333, (1979). 6. J. Efstathiou, The incorporation of objective and subjective aspects into the assessment of information systems, In: The information systems environment, Lucas, Land, Lincoln, and Supper (eds.), North-Holland, 187-197, (1980). 7. S-J. Chen, and C-L. Hwang, Fuzzy multiple attribute decision-making: methods and applications, Springer-Verlag, Berlin, (1992). 8. D. Dubois, H. Prade, and C. Testemale, Weighted fuzzy pattern matching, Fuzzy Sets and Systems, 28 (3), 313-331, (1988).
ESTIMATION OF EASE ALLOWANCE OF A GARMENT USING FUZZY LOGIC Y.CHEN, X.ZENG, M.HAPPIETTE, P.BRUNIAUX Ecole Nationale SupPrieure des Arts & Industries Textiles, 9 rue de I’Ermitage, Roubaix 59100, France R.NG, W.YU The Hong Kong Polytechnic University, Hong Kong, China
The ease allowance is an important criterion in garment sales. It is often taken into account in the process of construction of garment patterns. However, the existing pattern generation methods can not provide a suitable estimation of ease allowance, which is strongly related to wearer’s body shapes and movements and used fabrics. They can only produce 2D patterns for a fixed standard value of ease allowance. In this paper, we propose a new method of estimating ease allowance of a garment using fuzzy logic and sensory evaluation. Based on these values of ease allowance, we develop a new method of automatic pattern generation, permitting to improve the wearer’s fitting perception of a garment. The effectiveness of our method has been validated in the design of trousers of jean type. It can also be applied for designing other types of garment.
1.
Introduction
A garment is assembled from different cut fabric elements fitting human bodies. Each of these cut fabric elements is reproduced according to a pattern made on paper or card, which constitutes a rigid 2D geometric surface. For example, a classical trouser is composed of cut fabrics corresponding to four patterns: front left pattern, behind left pattern, front right pattern and behind right pattern. A pattern contains some reference lines characterized by dominant points which can be modified. Of all the classical methods of garment design, draping method is used in the garment design of high level [l]. Using this method, pattern makers drape the fabric directly on the mannequin, fold and pin the fabric onto the mannequin, and trace out the fabric patterns. This method leads to the direct creation of clothing with high accuracy but needs a very long trying time and sophisticated techniques related to personalized experience of operators. Therefore, it can not be applied in a massive garment production. Direct drafting method is more quick and more systematic but often less precise [2]. It is generally applied in classical garment industry. Using this method, pattern makers directly draw patterns on paper using a patter construction procedure, implement in a Garment
525
526
CAD system. This construction procedure does not determine the amount of ease allowance, but instead generates flat patterns for any given value of ease allowance. In practice, it is necessary to find a compromise between these two garment construction methods so that their complementarity can be taken into account in the design of new products. To each individual corresponds a pattern whose parameters should include his body size and the amount of ease allowance of the garment. In fact, most of fabrics are extensible and can not be well deformed. Moreover, the amount of ease allowance of a garment, defined as the difference in space between the garment and the body, can be taken into account in the pattern by increasing the area along its outline. In practice, there exist three types of ease allowance : (1) standard ease, (2) dynamic ease and (3) fabric ease. Standard ease allowance is the difference between maximal and minimal perimeters of wear’s body. It is obtained from standard human body shape for the gesture of standing or sitting still. This amount can be easily calculated using a classical drafting method [2], [ 3 ] . Dynamic ease allowance provides sufficient spaces to wearers having non standard body shapes (fat, thin, big hip, strong leg, . ..) and for their movements (walking, jumping, running, etc.). Fabric ease allowance takes into account the influence of mechanical properties of fabrics of the garment. It is a very important concept for garment fitting. Existing automatic systems of pattern generation or garment CAD systems can not determine suitable amounts of ease allowance because only standard ease allowance is taken into account. In this case, 2D patterns are generated according to the predefined standard values of ease allowance for any body shapes and any types of fabric. In this paper, we propose a new method for improving the system of garment pattern generation by defining the concept of fuzzy ease allowance capable of taking into account two aspects: standard ease and dynamic ease.
2.
Construction of the Fuzzy Logic Controller
This method permits to generate new values of ease allowance using a Fuzzy Logic Controller (FLC), adapted to body measurements and movements of each individual. The corresponding scheme is given in Figure 1, For simplicity, only trousers of jean type related to the comfort at hip position are studied in this paper and the influence of fabric ease related to physical properties of garment and other body positions not taken into account.
527
measurements
Fuzzy Logic Controller
standard ease
+ dynamic eaie
Garment pattern generation
evaluation
Table 11 comparative maran and pnn withthree traning meth met 150 The FLC used for generating fuzzy ease allowance includes an interface of fuzzification, a base of fuzzy rules, an inference mechanism and an interface of defuzzification. It permits to produce the fuzzy ease allowance at hip position, i.e. the combination of the dynamic ease and the standard ease, from a number of relevant measures on wearer’s body shapes and comfort sensation of wearers. The amount of fuzzy ease allowance will be further used for generating more suitable patterns. The construction of this FLC is based on a learning base built from a adjustable garment sample to generate different trouser sizes and a group of representative evaluators or wearers (sensory panel). Precisely, the procedure of construction of this learning base is given as follows: Step 1: Selecting or producing a special sampling jean whose key positions can be adjustable in order to generate different values of ease allowance. This sample can be used to simulate jeans of different sizes and different styles. In our project, the sample is adjusted to have three sizes: normal size, loose fit and tight fit and the corresponding ease allowance at hip position can vary from -1 to 8. These values are taken as output learning data of the FLC. Step 2 : Selecting a group of n evaluators having different body shapes. Step 3: Sensory evaluation: evaluating the sample at different adjustable sizes by each evaluator according to a questionnaire defined a priori. The perception of wearers related to jean comfort on different body positions and different movements can be taken into account in the reply of the questionnaire. In our project, we calculate for each evaluator the minimal value of the evaluation scores with respect to all body positions and all movements and take the minimal values for all evaluators as input learning data of the FLC. Steu 4: Objective evaluation: measuring the body shapes of the evaluators on a number of dominant geometric positions. These values are also taken as input learning data of the FLC. This procedure permits to obtain input/output data for different sizes of the learning garment sample. In this FLC, the inputs include 3 variables. The first and second input variables, measuring the wearer’s body shapes and related to standard and
528 dynamic ease allowance, are waist girth ( X I ) and waist to hip (x2) respectively. Their linguistic values are: {very small (VS), small (S), normal (N), big (B), very big (VB)}. The third input variable (x3) measures the comfort sensation of wearers and its linguistic values, obtained from sensory evaluation of wearers, are {very uncomfortable (VUC), uncomfortable (UC), normal (N), comfortable (C), very comfortable (VC)}. All these measures are obtained from Step 3 and Step 4 of the previous learning procedure. The output of the FLC is the estimated fuzzy ease allowance, denoted by y . Its values are real numbers varying between -1 and 8. The corresponding learning inpudoutput data, measured and evaluated on the garment sample, are denoted by {(XII,x12, x13; yl), ..., (x,~,xn2, x,3; y J } . Sugeno method is used for defuzzification [4]. The fuzzy rules are extracted from these input/output learning. For each input variable xi (i=l, 2, 3), the parameters of its membership functions are obtained using fuzzy c-means clustering method [5]. This method permits to classify the learning data {XI;, ..., xni}into 5 classes, corresponding to the five fuzzy values of xi. For each learning data xki, we obtain the membership degrees for these five fuzzy values as follows: p l ( x k ; ) ,..., ,u5(xki).Assuming that the corresponding membership functions take a triangular shape characterized by Tr(uli,bli,clj) , .. ., T r ( ~ ~ ~ , b ~ ; the , c ~15~ )parameters , ulb ..., c5i are obtained by minimizing the following criterion:
Figure 2. The membership functions for X I (waist girth)
An method method because data.
example of the membership functions optimized by fuzzy c-means is given in Figure 2. In practice, the fuzzy values obtained by this lead to more precise results than uniformly partitioned fuzzy values each fuzzy value generally correspond to one aggregation of learning
529 3.
Fuzzy Rules Extraction from Learning Data
The fuzzy rules of the FLC for estimation of ease allowance are extracted from the learning data ( ( X I ] , x12, ~ 1 3 yI), ; ..., (xnI,x n 2 , xn3; y J } using the method of antecedent validity adaptation (AVA) [ 6 ] . The principle of this is essentially a process by which the antecedent validity of each data, with respect to fuzzy values and fuzzy rules, is evaluated in order to adjust the output consequent. Compared with the other fuzzy extraction methods, the AVA method can effectively resolve the conflicts between different rules and then decrease the information lost by selecting only the most influential rules. In our project, the basic idea of applying the AVA method is briefly presented as follows. According to Section 2, the input variable xi ( i E { l , 2, 3)) is partitioned into 5 fuzzy values: FV,={VS, S, N , B, VB). For each learning data (Xkl. Xk2, Xk3; yk), we set up the following fuzzy rules by combining all fuzzy values of these three input variables: Rule j: IF (x, is A { ) AND (xq is A ! ) AND (x3 is A!), THEN (y is yk) with A,' EFV;and the validity degree of the rule
Given a predefined threshold o,the rule j is removed from the rule base if the following condition holds: D( rule j)
4. Results and discussion To test the effectiveness of the FLC, we carry out the following experiments. Of n learning data (n=20) evaluated and measured on the garment sample, we use n-I data for learning of the FLC, i.e. construction of membership functions and extraction of fuzzy rules and the remaining one data for comparing the difference between the estimated ease allowance generated by the fuzzy model (ym)and the real value of the ease y . Next, this procedure is repeated by taking another n-1 data for learning. Finally, we obtain the results for 20 permutations (see Figure 3). For ~ 0 . 0 1 the , averaged error between y, and y for all the permutations is 0.7854. Figure.3 shows that the estimated ease allowance generated from the h z z y model can generally track the evolution of the real ease. ym varies more
530 smoothly than y because there exist an averaging effect in the output of the FLC. The difference between ym and y is bigger for isolated test data because their behaviors can not be taken into account in the learning data.
Figure 3. Companson between the estimated ease allowance and its real value
Moreover, we obtain 49 fuzzy rules with validity degree D
2
IF waist girth IS big AND waist to hip is small AND comfort value is very small THEN ease allowance is 1.5 (D=O 92) IF waist girth is small AND waist to hip is normal AND comfort value is normal THEN ease is 3 (D=0.91).
5. Conclusion The proposed method combines the experimental data measured on wearer’s body shapes and the evaluators’ sensory perception on garment samples in the construction of the FLC for estimating fuzzy ease allowance at hip position. This method can be easily applied to the other key body positions such as waist and knee. A suitable aggregation of values of ease allowance at all positions can effectively improve the quality of garment pattern design.
References 1. C.A.Crawford, The Art of Fashion Draping, 2nd Edition, Fairchild Publications, New York (1996). 2 . W.Aldrich, Metric Pattern Cutting for Men’s Wear, 3rd Edition, Blackwell Science Ltd, Cambridge (1997). 3. R.Ng, Computer Modeling for Garment Pattern Design, Ph.D. Thesis, The Hong Kong Polytechnic University (1 998). 4. T.Takagi and M.Sugeno, Fuzzy identification of systems and its application to modelling control, IEEE Trans. on Systems, Man and Cybernetics, 15, pp.116-132 (1985). 5 . J.C.Bezdek, Pattern recognition with fuzzy objective function algorithms, Plenum Press (1981). 6. P.T.Chan and A.B.Rad, Antecedent validity adaptation principle for fuzzy systems tuning, Fuzzy Sets and Systems, 131, pp.153-163 (2002).
MULTI-ATTRIBUTE COMPARISON OF QUALITY CONSULTANTS IN TURKEY USING FUZZY AHP UFUK CEBECI
Department of Industrial Engineering, Istanbul Technical University, Macka, Istanbul 34361, Turkey
Quality consultants are used in I S 0 9000 implementation projects by especially small and medium-sized enterprises. Clients do not always apprcciate differences between quality consultants. The aim of this paper is to provide an analytical tool to select the bcst quality consultant providing the most customer satisfaction. Thc clients of three Turkish quality consultancy firms were interviewed and the most important criteria taken into account by the clients while they were selecting their consultancy firms were determined by a designed questionnaire. The fuzzy analytic hierarchy proccss was used to compare these consultancy firms. The means of the triangular fuzzy numbers produced by the customers and experts for each comparison were successfully used in the pair wise comparison matrices.
1.
Introduction
Total quality management (TQM) and I S 0 9000 are the most popular topics in the quality consultancy area. A quality consultant might cause too much documentation and unnecessary costs. There may be no demonstrative productivity gain. Quality consultants are used IS0 9000 implementation projects by especially small and medium-sized enterprises. Clients do not always appreciate differences between quality consultants. Taylor [ 13 stated that 87 per cent of respondents in his survey, used consultants for their I S 0 9000 projects. The aim of this paper is to provide an analytical tool to select the best quality consultant providing the most customer satisfaction. Cebeci and Kahraman [2] compare some catering firms using four attributes and fuzzy AHP. Beattie [3] studied the benefits of I S 0 9000 implementation in Australian companies. Beskese et al. [4] investigate the current situation of the implementation of TQM and IS0 9000 among Turkish companies; Badri [5] identifies five sets of quality measures by using the results of previous studies of service quality attributes. These indicators or measures, through the analytic hierarchy process (AHP), are then accurately and consistently weighted. The paper proposes a decision aid that will allow weighting (prioritizing) of a firm’s unique service quality measures, consider the real world resource limitations (i.e., budget, hour, labor, etc.), and select the optimal set of service quality control instruments. The paper addresses two 531
532 important issues: how to incorporate and decide upon quality control measures in a service industry, and how to incorporate the AHP into the model. The AHP is one of the extensively used multi-criteria decision-making methods. One of the main advantages of this method is the relative ease with which it handles multiple criteria. In addition to this, AHP is easier to understand and it can effectively handle both qualitative and quantitative data. The organization of this paper is as follows. First, the criteria of the customers while they select their quality consultancy firms are explained. Then fuzzy sets and fuzzy numbers are introduced because our comparison method, fuzzy AHP, includes fuzzy numbers and their fuzzy algebraic operations. Finally, the literature review regarding fuzzy AHP and the formulation of fuzzy AHP are given. A comparison among three Turkish quality consultancy firms is made by using fuzzy AHP.
2.
Quality consultancy in Turkey and selection criteria of customers
The total number of quality consultancy firms in Turkey is estimated 2,000. Three consultancy firms, Verim (V), Degisim (D), and, Focus (F) were compared to select the best consultancy firm among the three. A questionnaire was applied to the customers of these consultancy firms. 16 customers of Verim, 12 customers of Degisim, and 10 customers of Focus replied the questionnaires. The customers were asked what the attributes were, while they were selecting their consultancy firms and what preference they had while they were making pair wise comparisons among these attributes. The attributes were determined as A. Ability to perform the promised service, B. Fulfillment of commitments within the established time limits, C. Co-ordination between different company departments, D. Ability to inspire trust and confidence, E. Willingness to help clients, F. Enthusiasm and involvement in the project, G. Communication and interpersonal skills of the consultant, H. Professional knowledge, I. Expertise of consultants in client’s sector, J. Problem solving ability, K. To keep customer informed. Some questionnaires aiming at determining the degrees of preference by the help of the pair wise comparisons among the attributes are prepared. The questionnaires facilitate the answering of pair wise comparison questions. The five experts (member of Management Consultancy Association of Turkey) compared the three firms with respect to each sub-attributes.
533 The meanings of the attributes were explained in detail to both the customers of the consultancy firms and the five experts so that every one would understand the same thing when they read the questionnaire. 3. Fuzzy sets and fuzzy numbers To deal with vagueness of human thought, Zadeh [6] first introduced the hzzy set theory, which was oriented to the rationality of uncertainty due to imprecision or vagueness. A major contribution of h z z y set theory is its capability of representing vague data. The theory also allows mathematical operators and programming to apply to the fuzzy domain. A fuzzy set is a class of objects with a continuum of grades of membership. Such a set is characterized by a membership (characteristic) function, which assigns to each object a grade of membership ranging between zero and one. Zimmermann [7] gives the algebraic operations with Triangular Fuzyy Numbers (TFNs). Many ranhng methods for fuzzy numbers have been developed in the literature. They do not necessarily give the same rank. The algebraic operations with fuzzy numbers can be found in Kahraman [8] and Kahraman et al. [9]. 4. Multi Criteria Comparison of Consultancy Firms in Turkey Using Fuzzy AHP
The outlines of the extent analysis method on fuzzy A H P and related references can be found in Kahraman et a1 [lo]. A big Turkish Chemical Company wants to contract with a consultancy firm. Alternative Turkish consultancy firms are Verim (V), Degisim (D), and, Focus (F). The goal is to select the best consultancy firm between the alternatives. The decision-making group consists of three members. The matrix of paired comparisons for attributes is given in Table 1. Table 1, The matrix of paired comparisons for attributes
Table 11 comparative maran and pnn withthree traning meth met 150 Table 11 comparative maran and pnn withthree traning meth met 150 Table 11 comparative maran and pnn withthree traning meth met 150 Table 11 comparative maran and pnn withthree traning meth met 150 Table 11 comparative maran and pnn withthree traning meth met 150 Table 11 comparative maran and pnn withthree traning meth met 150 Table 11 comparative maran and pnn withthree traning meth met 150 Table 11 comparative maran and pnn withthree traning meth met 150 Table 11 comparative maran and pnn withthree traning meth met 150 Table 11 comparative maran and pnn withthree traning meth met 150 Table Table 11 11 comparative comparative maran maran and and pnn pnn withthree withthree traning traning meth meth met met 150 150 Table 11 comparative maran and pnn withthree traning meth met 150
534
The matrix of paired comparisons for alternatives is given in Table 2. Table 2. The matrix of paired comparisons for alternatives Ability to perform the promised seMce (A) Verim Degisim Focus Venm 0,111 0,201 0.048 Degisim 0,333 0,602 0,714 Focus 0,556 0,199 0.238 Total 1,000 1,002 1,000 Fulfillment of commitments within the established time limits (6) Verim Degisim Focus Venm 0,091 0,067 0,143 Degisim 0,636 0,467 0,429 Focus 0,273 0,467 0,429 Total 1,000 1,001 1,001 Codrdination between different company departments (C) Verim Degisim Focus Venm 0,654 0,693 0,556 Degisim 0,216 0,231 0,333 Focus 0,131 0,076 0,111 Total 1,000 1,000 1.000 Ability to inspire trust and confidence (D) Verim Degisim Focus Venm 0,200 0,199 0.200 Degisim 0,600 0,602 0,600 Focus 0,200 0,199 0,200 Total 1,000 1,000 1,000
Total 0,360 1,650 0,992 3,002
Average 0,120 0,550 0,331 1,000
Weights Venm Degisim Focus
Total 0,301 1,533 1,169 3,003
Average 0,100 0,511 0,390 1,001
Weights Venm Degisirn Focus
Total 1,902 0.780 0.318 3.000
Average 0,634 0,260 0,106 1,000
Weights Venm Degisim Focus
Total 0,599 1,802 0,599 3,000
Average 0,200 0,601 0,200 1,000
Weights Venm Degisim Focus
Focus 0,467 0,065 0,467 1,000
Total 1,169 0.298 1,533 3,000
Average 0,390 0,099 0,511 1,000
Weights Venm Degisim Focus
Focus 0,556 0,333 0.1 11 1,000
Total 0.773 1,867 0.360 3,000
Average 0,258 0,622 0,120 1,000
Weights Venm Degisim Focus
Focus 0,333 0,556 0,111 1,000
Total 1,217 1,439 0,344 3,000
Average 0,406 0,480 0,115 1,000
Weights Venm Degisim Focus
Focus 0,052 0,790 0,158 1,000
Total 0,248 2,173 0.580 3,000
Average 0,083 0,724 0,193 1,000
Weights Venm Degisim Focus
Focus 0,693 0,076 0,231 1,000
Total 1,902 0.318 0.780 3,000
Average 0,634 0.106 0,260 1,000
Weights Venm Degisim Focus
Focus 0,052 0,790 0,158 1,000
Total 0.248 2,173 0,580 3,000
Average 0,083 0,724 0,193 1,000
Weights Venm Degisim Focus
Focus 0,600 0,200 0,200
Total 2,058 0,406 0.536
Average 0,686 0,135 0.179
Weights Venm Degisim Focus
Willingness to help clients (E) Verim Degisim Venm 0,429 0,273 Degisim 0,142 0,091 Focus 0,429 0,636 Total 1,000 1,000 Enthusiasm and involvement in the project (F) Verim Degisim Venm 0,122 0,095 Degisim 0.854 0.680 Focus 0,024 0,224 Total 1,000 1,000 Communicationand interpersonalskills of the consultant (G) Degisim Verim 0,455 Venm 0,429 0,455 Degisim 0,429 Focus 0,142 0,091 Total 1,000 1,000 Professionalknowledge (H) Degisim Verim 0,104 Venm 0,091 Degisim 0,636 0,746 0,149 Focus 0,273 Total 1,000 1,000 expertise of consultants in clients sector (I) Degisim Verim Venm 0,654 0,556 Degisim 0,131 0,111 Focus 0,216 0,333 Total 1,000 1.000 problem solwng ability (J) Venm Degisim Venm 0,091 0,104 Degisim 0,636 0,746 Focus 0,273 0,149 Total 1,000 1,000 to keep customer informed(K) Verim Degisim Verim 0.680 0,778 Degisim 0,095 0,111 0,111 Focus 0,224
Summary of priority weights label as attribute weights is given in Table 3.
535 Table 3 . Summary of priority weights label as attribute weights
-
Table 11 comparative maran and pnn withthree traningJ meth met 150 Attribute Table 11 comparative maran and pnn withthree traningmeth met 150 0,022 Table 11 comparative maran and pnn withthree traningmeth met 150 0,3556 Table 11 comparative maran and pnn withthree traningmeth met 150 0.08 0,4275 0,72 Table 11 comparative maran and pnn withthree traningmeth met 150 0,2474 0,19 Table 11 comparative maran and pnn withthree traningmeth met 150 5. Conclusion Decisions are made today in increasingly complex environments. In more and more cases the use of experts in various fields is necessary, different value systems are to be taken into account, etc. In many of such decision-making settings the theory of fuzzy decision-making can be of use, Fuzzy group decision-making can overcome this difficulty. In general, many concepts, tool and techniques of artificial intelligence, in particular in the field of knowledge representation and reasoning, can be used to improve human consistency and implementability of numerous models and tools in broadly perceived decisionmaking and operations research. In this paper, consultancy firms were compared using fuzzy AHP. Humans are often uncertain in assigning the evaluation scores in crisp AHP. Fuzzy AHP can capture this difficulty. There are many other methods to use in comparing consulting firms. These are multi-attribute evaluation methods such as ELECTRE, DEA, and TOPSIS. These methods have been recently developed to use in a fuzzy environment.
References 1.
2. 3. 4. 5. 6. 7. 8.
A.T. Taylor, International Journal Quality and Reliability Management 12(4), 40 (1995). U. Cebeci, C. Kahraman, Proceedings of ICFSSCIM, 3 15 (2002). K. R. Beattie, A. S. Sohal, Total Quality Management, 10(1), 95(1999). A. Beskese, U. Cebeci, The TQMMagazine, 13(1), 69 (2001). M.A. Badri, International Journal of Production Economics 72,27(2001) L. Zadeh, Information Control, 8, 338 (1965). H. J. Zimmermann, Fuzzy Sets and Its Applications, Kluwer Publish., 1994. C. Kahraman, In: Da Ruan, Janusz Kacprzyk, Mario Fedrizzi (Eds.), Soft
536 Computing for Risk Evaluation and Management: Applications in Technology, Environment and Finance, Physica-Verlag, 375 (200 1). 9. C. Kahraman, D. Ruan, D., E. Tolga, Information Science, 42 (1-4), 57 (2002). 10. C. Kahraman, U. Cebeci, International Journal of Production Economics 87-2, 171 (2004)
NARMAX-MODEL-BASED TIME SERIES PREDICTION: FEEDFORWARD AND RECURRENT FUZZY NEURAL NETWORK APPROACHES
Y . GAO, M.J. ER AND J. DU School of Electrical and Electronic Enginee&ng Nanyang Technological University, Singapore Email: [email protected]
The nonlinear autoregressive moving average with exogenous inputs (NARMAX) model provides a powerful representation for time series analysis, modeling and prediction due to its capability of accommodating the dynamic, complex and nonlinear nature of real-world time series prediction problems. This paper focuses on the modeling and prediction of NARMAX-model-based time series using the fuzzy neural network (FNN) methodology. Both feedforward and recurrent FNNs approaches are proposed. Experiments and comparative studies demonstrate that the proposed FNN approaches can effectively learn complex temporal sequences in an adaptive way.
1. Introduction Time series prediction is an important practical problem with a variety of applications in business and economic planning, inventory and production control, weather forecasting, signal processing, and many other fields. In the last decade, neural networks (NNs) have been extensively applied for nonlinear time series tasks. This is due t o their capability of handling nonlinear functional dependencies between past time series values and estimates of the values t o be forecast. More recently, fuzzy logic has been incorporated with the neural models for time series prediction These approaches are generally known as fuzzy neural networks (FNNs) or NNbased fuzzy inference systems (FISs) approaches. FNNs possess both the advantages of FISs, such as human-like thinking and ease of incorporating expert knowledge, and NNs, such as learning abilities, optimization abilities and connectionist structures. In this paper, NARMAX time series models are investigated using FNN approaches in both feedforward and recurrent model representations. A sequential and hybrid (supervised/unsupervised) learning algorithm, namely generalized fuzzy neural network (G-FNN) learning algorithm, is employed to form the FNN prediction model. Various comparative studies show that the proposed approaches are superior t o many existing methods. 1121394.
537
538 2. NARMAX Model and Optimal Predictors 2.1. General NARMAX(n,,n,,n,)
Model
The statistical approach for forecasting begins with selection of mathematical models t o predict the value of an observation yt using previous observations. A very general class of such models is the nonlinear autoregressive moving average with exogenous inputs (NARMAX) models given by Yt = F[Yt-1,.
. . ,Yt-n,,et-l,.
. . , e t - n , , x t - ~ , . . . ,xt-n,]
+ et
(1)
where y, e and x are output, noise and external input of the system model respectively, ny, ne and n, are the maximum lags in the output, noise and input respectively, and F is an unknown smooth nonlinear function.
2 . 2 . Optimal Predictors Optimum prediction theory revolves around minimizing mean squared error (MSE). Given the infinite past and provided the conditional mean exists, the optimal predictor ijt is the conditional mean E [ y t ( y t - ~ yt--2,. , . . ] '. Assuming that et is zero mean, independent and identically distributed, independent of past y and 2 , and has a finite variance u2 in (l),the optimal predictor for NARX or NARMAX model can be approximated as 3t = E[YtlYt-1,.
' '
,Yt-nu]
NARX model (2)
= F [ y t - - l , . . . , ~ t - n , , x t - ~ , . ' jxt-n,] .
or -
F [ y t - l , . . . , y t - n , , & t - l , . . . , & - n , , x t - ~ , . . . ,xt--n,]
NARMAX model (3) where 2
=y
-
i j , and optimal predictors (2) and (3) have MSE n2.
3. Feedforward and Recurrent G-FNN Predictors
3.1. FNN Predictors
In this paper, F N N predictors are proposed to emulate optimal predictors in (2) and (3). In other words, the F N N is used to approximate the function F. Functionality of the G-FNN is given by jjt = FFNN[Z~] = c + j ( z ) w j =xexp[-(z j=1 j=1
- ~ j ) ~ C j-( cj)]wj z (4)
where z = [z1 . . . zntIT is the input vector, cj = [clj . . . cn%jlT and Cj = &as( 4 . . . &) are the center vector and the width matrix of the Gaussian UlJ
un,l
+
+ +
membership function q5j respectively, wj = koj k l j z l . . . knijznt is the TSK-type weight, kijs are real-valued parameters, ni and n, are the number of inputs and rules in the FNN respectively.
539 Eq. (4) can be represented in matrix form as follows: Ot
T
= at
(5)
Wt
where the regression vector a = [qfq 41.~1 . . . &znZ. . . . . . &. &,z1 . . . q5nr z;,lT and the weight vector w = [kol k l l . . . kn,l . . . . . . konT klnp . . . kntnr
I ’
With proper choice of the input vector z , G-FNNs can be used t o emulate NARMAX models for time series prediction. An G-FNN is a nonlinear approximation to function F that is equivalent t o optimal predictors in (2) and (3) as follows: (i) Feedforward G-FNN NARX it
= F F N N {[ ~ t - i .. . Yt-n,
. . xt-n,lT}
(6)
xt-1 . . xt-nzIT}
(7)
xt-1.
(ii) Recurrent G-FNN I NARX Ot
= FFNN{[Ot-l.
’ ’
Ot-n,
‘
(iii) Recurrent G-FNN I1 NARMAX Ot
= FFNN{ [Yt-i . . . yt-nu i t - i . . . L
- 7 1 ,
xt-i
. . . xt- n,IT>
(8)
3.2. G-FNN Learning Algorithm In this paperr G-FNN learning algorithm is used to establish the FNN prediction model from prior time series values. It provides an efficient way of constructing the prediction model online and combining structure and parameter learning simultaneously. Structure learning includes determining the proper number of rules n,. The parameters learning corresponds t o premise and consequent parameters learning of the FNN. Premise parameters include membership function parameters cJ and C, , and consequent parameter refers to the weight w J of the FNN (refer to Appendices).
4. Experiments
4.1. N A R X ( 2 , l ) Dynamic Process The following dynamic NARX(2,l) process is generated Yt =
+ 2.51 + x t - i
~t-i~t-2[~t-l
1+ Y L
+ Yt-2
+et
(9)
where the input has the form xt = sin(27rt/25). The FNN prediction models are estimated using the first 200 observations of the time series generated from (9). The G-FNN predictors are then tested on the following 200 observations from the model. To demonstrate the superior performance of the G-FNN predictors, comparisons with three other fuzzy neural methods with dynamic topology, i.e. RBF-AFS’, FNNS2 and D-FNN4, are shown in Table 1. In this
540 comparison, noise-free signals and fitting models with ny = 2 and n, = 1 are used for all the methods. It can be seen that the G-FNN provides better generalization as well as greater parsimony. The proposed G-FNN predictors are fast in learning speed because no iterative learning loops are needed. Generating less number of rules and parameters leads t o better computational efficiency using G-FNN learning algorithm. Model RBF-AFS FNNS D-FNN Feedforward G-FNN Recurrent G-FNN I Recurrent G-FNN I1
n,35 22 6 5 5 4
np
280 84 60 48 46 52
Testing MSE 1.9e-2 5.0e-3 8.0e-4 1.le-4 3.2e-4 4.5e-4
Training Method iterative loops 300 iterative loops one pass one pass one pass one pass
4.2. Chaotic Mackey-Glass Time Series
Chaotic Mackey-Glass time series is a benchmark problem that has been The time series is generated considered by a number of researchers from the following equation 1,31?1?1?.
where T > 17 gives chaotic behavior. Higher value of T yields higher dimensional chaos. For the ease of comparison, parameters are selected as: a = 0.1, b = 0.2 and T = 17. The fitting model of (10) is chosen to be Yt = F[Yt-p, Yt-p-At, Yt-p-2At7 Yt-p-snt]
(11) For simulation purpose, it is assumed that yt = 0, W < 0 and yo = 1.2. In this experiment, the following values are chosen: p = At = 6 and 118 5 t 5 1140. The first 500 input/output data pairs generated from (10) are used for training the FNN while the following 500 data pairs are used for validating the identified model. Using the G-FNN learning algorithm, a total of 8 fuzzy rules are generated for the G-FNN predictor during training. Figure 1 shows that the actual and predicted values are essentially the same and their differences can only be seen on a finer scale. 5 . Conclusions
In this paper, one feedforward and two recurrent G-FNN predictors are proposed, tested and compared. The proposed G-FNN predictors provide a sequential and hybrid learning method for model structure determination and parameter identification which greatly improves predictive
54 1
0 200
300
400
500
I
I
I
' 500
600
600
700
800
900
I
I
I
0 I000 1100
t
0.061
-0041
(a) I
'
200
,
' 300
. 400
,
'
I
,I
I
'
700
'
800
'
900
' 1000
I '
1100
t (b)
Figure 1. Prediction results: (a) Mackey-Glass time series from t = 118 to 1140 and six-step ahead prediction; (b) Prediction error.
performance. Experiments and comparative studies demonstrate superior performance of the proposed approaches. Appendix A.
Two Criteria of Rule Generation For each training data pair [zt,yt] : t = 1 . . . n d , where yt is the desired output or the supervised teaching signal and n d is the total number of training data, the system error is defined as et = (Iyt - &I]. If et is bigger than a designed threshold K,, a new fuzzy rule should be considered. At sample time t , regularized Mahalanobis distance is calculated as rndj = J[zt - c j l T C j [zt - c j ] ,j = 1 . . . n,. The accommodation factor is
@,
a new rule defined as dt = min m d j . If dt is bigger than Kd = should be considered because the existing fuzzy system does not satisfy ~-completeness~. Otherwise, the new input data can be represented by the nearest existing rule. Pruning of Rules The Error Reduction Ratio (ERR) concept proposed in is adopted here for rule pruning. At sample time t , we have from ( 5 ) y = Ow e, where y = [yl y2 . . . yt]* E Rt is the teaching signal, w E R ' is the real-valued weight vector, 0 = [@I . . . = QR E StX"is known as the regressor and transformed by QR decomposition, e = [el e 2 . . . etIT E Rt is the system error vector that is assumed to be uncorrelated with the regressor 0 , and Y = n,(n( 1). An ERR due t o q, can be defined as err, = *. ( s T y Y Total ERR T e r r j , j = 1 . .. n, corresponding t o the j t h
+
+
542
d-.
errj)Terrj rule is defined as T e r r j = If T e r r j is smaller than a designed threshold 0 < Ke,,< 1, the j t h fuzzy rule should be deleted, and vice versa. Determination of Premise Parameters Premise parameters or Gaussian membership functions of the F N N are allocated to satisfy the +completeness of fuzzy rules. In case of et > K , and dt > Kd, we compute the Euclidean distance edij, = llzi - bijn 11 between zi and the boundary point b i j , E { C i l , ciz,.. . , c ~ N , , i+,in, zqrnax}.Next, we find ;n = argmin edij,. If edijn is less than a threshold or a dissimilarity ratio of neighboring membership function K,, , we choose
Ci(n,+l)
= bij,,
Ci(n7+1)
=
(A.1)
Otherwise, we choose
c i ( n v + l )=
max (lCi(n,+l) - C i ( n , + l ) ,
I, ICi(n,+l) - Ci(n,+l)* I)
@
(A.3)
In case of et > K , but dt 5 Kd, the ellipsoidal field needs t o be decreased to obtain a better local approximation. A simple method to reduce the Gaussian width is as follows = K, x oijOld
(-4.4) where K, is a reduction factor which depends on the sensitivity of the input variables. In case of the rest, the system has good generalization and nothing need to be done except adjusting weight. Determination of Consequent Parameters TSK-type consequent parameters are determined using the Linear Least Squared (LLS) method as w = Oty, where where 0 t is the pseudoinverse of 0. 0.%3new
References 1. K. B. Cho and B. H. Wang, Fuzzy Sets and Systems, 83,325 (1996). 2. C. T. Chao, Y. J. Chen and C. C. Teng, IEEE Trans. System, Man and Cybernetic, 26,344 (1996). 3. S. Chen, C. F. N. Cowan and P. M. Grant, IEEE Trans. Neural Networks, 2, 302 (1991). 4. S. Wu and M. J. Er, IEEE Trans. Systems, Man and Cybernetics, Part B, 30, 358 (2000). 5. L. X. Wang, A Course in Fuzzy Systems and Control, New Jersey: Prentice Hall (1997). 6. A. Jazwinski, Stochastic Processes and Filtering Theory, New York: Academic Press (1970).
IMPLEMENTATION OF ON-LINE MONITORING PROGRAMS AT NUCLEAR POWER PLANTS J. WESLEY HINES Nuclear Engineering Department, The University of Tennessee, Knoxville, Tennessee 37996-2300 EDDIE DAVIS
Edan Engineering Corporation, 900 Washington St., Suite 830, Vancouver, Washington 98660 The investigation and application of on-line monitoring programs has been ongoing for over two decades by the U.S. nuclear industry and researchers. To this date, only limited pilot installations have been demonstrated and the original objectives have changed significantly. Much of the early work centered on safety critical sensor calibration monitoring and reduction. The current focus is on both sensor and equipment monitoring. This paper presents the major lessons learned that contributed to the lengthy development process including model development and implementation issues, and the results of a recently completed cost benefit analysis.
1. Introduction and Background For the past two decades, Nuclear Power Plants (NPPs) have attempted to move towards condition-based maintenance philosophies using new technologies developed to ascertain the condition of plant equipment. Specifically, techniques have been developed to monitor the condition of sensors and their associated instrument chains. Historically, periodic manual calibrations have been used to assure sensors are operating correctly. This technique is not optimal in that sensor conditions are only checked periodically; therefore, faulty sensors can continue to operate for periods up to the calibration frequency. Faulty sensors can cause poor economic performance and unsafe conditions. Periodic techniques also cause the unnecessary calibration of instruments that are not faulted which can result in damaged equipment, plant downtime, and improper calibration under non-service conditions. Early pioneers in the use of advanced information processing techniques for instrument condition monitoring included researchers at the University of Tennessee (UT) and Argonne National Laboratory (ANL). UT Developed Neural Networks based systems while ANL developed the Multivariate State Estimation System (MSET) [ 13. The EPFU Instrument Monitoring and Calibration (IMC) Users Group formed in 2000 with an objective to demonstrate OLM technology in operating nuclear power plants for a variety of systems and applications. The On-Line Monitoring Implementation Users Group formed in mid 2001 to demonstrate OLM in multiple applications at many nuclear power plants and has a four-year time frame. Current U.S. nuclear plant participants include Limerick, Salem, Sequoyah, TMI, VC Summer, and Sizewell B 543
544
using a system produced by Expert Microsystems Inc. (expmicrosys.com), and Harris and Palo Verde, which are using a system developed by Smartsignal Inc. (smartsignal.com). Each of these plants is currently using OLM technology to monitor the calibration of process instrumentation. In addition to monitoring implementation, the systems have an inherent dual purpose of monitoring the condition of equipment, which is expected to improve plant performance and reliability. The major European participant in this area is the Halden Research Project where Dr. Paolo Fantoni and his multi-national research team have developed a system termed Plant Evaluation and Analysis by Neural Operators (PEANO) [2] and applied it to the monitoring of nuclear power plant sensors.
2. On Line Monitoring Techniques The OLM systems use historical plant data to develop empirical models that capture the relationships between correlated plant variables. These models are then used to verify that the relationships have not changed. A change can occur due to sensor drift, equipment faults, or operational error. Numerous data-based technologies have been used by major researchers in the field. Three technologies have emerged and have been used in the Electric Power Industry that use different databased prediction methods: a kernel based method (MSET), a neural network based method PEANO [2]),and non-linear partial least squares (NLPLS) [3]. These methods are described and compared in Hines [4]. The major lesson learned in applying empirical modeling strategies are that the methods should produce accurate results, produce repeatable and robust results, have an analytical method to estimate the uncertainty of the predictions, be easily trained and easily retrained for new or expanded operating conditions. 2.1 Accurate Results
Early applications of autoassociative techniques, such as MSET, were publicized to perform well with virtually no engineering judgment necessary. One item of interest is the choice of inputs for a model. Early application limits were said to be around 100 inputs per model with no need to choose and subgroup correlated variables. However, experience has shown that models should be constructed with groups of highly correlated sensors resulting in models commonly containing less than 30 signals [ 5 ] . It has been shown that adding irrelevant signals to a model increases the prediction variance while not including a relevant variable biases the estimate [ 6 ] . 2.2 Repeatable and Robust Results When empirical modeling techniques are applied to data sets that consist of collinear (highly correlated) data sets, ill-conditioning can result in highly
545 accurate performance on the training data, but highly variable, inaccurate results on unseen data. Robust models perform well on data that have incorrect inputs as expected noisy environments or when a sensor input is faulted. Regularization techniques can be applied to make the predictions repeatable, robust, and with lower variability. A summary of the methods is given in Gribok [7], and regularization methods have been applied to many of the systems currently in use. 2.3 Uncertainty Analysis The most basic requirements outlined in the NRC safety evaluation [S] are that of an analysis of the uncertainty in the empirical estimates. Argonne National Laboratory has performed Monte Carlo based simulations to estimate the uncertainty of MSET based technique estimations [9]. These techniques produce average results for a particular model trained with a particular data set. Researchers at The University of Tennessee have developed analytical techniques to estimate prediction intervals for all of the major techniques (MSET, AANN, PEANO, and NLPLS). The analytical results were verified using Monte Carlo based simulations and provide the desired 95% coverage [6]. Each of the techniques performs well, some better than the others, on various data sets. 2.4 Ease of Training and Retraining As will be shown in section 3.0, it is virtually impossible for the original training data to cover the entire range of operation. The operating conditions may change over time and the models may need to be retrained to incorporate the new data. MSET based methods are not trained, but are non-parametric modeling techniques. These techniques work well in that new data vectors can simply be added to the prototype data matrix. Artificial Neural Networks require fairly long training times. Other parametric techniques, such as Non-Linear Partial Least Squares, can be trained much faster. Recently, PEANO system has incorporated a NLPLS algorithm with performed with equaled accuracy to the original AANN algorithm and can be trained in minutes versus days [lo].
3. OLM Plant Implementation Several lessons have been learned from EPRI’s three years of OLM implementation and installation. The major areas include data acquisition and quality, and model development, and results interpretation.
3.1 Data Acquisition and Quality In order to build a robust model for OLM, one must first collect data covering all the operating conditions in which the system is expected to operate and for which signal validation is desired. This data is historical
546
data that has been collected and stored and may not represent the plant state due to several anomalies that commonly occur. These include interpolation errors, random data errors, missing data, loss of significant figures, stuck data, and others. Data should always be visually observed and corrected or deleted before use. 3.1.1 Interpolation Errors The first problem usually encountered in using historical data for model training is that it is usually not actual data, but instead, data resulting from compression routines normally implemented in data archival programs. For example, the PI Data Historian from OSI Software creates a data archive that is a time-series database. However, all of the data is not stored at each collection time. Only data values that have changed by more than a tolerance are stored along with their time stamp. This method requires much less storage but results in a loss of data fidelity. When data is extracted from the historian, data values between logged data points are calculated through a simple linear interpolation. The resulting data appears to be a saw tooth time series and the correlations between sensors may be severely changed. Data collected for model training should be actual data and tolerances should be set as small as possible or not used. 3.1.2 Data Quality Issues Several data quality issues are common. These cases include 0 Lost or missing data, 0 Single or multiple outliers in one sensor or several, 0 Stuck data in which the data value does not update, 0 Random data values, 0 Unreasonable data values, 0 Loss of significant digits. Most of these data problems can be visually identified or can be detected by a data clean up utility. These utilities remove bad data or replace it with the most probably data value using some algorithm. It is most common to delete all bad data observations from the training data set. Most OLM software systems include automated tools for data cleanup; these tools easily identify extreme outlying data but are typically insensitive to data errors that occur within the expected region of operation. The addition of bad data points in a training set can invalidate a model.
3.2 Model Development Model development is not just a simple click and go as once claimed. There are several decisions that need to be made including: 0 Defining models and selecting relevant inputs, Selecting relevant operating regions, 0 0 Selecting relevant training data.
547
The model must be trained with data covering all operating regions in which it is expected to operate. These operating regions can vary significantly between nuclear plants since regions are defined by system structure, sensor values, operating procedures. One example of a system structure change is the periodic usage of standby pumps or the cycled usage of redundant pumps. A model must be trained for each operating condition for the system to work properly, but excessive training on unusual conditions may degrade the performance on the most usual operating conditions. Therefore, some plant line-ups may not ever be included in the training set. Operating conditions also change due to cyclical changes such as seasonal variations. If a model is trained during mild summers and then monitoring occurs in a hotter summer with higher cooling water temperatures, the model will not perform correctly. In this case, data from the more severe operating conditions must be added to the training data. 3.3 Results Interpretation
Once a model is trained and put into operation, the predictions must be evaluated to determine if the system is operating correctly, if a sensor is drifting, if an operating condition has changed, or if an equipment failure has occurred. The choice of which has occurred can be made using logic and this logic has been programmed into expert system type advisors with some success [ 1 I]. The logical rules operate on the residuals, which are the difference between the predictions and the observations. Under normal conditions, the residuals should be small random values. If only one residual grows, the hypothesis is that a sensor has degraded or failed. If several residuals significantly differ from zero, the operating state has probably changed or an equipment failure has occurred. More in depth knowledge and engineering judgment must be used to ascertain which has occurred. 4. Conclusions
The development and application of On-Line Monitoring systems has occurred over the past 20 years. Through that time period much has been learned about improving the modeling techniques, implementing the system at a plant site, evaluating the results, and the economically basis for such an installation. The original objective of extending Technical Specification sensor calibrations to meet extended fuel cycles has changed to monitoring both safety and non-safety related signals, performance, and equipment. A recently, completed Cost Benefit Analysis [12]shows a nominal six year payback for a 600 sensor installation and shows that the real basis may be in the more difficult to quantify benefits of efficiency improvement and equipment monitoring. As plants hlly field these technologies, the efforts and experiences of plant personnel, researchers, and EPRI project managers will prove invaluable.
548
References 1. Singer, R.M., K.C. Gross, J.P. Herzog, R.W. King and S.W. Wegerich (1996), "Model-Based Nuclear Power Plant Monitoring and Fault Detection: Theoretical Foundations", Proc. 9th Intl. Conf. on Intelligent Systems Applications to Power Systems, Seoul, Korea. 2. Fantoni, P., S. Figedy, A. Racz, (1998), "A Neuro-Fuzzy Model Applied to Full Range Signal Validation of PWR Nuclear Power Plant Data", FLINS-98, Antwerpen, Belgium. 3. Rasmussen, B., J.W. Hines, and R.E. Uhrig (2000), "Nonlinear Partial Least Squares Modeling for Instrument Surveillance and Calibration Verification", Proc. Maintenance and Reliability Conference, Knoxville, TN. 4. Hines, J.W. and B. Rasmussen (2000), "On-Line Sensor Calibration Verification: "A Survey"", 14th International Congress and Exhibition on Condition Monitoring and Diagnostic Engineering Management, Manchester, England, September, 2000. 5. EPRI (2002), Plant Systems Modeling Guidelines to Implement OnLine Monitoring, EPRI, Palo Alto, CA: 1003661. 6. Rasmussen, B., (2003), "Prediction Interval Estimation Techniques for Empirical Modeling Strategies and their Applications to Signal Validation Tasks", Ph.D. dissertation, Nuclear Engineering Department, The University of Tennessee, Knoxville. 7. Gribok, A.V., J.W. Hines, A. Urmanov, and R.E. Uhrig, "Regularization of Ill-Posed Surveillance and Diagnostic Measurements", Power Plant Surveillance and Diagnostics, eds. Da Ruan and P. Fantoni, Springer, 2002. 8. NRC Letter dated June 22, 2000, Safety Evaluation related to Topical Report (TR) 104965 "On-Line Monitoring of Instrument Channel Performance". 9. Zavaljevski, N., A. Miron, C. Yu, and E. Davis (2003), "Uncertainty Analysis for the Multivariate State Estimation Technique (MSET) Based on Latin Hypercube Sampling and Wavelet De-Noising", Transaction of the American Nuclear Society, New Orleans, LA, November 16-20. 10. Fantoni, Paolo F., Mario Hofhann, Brandon Rasmussen, Wesley Hines, and Andreas Kirschner, (2002), "The use of non linear partial least square methods for on-line process monitoring as an alternative to artificial neural networks," 51hInternational Conference on Fuzzy Logic and Intelligent Technologies in Nuclear Science (FLINS), Gent, Belgium, Sept. 16-18. 11. Wegerich, S, R. Singer, J. Herzog, and A. Wilks (2001), "Challenges Facing Equipment Condition Monitoring Systems", Proc. Maintenance and Reliability Conference, Gatlinburg, TN. 12. EPRI (2003), On-Line Monitoring Cost Benefit Guide, Final Report, EPRI, Palo Alto, CA: 1006777.
PREDICTION INTERVAL ESTIMATION TECHNIQUES FOR EMPIRICAL MODELING STRATEGIES AND THEIR APPLICATIONS TO SIGNAL VALIDATION TASKS BRANDON RASMUSSEN AND J . WESLEY HINES The University of Tennessee, Knoxville, TN 37771 Empirical modeling techniques have been applied to on-line process monitoring to detect equipment and instrumentation degradations. However, few applications provide prediction uncertainty estimates, which can provide a measure of confidence in your decisions. This paper presents the development of analytical prediction interval estimate methods for three common non-linear empirical modeling strategies: artificial neural networks (ANN), neural network partial least squares (NNPLS), and local polynomial regression (LPR). The techniques are applied to nuclear power plant operational data for sensor calibration monitoring and verified via bootstrap simulation studies.
1. Introduction Empirical modeling techniques are being used for on-line monitoring of process equipment and instrumentation in the nuclear power industry [ 11. The original objective was to reduce the calibration interval of safety critical sensors by moving to a condition based approach through determining their status. As stated in EPRI's On-line Monitoring of Instrument Channel Performance [2]: "On-line monitoring is the assessment of channel performance and calibration while the channel is operating". The modeling strategies applied to signal validation tasks in this work are: artificial neural networks (ANN), neural network partial least squares ("PLS), and local polynomial regression (LPR). These 3 modeling paradigms have been the most commonly reported for applications to signal validation tasks in on-line monitoring applications. The focus of this work was to provide point-wise prediction intervals, which contain the measured responses to a specified significance level, namely 95%. One of the functions of an on-line monitoring system is to report when an empirical model's estimations are significantly deviating from the measured values of the monitored process. While the ability to detect these significant deviations has been proven, the quantification of the uncertainty associated with the empirical model estimates is rarely addressed. To verify that the observed deviations are significant, in that they exceed all observed effects of modeling uncertainty, prediction interval estimation techniques need to be developed, proven, and incorporated into existing and future software for on-line monitoring applications. 549
550
2. Empirical Modeling Techniques This section provides a brief overview of the three empirical modeling systems under investigation. The three systems were selected for study because they are currently implemented in on-line power plant monitoring systems.
2.1 Artificial Neural Networks Artificial neural network (ANN) models, inspired by biological neurons, contain layers of simple computing nodes that operate as nonlinear summing devices. These nodes are highly interconnected with weighted connection lines, and these weights are adjusted when training data are presented to the ANN during the training process. The training process often identifies underlying relationships between environmental variables that were previously unknown. Successfully trained A " s can perform a variety of tasks, the most common of which are: prediction of an output value, classification, function approximation, and pattern recognition. Neural networks have been applied to signal validation in the power industry [3,4].
2.2 Neural Network Partial Least Squares First introduced by H. Wold in the field of econometrics [ 5 ] ,PLS has become an important technique in many areas, including psychology, economics, chemical engineering, medicine, pharmaceutical science, and process modeling. In attempts to enhance the technique by providing non-linear modeling capabilities, the use of single hidden layer feedforward neural networks (NN) has been applied in the field of chemical engineering [ 6 ] . These methods have been under study at the University of Tennessee, for the purposes of signal validation in large-scale processes, beginning in late 1999 [7]. This method will be referred to as neural network partial least squares (NNPLS). A NNE'LS signal validation system has been implemented, on a trial basis, at the 9" unit of Tennessee Valley Authority's Kingston fossil plant, in Harriman, Tennessee, USA [S, 91. 2.3 Local Polynomial Regression Local polynomial regression (LPR) models are often referred to as lazy learning methods. Lazy learning comprises a set of methods in which data processing is deferred until a prediction at a query point needs to be made. These methods are also referred to as memory based methods due to the approach of storing the training data, and recalling relevant training data when a query is made. A good review of lazy learning methods, focusing on locally weighted regression, is presented by Atkeson et. al. [lo]. A training data set is comprised of a set of input vectors, and a corresponding set of output values. A query point is an input vector for which an output is to be determined. Relevant data is identified by the use of a distance function where maximum relevance occurs when a query point matches a point in the
551 training set, relevance diminishes from this maximum as the distance between a query point and the training points increases. Nonparametric regression using data in a neighborhood of the present query point is generally referred to as a local model. The neighborhood size is controlled by a bandwidth parameter, which is implemented as a kernel width. Local models attempt to fit the data in a region surrounding the query point with a d * degree polynomial. This paper presents the results using local linear regression (LLR) in which d=l.
3. Uncertainty and its Estimation This section presents a brief discussion of empirical uncertainty and its sources. Next it provides a brief description of analytical methods for prediction interval estimation. 3.1 Sources of Uncertainty
There are several sources of prediction uncertainty related to the modeling process. They include the selected predictor variables, the amount and selection of data used to develop the model, the model structure including complexity, and the noise in the predictors and response. Uncertainty can be divided into bias and variance. The selection of a training set is prone to sampling variation because there is variability in the random sampling from the entire population of data. Since each possible training set will produce a different model, there is a distribution of predictions for a given observation. The issues relative to fluctuations in the response for a given observation are reduced as the training data set size increases, i.e. more training data lowers the uncertainty of the estimates, assuming the data are representative of the process. Model misspecification occurs when a given model is incorrect, and a bias is introduced due to the improper model, e.g. fitting non-linear data to a linear model will result in a biased model. Model misspecification may occur for the ANN models and the NNPLS models, though given the proper number of free parameters both techniques are proven to perform adequately, i.e. with minimal bias. Misspecification can also occur for a local polynomial model through the combined selection of the bandwidth parameter and the degree of the locally fitted polynomial. Model misspecification is made worse by an improper choice of model complexity. Incorrect model complexity increases model uncertainty. A model without the required flexibility will bias the solution while an overly complex model tends to fit the noise in the training data and has an increased variance. For an ANN, the complexity is determined by the number of hidden neurons; for LPR, the complexity is controlled by the polynomial order and the bandwidth; and for a NLPLS model, the complexity is determined by the number of latent variables included in the model. The selected set of predictor variables influences the model uncertainty. If the predictor variable set does not contain the necessary information to
552 accurately model the desired response, a bias results. If the predictor variable set contains variables that are unrelated to the desired response an increased solution variance results. Lastly, noise in the input and output data is a potential source of uncertainty. Each of the analytical approaches to prediction interval estimation presented herein considers only the noise in the dependent, or response, variable. Alternate theories based on the error-in-variables model are available for including the noise in the predictor variables in developing prediction intervals; however, they require knowledge of the noise level present, which is generally unknown.
3.2 Prediction Interval Estimation for Empirical Techniques The derivation of prediction intervals for these three techniques is too involved for this paper but can be found in Rasmussen [ 111. The methods for NNPLS and ANN follow Chryssolouris et. al. [ 121 and result in predictions of the form.
where F is the Jacobian matrix computed using the training data, and fo is the Jacobian computed for a new observation xo used for computing prediction intervals for the corresponding prediction Yo. Prediction intervals for the non-parametric techniques follow the bias-variance decomposition of the MSE, and can also be found in Rasmussen [ 111. 4. Results
The prediction interval estimation techniques were applied to several data sets. This paper will present the results for prediction of a nuclear power plant's first stage turbine pressure. The turbine pressure data set contains 5 predictor variables: 3 steam generator steam pressures, a turbine first stage pressure, and the unit gross generation. The response variable is also a turbine first stage pressure channel, though not the same channel as the one included in the predictor variable set. The data was provided by the Electric Power Research Institute (EPRI) and is from an operating U.S. nuclear power plant sampled at one-minute intervals. The training data set contains good data and the test data set contains a known drifting sensor. The data was standardized to have a zero mean and unit variance for model training and testing. For each of the modeling techniques, several architectures were developed and analyzed. The architecture with the best accuracy also had the smallest prediction interval. The number of latent factors was varied for the " P L S model, the number of hidden neurons was varied for the ANN model, and the kernel width was varied for the LLR model.
553 4.1 Neural Network Partial Least Squares Results
A " P L S model with two latent factors was found to be optimal. The prediction results on the test data set are shown in Figure 1 below. This figure shows the prediction for a specific sample with a circle along with its associated prediction interval. The prediction intervals contain the measured values until sample number 500, when the sensor drifts to a value lower than those predicted. When the prediction interval contains the actual value we call this coverage and the coverage was 96% for the good portion of the test data. This agrees
Bootstrap was used to validate the analytical PI results resulting in a 98% coverage value. Thus, the bootstrap estimates are slightly overestimating the true uncertainty for the " P L S models in this case, and that the analytic prediction intervals perform as expected and provide sufficient c,overage in all cases. 4.2 Artificial Neural Network Results The optimal neural network architecture contained two hidden neurons. The prediction results on the test data set are shown in Figure 2 below. The first observation is that the magnitudes of the prediction intervals are much lower than those of the NNPLS models. The average coverage values are above the expected value of 0.95. The test data predictions indicate the know drift which becomes significant around sample 500. For the ANN models, the analytic and bootstrap intervals for the test data were very similar for all evaluated architectures. The coverage values for both approaches to interval estimation remained at or above the expected level for all evaluated architectures.
554 4.3 Local Linear Regression Results The optimal local linear regression architecture was determined to have a kernel width of 0.75. The prediction results on the test data set are shown in Figure 3 below. The coverage values of the prediction intervals with respect to the fault free were at or above the expected value. Again, the drift can be easily identified.
555 4.4 Results Summary
All three techniques provided accurate prediction interval estimates as proven through coverage values of near 95%; however, the techniques did not produce equal accuracies. Table 1 presents a quantitative comparison of the three techniques.
Table 11 comparative maran and pnn withthree traning meth met 150 Model Type
Mean Absolute Errors (PSIA)
Prediction Intervals (PSIA)
Drift Estimates
,h! (PSIA)
0 (PSIA)
NNPLS ANN
1.3 0.5 0.4
4.1 1.2 1.3
-10.5 -8.9 -9.1
0.54 0.19 0.09
LLR
The NNPLS models performed poorly for this data set when compared to the other methods. This is because the turbine pressure data set contained only mild correlations. Previous work documents the need for high correlations in the predictor variables for successful implementation of the " P L S architecture [Rasmussen 20021. However, the resultant prediction intervals reflected the poor performance and the coverage values were consistently at or above the expected 95% level. The ANN models for. this data set performed well. The prediction intervals provided the appropriate coverage for the test data, the average errors were minimal, and the predictions identified the drift. The LLR models also performed well for this case. The prediction intervals provided the appropriate coverage for the majority of different bandwidth models evaluated. The errors with respect to the test data were slightly lower than those observed for the ANN models, and the drift in the test data response was clearly identifiable in all cases. 5. Conclusions
Methods were developed and applied to the estimation of prediction intervals for three commonly used empirical models. The analytical algorithms were successfully applied to actual nuclear power plant data. The analytic prediction interval estimation techniques were shown to consistently provide the expected level of coverage for the empirical models. The methods were also able to detect and quantify a drifting sensor. The prediction interval estimation methods were also applied to several other data sets in which similar results were found. However, the NNPLS algorithm performed very well on data sets with higher correlations. This fact agrees with the well-published assumption that no model performs best on all data sets.
556 References 1. Davis, E., Shankar, R., (2003), "Results of the EPRUUtility On-Line Monitoring Implementation Program", Transactions of the American Nuclear Society 2003 Winter Meeting, New Orleans, LA. 2. Davis, E., D. Funk, D. Hooten, and R. Rusaw, (1998), "On-Line Monitoring of Instrument Channel Performance," EPRI TR- 104965. 3. Fantoni, P.F., and A. Mazzola, (1996), "A pattern recognition-artificial neural networks based model for signal validation in nuclear power plants," Annals of Nuclear Energy, 23, no. 13, 1069-1076. 4. Hines, J.W., R. Uhrig, C. Black, and X. Xu, (1997), "An evaluation of instrument calibration monitoring using artificial neural networks," Proceedings of the American Nuclear Society, Albuquerque, NM, November 16-20. 5. Wold, H., (1966), "Nonlinear Estimation by Iterative Least Squares Procedures," F. N. David, Ed., John Wiley, New York, 1966. 6. Qin, S. Joe, and T.J. McAvoy, (1992), "Nonlinear PLS modeling using neural networks," Comp. Chem. Engng., 16, no. 4, 379-39 1. 7. Rasmussen, B., J.W. Hines, and R.E. Uhrig, (2000), "A Novel Approach to Process Modeling for Instrument Surveillance and Calibration Verification," The Third American Nuclear Society International Topical Meeting on Nuclear Plant Instrumentation and Control and Human-Machine Interface Technologies, Washington DC, November 13-17. 8. Rasmussen, B., (20M), "Neural Network Partial Least Squares for Instrument Surveillance and Calibration Verification," MS Thesis, Nuclear Engineering Department, The University of Tennessee, Knoxville. 9. Hines, J.W., Rasmussen and R.E. Uhrig (2002, "An On-line Sensor Calibration Monitoring System, International Journal of COMADEM, Birmingham, United Kingdom. 10 Atkeson, C., Moore, A. & Schaal, S. (1997). Locally weighted learning. Artif. Intel. Rev., 11, 76-113. 11. Rasmussen, B., (2003), "Prediction Interval Estimation Techniques for Empirical Modeling Strategies and their Applications to Signal Validation Tasks", Ph.D. dissertation, Nuclear Engineering Department, The University of Tennessee, Knoxville. 12. Chryssolouris, George, Moshin Lee, and Alvin Ramsey, (1996), "Confidence interval prediction for neural network models," IEEE Transactions on Neural Networks, 7, no. 1,229-232.
NUCLEAR POWER PLANT MONITORING WITH MLP AND RBF NETWORK KUNIHIKO NABESHIMA Research Groupfor Advanced Reactor System, Japan Atomic Energy Research Institute, Tokai-mura, Ibaraki-ken, 3 19-I 195, Japan EMINE AYAZ, SERHAT SEKER, BURAK BARUTCU, ERDINC TURKCAN Electric Engineering Department, Istanbul Technical University 34469 Maslak, Istanbul, Turkey KAZUHIKO KUDO Department of Quantum Physics and Nuclear Engineering, Kyushu University 6-10-1 Hakozaki, Higashiku, Fukuoka, 812-8581, Japan
The monitoring system with MLP and RBF network has been developed for NPP. MLP is used to detect the symptom of anomalies and RBF network is used to identify the abnormal events. From the off-line test results using PWR simulator, it is clear that the monitoring system can successfully detect and diagnose small anomalies earlier than the conventional alarm system.
1. Introduction
Main purpose of Nuclear Power Plant (NPP) monitoring is to diagnose the current status of operational plants using process signals in real-time. Especially, it must be more important for plant safety in aged NPPs to detect the symptom of small anomalies at the beginning stage. Therefore, we have developed Artificial Neural Network On-line Monitoring Aids (ANNOMA) system for the Borssele NPP in the Netherlands [l]. In the system, Multi Layer Perceptron (MLP) in auto-associative mode can model and predict the plant dynamics by training normal operational data only. The basic principle of the anomaly detection is to monitor the deviation between process signals measured from the actual plant and the corresponding values predicted by MLP [2]. The expert system is used to diagnose the plant status with the measured signals, the outputs of MLP and the alarm information from the conventional alarm system. The test results showed that the monitoring system with MLP and an expert system could detect and diagnose small anomalies earlier than the conventional alarm system. However, the description of rules in the expert system is slightly complicated if many kinds of anomaly cases are assumed. In this study, Radial Basis Function (RBF) network [3] is tested in place of expert system because it has the
557
558 advantage of fast learning and easy adaptation for the changes of network construction, that is, input or output signals. 2.
Plant Monitoring System
Figure 1 shows the overview of monitoring system. The monitoring system receives the digitized plant signals every two seconds from the data acquisition system. Out of these, most significant plant signals are selected for the inputs of MLP; neutron flux, flow rate, pressure, temperature, electric power, etc. Here, on-line Pressurized Water Reactor (PWR) training simulator is utilized for evaluating the performance of RBF network as diagnosis method, because it is difficult to collect many kinds of anomaly data from the actual power plants. The PWR simulator is manufactured on the basis of an existing 822 MWe power plant, Suny-1 in U.S.A.We can take into consideration in 49 abnormal events of major system including failure of pumps, valves, controllers, pipes, etc. The conventional alarm system is attached to the panel of the plant simulator. Input Nsulm Flux
Flow Rate Temp
PWR Plant Simulator (Surry 1 Model)
‘Owe
Digital Signal. 22 Ch Time Interval 2 sec Figure 1. Overview of ANNOMA system with MLP and RBF network.
I
2.1. MLP for Anomaly Detection
MLP has three layers: input, one hidden and output layer. The numbers of input and output nodes are 22 as shown in Table 1, respectively. In auto-associative Table 1. Monitoring signals and their maximum error in learning.
Table 11 comparative maran and pnn withthree traning meth met 150 Table 11 comparative maran and pnn withthree traning meth met 150 Table 11 comparative maran and pnn withthree traning meth met 150 Table 11 comparative maran and pnn withthree traning meth met 150 Table 11 comparative maran and pnn withthree traning meth met 150 [kgffcm2]
Hot-leg Temperature (C) Steam Pressure (loop-B)
0.19781 [“Cl 0.07070 [kgffcm2]
8
VCTLevel Turbine Impulse Press.
0.38583 Vh.1 0.13519
9 10 11
SGLevel(B) SGLevel (C) Steam Flow (loop-B)
0.07122 kgficm2] 20 Steam Pressure (loop-C) 0.09953 [“h] 0.58640 0.08940 [%I 21 Average Neutron Flux 2.82109 [a] 22 Generated Electric Power 2.31500 -MWe]
7
18
19
wgficmz]
559
network, the output signals are supposed to be the same as the input signals at the same time step. The number of hidden node is selected as 20. The backpropagation algorithm is used for learning, and the sigmoidal function is selected as transfer function. The patterns for initial learning were obtained during normal start-up and steady state operations. Table 1 shows maximum learning errors after repeated 1,000 times per each pattern, which are small enough for modeling.
2.2. RBF Network for Plant Diagnosis In this study, (RBF) network is utilized instead of expert system to develop more efficient diagnosis system. The input layer in RBF network is abnormal extent of 22 signals calculated by MLP, as shown in Table 2. The abnormal extent is defined as follows; 1 for e/emax>1.25, 0.5 for 1.0<s/Emax<1.25, 0 for E/S^ \ <1.0, -0.5 for -1.0<e/emax<-1.25 and -1 for e/smax<-1.25. Here, E is the deviation between measured signal and predicted value by MLP, and Em,* is the maximum error calculated in initial learning. Each element in hidden layer is corresponding to sixteen of abnormal cases which are simulated in this application and one normal case, respectively. The transfer function of radial basis neuron is defined as e . The output in competitive layer picks the maximum of these probabilities, and produces 1 for that case and 0 for the other cases. Table 2. Anomaly extent of 22 channels in abnormal events.
No. Abnormal Event 1 Partial Loss of FW
4 SG-C Tube Rupture
0 0 0 0
VCT Level Control 5 Fails Low
0 0 0 0 0 0
2 Leakage of ASDV 3 Small RCS Leak
6 7
8 9 10 11 12 13 14 15 16 17
// 12 13 0 0 -1 0 0 1 1 0 0 0 0 0 0 -1 0 0
CH / 2 3 4 5 6 7 8 9 10
VCT Level Control Fails High SO Level Control fails High SG Level Control fails Low PRZ Spray Valve Fails Open PRZ Spray Valve Fails Close Backup PRZ Heater Fails On Temperature Failure High in Cold-Leg One of TGV Fails Open TGV Fails Closed Dropped Control Rod Partial Loss of CW No Anomaly
-1
0 0 0 0
0 0 0 0
0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0
1
0.5 0 0 0.5
-1 -1 -1 -1 -1 -1 -1
0
1
14 0 0 0 0
0 0
15 16 17 18 19 20 21 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 -1 0 0 0 0 0 0 0 0 0 0 0 0 0.5
0 0 0 0
0.5
22 0 0 0 0
0 0
1 -1 -1 -1 -1 1 1 -1 -1 1 1 -1 -1 1 0 -1 0 0 0 0 0 0 0 0 0 -1 0 1 0 0 0 0 0 0 0.5
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
-1 1
0 0 0 0 0
-1
0 0
0 0 0 0 0
0 0 0 0 Q 0
-1
0 0 0 0 0 0.5 0 0 0 0 0 0
-1 1 1 0 -1 0 1 1 0 0 1
0 0 0 0 0 0.5 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0
0 0 0
-1
0 0
0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0
1
0 0
1 1
0 0 0
0 -1 0 0 0 0 0 0 0 1 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 -1 0 0 0 0 0 0 0 0 0 0 0 0 0 0
-1
0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
560 2.3. Monitoring Results
After the learning, RBF network can easily identify the events because the distributions of abnormal extents are dependent on the type of anomalies. Figure 2 shows the identification results of five abnormal events at steady state. The RBF network immediately identified in case-1, 2 and 5, although the network missed after certain period when the anomalies are well developed. On the other hands, the abnormal events in case-3 and 16 needed 16 seconds to be identified. Figure 3 shows the identification results of same type of abnormal events, which occurred at 80% electric power during 2.0%lmin power decrease operation. The RBF network could identify the anomalies same as at steady state operation. Furthermore, it was clear that the diagnosis performance of RBF network was almost same even if the degradation of anomalies was different. 17 15
213
P
dl 1
3 1
Figure 2. Identification results at steady state. Table 11 comparative maran
3.
and pnn withthree traning meth met 150
Conclusion
In the monitoring system, the MLP can detect symptom of anomalies and the RBF network can successfully identified the type of abnormal events during steady state or transient operation by using the outputs of MLP.
References 1. K. Nabeshima, et al., Proc. ICANN/ICONIP2003, 406 (2003). 2. K. Nabeshima, et al., Math. Comput. Simul. 60,233 (2002). 3. P. D. Wasserman, Advanced Methods in Neural Computing, Van Nostrand Reinhold, New York (1993).
TECHNICAL ASPECTS OF THE IMPLEMENTATION OF THE FUZZY DECISION AID SYSTEM DEVELOPED FOR THE BELGIAN RADIOACTIVE WASTE MANAGEMENT AGENCY P.L. KUNSCH 14 Avenue des Arts BE-1210 BRUSSELS [email protected]
A. FIORDALISO, AND PH. FORTEMPS
Facult6 Polytechnique de Mons 9 Rue de Houdain RE -7000 MONS
This paper describes the technical aspects of the implementation of a Fuzzy Decision Aid System developed by the Mathematics Department of the Polytechnic Faculty of Mons for ONDRAFNIRASNERAS, the Belgian Radioactive Waste Management Agency. The proposed system is intended to deal with the economic uncertainties inherent to radioactive waste management, and especially to High-Level-Wasterepository projects.
1. Introduction The economic calculus of radioactive-waste-management projects is particularly difficult due to many types of uncertainties that still exist regarding the final design, the eventual costs and the realisation schedule. The general approach mainly relies on fuzzy rule-based systems to infer contingency or margin factors in such uncertain contexts. We refer to earlier papers ([l], [ 2 ] , and [3]) for details about the underlying fuzzy-logic approach. We mainly focus in the present paper on the description of the Decision Aid System from a guidance point of view. Nevertheless, the paper describes all the technical points that are not covered by those papers, mainly with respect to including correlations between factors and the schedule aspects. We describe the application of the methodology to the disposal of High-Level-Waste (HLW) repository projects. 2.
Quick overview of the fuzzy approach
A HLW repository project can be divided into five main phases (preliminary planning, construction of basic structures, disposal of Medium Level Waste, disposal of HLW, and closure of the site) comprising a total of about 50 elementary tasks. The methodology for calculating contingency factors for each task has been developed in the first paper [I]. The fuzzy logic procedure is applied to each task to estimate individual contingency factors related to the project (Pfuctor) and to the technology 561
562 (Tfactor) according to advancement levels Plevel or Tlevel. All the individual costs including contingency factors are eventually summed up to produce the global cost estimate. Two Fuzzy Inference Systems (FIS) have been proposed to infer contingency factors (Tfactor and Pfactor) based on Plevel and Tlevel estimates. There are using the Goguen implication to find the conclusions of rules [ 11. These systems can be seen as the translation, or extension, of semantic rules coming from EPRI (Electric Power Research Institute) commonly used to roughly estimate the economic risks of nuclear long-term projects. In [2] the methodology has been further developed. The point was at this stage to remove the difficulty faced by experts when providing Plevel and TIevel estimates for a given task. A two-step approach is proposed:
Firstly, opinions of experts are aggregated about a meaningful variable (called proxy), strongly correlated to Plevel or Tlevel, in order to obtain crisp proxy values, for example the technological Research and Development (R&D) total budget for obtaining Tlevel; b) Secondly, the individual Plevels and T levels are derived from those crisp proxy values using an adequate fuzzy system.
a)
From these two steps the contingency factors are obtained for each individual task using the fuzzy Tsystem or Psystem. Once all Pfactors and Tfactors attached to the elementary tasks composing the project are available in this way, it is quite straightforward to compute the global cost envelope, with or without discounting, according the chosen discount rate. The whole process from expert data acquisition to contingency factor computation is illustrated for a specific Tfactor in figure 1.
3.
Calibration
For calibrating all contingency margins in the model, the opinions of two default artificial experts (expert] and expert2) have been encoded in the software. One expert produces pessimistic estimates of the proxies; the other one is rather more optimistic. This encoding is made by using first quite rough estimates of the Tfactors and Pfactors. The calibration procedure sets the credibility of expert 1 and expert2 such that the produced Tfactors or Pfactors are as close as possible to the targets given by the rough estimates. Note that these two artificial experts only provide a baseline to work with. The proposed values can be modified later at users’ convenience. Of course, per task one or several experts can be added (or removed). Working with default experts offers the advantage of having an immediately operational system. To start with, the system only replicates the
563 rough estimates. As real expert information is encoded, the system will reflect more and more the financial risk reality.
expert 1 about R&D Fuzzy system Consensus R&D=6024
Tlevel=0.52
Tsystem
+Tfacto~0.36
expert2 about R&D
Figure 1. From proxy R&D to contingency Tfactor (arbitrary values).
4.
Schedule aspects
A first discussion of the scheduling aspects has been made in [3].By contrast with the approach proposed in this reference, it has been found more appropriate to handle separately two well-contrasted schedules considering a fast puce and a slow puce for realising all the tasks. Because some key dates are very much dependent on political decisions, for example the beginning date of operations, a full fuzzy treatment would considerably expand the support of some membershp functions. The consequence is that the financial assessments may not be conservative. The propagation of uncertainties through the schedule is another rather complex problem which has required additional development regarding the existence of correlations between some project P-factors. When the costs of tasks are fully correlated, the corresponding costs are combined additively. When the costs of tasks are uncorrelated, the corresponding costs are combined like the standard deviations of independent random variables using the square root of the sum of the squares. It also has to be discussed how the possibility distributions for Pfactors and Tfactors can be exploited for providing usable results. a! -cuts are used for giving an optimistic minimum mode and a pessimistic maximum mode for all contingency factors.
564 5.
Data acquisition and Graphical user interface
Today the data acquisition module is not yet integrated into the software. An example of form to be filled by expert for calculating the proxy R&D is shown in figure 2. The Graphical User Interface (GUI) which has been developed is able to perform the following operations: 0 Edit, modify the fuzzy systems that map Plevel on Pfactor (Psystem) and Tlevel on Tfactor (Tsystem) using the Margins Fuzzy Systems panel. 0 Edit, modify, add, and remove expert advices about proxies using the Design Proxy panel. An example is shown in figure 3. 0 Build the slow and fast pace schedules using the Calendar panel; calculate the risk curves for the costs. An extract of a possible Calendar panel is shown in figure 4. This panel also provides the total undiscounted and discounted costs. P-risk curves for the total discounted costs are introduced. There are shown at the bottom of the figure. These curves are defined as the minimum and maximum values in the Ct -cuts of the distribution of the total P- or T-factors for the global project. Three versions are considered, corresponding to the 'minimum P-risk' on the left (the global P-factor is not integrated); the medium P-risk curves (half the global P-factor is integrated), and the maximum P-risk curve (the global P-factor is integrated.
Expert id: Task id: Fast pace
Slow pace
Table 11 comparative maranenario and pnn withthree traning meth met 150 Table 11 comparative maran and pnn withthree traning meth met 150 Table 11 comparative maran and pnn withthree traning meth met 150 Development maran and pnn withthree traning meth met 150 Table 11 comparative le+03 Table 11 (in comparative maran and pnn withthree traning meth met 150 Table 11 comparative maran and pnn withthree traning meth met 150 Euros) Table 11 comparative maran and pnn withthree traning meth met 150 Table 11 comparative maran and pnn withthree traning meth met 150
Fig. 2 An example of interface for the experts inputting their opinion regarding the proxy R&D
565
Fig. 3 An example of the Design Proxy Panel for the determination of the proxy R&D budget
6.
Further work
Rather immediate developments concern the improvement of data acquisition for the various inputs from the experts. It is planned to introduce random elements to allow for Monte-Carlo simulations (tasks durations, nominal costs, etc.). Some aspects have been discussed in [3], The possibility should also be given to add, delete, or manipulate the tasks, for example in the case when a tasks is split between several intermediary subtasks. The possibility of developing a fuzzy PERT [4] to represent critical paths and uncertainties on the task durations is also being currently investigated.
566
Figure 4.An example of the Calendar Panel with the three P-risk curves
References 1. P.L. Kunsch, A. Fiordaliso, Ph. Fortemps, A fuzzy inference system for the economic calculus in radioactive waste management, in : Da Ruan (ed.), Fuzzy System and Sofr Computing in Nuclear Engineering, Physica-Verlag, Heidelberg, Germany, 153-171 (2000). 2. P.L. Kunsch, Ph. Fortemps, A fuzzy decision support system for the economic calculus in radioactive waste management, Information Sciences, FLINS 2000 Conference Special Issue 1431-4, 103-116 (2002). 3 . P. L. Kunsch, A. Fiordaliso and P. Fortemps, Fuzzy-Logic Supported Evaluation of the Disposal Costs and Tariffs of High-level Radioactive Waste in, Da Ruan, P. D'hondt, E.E. Kerre (Eds.) Computational Intelligent Systems for Applied Research, Proceedings of the 5th International FLINS Conference, Gent, Belgium, Sept. 16-18 2002, 520-527 (2002). 4. F.A. Lootsma, Stochastic and fuzzy PERT, European Journal of Operational Research 43, 174-183 (1989).
REACTOR COOLANT LEAK DETECTION SYSTEM IVAN PETRUZELA Data Systems and Solutions, IBC-Pobrezni 3 Prague 8,Czech Republic, 186 00
SDU is a software system that detects leak of the reactor coolant at quasistationary processes. The SDU system has been installed at an Nuclear Power Plan Dukovany. The reactor coolant leak is calculated from instant data in the plant information system by means of a balance model for defined RCS volumes. The system performs an automatic classification of the operation conditions. SDU Functional Tests have been canied out at Unit 2 of Nuclear Power Plan Dukovany in 2003. Tree types of leaks have been used and SDU output information has been evaluated with good result.
1.
Method of the leak detection by mass balance calculations
1.1. Coolant leakage from the reactor coolant system The ability to prevent the coolant leakage from the reactor coolant system (RCS) is among the most important safety aspects of a nuclear power plant [l]. The plant operator has to demonstrate that even in case of problems with the pipelines systems integrity there is an option to shut the reactor safely down. The required solution is the implementation of a detection system diagnosing the failure on time. The supervising bodies request the Implementation of independent systems for the monitoring of the coolant leak from the RCS. Those are mostly modifications of the following methods: 0 Air activity monitoring in the hermetic space 0 Acoustic leak detection system Mass balance calculations The systems based on first two methods are frequently used. They are in most cases financially demanding and, therefore, the acoustic detection system uses to be replaced by monitoring of selected nodes by means of TV cameras. The least financially demanding are the methods of the leak detection by means of mass balance. This system calculates based on measured values the variation of the coolant mass in partial RCS volumes. However, these data are burdened with errors and, prior to their further processing, Uncertainties of measuring chains Inaccuracies at determining the process parameters (e.g. geometric size of the RCS parts) have to be corrected. 567
568 SDU - the leak detection system works by means of defined relations between measured and calculated values. The system calculates based on measured values the variation of coolant mass in partial RCS volumes. If such calculated variation does not relate to controlled makeup or letdown of the coolant, then the change is identified as a coolant leak. SDU provides operator support at coolant leak detection from the reactor coolant system of the nuclear power plant and is an integrated part of the EDU information system.
1.2. SDU Principle The nuclear power plant operator is in stable operation able to identify a coolant leak from the changes of tank levels in the RCS. His ability to distinguish and to detect a leak of lt/h by means of the levels is about 60 minutes from the leak start. At an operation under variable load, mutual level variations occur more frequently and the detection by operator of small leaks from the RCS becomes more uncertain. The SDU diagnostic system performs automatic classification of the following basic conditions: Normal operation Operation with a coolant leak in RCS Operation with a coolant leak in a separated RCS volume The input process data of the system are data from the process information system. The system output is information on the size and location of the coolant leakage. The reason for the deviation from the normal operation is the coolant leak into some of the related systems. The complete volume of the reactor coolant system is subdivided into process nodes. Their arrangement corresponds to the EDU process equipment. The system is subdivided into 23 subsystems by means of partitioning . Deaerator and regeneration heat exchanger TKlO Makeup pumps and recirculation Deaerator and regeneration heat exchanger TK50 System of organized leakages TY TElO line Outlet manifold TKlO Glands Coolant cleaning plant TClO reactor pressurizer 1st - 6th loops
569 Forced flow occurs between water volumes with different parameters. The system conditions are determined by measured location in the plant information system. It is from known geometrical parameters and from the status of valves that we can calculate the values of flow rates, pressures and temperatures even in not measured locations of the system. Then, we can perform volume calculation (Eq.1)
Vi(t) = Voi
+ ki Si li(t)
t 1)
Density calculation (Eq.2)
and mass calculation (Eq.3)
for each individual subsystem. If the total mass drops, a leak must have occurred. We can calculate (Eq.4)the total coolant mass leaking (UNIK) from the RCS. A negative value means mass reduction, a positive value means a mass increase.
UNIK(t) = (M(t) - M(t -At))/ At - Fin(t) + F&t)
(4)
where UNIK(t) is the total coolant mass leaking from the RCS [kg.h-1] M(t) is the coolant mass value in the RCS at the time t [kg] At is the time interval [h] Fin(t) is the sum of RCS coolant makeup flow rates [kg.h-1] F,,, (t) is the sum of RCS coolant letdown flow rates [kg.h-I]
1.3. Coolant Leak Calculation Algorithm
The bases of the system are the balance equations for the partial volumes of the RCS [2]. Both measured and calculated thermodynamic quantities enter in them. The main attribute vector is created by the imbalance of balance equations (leaks) of individual volumes that are calculated in different time intervals so that both a sufficient system sensitivity (from 100 kghour within 40 minutes from its occurrence) and sufficient identification speed (within 1 minute from the occurrence of leaks larger than 100 t/h) are provided.
570
The leak magnitude is calculated for each of the 23 subsystems and, in addition to that, for separated and not separated systems. For each system, 6 leak types are calculated and, as the resulting value, the valid leak is selected based on expert defined relations. For a unit in unsteady state conditions, the alarms of the overall leaks are blocked. The steady state conditions are: the pressure variation in the main steam manifold must be lower than 0,2 MPalmin, the reactor power variation must be lower than 3%RTP/min the variation of the average temperatures in hot and cold legs of the RCS loops must be lower than 50°C/h. The primary coolant leakage is calculated by means of a balance model in defined RCS volumes. The balance model is a physical model corresponding to separate parts of the reactor coolant system of the NPP. The subdivision of the total volume into partial volumes is performed, for which relevant physical equations are developed. The links among the modules are expressed by initial and boundary conditions. The modules can be interconnected into mutually separable groups. The systems and the boundaries between them are defined based on operation procedure. The separation of the part with localized leak must occur in calculation. The calculation nucleus of the SDU is made as a data source. The application is programmed as a COM client of the RECEIVE COM server. The user interface for the SDU outputs display is built into the existing ISSI application with 2 new display units used for this purpose.
2.
SDU Functional Tests Results
In 2003, SDU Functional Tests have been carried out at Unit 2 of EDU. Tree types of leaks have been used and SDU output information has been evaluated ~31. The reactor coolant leakages No. 1 and No. 2 (about 100 and 200 kgh) have been imitated in the reactor coolant sampling system by its alignment directly into the special sewage system. The real leak of the reactor coolant with a magnitude of several t/h has been simulated by the level letdown in TKlOBOl by about 10 cm by means of the TElO(50)DOl pump into the tank of the dirty condensate. The results are shown on the following two plots.
571
221400
2212W
-3
221000
220800
220600
220400
2a200
220000
830
SO0
9.W
IOW
1030
11.00
11:s
12:W
12.30
13:OO
13.30
time
1-
Mans of d a n l m RCS
I
Figure 1. The course of the overall RCS mass. Test of SDU small leakages Unit 2 EDU 25/07/2003
221400
?21200
22iWO
220800
220800
-g :
2204oO
2mxu 830
900
930
IOW
1030
11W
1130
12W
1230
1303
1330
Eas I-lotal
molanl ma85 leakins -oak me 1 - - - l f i ~ l a n t m a s lsaklng 40 type 2 -(axxlsY2~
Mass of molant m ucs j
Figure 2. The course of the overall RCS mass with SDU leak identification.
The SDU code responded to all 3 types of small leaks. At the first two of them, the calculation No. 1 is used.
572 The SDU output does not oscillate and, after about 43 minutes, the detection of the first leak of 100 kgh occurred. The increased leak of 200 kg/h has been detected at the time 11: 14 that is within 44 minutes from the leak start.
3.
Conclusion
SDU is a software system that detects leak of the reactor coolant at quasistationary processes. The system performs an automatic classification of the following basic conditions: Normal operation Operation with a coolant leak in RCS 0 Operation with a coolant leak in a separated RCS volume The reactor coolant leak is calculated from instant data in the LAN by means of a balance model for defined RCS volumes. The SDU system has been installed at the user’s workstation of LAN at an Dukovany NPP unit. The user’s interface for displaying the SDU outputs is built into the existing ISSI application where two new displays have been developed for this purpose. The further effort will be base on the application of fuzzy logics, which will provide for a more precise localization of the leak location. The processing of accompanying attributes in the interconnected systems will be fundamental for that.
References 1. I.Petruzela, Fuzzy system of automatic failure classijkation, 6, FLINS (2000). 2 . M. Soukupova, P. Sury, SW SDU description, 65, I&C Energo report (2003). 3. I.Petruzela, J.Marek, Test SDU( unit2) evaluation, 15, I&C Energo report (2003).
A FUZZY CLUSTERING APPROACH FOR TRANSIENTS CLASSIFICATION
E. ZIO, P. BARALDI Department of Nuclear Engineering, Polytechnic of Milan, Via Ponzio 34/3, 20133 Milano, Italy E-mail: enrico.zio @polimi.it
In this paper, we look into the issue of clustering of signal transients for the reliable monitoring and timely diagnosing of nuclear components and systems and show that the choice of the metrics upon which the clustering is based is critical for obtaining geometric clusters as close as possible to the real physical classes. The a priori known information regarding the true classes will be exploited to select, by means of an evolutionary algorithm, an optimal metrics. In case the classification thereby obtained were still unsatisfactory, an iterative procedure is used to split the less compact physical classes in further subclasses.
1. Introduction
The early identification of the causes for the onset of a meaningful departure from steady state behaviour is an essential step for the operation and control and for the accident management of nuclear power plants. The basis for the identification of a change in the system is that different system faults and anomalies lead to different patterns of evolution of the involved process variables. In this work, we present an approach to transient identification based on pattern classification by fuzzy clustering. The task of pattern classification may be viewed as a problem of partitioning of objects (hereafter also called data, patterns) into classes. To tackle the classification problem with a supervised technique we must have available a set X of n “training” data, in an h-dimensional space, grouped into c subsets representing the classes ri = {Zii)),z = 1 , 2 , . . . ,c, 1 = 1 , 2 , . . . ,Ni, with Ni= n and U,C=l ri= r. In many practical instances] however, the membership of an object to a class is not binary (Yes/No), since many data points may share characteristics common to several classes. The implication is that the representation of the data structure can be more accurately handled by fuzzy sets. A fuzzy
573
574
partition I' into c subset l?i is then characterized by a set of c membership functions { p l ( Z ) ,. . . ,p c ( Z ) }where p i ( Z ) denotes the membership of 2 into class i. In this work, we investigate the feasibility of building a fuzzy classifier by means of an evolutionary procedure applied to the well known Fuzzy C Means (FCM) alg0rithm.l More precisely, the evolutionary algorithm searches for the optimal metrics to be used by the FCM so as to achieve clusters as close as possible to the real physical classes. To enhance the performance of the algorithm when dealing with classes of data characterized by little compactness, we introduce an iterative procedure which amounts to splitting the less compact physical classes into subclasses, in a supervised manner aimed at achieving a clustering of the data which more closely adheres to the known physical classes. The approach is applied to the case of classification of simulated transients of a U-tube steam generator.
2. An Evolutionary Fuzzy C Means classifying algorithm 2.1. Introduction
The approach to the classification problem offered by the FCM algorithm can be said to be unsupervised since the algorithm makes no use of the a priori known information on the true physical classes of the training data. Indeed, the clustering performed by the FCM algorithm is based only on the geometric grouping of the data and, hence, depends on the metrics chosen. Then, the geometric clusters obtained by the FCM algorithm do not necessarily yield the actual physical classes. In this respect, here we investigate whether the search for physical classes of objects, performed as a search for geometric clusters in the features space, can be improved by choosing an appropriate metrics. A genetic algorithm in which the the only reproductive operations allowed are mutations, is employed to identify a metrics such that the geometric clusters found by the FCM algorithm are as close as possible to the a priori known classes of data. Denoting by AI, the definite positive matrix of dimensions h x h which defines the m e t r G , through its associated geometric distance function, the algorithm can be summarized as follows: (1)initialize the metrics to the Euclidean metrics, i.e. M - = - (2)Perform the FCM partitioning of the n training data into c clusters I?* = {I?;, . . . , I?:}, based on the metrics M and using a"supervise8' initial partition which sets the initial clusters assignments coincident to the a priori known classes. (3)Compute the distance D(I',I'*) between the a
r.
575 priori known physical classes and the geometric FCM-obtained clusters by:6
where 0 I p i ( Z ) I 1 is the a priori known (possibly fuzzy) membership of the Ic-th pattern to the i-th class and 0 i p: (2)5 1 is the fuzzy membership computed by the FCM algorithm. (4)If I?* is close to I?, i.e. D(r,r*)is smaller than a predefined threshold E , stop. (5)0therwise, update - by exploiting its unique decomposition into Cholesky factors16 - = GTG, _ _ where G is a lower triangular matrix with positive entries on the main diagonal. More precisely, at iteration t 1, the entries gij of the Cholesky factor G - are updated as follows:
+
g i j ( t + l ) =gij(t>+Nij(Ol6(t)) if i < j gi i (t + I) = max(10-~,gii(t)+ ~ i i ( ~ , ~ ( t ) ) )
(2) (3)
where 6 ( t ) = aD(I’,F*(t)),a is a parameter that controls the size of the random step of modification of the Cholesky factor entries gij, Nij(O,6) denotes a gaussian noise with mean 0 and standard deviation 6, and eq.(3) ensures that all entries in the main diagonal of the matrix G(t - + 1) are positive numbers. (6)Return to step 2. 2 . 2 . Application of the evolutionary FCM classifier to nuclear transients data We consider the problem of classifying four kinds of transients occurring in the steam generator (SG) of a Pressurized Water Reactor (PWR) on the basis of the available temperature, pressure and flow measurements. Among the existing configurations, we consider here the well-known standard recirculation U-tubes type. A detailed model of the physical functioning of this component is given in Ref. 3. In our analysis, we have assumed that the pressurizer imposes a constant primary system pressure of 154.4 bar. Among the possible transientinitiating perturbations (hereafter called forcing functions) the ones here considered are: the inlet water temperature and the inlet water mass flow on the primary side, the feed water temperature and the feed water m a s flow on the secondary side. These four forcing functions may vary as a consequence of an operator action or because of plant’s anomalies or faults. The transient responses of the SG were obtained with the code UTSG provided by the NEA data bank of Paris. The code has been developed by the “Gesellshaft fur Reaktorsicherheit (GRS)” in Garching bei Munchen,
576
and it properly accounts for the feedback due to the heat removal system. It is part of the code ALMOD-2 which simulates the non-linear behavior of a PWR.2 The UTSG code was used to generate 529 transients, each one 50s long, obtained by randomly varying the kind and intensity of the forcing function, according t o a sigmoidal shape which develop in 8s. The intervals of variation of the forcing functions considered in this work for the generation of the transients of interest are those of Ref. 4. Regarding the monitored signals, they have been selected taking into account their physical measurability. Out of the 15 signals provided by the UTSG code, we considered: primary outlet water temperature and mass flow; secondary inlet water temperature and mass flow; secondary system pressure; secondary water level; total steam mass flow; generator power. The generic measurement vector is composed of the values of these 8 measurable signals taken at the same instant. Along the 50s-long transients, we have considered 10 sampling instants, taken every 4s, from 7s to 43s. For improved classification performance, the dimensionality of the measurements vector has been reduced to four features by means of a nonlinear principal component analysis performed by means of an autoassociative artificial neural n e t ~ o r k . ~ The application of the evolutionary FCM classifier to all the available 3500 data leads to completely unsatisfactory results. For example with a degree of confidence y = 0.9 (lower limit on the values of membership of a data point to a given class), only 35.59% of the transients are correctly assigned t o the right class (23.60% with the standard, Euclidean based FCM algorithm), whereas 29.00% (12.43% with the standard, Euclidean based FCM algorithm) are assigned to the wrong class and 35.41% (69.97% with the standard, Euclidean based FCM algorithm) are not assigned to any class. Evidently, the four features which are taken to physically characterize the data give an insufficient description of the four classes to which the data belong. The reason for these unsatisfactory results is the little compactness of the a priori known physical classes in the features space. Physically, this is due to the fact that the forcing functions causing the transients may vary in two directions, above or below the nominal values, leading to different and possibly opposite consequences on the behavior of the plant signals in the features space.
577 3. A supervised classifying algorithm based on the splitting of the less compact classes 3.1. Introduction
To improve the previous results, we investigate a new approach based on the iterative splitting of the most dispersed classes into two more compact subclasses. The method proceeds as follows. (1)The least compact physical class I?; is identified as the one with highest value of the quantity oi:
where of< = C;=, ~ i ( ? k ) ~ & ( Z k , G i ) can be taken as representing the (fuzzy) variation, or dispersion, of the data in class i and N f i = pi(Zk) is the fuzzy number of samples in class i. (2)Then, the standard, Euclidean-based FCM algorithm is applied to the data belonging to such class in order to split it into two fuzzy sub-classes, ril and Fz2, such that: = ral U ri2 where the union of the two subclasses rzl,Fi2 may be defined as:
c;=l
-
r;l u ri2 pr;lur;z (G) = m 4 1 , przl(zrc)+ prTZ(&)} (5) Note that in this way we obtain two fuzzy subclasses from one crisp class and the a priori known crisp partition r = {Fl, r2,.. . , r,} is substituted by the fuzzy partition r(C)= {Fl,F2,. . . , rzl,rz2,.. . , FC}, where for clarity we have explicited the number of classes C = c 1 as argument of the partition. This new partition is also a priory known, with the data of subclasses ril and actually belonging to the same physical class l?;. (3)The FCM evolutionary algorithm described in Section 2 is used to partition the data in c 1 subclusters, i.e. to determine the partition l?*(C) = {r;,Fa,. . . , ri;,I';2,. . . ,rg}. (4)However, for our classification purposes we are interested in a partition r*"as close as possible" to r, and so that in the end we have to not in the obtained, enlarged partition r*(C), merge the obtained subclusters I'zl and I'Zr, by means of the union operator (eq.5), i.e. = 1:1 ' Ur$. (5)At this point, the value of the partition distance D ( r ,r*)can be computed: if it exceeds the pre-defined convergence threshold E we repeat the procedure from step 1; otherwise we stop.
+
+
3.2. Application of the developed algorithm t o the nuclear
transients data The iterative algorithm has been applied to the steam generator transients data of section 2.2. A set of 2800 transients has been used for training the
578
algorithm, with the remaining 700 patterns left behind to test its performance. The minimum value of D(I',I'*) is reached after three splitting iterations, i.e. with C = 7 subclasses. There is a significant improvement in the classification of the 700 test transients as compared to the plain evolutionary FCM classifier. For example, the iterative splitting algorithm correctly classifies, with a degree of confidence of 0.85, 70.0% of the transients (37.4% with the evolutionary FCM), misclassifies 7.7% (31.0% with the evolutionary FCM) and does not assign to any class 22.3% of the patterns (31.6% with the evolutionary FCM). 4. Conclusions
The reliable and timely classification of signal transients is a matter of paramount importance for the safe operation of nuclear power plants and components. In the present work we have embraced the fuzzy clustering approach to the classification problem and exploited an evolutionary algorithm for detecting the optimal Mahalanobis metrics to be used within a generalized Fuzzy C Means clustering method. To overcome the limitations of the approach with respect to the analysis of dispersed classes we have introduced an innovative algorithm which iteratively splits the less compact classes into more compact subclasses. At each iteration, the extended physical partition is treated by the evolutionary FCM algorithm to identify the corresponding fuzzy clusters. The complete algorithm has been applied to simulated nuclear transients data related to the operation of a steam generator of a pressurized water reactor.
References 1. J.C. Bezdek, Pattern Recognition with Fuzzy Objective Function Algorithms,
Plenum Press, (1981). 2. A. Hold,Nuclear Engineering and Design, 47:l-23, (1978). 3. M. Marseguerra, M.E. Ricotti, and E. Zio, Nucl. Sci. and Eng., 124, (1996). 4. M. Marseguerra, E. Zio, and F. Marcucci. Continuous monitoring and calibration of process sensors by autoassociative artificial neural network. Nuclear Technology, (2004). Submitted for pubblication. 5 . X.L. Xie and G. Beni,IEEE R u n s . On Pattern Analysis and Machine Intelligence, 13(8):841-847, (1991). 6. B. Yuan and G. Klir, Data driven identification of key variables. In Da Ruan, editor, Intelligent Hybrid Systems Fuzzy Logic, Neural Network, and Genetic Algorithms, pages 161-187, kluver Academic Publishers, (1997).
FRAME STRUCTURE OF CHINA’S NUCLEAR EMERGENCY DECISION-MAKING OPERATION SYSTEM (COSY) AND EVALUATING SUBSYSTEM FUZZY DECISION-MAKING METHOD NENGZHI LU .Computer Science Department of Shanghai Maritime University, Shanghai, 2001 35 JIALI FENG Computer Science Department of Shanghai Maritime University, Shanghai, 2001 35
YONGXING ZHANG Institute of Atomic Energy of China, Beijing, 100432) In accordance with the conditions in terms of
“
RODOS” (Real-time Online Decision
Support System) which was developed ty the European Community operations so far, this research work has combined itself with actual conditions existing in china and made exploratory discussion upon the construction of COSY and the application of fuzzy decision-making theory/methods in early nuclear emergency decision .
1.
Introduction
The nuclear emergency decision supporting system is a comprehensive system relating such fields of the society, nuclear engineering, politics, economy, ecological protection and radiation protection. Meanwhile, during the nuclear emergency decision-making system, there are many uncertain factors, such as social and public psychology factors. which are directly affecting the behaviors of the decision-makers.Therefore, the development work of such a system is of great difficulty. In accordance with our country’s current situation, this research work discussed on the construction of COSY and made exploratory discussions upon the nuclear emergency computer assisted decision-making model and utilized the method of fuzzy mathematics to fuzzyly quantize the uncertain factors relating to the nuclear emergency decision-making course .
579
580 2.
COSY overall structure
COSY, one of the big 5 subsystem of CRODOS, is mainly used into monitoring and managing the whole CRODOS system, inckding interdependence of all CRODOS programming blocks, data management, model/method storage /recall, knowledge acquisition and utilization, rules inference, problem resolving, geographical information analysis, results display and interactive control of system operation etc. The COSY overall frame structure is shown in Fig.1
52 Decision Maker
I
I
I
COSY
External Programs
rE& Fig.1 COSY overall frame Structure
DialoaueSvstem
I
581 Table 1. The grades of attribute factor weight given by experts \ factor \ \ \
econo-
\ \
stipulated standards
mic
costs
measuresX
influence on
human health
\
shelter
withdrawal
administratior f iodine tablets
expected the
maximum exposure to individual
avoidable avoidable Individual collective doses dose
social factor
psychology of public
doses ap
0.4
0.8
0.6
I
0.7
0.4
0.5
approaches
0.5
0.7
0.5
1
0.6
0.5
0.7
>elow
0.6
0.5
0.2
1
0.5
0.7
0.9
IP
0.6
0.8
1
0.8
0.7
0.4
0.5
approaches
0.8
0.7
0.6
1
0.6
0.5
0.7
jelow
0.9
0.5
0.3
1
0.5
0.8
0.9
up
0.4
0.3
0.8
0.7
0.9
0.7
0.2
approaches
0.4
0.7
0.5
0.9
0.6
0.4
0.6
jelow
0.5
0.5
0.2
1
0.5
0.6
0.8
Notetunder identical measures.the same factor registers an increase(or decrease) by degrees, the bigger value means a bigger weighting. Table 2 The experts give the grades of social factor in decision scheme influencing factors number of withdrawing persons
W, 0.26
sheltering area
0.18
injured in traffic accidents (number of persons)
0.16
avoidable influence upon human health(number of the instances)
0.21
nuclear s afety culture in the local area(support headcount percentage)
0.19
ranges above lO 4 2xlO J -10< SxlO^xlO 1 500-1 0 20Km 10-20Km 3-10Km l-3Km 0 >20 10-20 3-10 1-3 0 >20 10-20 3-10 1-3 0 90-100 70-90 50-70 30-50 10-30 <10
R, 0 0.25 0.5 0.75 1 0 0.25 0.5 0.75 1 0 0.25 0.50 0.75 1 1 0.75 0.50 0.25 0 I 0.8 0.6 0.4 0.2 0
582 3.
COSY information structure
The information structure of these five base (text base, database, model base, method base and knowledge base of COSY) has been expressed in such a unified manner as: R (A,, A2, . . Am)[11.Such a unified information structure in the entire system by use of the predicate logic and relation schema allows the unified use of relation type management system to manage and operate the bases 4.
Determination of membership function in nuclear emergency fuzzy decision-making
After conducting a profound study, we directly marked scores in terms of 7 uncertain factors, then determines the membership function by use of the fuzzy statistical method. And the score-marking table is shown in the following Table 1
5. Fuzzy factorial quantization In the course of nuclear emergency decision-making, there exist uncertain fuzzy factors, which must be quantized in the method of inviting experts to conduct value assignment directly upon these factors, as shown in Table 2. Then sum up the values of all the influencing factors, this course is quantization of the factor. 6.
Evaluation of fuzzy decision-making
It is essential to consider these factors in a comprehensive way, before you can choose a satisfactory decision-making strategy. For this reason, the fuzzy decision-making evaluation model has adopted the weighted average model: m( . ,+) as : B=A * R ;A is the weighting vector (membership function); R is the fuzzy relation, valued at [0, 13 R=(Ri,) ; R,. attribute factor. In accordance with the calculation result, you can reach the optimal strategy bj=max @, ,b2, . . .b,,J,and hold discussions and inquire upon experts’ opinions.
-
7.
Concluding remarks
Utilizing the relation schema to unify the information structure of COSY and the application of fuzzy decision-making theory in the field of nuclear emergency decision-making will have a positive influence upon boosting China’s research into the field of nuclear emergency. Reference 1. Qingda. Yao, Nengzhi .Lu , New Generation Decision Support System , Guangdong Science & Technology Press , January ,1993
SURVEILLANCE TESTS OPTIMIZATION OF A PWR AUXILIARY FEEDWATER SYSTEM BY CONSTRAINED GENETIC PROGRAMMING RAFAEL P. BAPTISTA" ROBERTO SCHIRRU", CLAUD10 M. N. A. PEREIRAazb, CELSO M. F. LAPAb. ROBERTO P. DOMINGOS" a Programa de Engenharia Nuclear, Universidade Federal do Rio de Janeiro, COPPWUFRJ, Cidade Universita'ria, Centro de Tecnologia, Ilha do Fundao, block G, room 206, PO Box 68.509, 2I94S-970, Rio de Janeiro, Brazil
I>
Instituto de Engenharia Nuclear, ComissEo Nacional de Energia Nuclear, CNENIDIRE, Ilha do FundEo s/n, PO Box 68.550, 21945-970, Rio de Janeiro, Brazil
In this work, a Constrained Genetic Programming (CGP) based approach is used to automatically generate optimum surveillance tests policies for a PWR Auxiliary Feedwater System (AFWS). The main idea is to produce surveillance test schedules that maximize the system reliability. The AFWS reliability model is the same used in previous works, where the use of a Genetic Algorithm (GA) was proposed to solve the same problem. The results demonstrated several advantages of the CGP-based approach over the GA-based model.
1. Introduction Nuclear Power Plant (NPP) safety systems often operate in a standby mode ready to start under demand. However, while in standby, unrevealed failures on their components may turn the system unavailable when requested. In order to minimize this shortcoming, components of standby systems are submitted to surveillance tests, in order to reveal failures before real demands. Standard surveillance tests policies (STP) are often based in periodical tests. However, previous works"' have demonstrated that genetic algorithms (GA)394 are able to produce better STPs using a model of non-periodic intervals between test interventions. In this work, using the probabilistic model proposed by Lapa et al.', we have developed a constrained genetic programming (CGP) based model for an automatically planning optimum STP, considering the same objectives and constraints used by Lapa et a l l . The result have demonstrated not only the feasibility of the use of a CGP-based model in STP optimization, but also indicate several advantages over the GA-based approach. 583
584 2. The Genetic Programming Genetic Programming (GP)6s7is an evolutionary computation technique, which evolves a group of programs. The main idea is to teach computer programs themselves, that is, to produce a program that satisfy a given behavior7. The search mechanism of the GP can be described as a cycle to "create-test-modify", similar to the way humans develop their programs8. Unlike in classical GAS, individuals in the GP are not fixed length binary strings, but computer programs usually represented in tree format (see Fig. 1).
Figure 1: Example of solution candidate in the GP.
The solution of a problem consists of finding the best program in a space of programs composed by functions, variables and constants adapted to the domain of the problem.
3. System Description The Auxiliary Feed-Water System ( A F W S ) is comprised by two subsystems. One of them uses a single turbine-driven pump (TDP) to supply both steam generators (SGs). The other one has two electrical motor-driven pumps (MDPl and MDP2) and each one supplies one of the SGs. Under normal conditions, all pumps use water from the Auxiliary Feed Water Tank (AFWT). Figure 2 shows the A F W S block diagram considering the components involved in the surveillance tests policy. The probabilistic analysis was developed considering that the A F W S failure event was the insufficient water supply to both SGs. The adopted test and maintenance outage times for each AFWS component are similar to those published by Harunuzzaman et aL9 considering similar components.
Figure 2: Simplified scheme of the AFWS.
585 4.
Genetic modeling
Modeling a problem into a CGP consists of defining: i) a group of terminals (variables and constants) and a group of functions; ii) constraints of the problem, iii) the structure that encodes solution candidates and iv) an objective function. The group of functions is comprised of: function RAZZ implements the probabilistic model to calculate the fitness; function C represents the schedule of one component; function t represents a scheduled time for surveillance test; terminal function T is similar to function t, however it works as a terminal function (it represents the end of the scheduling) and terminals N1, N2 and N3 are randomly generated numbers used to compose a scheduled time ranging between 0 and 480. The constraints used to preserve valid structures are: i) Only function RAZZ is allowed in the root; ii) Function RAZZ allows only functions C as arguments; iii) Functions C allow only functions t or T as arguments; iv) Function t has 4 branches: branch[O]: function t or T, branch[ 11: terminal N3, branch[2]: terminal N2 and branch[3]: terminal N l ; v) Functions T have 3 branches: branch[O]: terminal N3, branch[ 11: terminal N2 and branch[2]: terminal N I . An example of solution candidate can be seen in Figure 3 and its decoded schedule in Table 1.
Figure 3: Graphical representation of a solution candidate.
586 Table 1. Decoded schedule for example in Figure 2
Corn onent Surveillance test schedule (da s of the test interventions) [25, 140,380) C3 C4
(50, 150,225) (100,270j
The objective function, Fit (see Equation I), is the average system’s unavailability considering the surveillance test policy, and it is calculated according to the probabilistic model previously described by Lapa et al.
’.
1 Fit = T,,
f- [A(t),,]dt m,r
=O
where x(t)sis is the system‘s instantaneous unavailability. A more detailed description of the reliability model can be found in Lapa et d.’.
5.
Experiments and Results
The developed system (SisGP) was based on the CGP lil-gp, which was interfaced to a reliability calculation model, based on the probabilistic safety analysis (PSA) of the A F W S . Table 2 shows the minimum average unavailability values obtained by 3 experiments using SisGP, as well as those obtained by the GA-based methodology and standard policy. Table 2. Results and comparisons.
Polic SisGPl SisGP2 SisGP3 Standard
5.2401 x 5.2153 x 5.1713 x 10” 5.2492 x 10.’ 5.9592 x
As already observed by Lapa et al.’, we see in Table 2 that the standard policy does not appear as the best alternative to minimize the system unavailability under the previously done suppositions. The result obtained by the GA’ is quite good comparing to that obtained by the standard policy. Using the CGP, we could reach unavailability levels slightly lower than the one found by the GA. The improvements in the unavailability levels are not very expressive in this application, but we should keep in mind that nuclear systems
587 have inherent high reliability/availability (due to redundancies, factory requirements, etc). In other industrial systems, where acceptable medium availability is lower, the impact should be more expressive. Table 3 shows the schedule for SisGP3.
Table 11 comparative maran and pnn withthree traning meth met 150 Component VI
Surveillance test schedule (days of the interventions) 120, 32, 42, 54, 64, 74, 98, 106, 120, 130, 142, 158, 170, 180, 192, 202, 220, 234, 244, 260, 274, 284, 300, 312, 324, 334, 348, 354, 366, 376, 388, 400,
Table 11 comparative maran and pnn withthree traning meth met 150 Table 11 comparative maran and pnn withthree traning meth met 150 Table 11 comparative maran and pnn withthree traning meth met 150 Table 11 comparative maran and pnn withthree traning meth met 150 Table 11 comparative maran and pnn withthree traning meth met 150 Table 11 comparative maran and pnn withthree traning meth met 150 v4
MDP2
242, 254, 266, 282, 296, 306, 322, 336, 344, 352, 364, 372, 384, 392, 398, 406,420,426,434,442,448,460,468) (36,62,110,134,182,230,266,306,336,352,372,392,420,442,460)
As already observed by Lapa et al.', the test scheduling proposed by the GA reveals several intelligent features. Such features were also observed in the CGP's results. For example: valves, for which the outage times are 4 times less than the pumps outages tend to undergo more tests. The TDP, to which is associated a high outage time in case of corrective maintenance, presents few test interventions. The CGP could learn that components in series, such as the supercomponent formed by V1 and MDP1 (the same occur with V4 and MDP2) should better stop together and components in parallel should alternate interventions (for example MDPl should not stop together with MDP2 or TDP). It must be emphasized that such knowledge was never given to the CGP system.
6.
Conclusions
The proposed methodology has demonstrated to be efficient and robust to find good surveillance test policies for the A E W S . The results ratify previous experience, which uses GA as the optimization technique. In this work, once again is ratified the importance of using the evolutionary computation in complex multi-modal optimization as demonstrated in a previous work'. The results were slightly better than those obtained by the GA-based model, however, we should highlight several advantages of the CGP-model. Due to its fixed-length structure, the GA-based method finds difficulties when dealing with greater systems or more fine-grained discretization. Unlike the GA, the CGP varying-length structure allows a better fitting to the problem
5aa
solutions space, providing more flexibility with low computational overhead, allowing the application to greater and more complex problems. Future works in this research line should investigate the application of a niched and parallel CGP.
Acknowledgments Rafael P. Baptista is supported by Coordenaqiio de AperfeiGoamento de Pessoal de Nivel Superior - CAPES. Cliudio M.N.A. Pereira is supported by Conselho Nacional de Desenvolvimento Cientifico e Tecnol6gico - CNPq. Celso M.F. Lapa is supported by Fundaqiio Carlos Chagas Filho de Amparo 2 Pesquisa do Estado do Rio de Janeiro - FAPERJ.
References 1. C.M.F. Lapa, C.M.N.A. Pereira and P.F. Frutuoso e Melo, International Journal of Intelligent Systens 17(8), 8 13-831 (2000). 2. C.M.N.A. Pereira and C.M.F. Lapa, Annals of Nuclear Energy 30, 16651675 (2003). 3. J.H. Holland, Adaptation in natural and artificial systems, The University of Michigan Press (1975). 4. D.E. Goldberg, Genetic algorithms in search, optimization and machine learning, Addison Wesley (1989). 5. C.M.F. Lapa, C.M.N.A. Pereira and A.C.A. Mol, Nuclear Engineering & Design 196, 95-107 (2000). 6. J.R. Koza, Proceedings of the 1lrh International Joint Conference on Artificial Intelligent (IJCAI-89), 768-774 (1989). 7. J.R. Koza, Genetic Programming: On the Programming of Computers by Means of Natural Selection, MIT Press (1992). 8. T. Yu, Proceedings of the 1999 Congress on Evolutionary Computation, pp. 652-659. IEEE Press (1999). 9. M. Harunuzzaman and T. Aldemir, Nuclear Technology 113, 354-367 (1996).
NEW PROPOSAL OF REACTIVITY COEFFICIENT ESTIMATION METHOD USING A GRAY-BOX MODEL IN NUCLEAR POWER PLANTS MICHITSUGU MORI, YUICHI KAGAMI R&D Center, Tokyo Electric Power Company 4-1, Egasaki-cho. Tsunrmi-ku, Yokohama, 230-8510, Japan
SHIGERU KANEMOTO~,TETSUO TAMAOKI, MITSUHIRO ENOMOTO, SHINICHIRO KAWAMURA Power and Industrial Systems R&D Center, Toshiba Corporation 8 Shinsugita-cho, Isogo-ku, Yokohama, 235-8523, Japan
A new method for estimating reactivity parameters, such as moderator temperature coefficient (MTC) and void reactivity coefficient (VRC), is proposed using steady-state noise data. In order to solve the ill-posed problem of reactivity parameter estimation, a concept of a gray box model is newly introduced. The gray box model includes a first principle based model and a black-box fitting model. The former model acts as a priori knowledge based constraints in a parameter estimation problem. After establishing the gray box and noise source models, the maximum likelihood estimation method based on Kalman filter is applied. Furthermore, it is shown that the frequency domain approach of the gray box model is useful in the case of VRC estimation. The effectiveness of the proposed algorithms is shown through numerical simulation and actual plant data analysis.
1.
Introduction
Monitoring of reactivity parameters, such as moderator temperature coefficient (MTC) or void reactivity coefficient (VRC), will be of prime importance in the future high bumup and Mixed Oxide (MOX) fuel assemblies. Noise analysis is an interesting technique for estimating these parameters since it enables them to be monitored without disturbing normal operations of the reactor. Hence, there have been many attempts to develop estimation algorithms of MTC or VRC [l-71. However, it has also been pointed out that there are problems in that the estimated parameters are biased from the true values. Generally speaking, these parameter estimation problems are typical illposed problems, since the parameters should be estimated from a noisy and insufficient number of sensor signals. Hence, the estimation algorithm should be designed carefully by utilizing all knowledge about the problem.
'EMAIL:[email protected]
589
590
The present paper proposes a new algorithm to solve the above ill-posed parameter estimation problem from noisy observations. The algorithm is based on the following three ideas [ 1,2]: 1.
A gray box model, which combines the first principle model (white-box model) and black box model (fitting model), is utilized in the parameter estimation algorithm. The first principle model plays a role of constraints the ill-posed parameter estimation problem. 2 in The system and observation noise sources are assumed in a gray box model to reconstruct noisy behavior of observation signals. The distinction of system and observation noise contributes to rational design of statistical parameter estimation criteria. 3. Parameter estimation criteria are assumed in both time and frequency domains depending on the problem features. The maximum likelihood estimation, which is the most rational algorithm in the statistical parameter estimation area, can be applied in the time domain approach. On the other hand, the frequency domain approach has a benefit of easily including heuristic knowledge in the criteria. Concretely, the one point reactor kinetics equation for neutron dynamics is combined with an autoregressive fitting model for temperature or core flow dynamics. Also, it is assumed that the system noise sources are generated by temperature, void and core flow fluctuation. Once the system and noise dynamics model is assumed, the standard maximum likelihood parameter estimation algorithm based on Kalman filter can be used. Here, the gray box model is recognized as a priori knowledge based constraints as mentioned before. In the present paper, MTC will be estimated by this approach. It will be shown that this maximum likelihood based approach gives a good estimation result compared with the conventional method. On the other hand, VRC should be estimated by another approach based on the frequency domain estimation criteria. The VRC estimation problem is more complex than the MTC one, since the key state parameter of void fraction cannot be directly observed. This means that the estimation results will be more sensitive to the modeling inaccuracies of system dynamics and noise sources. In order to overcome these inaccuracies, the heuristic estimation criterion defined in the frequency domain transfer function and coherence function is introduced in the present paper. The two kinds of statistical functions are necessary to estimate dynamics parameter and noise source magnitudes in the gray box model. Although individual models and algorithms of the present paper are not new ones, their appropriate combination to address the essential feature of the
591 problem will give better results than the conventional methods. The present paper shows these results through numerical simulation and actual plant data analysis.
2.
Gray Box Model Based Parameter Estimation Method
2.1. Estimation Method of Moderator Temperature Coefficient The present MTC estimation method is based on the following point kinetics reactor model. dn p - P n + K w , , %"-nc+v, dt B di t dT -= ~ (+ v,. 0 p = K,T+ pa di
(1)
Here, three state variables, neutron flux, precursor density and moderator temperature are assumed. The u(t) and a are an input signal and a gain parameter to control temperature behavior. The gain parameter is obtained by fitting of observation data. Other symbols are given as known design parameters, which have a role of constraints for signal behavior prediction. In the conventional MTC estimation, the reactivity p ( t ) is at first calculated by inverse kinetics equation of Eq. (1). Then, the coefficient KT is estimated by the linear least squares fitting between temperature and reactivity signals. However, the proposed method estimates the coefficient KT without using reactivity by applying Kalman filter. For this purpose, Eq. (1) is at first converted into a linear and discrete form as follows: X , = FX,., +Gun+ Q,
(2)
Yn = HXn iR,
X = [sC,cTr,
F = e "',
G = A'l(e"' - I ) E
Here, the prompt jump approximation is also used. In Eq. (2), the system noise Q, and observation noise R,, are defined. The maximum likelihood estimation is based on the probability density function of prediction residuals of observation data Y, [8]. By applying the standard Kalman filter algorithm, the maximum likelihood can be calculated as follows:
592
In the proposed algorithm, the coefficient KT is obtained by maximizing the likelihood by iterating calculation of Eq. (3) according to the standard nonlinear programming methodology. It should be noted that the present method does not directly use the reactivity. In the conventional estimation method using numerical calculation of the reactivity, the observation noise included in neutron signal measurements, which could be originally assumed as white Gaussian noise, is changed into non-white noise through the inverse kinetics calculation. Then, the estimation based on the calculated reactivity diverge from the optimum one. The most important benefit of the present method is to avoid this numerical processing of the observation noise for assuring validity of a statistical optimization procedure. 2.2. Estimation of Void Reactivity CoefJicients The VRC is defined by a sensitivity coefficient reactivity to void fraction. The VRC estimation problem is more complex than the MTC one, since the key parameter, that is, void fraction cannot be directly measured and the estimation model becomes complex. The dynamics model used in the present estimation is shown in Fig. 1 , where two observation signals, neutron flux and recirculation core flow rate, are shown together with system and observation noise Q and R. Here, the model is expressed in the frequency domain. The core flow rate is assumed as the input of the system and the neutron flux as the output.
Figure 1. Block diagram of BWR core dynamics in frequency domain
593
This model can be expressed by the following time domain linear model.
0,
' . P' P
%,
U , A I . , K T . , I , U , ~ .
A = 0,
0,
0,
0,
0,
(4)
0,
The above-mentioned maximum likelihood estimation method can be applied by using Eq. (4). However, in the case of VRC estimation, the frequency domain approach is more appropriate in order to introduce heuristic knowledge into the estimation criteria. The main reason for the necessity of heuristic knowledge is the complexity of system noise sources in BWR plants. According to current knowledge of the noise sources in BWR plants, there are three kinds of dominant noise sources, the random void generation noise, forced recirculation core flow noise and fuel channel inlet flow noise, which are shown in Fig.1. Although these noise sources cannot be observed directly, we can presume them from various kinds of statistical analysis, such as coherence analysis or multi-variable autoregressive modeling analysis [9]. The existence of these noise sources is a common feature in various types of commercial BWR plants. However, their magnitudes are different depending on plant types. Hence, the estimation of these noise source magnitudes is also important in addition to the VRC estimation. In order to estimate these two kinds of parameters at once, the present VRC estimation method introduce the following multi-function based criteria.
Here, AP(f), CF(f) and G(f) mean the auto-power spectral density (APSD) , coherence function (CF) and transfer function (TF) gain. They are calculated from the following cross power spectral density expressed by system matrices A and B in Eq. (4), the APSD of input core flow signal pw(jdL)), system and observation noise APSD PQ and PR,as follows:
594 The term of coherence function in the criteria is necessary to estimate the noise source magnitudes and the term of transfer function is necessary for the VRC estimation [2]. The term of APSD contributes to both estimations. It is shown in the following analysis that this heuristic definition is superior to the simple criterion of likelihood in the time domain. An example of this comparison is shown in our previous work [2]. 3.
Estimation Results
3.1. Estimation Results of Moderator Temperature Coefficient The quality of the parameter estimation should be evaluated from the viewpoints of bias and scattering of estimated values. Here, the scattering of estimated values is important from the practical viewpoint since it will affect the measurement time required to store sufficient observation data. In order to clarify the performance of the present method, the Monte-Carlo simulation is made. A number of different random time series of neuron and temperature signals are generated numerically and used for the parameter estimation. In Fig. 2(a), an example of a time series data waveform is shown. Independent estimations are done 50 times and shown in Fig. 2(b). Here, the results for the present method and the conventional fitting method are shown. It is clear that the present method is superior to the conventional one from the viewpoint of smaller scattering of the estimated values. The reason for the good performance of the present method has already been mentioned in the previous section. The small scattering is expected to shorten the measurement time for the MTC estimation.
I
30
,
I
,
2
,
3
,
4
,
5
,
6
,
7
, ' , 1.1
B
4
I0
,...
1 4
(a) Simulated neutron and temperature signals
....
7 10 13 16 19 22 25 28 31 34 37 40 43 46 49
(b) Estimated moderator temperature coefficients
Figure 2. Moderator temperature estimation results
595 3.2. Estimation Results of Void Reactivity Coefficient The VRC is estimated to minimize the frequency domain criterion Eq. (5). The APSD, CF and TF are calculated both from measured neutron and core flow fluctuating signals and the gray box model defined in Eq. (4). Then, the VRC, K , , and void and channel inlet flow noise source magnitudes, oVand
oL,are estimated by the non-linear optimizing algorithm. The typical results are shown in Fig. 3. Here, the dotted and solid lines show the calculated APSD, CF and TF from the measurement data in a BWR-4 type plant and the predicted ones from the gray box model. Here, the APSD of the core flow signal is used as the input of the model. Good accordance of both results suggests the validity of both the gray box model and estimated parameters. It should be noted that the accordance of CF shows the appropriate noise source estimation and that of TF the VRC estimation. In Fig. 4, the VRC estimation results are shown through one year fuel cycle operation. They are also compared with the calculated VRC by the core performance analysis code. The tendency of gradual increase of VRC according to fuel burnup can be regarded as indicating that the present estimation results are plausible. In the present estimation algorithm, the VRC can also be converted into the reactor stability index, the decay ratio, by using the eigenvalue of the gray box model. The results are shown in Fig. 4(b), where the results by the conventional stability identification method using the autoregressive modeling of neutron noise signals are also shown. The results coincide with one another in their average trends, but the scattering of the gray box model estimation results is smaller than that of the conventional ones. These results clearly show the effectiveness of the present method.
Figure 3. Calculated frequency characteristics using gray box model based on noise sources &(0.0165) and ~ ~ ( 0 . 7 3 0 identification 2) (Solid line: gray box model, Dotted line: Measurement)
596 1.3
Estimation reiultl by APRM noise
1.2 1.1 1
09 0.8
0.15
0.7
0.1
1
2
3
4
5
6
7
B
9
Date Relatae scale)
1 0 1 1 1 2 1 3
Ertlmatbn resuts by gray box model
1
2
3
I
3
6
1
8
9
1 0 1 1 1 2 1 3
DatsO(R&twe rcak)
Figure 4. Estimated and calculated results of void reactivity coefficient during one year fuel cycle (Solid line: Gray box mode estimation, Dotted line: Core performance code calculation)
4.
Summary
We have shown the usefitlness of the new reactivity parameter estimation method based on the concept of the gray box model both in the time and frequency domains. The usage of the gray box model improves the estimation accuracy in the ill-posed MTC and VRC parameter estimation problems. This can be explained by the fact that the gray model acts as effective constraints in the ill-posed problem and contributes to the improvement in estimation accuracy.
References 1. M. Mori, S.Kanemoto, et al., J. of Atomic Energy Society of Japan, Vo1.3, No. 1, p l (2004). (in Japanese) 2. M. Mori, S.Kanemoto, et al., J. of Atomic Energy Society of Japan, Vo1.3, No. 1, pl 1 (2004). (in Japanese) 3. T. Tamaoki, et al., Proc. of Fall Meeting of Japan Atomic Energy Society, H32 (2001). (in Japanese) 4. Y. Shimazu, J. Nucl. Sci. Technol., Vo1.32, No.7, p622 (1995). 5. C. Demaziere, et al., Prog.Nucl.Energy, Vo1.43, No.1-4, p57 (2003). 6. T. Anderson, et al., Prog. Nucl. Energy, Vo1.43, No. 1-4, p35 (2003). 7. M. Mori, S.Kanemoto, et al., Prog.Nucl.Energy,Vo1.43, No.1-4, p43 (2003). 8. K. Akaike, G. Kitagawa, “Practice of Time Series Analysis 2‘: AsakuraShoten, (1995). (in Japanese) 9. S. Kanemoto, et al., J. of Nuclear Science and Technology, V01.20, No.1, P13-24 (1983)
HTGR HELIUM TURBINE CONCEPTUAL DESIGN BY GENETIUGRADIENT OPTIMIZATION LONG YANG Institute of Nuclear and New Energy Technology, Tsinghua University, Beijing 100084, China
SUYUAN YU, GUOJUN YANG, ZHIYONG HUANG Institute of Nuclear and New Energy Technology. Tsinghua University, Beijing 100084, China
Helium turbine is the key component of the power conversion unit for High Temperature Gas-cooled Reactor (HTGR) with direct gas cycle. Gas turbine design currently is a multidisciplinary process in whch the relationships between constraints, objective functions and variables are very complicated. Due to the ever-increasing complexity of the process, it is very difficult for the designer to foresee the consequences of changing certain parts. Furthermore, helium is very different in thermal properties from air, it is uncertain whether the operation experiences of an air turbine may be used in the conceptual design of a helium turbine. Therefore, it is difficult to produce an optimal design with the general method of adaptation to baseline. In this paper, the genetidgradient method is applied in the conceptual design of helium turbine. The design space is reduced by genetic algorithm and then the optimal solution is located by the gradient method. Results show that the convergent speed is high and the optimal design can be obtained by this method. The experiences from air turbine conceptual design could be used in helium turbine conceptual design without evident error.
1. Introduction The traditional designing process of the gas turbine can be divided into four parts, conceptual design, schematic design, technical design and detailed design. The aim of conceptual design is to carry out a semi-optimal scheme on the basis of design experience, and to define the geometrical properties and parameters, such as the stage number. The component performance, structure, reliability, and material properties will be done by the following three parts. The step of the conceptual design is very important in the whole process. If any mistake of conceptual design is discovered, all the work must be repeated. Because the selection of design parameters in traditional conceptual design is based on experience, the usability of experience becomes essential [l’. The helium turbine of direct cycle is regarded as the most promising cycle model for a high temperature gas-cooled reactor, but the helium turbine has not been reasonably verified in the world. There is no experience of a helium turbine conceptual design for reference. Helium differs greatly from air in thermal
597
598
properties. So a question may be presented for the turbine design by using the traditional design method: could the design experience of an air turbine be correctly used in a helium turbine conceptual design process? In this paper, a genetidgradient algorithm is employed as a new method of helium turbine conceptual design and an attempt is made to answer this question. As stated before, the conceptual design is just the first step of the turbine design process. In the traditional design process, it is possible for a designer to find that some parameter selected according to the baseline does not meet the needs of the resulting design, requiring him to go back to the first step to change the baseline and repeat the process. With the design by genetidgradient method, such repetition could be avoided because some factors in the latter steps can be involved in the conceptual design as constraints. Figure I shows the difference between the two processes.
4 4
~
2- Genetidgradient design process Figure 1 Difference between traditional and genetidgradient design process
2. Design method The HTGR is regarded as a typical fourth-generation advanced reactor. The helium turbine cycle is selected as the probable circulation mode for HTGR in the world. A helium turbine for the HTGR with the thermal power of 200MW is designed using the genetidgradient method. This turbine has a constant inner
599
radius. The parameters needed for the conceptual design of this helium turbine are shown in Table 1.
Table 11 comparative maran and pnn withthree traning meth met 150 Table 11 comparative maran and pnn withthree traning meth met 150 Table 11 comparative maran and pnn withthree traning meth met 150 Table 11 comparative maran and pnn withthree traning meth met 150 Table 11 comparative maran and pnn withthree traning meth met 150 Table 11 comparative maran and pnn withthree traning meth met 150 Table 11 comparative maran and pnn withthree traning meth met 150 Table 11 comparative maran and pnn withthree traning meth met 150 Table 11 comparative maran and pnn withthree traning meth met 150 Table 11 comparative maran and pnn withthree traning meth met 150 As above mentioned, the turbine design is a multidisciplinary process, including thermal dynamics, fluid dynamics, structure analysis and intensity analysis as well as other aspects. In the traditional design process, conceptual design focuses on thermal dynamics, and the other parts are mainly considered in the subsequent design step. While in conceptual design using the genetidgradient method, other parts in addition to thermal dynamics are considered as constraints to the optimization problem. These constraints include many elements, such as outline size, blade and plate intensity, and the fluid dynamics items like Mach number at the inlet and outlet of every stage and the angle of the flow. The design space is reduced by these constraints and the optimization process is kept inside a region in which the rules considered as constraints will be followed. Figure 2 shows how the design space is reduced by the constraints "I.
Bounds of constraint on feasible design space
Figure 2 Reduction of design space by constraints
600
There are many different chromosome coding methods in genetic algorithms. For simple genetic algorithms, binary coding and float coding are usually used. Programming and computation indicate that there is no evident difference between the two modes of coding for this problem, so the more convenient one, float coding, is employed. Total efficiency for the helium turbine is defined as fitness function, and the efficiency is evaluated by the computation step of the traditional conceptual design method. The optimization variable includes the turbine inner radius, blade length and the reaction of each stage. For an eight-stage turbine, there are 17 variables. The population size is 50, which means that in every generation, 50 chromosomes are involved. The proportion of mutated variables is 0.02 and the probability of crossover is 0.1. The maximum number of generations is set as 200 ‘3J.
The change between the best chromosome’s fitness and generation growth is given in Figure 3. The fitness grows rapidly from 0.5 to 0.9 within the first 20 generation, and becomes almost stable after that.
Fitness 1
r
r
0. 9 0. 8 0. 7 0. 6 0. 5
/
0. 4 0. 3 0. 2 0. 1 0
, , , I I I I I , , , I , I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I l I I I I I I I I I I I I I 1 1 1 1
1 10 19
28 37 45
55 84 73
82
91 100 109 118 127 1%
Generation Figure 3 Dependence of fitness on generation
601
3.
Results and discussion
The values of the genes for one chromosome in the 14th generation are presented in Table 2. It can be found that these values are very close to the conceptual design values that the authors obtained by the traditional design method. The result shows that experience of air turbine conceptual design can be at least partly used in helium turbine conceptual design and dose not bring evident error. In addition, the values of some parameters are still selected according to experience, such as the blade clearance loss ratio and the blade speed coefficient[4]. Table 2 Comparison of gene values for one chromosome in the 14th generation of the genetic/gradient design method and the traditional design method Traditional design method Inner Total Blade length (mm) radius efficiency 2 4 1 3 6 7 8 5 (mm) 72 78 60 66 84 96 102 90 900 0.903 Reaction of each stage
1 60
0.1 63
4.
2 66
0.1 75
0.180 Genetic/gradient design method Inner Blade length (mm) radius 4 3 6 7 5 8 (mm) 72 78 84 96 102 90 860 Reaction of each stage 0.1 83
0.1 88
0.1 9
0.1 9
0.1 87
Total efficiency
0.898
0.1 81
Conclusion
Being compared with the traditional design method, the genetic/gradient optimization method can be used to obtain the better values in a relatively short time. If enough constraints are included in the design process and no further improvement are be reached after gradient search with different starting points, the final computing data should be considered as an optimum design result. The research result also indicates that some of the experiences from air turbine conceptual design may be used in helium turbine conceptual design without evident error.
602
Acknowledgments Financial support by the State High Technology Research and Development Program of China (Contract No.2001AA511010) is gratefully acknowledged.
References 1. A.H.W. Bosa. Engineering Applications of Artificial Intelligence. 11, 377 (1998). 2. D.E. Goldberg. Genetic algorithms in search, optimization and machine learning. Addison- Wesley Publishing Company, 1989. 3. Li Minqiang, Kou Jisong, Lin Dan, Li Shuquan. Fundamental Theory and Utility of Genetic Algorithm. Science Press, 163 (2002). 4. Wang Zhongqi, Qin Ren. Theory of Turbomachine. Machine Industry Press, 1988.
CONTINUOS AND DISCRETE METHODS FOR THE AGGREGATION AND THE COG STAGES IN THE POWER CONTROL OF A TRIGA REACTOR JORGE s. BEN~TEZ- READ^, MAGNOLIA NAJERA-HERNANDEZ, BENJAMINPEREZ-CLAVEL Instituto Nacional de Investigaciones Nucleares, Instituto Tecnoldgico de Toluca, Universidad Autdnoma del Estado de Mkxico Municipio de Ocoyoacac, Edo. de MLxico 52045, MLxico
This paper presents the main characteristics of two different methods to realize the aggregation and the center of gravity stages of a fuzzy controller that is under development for its integration, as an alternative power control algorithm, in the control console of the TRIGA Mark III reactor of the Mexican Nuclear Center. In one case, an innovative method determines, in every control cycle, the group of lines that define the fuzzy aggregated set of the rule base in the continuous domain of the output variable. Likewise, the center of gravity of this aggregated set is analytically obtained. In the other case, the method used to determine both the aggregated set and its center of gravity is based on the discretization of the universe of discourse. These methods were simulated in the ascent and regulation of neutron power and comparison results of,their performances are presented.
1. Introduction The development of control algorithms for the research nuclear reactor of the Mexican National Institute for Nuclear Research (ININ) has been focused in obtaining a controlled ascent of power and its regulation for long periods of time, maintaining the reactor period within the safety limits. A current project at ININ, in collaboration with the Nuclear Research Center of Belgium, has the objective of designing a neutron power fuzzy controller, as well as its integration in the reactor control console for real time validation purposes. To this end, different Mamdani type fuzzy algorithms have been simulated [ l ] in which the aggregation and defuzzification stages utilize a discrete universe of discourse of the controller output variable. In this paper, another method is used to carry out these two stages, where a continuous domain of the output variable is used. The continuous and the discrete fuzzy controllers are simulated using a simple model
Work partially supported by grant 33797-A of the Mexican National Council for Science and Technology and grants 628.02-P and 634.03-P of the Council of the National System of Technological Education. 603
604
of the TRIGA reactor, and comparison results of their performances are presented. 2. Development of the Fuzzy Control Algorithm Five fuzzy sets are used for each of the two input variables (the percentage of power error "ne" and the reactor period "T"), and four for the output variable (external reactivity slope "m-ext"). A description of these sets can be found in 121. The set shapes are either triangular or trapezoidal. In each control cycle, the crisp value of each input is used to determine the membership degree of each input in its associated fuzzy sets. The membership degrees obtained in the fuzzification stage are used in the next stage (rule evaluation) to determine the activation of each control rule and to combine all the activations obtained for each of the four rule base consequents. The rules to control the ascent of power in the reactor were obtained from simulations of the point kinetics equations of the reactor subject to different insertions of external reactivity 111. The rule evaluation stage generates two vectors, one whose entries define the output fuzzy sets that were activated, and the other containing the activation values of each set defined in the first vector 131.
Continuous Fuzzy Controller: Aimed to improve the accuracy of the controller output, a method was designed to perform an analytic aggregation of the activated output fuzzy sets, considering a continuous domain of the output variable. In each control cycle, this method determines some parameters such as domain of overlapping, intersection points, and activation values of consecutive activated sets. The analytic definition of the group of lines that compose the aggregated output fuzzy set is carried in two main steps: (i) Definition of the first line or lines of the aggregated set by analyzing the first activated set, and (ii) definition of the subsequent lines considering the other sets. Figure 1 shows a typical line of the aggregated set formed between the sets NS, considered as the previous activated set, and ZE, considered as the current activated set. Previous
x-ant
4'
Current
x-act
Figure 1. Line of the aggregated set formed between the NS and ZE activated sets.
605
The center of gravity method is used to determine the crisp value of the controller output, mpext. In each control cycle, the aggregation stage generates a matrix whose number of rows, M, corresponds to the number of lines that compose the aggregated fuzzy set. Thus, for instance, the entries of the i-th row would define the line yi =mix+b,, xmt-;< x < x,cf-i. The pair of integrals associated to each line, needed to determine the center of gravity, are computed analytically, and the exact value of the controller output is determined by the formula
mp_y+I =Fnumr/tdenl, r=l
where
numi
is
defined
as
1=1
(xOcf-; 3 - x 3m f - i ) ~ i / 3 + ( -x kx L ) W
and
den,
as
(xfCf-;- ~2 ~ - ~/ 2) +m( ~, , , - -~x m f - i ) b i . The external reactivity to be inserted in the reactor is determined using the controller's output qext.
Discrete Fuzzy Controller: This controller contains the same stages that compose the continuous fuzzy controller. The main difference is that the processes of aggregation and defuzzification are carried out simultaneously over a discrete universe of discourse of the controller output variable. The crisp value of the output is obtained with the expression
the value of the output variable at the i-th discrete position along its universe of discourse and pA the membership degree of this i-th value in the aggregated set.
3. Simulation Results and Conclusions The two fuzzy control algorithms were simulated, using Matlab, to bring the power of the TRIGA Mark I11 reactor from its source level (50W) to a stable level of 1 MW. For comparison purposes, the following parameters were defined: Number of instructions, time required to attain a certain power level, and power and reactor period time responses. Figure 2 shows a comparison of the total number of instructions, from which it is determined that the total number of instructions carried out by the fuzzy controller when the continuous method is used represents the 3.7% of the total number of instructions required by the fuzzy controller when the discrete method is used. Likewise, the number of instructions to perform exclusively the four fuzzy control stages, using the continuous method represents the 2.9% of the instructions required by the discrete method. The curves shown in figure 3 correspond to the power
606
responses using both controllers. The safety requirement of maintaining the reactor period above the 3-second limit is attained at all times.
CONTINUOUS: 34,158,100 DISCRETE: 905,617,200 Figure 2. Total number of instructions
The power responses do not present overshoots when the reactor approaches the desired full power level. These results show the feasibility of using knowledgebased techniques for the control of neutron power in research reactors.
t Figure 3. Reactor power response: (1) continuous method, (2) discrete method.
References 1. Benitez-Read, J.S. and VClez-Diaz, D.; Neutron Power Control in a Research Reactor Using a Fuzzy Rule Based System; Soft Computing with Industrial Applications, Vol. 5, pp. 53-58, (1996) 2. Benitez-Read, J.S. and VClez-Diaz, D.; Controlling Neutron Power of a TRIGA Mark I11 Research Nuclear Reactor with Fuzzy Adaptation of the Set of Output Membership Functions; Fuzzy Systems and Soft Computing in Nuclear Engineering, pp. 83-114, Physica-Verlag, (2000). 3. Najera Hernandez, M.; B.Sc. Thesis; Diseiio de una agregaci6n analitica de reglas de un controlador difuso y su aplicaci6n en el modelo de un reactor nuclear de investigacibn, (2003).
A NICHING METHOD WITH FUZZY CLUSTERING APPLIED TO A GENETIC ALGORITHM FOR A NUCLEAR REACTOR CORE DESIGN OPTIMIZATION WAGNER F. SACCO PENKOPPE, Universidade Federal do Rio de Janeiro, Ilha do Fundcio s/n, PO Box 68.509, Rio de Janeiro, RJ, 21945-970, Brazil CLAUD10 MARC10 DO NASCIMENTO ABREU PEREIRA DIREIIEN, Comissiio Nacional de Energia Nuclear, Ilha do FundEo s/n, PO Box 68509, Rio de Janeiro, RJ, 21 94.5-970, Brazil PENICOPPE, Universidade Federal do Rio de Janeiro, Ilha do Fundcio s/n, PO Box 68.509, Rio de Janeiro, RJ, 2194.5-970, Brazil ROBERTO SCHIRRU PENKOPPE, Universidade Federal do Rio de Janeiro, Ilha do Fundiio s/n, PO Box 68509, Rio de Janeiro, RJ, 21945-970, Brazil
The nuclear core design optimization problem consists in adjusting several reactor parameters in order to minimize the average peak factor in a three-enrichment zone reactor, considering restrictions on the average thermal flux, criticality and submoderation. We solve this problem with a Niching Genetic Algorithm (NGA) with fuzzy clustering and compare its performance to the Standard Genetic Algorithm (SGA). After several experiments, we observed that the former technique performs better than the latter due to a greater exploration of the search space.
1. Introduction Traditional GAS’ have shown to be inadequate to complex multimodal function optimization, as they rapidly push an artificial population toward convergence. Niching Methods’ allow genetic algorithms to maintain a population of diverse individuals, locating multiple, optimal solutions within a single population. This work presents a niching method with fuzzy clustering applied to a nuclear core design optimization problem3. First, there is an overview of genetic algorithms with niching methods. Then, the new niching method is introduced. Next, the problem to be optimized and the system implementation are exposed. Afterwards, our results are shown and compared to previous efforts. Finally, the conclusive remarks are made. 607
608
2. An Overview of Genetic Algorithms with Niching Methods Genetic Algorithms have proven to be efficient in a great variety of domains, as the population of candidate solutions converge to a single optimum, in a phenomenon known as genetic drift4. Niching methods allow the GA to identify, along with the global optimum, the local optima in a multimodal domain. Various niching formation methods have been proposed5v697x8, being generally based on sharing, which is the most disseminated paradigm and serves as reference when considering these methods. Yin and Germay6 applied a clustering algorithm to the GA population prior to sharing with a relative success. Based on this and on the results of Sareni and Krahenbiihlg, that showed clearing's superiority over sharing, Sacco et al. lo decided to cluster the GA individuals before submitting them to clearing. While in standard clearing an individual dominates those within a range o,in clearing associated with clustering the dominance within a cluster was proposed. Instead of estimating a threshold radius, it is necessary to estimate the number of clusters or classes, an easier task for real-world problems. As the assignment of an individual to a cluster in real-world functions has a high degree of uncertainty, the use of fuzzy logic" to group the GA individuals in clusters appeared as a natural step to be taken. Thus, the clustering method applied was Fuzzy Clustering Means", because it did not require, at each iteration, the total allocation of an individual to a certain cluster or class. This algorithm borrows from fuzzy logic the concept of pertinence. 3.
Problem Description
The nuclear reactor core optimization problem was first attacked by Pereira et al. using the SGA3. In order to provide a complete understanding of this work, we will briefly describe it here. Consider a cylindrical 3-enrichment-zone PWR, with typical cell composed by moderator (light water), cladding and fuel. The design parameters that may be changed in the optimization process, as well as their variation ranges. The objective of the optimization problem is to minimize the average peakfactor,&, of the proposed reactor, considering that the reactor must be critical (ke8= 1.O f 1%) and sub-moderated, providing a given average flux 40. 4.
Method Application
We have used in our tests the GENESIS GA13 with a module containing FCM and Fuzzy Clearing attached. This GA uses double-point crossover4, Stochastic Universal Sampling as selection scheme14 and elitism4. We have adopted the usual genetic parameters4.
609 To perform the Reactor Physics evaluations we have used the Hammer code15. The fitness function was developed in such a way that, if all constraints are satisfied, it has the value of the average peak factor,&. Otherwise, it is penalized proportionally to the discrepancy on the constraint. In order to evaluate the proposed NGA,we have made several experiments, comparing it with the SGA. We have used for both GAS the same population size (100 individuals) and genetic parameters. Our experiments were performed on a 1.6 GHz Pentium IV PC using ten different random seeds so that the results would be unbiased. 5.
Results
Table 1, below, shows the results obtained by the GA with Fuzzy Clearing (NGA) with 10 and 20 classes selected for FCM in comparison to those obtained by the SGA considering 100 and 500 generations. Table 1. Comparison between the NGA and the SGA for 100 and 500 generations. 100 gen.
Experiment
SGA
classes)
NGA (20 classes)
1.2924 1.3097 1.3224 1.3164 1.3118 1.3327 1.3275 1.3078 1.3186 1.3275 1.3167
1.3499 1.3166 1.3541 1.2992 1.3134 1.3079 1.3355 1.3129 1.3275 1.3824 1.3299
NGA (10
#I #2 #3 #4 #5 #6 #7
#8 #9 #10
Average
1.3201 1.3126 1.3300 1.3294 1.3596 1.3562 1.3372 1.3523 1.3614 1.3467 1.3405
SGA
1.3185 1.3116 1.3203 1.3294 1.3595 1.3562 1.3372 1.3523 1.3614 1.3467 1.3393
500 gen. NGA
NGA
(10
(20
classes)
classes)
1.2916 1.3069 1.3003 1.2874 1.2956 1.3014 1.3190 1.3075 1.2974 1.3071 1.3014
1.2953 1.2956 1.2969 1.2989 1.2804 1.3055 1.3049 1.3115 1.3100 1.3172 1.3016
Taking a glance at the table above, it can be noticed that: The NGA outperformed the SGA in all the experiments. The NGA evolves toward a better solution along the generations. The NGA is quite robust to the different number of classes. The configurations obtained were quite different, but it was observed that Fuzzy Clearing could deal better with the compromise between the main objective (average peak factor) and the constraints.
610
6. Conclusions With this work, we show that niching GAS can be successfully applied to realworld problems, especially to a challenging one like the nuclear reactor core design optimization problem. Our results are due to Fuzzy Clearing, where we joined the best of two worlds: an efficient niching method with a clustering technique that does not depend on previous knowledge of the search space. Niching methods promote a thorough exploration of the search space, being extremely adequate to highly multimodal domains.
References 1. J.H. Holland, Adaptation in Natural and Artificial Systems, University of Michigan Press (1975). 2. S.W. Mahfoud, Niching Methods for Genetic Algorithms, PhD Thesis, University of Illinois at Urbana-Champaign (1995). 3. C.M.N.A. Pereira, R. Schirru and A.S. Martinez, Annals of Nuclear Energy, 26, 173 (1999). 4. D.E. Goldberg, Genetic Algorithms in Search, Optimization & Machine Learning, Addison-Wesley (1989). 5. D.E. Goldberg and J. Richardson, In: Proceedings of the Second International Conference on Genetic Algorithms, 41 (1987). 6. X . Yin and N. Germay, In: Proceedings of the International Conference on Artificial Neural Nets and Genetic Algorithms, 450 (1993). 7. M.R. Anderberg, Cluster Analysis for Applications, Academic Press (1975). 8. A. PCtrowski, In: Proceedings of the 1996 IEEE International Conference on Evolutionary Computation, 798 (1996). 9. B. Sareni and L. Krahenbuhl, IEEE Trans. on Evolut. Comp., 2,97 (1998). 10. W.F. Sacco, M.D. Machado, C.M.N.A. Pereira and R. Schirru, Annals of Nuclear Energy, 31,55 (2004). 11. L.A. Zadeh, Information and Control, 8, 338 (1965). 12. J .C. Bezdek, Pattern Recognition with Fuzzy Objective Function Algorithm, Plenum Press (1981). 13. J.J. Greffenstette, A User’s Guide to GENESIS version 5.0 (1990). 14. J.E. Baker, In: Proceedings of the International Conference on Genetic Algorithms and their Application, 14 (1987). 15. J.E. Suich and H.C. Honec, The HAMMER System Heterogeneous Analysis by Multigroup Methods of Exponentials and Reactors, Savannah River Laboratory (1967).
ESTIMATION OF BREAK LOCATION AND SIZE FOR LOSS OF COOLANT ACCIDENTS USING NEURAL NETWORKS MAN GYUN NA, SUN HO SHIN, DONG WON JUNG, AND SOONG PYUNG KIM Department of Nuclear Engineering, Chosun Universiv, Gwangiu 501 - 759, Korea JI HWAN JEONG Dept. of Environmental System, Cheonan College of Foreign Studies, Cheonan, Korea
BYUNG CHUL LEE Future & Challenges, Inc., Seoul, Korea In this work, a probabilistic neural network (PNN) that has been applied well to the classification problems is used in order to identify the break locations of loss of coolant accidents (LOCA) such as hot-leg, cold-leg and steam generator tubes. Also, a fuzzy neural network (FNN)is designed to estimate the break size. The inputs to PNN and FNN are time-integrated values obtained by integrating measurement signals during a short time interval after reactor scram. An automatic structure constructor for the fuzzy neural network automatically selects the input variables from the time-integrated values of many measured signals, and optimizes the number of rules and its related parameters. It is verified that the proposed algorithm identifies very well the break locations of LOCAs and also, estimate their break size accurately.
1.
Introduction
In case a loss of coolant accident (LOCA) happens in a nuclear power plant, it is important for operators and technical staffs to find out where the break location is and how large the break size is by observing initial short time trends of major parameters in order to effectively accomplish LOCA accident management strategies. However, it is very difficult for nuclear power plant operators to figure out the break location of LOCAs and their break size by staring at temporal trends of important parameters after the LOCAs. The present work aims to identify the break locations of LOCAs such as hot-leg, cold-leg, and steam generator tubes by applying a probabilistic neural network (PNN). The inputs to the PNN are the initial time-integrated values of measured signals after reactor scram. Another objective of this work is to estimate the break size of a LOCA by using a fuzzy-neural network (FNN) of which the inputs are the time-integrated values of important measured signals during a short time interval after reactor scram. 61 1
612
2.
Identification of break location
PNN [l] is a general technique that is widely applied to pattern classification. Therefore, in this work, PNN is used as a non-linear pattern classifier that identifies the major LOCA break locations by using the very short time integration of some selected signals immediately after reactor scram. PNN is a neural network implementation of a Bayesian classifier. PNN operates by defining a probability density function (PDF) for each data class based on the training data set and a smoothing parameter ( (T ). PDF defines the boundaries for each data class, while the smoothing parameter determines the amount of interpolation that occurs between adjacent kernels. The advantage of probabilistic neural networks is that their training is easy and instantaneous.
3. Estimation of break size In this work, FNN is designed to provide plant operators with valuable information of the break size of LOCAs so that they can perform LOCA accident management successfully. FNN uses the neuronal improvements of fuzzy systems as well as the fuzzification of neural network systems aiming at exploiting the complementary nature of the two approaches. A Sugeno-Takagi type fuzzy inference system [2] is used where the i -th rule can be described as follows: If x1 is Ail AND
... AND x,
is A,, then j ' is f '(xl, ..., x,),
(1)
where m
f i(xl ,...,x,) = c q . r. JxJ .+ ri j=l
The output of a fuzzy inference system with n fuzzy rules is a weighted s u m of the consequent of all the fuzzy rules. Therefore, the estimated signal from the fuzzy inference system is given by:
L i=l
where
613 m
wi ' n A , ( x j ) . j=1
An automatic structure constructor plays a role that selects the input variables, determines the number of rules and optimizes the antecedent parameters related to the membership functions and the consequent parameters related to each rule output. This automatic structure constructor is described in a literature [3]. 4.
Verification of the proposed algorithm
To verify the proposed algorithm, it is necessary to acquire data that are needed to train the neural networks from lots of numerical simulations since there are few accident data. The data were acquired by simulating postulated LOCAs for the advanced power reactor 1400 (APR1400) [4] using MAAP4 code [ 5 ] . A total of 120 computer simulations have been conducted which are composed of 40 cold-leg LOCAs, 40 hot-leg LOCAs, and 40 SGTRs. The input variables to PNN are the time-integrated values of 13 simulated sensor signals as follows:
where t, is scram time and At is integration time span. 4.I . Identification of break location
To verify the break location by PNN, a total of 120 simulations are divided into both training simulation data and test simulation data. The training data are used to train the neural network and the test data are used to independently verify it. So a total of 105 simulations for training PNN are 35 hot-leg LOCAs, 35 coldleg LOCAs, and 35 SGTRs. And 15 test simulations consist of 5 hot-leg LOCAs, 5 cold-leg LOCAs, and 5 SGTRs. The integrating time span in Eq. (4) is 60 sec, which means that PNN uses the time-integrated signals of a 60 sec time interval immediately after reactor scram. Since PNN was very insensitive to the integrating time span, the integrating time span was determined by considering decision speed and the possibility of decision fault. PNN was trained so that it would categorize the hotleg LOCA, the cold-leg LOCA, and SGTR as 1, 2, and 3, respectively. Table 1 shows the final results that have classified where break locations were. It is shown that PNN can identify break locations accurately.
614 Table 1. Identification of breaks locations. Cold-leg LOCA Hot-leg LOCA Break size Scram time Classi- Break size ClassiScram time 2 (m ) (sec) (sec) fied fied (m2) 1 583.0198 5.0671e-4 917.9387 2 4.4970e-4 1 351.1106 7.9173e-4 602.8394 2 2.2485e-3
SGTR Scram Classitime (sec) fied 3 45.79234 3 45.82740
35 Training
5 Test
7.9173e-2 8.5634e-2 4.8695e-3 1.3179e-2 2.5543e-2 4.1960e-2 6.243 le-2
5.667431 5.444156 50.49323 19.36440 11.36787 8.119373 6.421889
6.998317 6.711850 80.33833 28.84917 15.68640 10.59376 8.092761
2 2 2 2 2 2 2
8.5443e-2 8.9940e-2 9.8934e-3 2.788 le-2 4.5869e-2 6.5207e-2 8.3195e-2
4.190392 3.879486 35.08813 12.23123 7.995428 6.102534 4.345119
3 3 3 3 3 3 3
4.2. Estimation of break size The integrating time span in Eq. (4) of 60 sec is used to estimate the break size of LOCA. This means that FNN can estimate LOCA break size by using the initial short time-integrated values of measured signals for one minute immediately after reactor scram. The integrating time was selected through several numerical simulations of the proposed algorithm so that the estimation errors are minimized. Table 2 shows the inputs that are selected automatically by the proposed automatic structure constructor, and the number of rules that are optimized by the automatic structure constructor. In addition, Table 2 shows the maximum estimated error, the root mean square error and the 2-sigma error. Table 2. Selected input variables, the number of optimized rules, and the estimation errors. Selected inputs
lumber of Rules
Training data (%) Max RMS 2
Test data (%) 2cr Max RMS error error error
Hot-leg LOCA
Containment temp. Sump water level
3
0.8833 0.4045 0.8207 1.0144 0.6219
Cold-leg LOCA
Containment temp. Containment pres.
2
4.0593 1.6359 3.3178 2.3899 1.7369 3.8518
3
3.5248 1.7384 3.5272 3.7482 2.7193 6.0616
SGTR
Pressurizer water level Broken side S/G water level
1.0086
Figure 1 shows the target and estimated break size for hot-leg LOCAs by using the two input variables of containment temperature and sump water level (refer to Table 2). The number of fuzzy rules is three. All relative errors exist within the bound of 1% error.
615 0.10-
-
target(tes1 data)
-0.027 -0.02
.
, 0 00
.
.
,
,
.
0.04
0.02
, 0.06
.
, 0 08
. I
break size(m’)
Fig. 1. Estimated break size due to hot-leg LOCA. 0.10-
-
0.08
-
b
0.08-
.
I
‘I 0.04-
w
9
,
5
0.02-
0
, 0.00-002 -002
000
002
004
006
008
break size(m’)
Fig. 2. Estimated break size due to cold-leg LOCA. 0.10-7
-
0.08
0.08-
$
.
ON ‘5
0.04-
I 5
0.02-
0
. , 0.00-
4
-0.02 -0.02
0.00
0.02
0.04
0.06
0.08
break sire(m’)
Fig. 3. Estimated break size due to SGTR
I 10
616
Figure 2 shows the target and estimated break size for cold-leg LOCAs by using the two input variables of containment temperature and containment pressure. The number of fuzzy rules is two. All relative errors exist within the bound of about 4% error. Figure 3 shows the target and estimated break size for SGTR accidents by using the two input variables of pressurizer water level and broken side S/G water level. The number of fuzzy rules is three. All relative errors exist within the bound of about 4% error. 5.
Conclusions
PNN has been designed to identify the break locations of LOCA accidents by using the short time(60sec)-integrated values of 13 measured signals. Also, FNNs have been designed to estimate the break size by using the short time(60sec)-integrated values of 2 measured signals after reactor scram. It was shown that the proposed probabilistic neural network could accurately identify the break locations into three kinds of categorized events such as hot-leg LOCA, cold-leg LOCA, and SGTR. Also, it was known that the proposed fuzzy neural network could accurately estimate the break size. Acknowledgement
This work has been supported in part by EESRI (01-local-Ol), which is funded by MOCIE(Ministry of commerce, industry and energy). References
1. D. F. Specht, Neural Networks, 3, 109, 1990. 2. T. Takagi and M. Sugeno, IEEE Trans. System, Man, Cybern., 1, 116, 1985. 3. M. G. Na, Y. R. Sim, K. H. Park, S . M. Lee, D. W. Jung, S . H. Shin, B. R. Upadhyaya, K. Zhao, and B. Lu, IEEE Trans, Nucl. Sci., 50, 241, April 2003. 4. Korea Hydro & Nuclear Power Company, APR1400 SSAR, June, 2002. 5. R.E.Henry, et al., M A P 4 - Modular Accident Analysis Program for LWR Power Plants, User’s Manual, Fauske and Associates, Inc., vol. 1, 2, 3, and 4, 1990.
CHARACTERISATION OF SIZE, SHAPE AND MOTION BEHAVIOUR OF INSULATION PARTICLES IN COOLANT FLOW USING IMAGE PROCESSING METHODS* AND@ SEELIGER Institute of Process Technology, Automation and Measuring Technology at University of Applied Sciences ZittadGoerlitz 0-02763 Zittau, Theodor-Koerner-Allee 16, Germany Tel: +49-(0)3583-61-1547,F a : +49-(0)3583-61-1288 RAINER HAMPEL Institute of Process Technology, Automation and Measuring Technology at University of Applied Sciences ZittadGoerlitz 0-02763 Zittau, Theodor-Koerner-Allee 16, Germany SOEREN ALT Institute of Process Technology, Automation and Measuring Technology at University of Applied Sciences ZittadGoerlitz 0-02763 Zittau, Theodor-Koerner-Allee 16, Germany
The investigations for insulation particle genesis and transport gain in importance regarding the reactor safety research for Pressurised Water Reactors (PWR) and Boiling Water Reactors (BWR). All types of loss-of-coolant accidents (LOCA) as well as short and long term behaviour of emergency core coolant systems were considered for analysis. A gist of these investigations is the development of 3-D-models simulating twophase flow consisting of water and insulation particles in large geometries. The background of experimental investigationsconsists of the following parts: generation of a wide data base, development and validation of CFD-models (computational fluid dynamics) for the description of insulation particle transport phenomena in flows.
These analysis will be carried out for various geometric and fluidic boundary conditions, as well as sedimentation, resuspension, agglomeration, clogging and increasing of differential pressure at hold-up devices. Three plexiglass test facilities were built for exploration of the mentioned single phenomena. Especially modem flow measurement and digital image processing technologies were applied.
*
The investigations described in this paper have been financed by the German Federal Ministry of Economy and Labour under grant no. 150 1270. 617
618
1. Test facilities Blast experiments were carried out at the rig “Fragmentation” to simulate LOCA and to fragment dlfferent insulation materials under real accident conditions, e.g. with saturated steam up to 7 MPa (BWR-LOCA). The facility was also designed for experiments with saturated water up to 11 MPa (PWR-LOCA). As a result of these experiments fragmented insulation materials were produced. The debris of each experiment was storaged in aqueous solution to apply this debris solution at the various separate effect acrylic glass test facilities. The rigs named “Column” and “Ring Channel” are conceived for 2dimensional observations. The test rig “Tank” allows the monitoring in three dimensions. An overview about the arrangement of the facilities, the water supply and waste water disposal and the auxiliary components is shown in Fig. 1. It is possible to feed the facilities with potable water or with deionised water or a mixture of both. A degassing of water can be realised with the electric heaters in the storage water tank. The acrylic rigs work under atmospheric pressure conditions. The temperatures can vary between 20 “C and 80°C.
CPI
CP2 SPJ TPI
centrifugal pumpDH ccntrifugalpumpDH sewagcpump DH circulatlonpumpDH
20 rn. DF I Us 50 m. DF I V s I O m . D F I Ys 21 m.DF 109 m ’ h
Figure 1. Scheme of the water system for three acrylic glass facilities Column, Ring Channel and Tank
619
2.
Instrumentation
The generation of a qualified data base for CFD-model development needs detailed data of water basis flow. Also extensive data of the relative movements and significant characteristics of the insulation particles in the flow are necessary. Methods of digital image processing were applied for measuring of particle geometries and particle movements. The used camera system is shown in Figure 2, which consists of two highspeed cameras. Every camera allows to take pictures with a resolution up to 1280x1024 pixel. Because of random programmability of window size, position (region of interest) and clock frequency, the resolution and the frame rate can be adapted to any specific necessity. Both cameras can be triggered simultaneously.
inpd
Figure 2. Configuration of the image processing system.
3.
Image Preprocessing
The resulting sequence of images were preprocessed for improving attributes like contrast and brightness. Radial distortion effects were compensated, which were caused by optical components of the camera system. For t h s operation a preceding camera calibration is required, which enables to determine a subset of camera parameters like orientation and relative position of the camera, distortion coefficient, center of radial distortion and focal length. The background subtraction makes it possible to compensate inhomogeneity caused by the lighting of the facility. A pristine image of the background must be available for application of this method. This image will be generated by calculating a maximum value image using the complete image sequence.
Two conditions have to be fulfilled for a correct result: The objects of interest always have to occur in low gray values on a bright background composed of h g h gray values. Every pixel in the sequence must appear at least once as an element of the background. After applying this algorithm the particles in the sequence show up on a homogeneous and bright background. 4.
Image Processing Methods
The objects of interest and the background were segmented for every image of the sequence by the region growing algorithm. This algorithm represents a region based segmentation method and starts with an initial partition of pixels. Neighboring points are aggregated into the same region if the difference of their feature vectors lies in a given interval. The algorithm terminates if every pixel in the image had its attribution to a particle region or the background [ 11. Finally the unique particle regions of an image were specified. The following primary significant characteristics of the particle objects can be calculated: 2-D geometries: contour length surface area 0 position in x and y directions of the world coordinate system 3-D geometries: 0 volume position in x, y and z directions of the world coordinate system Camera parameters were taken into calculation for obtaining the full-scale data. Several shape factors of the insulation particles were compiled like circularity, convexity, compactness and bulkiness [2]. The shape can also be described using geometrical objects e.g. largest inner circle, smallest surrounding circle and equivalent ellipse (Fig. 3).
Figure 3. Graphical representation of a subset of shape factors (inner circle, outer circle, equivalent ellipse, maximal distance between contour points).
621
This lund of attributes give information about the form of every single particle. For example, the value of circularity indicates the similarity of the particle region with a circle and the convexity points to the existence of any indentations or holes in the region. 5.
Particle Tracking
A new software algorithm was developed for traclung particles image by image. It underlies a basic principle: The region of an particle object in Image n overlaps the corresponding region in Image n+l, when both images are laying one upon the other (Fig. 4).
”. . I
B
I Figure 4. Overlappmg particle regions By evaluating this overlapping regions it is possible to recognize an object in consecutive images and finally to trace all detected particles over the whole sequence. During this several attributes of the objects were constantly compared for validating and adjusting of the performed assignments. The algorithm itself was extended for tracking smaller particles where normally no distinct overlapping region can be detected. The velocity of the observed objects could now be determined involving the recording time of the sequence. It was also possible to detect a whole range of specific effects like particle collision, overlap, break-up and coalescence.
622 6. Preliminary Results Statistical analyses permit a classification of flow relevant parameters in clusters which depend on typical optical parameters, geometrical dimensions and on shape factors of different particle sizes. It is also possible to formulate general rules concerning the behaviour of insulation particles under various ancillary conditions.
7.
Outlook
The data basis for CFD-Code will be extended by additional data to the 2D-flow concerning the behaviour of insulation particles in a 3D-flow field under consideration of turbulences. According experiments will be realised at the test facility "Tank". It permits the observation of any influences of waterfall effects on a two-phase mixture of insulation particles and water. The developed modelling for CFD-Codes will be validated by supplemental experiments on large scaled integral test facilities.
Acknowledgments The investigations described in this paper have been financed by the German Federal Ministry of Economy and Labour under grant no. 150 1270.
References 1.
2.
A. Rosenfeld and A.C. Kak, Digital Image Processing, Academic Press, San Diego (1982). B. Jahne, Digitale Bildverarbeitung, Springer Verlag, Berlin (2002)
CONTROL OF CHAOTIC SYSTEMS USING FUZZY CLUSTERING IDENTIFICATION AND SLIDING MODE CONTROL HASSAN SALARIEH~ Center of Excellence in Design, Robotics and Automation (CEDRA) Department of Mechanical Engineering, Shar i f University of Technology Tehran, Iran ARIA ALASTY Center of Excellence in Design, Robotics and Automation (CEDRA) Department of Mechanical Engineering, Sharif University of Technology Tehran, Iran
In this paper we use a combination of fuzzy clustering estimation and sliding mode method to control a chaotic system, which its mathematical model is unknown. At first the model of chaotic system is identified without using any input noise signal. In this case the recurrent property of chaotic behavior is used for estimating its model. After estimating the fuzzy model of chaos, control and on-line identification of the inputrelated section is applied. In this step, we estimate the system model in normal form, such that the dynamic equations can be used in sliding mode control. Finally the proposed technique is applied to the Lorenz system as an example of chaotic system. The simulation results verify the effectiveness of this approach in controlling an unknown chaotic system.
1.
Introduction
Chaotic phenomena have been observed in numerous fields of science such as in physics, chemistry, biology, ecology and etc. An interesting subject in chaos theory is to eliminate the chaotic behavior by means of control systems. In 1990 an analytical method was introduced for stabilizing an unstable trajectory in a chaotic attractor, OGY [l]. Extended and modified techniques of this method were introduced during the following years [2, 31. Also in this period of time the Pyragas method, based on delayed feedback control was introduced [4]. These methods use the recurrent property of chaotic trajectories in the strange attractors. In recent years different nonlinear control techniques are used for chaos control. Some of them are; controlling the chaos in Lorenz system, using feedback linearization [ 5 ] , controlling the chaos by variable structure systems [6,
'E-mail: [email protected]. 623
624
81, and feedback linearization in discrete time systems [9, 101. Also fuzzy methods and neural networks have been applied for chaos cotrol [ 11 -1 31. In conventional nonlinear control methods, references [5] to [ 131, the chaotic systems are controlled without considering the special qualities of chaos So, in these methods, the energy of control signals are high. All of the methods mentioned, except the methods in [ 111 and [ 121, depend on the mathematical model of dynamic system. Usually having an exact mathematical model is impossible. By the method studied in this paper, we can control a chaotic dynamic system without having an analytical crisp mathematical model. In addition, the qualities such as recurrent behavior of chaotic trajectories are used for identifying the chaotic system, so it dose not need to apply an external noise signal for identification procedure. Therefore, it is possible to identify the system when it works. Due to use of clustering method, on line updating the estimated model is possible. Hence the system identification becomes more accurate as time goes on and the number of saved data points increases In this method the model updating and control are synchronous. Because of using the sliding mode control, the robustness of system against external disturbance and uncertainties has been guaranteed. 2.
Clustering System Identification
The basic idea is to group the input-output data pairs into clusters according to the distribution of the input data points, and use one rule for one cluster. Finally, an optimal fuzzy system, which can estimate the whole system with an appropriate accuracy, is designed. One of the simplest clustering methods is the nearest neighborhood algorithm. We choose the first datum as the center of the first cluster. If the distances of a datum to the cluster centers are less than a prespecified value, we must put this datum into the cluster whose center is the closest to this datum. Otherwise, set this datum as a new cluster center [14]. For identification off(x) ,the identifiedf(x) is obtained in this form:
k, Mare the number of training data and clusters, x: is the cluster center, 1 1 is a designing parameter and A ,B arabtained by clustering algorithm. 3.
CT
Sliding Mode Control
Consider a nonlinear multi-input system of the form m
x!nf) = f. I (x) + C by ( x ) u j , i, j = 1,.. .,rn J=1
Czln j = n(order of system)
(2)
625 where,
Using the sliding mode method, the control signal is: ^-I (n-1) u = B (Xr -r x(~-'),
s
.
- f - k.sgn(d)
and kare vectors that their components must satisfy these
conditions:
(1 - o i i ) k i 2 F,
+
xf?-l) j=l
- j,l-
cm ~ Y. . Ji +i q i , i = I, ...,m .
(7)
j=1
Yi 's are the components of 2 = x -Ad , s d is the desired path, and vi is an arbitrary positive number. In relation ( 5 ) the sign function can be substituted by saturation function. 4.
Control of Chaotic Systems
In our approach it is supposed that the chaotic system model is in the form of relation (S), in such a way if u = 0 the system shows chaotic behavior. .i = h(t,.x,u) (8) h(t,x,u) is a sufficiently smooth and bounded with respect to time, uniformly in (&,u) . It is required to identify the system model in the normal form of (9) using the clustering method. x = f ( 4 &) + g(t,L4.u (9) Where the functionf(t,x) is an estimation of h(t,x,O). Having u, measuring&, and calculating f ( t , & ) and-, the function g(t,x) is determined: g(t,x) = (%- A t , 5))l u (10) Control algorithm is performed in two steps. At first the function f ( t , & ) is estimated. For estimation of f ( t , x ) , we set input as u = 0 , and excite the system only with initial conditions (zero input test). In regular systems data points logged from this test, contains only a few number of dynamic modes and it is not rich enough to introduce a reliable estimation of system. Although, when the system is in its chaotic mode the data logged from zero input test contains a wide band of frequencies and represents almost the whole dynamic mode of system, and due to recurrent characteristics of chaotic trajectories, the data logged from a zero input test covers almost all regions of the strange attractor. This set of data, logged from zero input test, is used in fuzzy clustering
626 algorithm. Degree of accuracy in identification of f ( t , ~ )depends on the design parameters introduced in clustering algorithm, such as u , and the number of data points obtained from the zero input test, i.e. k. In this step the identification process is done until the relation (1 I ) is established. llfk ( t , x i )- h(t,xi,0)11< 6,
k +1 5 i 5 k
+N
(1 1)
Where k is the number of data used in training of fuzzy clusters, and N is an arbitrary large number of data points used in verification of estimated model. 6 is the permitted error of estimation. The relation (1 1) can be satisfied because of recurrent property of chaotic trajectories. The second step is identification of g ( t , x ) and control at the same time. To identify g ( t , x ) in relation (9), we may use relation (10). Set: 77(t,X,rr) = h(t,x,u) - f ( t , x ) , g ( t , x >= 7 7 ( t , X , i ) / U (12) q ( t , ~ , i )and u are continuous functions. Due to boundedness of chaotic attractor and upper and lower limits for control signal, v(t,x,i) and u lie in a bounded set in. Therefore we can normalize the numerator and denumerator of g(t,x) in relation (12) for fuzzy clustering process. At starting moment we must set u in an arbitrary initial value, then using sliding mode we can write: u = gk (t,x)-'
(x, - f ( t , x )- k.sat(*))
(13)
s = / z ( x - x d ) , ~ , = ~ d - i l ( ~ - ~ dk )= ,q + E
(14) Where g k ( t , x ) is the result of clustering identification of g(t,x) when kth input-output data point is used. E > 0 , 4 > 0 , and q > 0 can be chosen arbitrary, but if 4 becomes large then the steady state error becomes large. 5.
Case Study and Simulation
The Lorenz system is considered as a case study, dynamic equations are as: i = p ( y - x ) , j , = Rr - y - x z , z = -pz+ xy (15) The behavior of above system is chaotic for p = 1 0 , p = 8 / 3 , R = 28. We assume that the dynamic system is unknown. It is supposed that all of the state variables are detectable. Moreover, the deviation AR around R = 28 is used as a control variable parameter. In this example the state variable y is selected as the system output. The algorithm is simulated for two different desired trajectories: a) yd = 5 , and b) yd = 5 + 5sin(4t) . In both cases two other states x and z, are not controlled; however if we control the state y in a desired path, two other states does not show the chaotic behavior. Thus, the method used in sliding mode only controls the state y . Since the final goal is controlling the system partially and only on state variable y , so the relations (13)-(15) are rewritten as: (16) X = f i ( x , y , z ) , j,=f2(x,y,z)+g(x,y,z).u, z = f 3 ( x , Y , z )
627 ~ = g k ( X , Y , z ) - l ( Y -A r -k-sat(_s/$)), s=/2o,-Yd) Y r = j i , - 4 Y - Y d ) (17) Also we set k = 40, A = 1 and $ = 0.01 , and 0 = 1 and Y = 5 in identification algorithm. Figs 1 and 2 show time responses of identification and control system. In both cases, up to t = 30sec the controller has not been switched on. Fig 3 shows the values of control parameter versus time.
' 0
5
10
15
20
25
30
35
40
Tlme(sec)
Figure 1. Beha\rior of Lorenz system, yd = 5 Controller off for
I
< 30, op for
I
> 30 sec
0
5
10
15
20
25
30
35
A0
5
10
15
20
25
30
35
40
50 N
o
-500
Tirne(sec)
Figure 2 Behavior of Lorenz system, yd =5+5sin@t) Control off fort <30, control op for t > 30
-50 1
0
I 5
10
15
5
10
15
20
25
30
35
40
20 25 Tme(sec)
30
35
40
200 0
LL -200 -400'
0
Figure 3. Variations of R . upper Fig.: y d = 5 lower Fig.: yd = 5 + Ssin(4t)
I
.
628 6.
Conclusion
Using the method explained in this paper, it is possible to control the chaotic systems without having any exact and crisp mathematical model of dynamic equations. In this method we obtain a fuzzy model of system by use of fuzzy clustering. Because of using the recurrent property of chaotic trajectories, it is not necessary to exert any external noise signal for identification of chaotic part of system. Besides, we can estimate some other part of system when the control signal is applied, and use the updating specifications of identifier. Because of using the sliding mode control, the closed loop system is robust against the uncertainties of estimation. The power of control signal is not minimal, as a result of the natural property of nonlinear control methods. The only necessity in this method is the ability of saving relatively numerous data points when the identification is applied. The results of a case study on Lorenz system and simulations verify the effectiveness of the proposed approach.
References 1. E. Ott, C. Grebogi and J. A. Yorke, Phys. Rev. Lett. 64, 1196 (1990). 2. T. Shinbort, E. Ott, C. Grebogi and J. Yorke, Phys. Rev. Lett. 65, 3215 (1 990). 3. N. J. Mehta and R. M. Henderson, Phys. Rev. A44, pp. 4861-65, (1991). 4. F.T. Arecchi, S. Boccaletti, M. Ciofini and R. Meucci, lnt. J Bif & Chaos. 8, n.8, 1643, (1998). 5. C. C. Hwang, R. F. Fung, J. Y. Hsieh and W. J. Li, Znt. J ofEng. Sc. 37, 1893, (1 999). 6. Xinghuo Yu, Chaos, Solitons & Fractal. 8, n. 9, 1577, (1997). 7. K. Konishi, M. Hirai and H. Kokame, Phys. Rev. Lett. A245,5 11, (1998). 8. H. H. Tsai, C. C. Fuh and C. N. Chang, Chaos, Soliton. & Fractals. 14,627, (2002). 9. C. C. Fuh and H. H. Tsai, Chaos, Solitons &Fractals. 13,285, (2002). 10. Y. M. Liaw and P. C. Tung, Phys. Rev. Lett. A211,350, (1996). 11. 0. Calvo and J. H. E. Cartwright, Int. J. ofB$ & Chaos. 8, 1743, (1998). 12. X. Guan and C. Chen, Fuzzy Sets & Sys. 139, 8 1, (2003). 13. M. Ramesh and S. Narayanan, Chaos, Soliton & Fractals. 12, 2395, (2001). 14. Li- Xin Wang, A Course in Fuzzy Systems and Control, Prentice-Hall International, Inc. (1997).
SECURE COMMUNICATION BASED ON CHAOTIC SYNCHRONIZATION PING LI, ZHONG LI AND WOLFGANG A. HALANG Faculty of Electrical Engineering, FernUniversitat in Hagen, 58084 Hagen, Germany E-mail: ping.li{ zhong. li}{wolfgang.halang} @fernmi-hagen.de GUANRONG CHEN Department of Electronic Engineering, City University of Hong Kong, Kowloon, Hong Kong SAR, P. R. China E-mail: gchenoee. cityu.edu. hk Chaos-based secure communication has been an attractive topic recently. In this paper, the techniques of secure communication based on chaotic synchronization are reviewed, including the schemes of modulating the message signal and the chaotic signal, the approaches of synchronization between the chaotic transmitter and the chaotic receiver, and the security analysis of the communication. Discussion on future studies is carried out.
1
Introduction
Over the past few years, the application of chaotic systems to secure communication has attracted much attention. There are several reasons for it. Firstly, since chaotic signal is typically broadband] noiselike and difficult t o predict, they are candidates for the carriers in communication by masking the message signal. Secondly, Pecora and Carroll have demonstrated that it is possible to synchronize two chaotic systems by coupling. Since then, chaotic synchronization based secure communication has been arousing more and more interest. Finally, it is shown that there is a close relation between chaotic systems and cryptographic algorithms] and chaotic systems have potential application t o cryptographic algorithms. In secure communication based on chaotic synchronization, the message signal is hidden in chaotic signal by some modulation approaches, and the combination signal is transmitted to the receiver. By synchronizing the transmitter and the receiver and then regenerating the chaotic signal at the receiver, the message signal can be extracted from the transmitted signal. As to chaotic encryption, S.J. Li has presented a detail review about chaotic encryption in his dissertation In this paper, an overview to the current secure communication based on chaotic synchronization will be given, which includes the schemes of modulating the message signal and the chaotic signal, the secu-
'.
629
630 rity analysis of the communication. In addition, measures for improving the security and the future research direction will be suggested.
2
Modulation schemes of secure communication based on chaotic synchronization
Different implementations of hiding the message signal to the chaotic signal are reviewed in the following. (1) Chaotic masking Chaotic masking is to mask a message signal simply by adding it to a chaotic signal. At t h e receiver, the message signal can be extracted by subtracting the regenerated chaotic signal, which is obtained by synchronization between the response system and the original transmitted signal 2 . Since the additive message signal has effects of a perturbation injected by the transmitted signal into the receiver, synchronization is possible only when the power level of the message signal is sufficiently small. Therefore, the power of the original message signals is limited. Additionally, this method is very sensitive to channel noise and parameter mismatches between the transmitter and receiver systems. Consequently, quality of the recovered signal and the degree of security of communication are low. (2) Chaos Shift Keying (CSK) or Chaotic Switching (CS) The scheme is used in the case of the binary message signal, where the message signal is used to switch the transmitted signal between two similar chaotic systems with the same structure but different parameters, and used to encode bit 0 and bit 1 of the message signal, respectively. By synchronizing the chaotic attractors in the transmitter and the corresponding one in the receiver, the binary message signal is recovered '. Recently, two distinct chaotic attractors are used to represent one message bit '. CSK scheme is very robust to noise and parameter mismatch. (3) Chaotic Parameter Modulation (CPM) The message signal is used to modulate a parameter of the transmitter. The receiver uses auto-synchronization to recover the message signal by reproducing the modulation '. A variation of CPM was presented 4 , where the digital signal can be highly masked via a multi-step parameter modulation combined with alternative driving of different transmitter variables. This method makes it impossible for an intruder to reconstruct the chaotic attractor to extract the message signal '. (4) Chaos Modulation (CM) The message signal hidden in the chaotic signal of the transmitter is injected into the dynamics of the transmitter and the receiver, and drives the
631 transmitter and the receiver in exactly the same way. Consequently, chaos modulation has been shown to be able to provide perfect synchronization between the transmitter and the receiver 2 . Recently, CM has been combined with chaotic masking ti or CSK ?to develop the original ones. ( 5 ) Inverse Systems (IS) The output of transmitter is employed to drive the receiver. Since the receiver is the inverse of the transmitter, the output of the receiver is the message signal ’. The IS method can be applied to both analog and digital systems, here, only analog ones are introduced. In IS scheme, it is important to design the inverse system of the transmitter. Recently, several demodulation procedures have been proposed to approximate the exact inverse of the transmitter *.
3
Security Analysis of Chaotic Secure Communication
So far, there are many schemes of chaotic secure communication, however, many of them are proven to be weak under various attack methods. (1) Constructing Dynamics of the Transmitter The message signal can be recovered by constructing the dynamics of the transmitter via nonlinear dynamic(NLD) forecast and return maps. NLD is used to extract the message signal in chaotic masking and CSK ’, as well as those in secure communication systems based on high-dimensional chaotic systems lo. By partially reconstructing dynamics of chaotic system via return maps 5 , the message signal from chaotic masking and CSK is extracted. Moreover, it has been shown that the chaotic secure communication scheme has a higher security level if its return map is more complicated or the change of the return map is more irregular ll. (2) Spectral Analysis In 1995, T. Yang et.al unmasked CSK scheme via the average frequency difference between two attractors l 2, by using a spectrogram to detect their difference l 3 and by combining the spectrogram and two single-layer neural networks 14, Rather recently, G. ALvarez recovered the message signal by power analysis attack 15. (3) Constructing An Intruder’s Receiver When the sensitivity to the parameter values is very low, the message signal can be recovered by constructing an intruder’s receiver with parameters values considerably different from the original ones. The message signals were decrypted by constructing an receiver via parameter estimation and generalized synchronization 15. (4)Conventional Crypt analysis Met hods
632 Conventional cryptanalysis methods such as those based on key space, brute force attacks, statistical analysis and plaintext attacks etc. are used in security analysis of chaotic secure communication. As we see, the security analysis of communication is flexible like cryptanalysis. One should use suitable methods in terms of special characteristics of secure communication itself. 4
Approaches for Improving Security
In terms of the weakness in security of chaotic synchronization communication, various measures have been developed to immunize attacks. (1) Application of High-dimensional Chaotic Systems The low-dimensionality of attractor is the weakness of security because of the rather simple geometric structure. While high-dimensional chaotic systems, which has the characteristics of increased randomness and unpredictability, can make it more difficult to describe the attractor’s structure and imitate the key parameters. Therefore, to improve the security of communication, high-dimensional chaotic systems are preferred. A high-dimensional chaotic system was implemented by using standard low-dimensional systems with well-known dynamics as building blocks 17, If only the building blocks in the transmitter and the receiver are synchronized, the message signal can be recovered. (2) Using Discrete Chaotic Systems Chaotic communication using analog systems, especially those based on chaotic masking, has serious weakness since the recovering of the message signal overwhelming depends on the synchronization error which can be used by an intruder to break the secure communication. In addition, digital systems make it possible for perfect matching of the receiver to transmitter. Therefore, using discrete chaotic systems is an alternative way to improve the security of communication 6. Since real transmission lines introduce an unknown amount of attenuation and delay in the transmitted signal as well as distortion and noise, it is difficult to achieve synchronization between the transmitter and the receiver, and secure communication is failed. Therefore, research toward secure communication based on discrete chaotic systems attracts more and more interest. (3) Combination of Chaotic Synchronization and Conventional Encryption The message signal is encrypted by multishift cipher before modulated with chaotic signal of the transmitter in different ways and then transmitted to the receiver It is shown that the method is much more sensitive to recovering errors and the modeling errors, thus, the level of security is
enhanced. Also, the message signal 2o is XOR a chaotic signal, which is generated by truncating the output signal of the transmitter before transmitted. Since the above transmitted signals are encrypted before transmitted, they contain very little information about the dynamics of the transmitter and can not be used to construct the transmitter by an intruder. As a result, the security of communication is improved. In summary, to improve the security of communication means to make the process of masking the message signal with the chaotic signal difficult as much as possible. First, one can apply relatively more complicated modulation schemes such as CM or CPM instead of the chaotic masking or, to develop the current modulation schemes to a higher security level. Second, one can use complex chaotic signal as the carrier, for instance, high-dimensional chaotic one. Third, one can encrypt the message signal before modulated, where many good encryption schemes can be selected. It is noted that while the security degree is increased via some approaches, the cost for secure communication may rise and other performance of communication such as real-time may become worse. Therefore, it is important for a designer to obtain a compromise. 5
Conclusion
In this paper, we reviewed the modulation schemes of secure communication based on chaotic synchronization. Then, the methods for security analysis of the communication systems and measures to improve the security have been discussed. So far, the security of the most communication systems is unsatisfactory and requires much more development before they can be used practically. Therefore, designing chaos-synchronization communication schemes with a higher security and robustness to noisy perturbation should be a focus of future study.
References 1. S.J. Li. Analyses and New Designs of Digital Chaotic Ciphers. PhD thesis, Xi’an Jiaotong University, 2003. 2. H.G. Schuster(ed). Handbook of Chaos Control. Wiley-VCH, 1999. 3. Y. Chu and S. Chang. Dynamical cryptography based on synchronised chaotic systems. Electron. Lett, 35(12):974-975, 1999. 4. P. Palaniyandi and M. Lakshmanan. Secure digital signal transmission by multistep parameter modulation and alternative driving of transmitter variables. Int. J. Bifurcation. Chaos, 11(7):2031-2036, 2001.
634 5. G. Prez and H.A. Cerdeira. Extracting messages masked by chaos. Phys. Rev. Lett., 74(11):1970-1973, 1995. 6. M. Feki et. al. Secure digital communication using discrete-time chaos synchronization. Chaos, Solitons and Fractals, 18:881-890, 2003. 7. K. Murali. Digital signal transmission with cascaded heterogeneous chaotic systems. Phys. Rev. E, 63, 2000. 016217. 8. J. A. Ramirez, H. Puebla, and J. Solis-Daun. An inverse system approach for chaotic communication. Int. J. Bifurcation and Chaos, 11(5):14111422, 2001. 9. K. Short. Unmasking a modulated chaotic communications scheme. Int. J. Bifurc. Chaos, 6(2):367-375, 1996. 10. K. M. Short and A.T. Parker. Unmasking a hy- per-chaotic communication scheme. Phys. Rev. E, 58:1159-1162, 1998. 11. T. Yang et. al. Cryptanalyzing chaotic secure communication using return maps. Physics Letters A , 25(6):495-510, 1998. 12. T. Yang. Recovery of digital signals from chaotic switching. Int. J . Circuit Theory Application, 23(6):611-615, 1995. 13. T. Yang, L.B. Yang, and C.M. Yang. Breaking chaotic secure communication using a spectrogram. Physics Letters A , 247(1-2):105-111, 1998. 14. T. Yang, L.B. Yang, and C.M. Yang. Application of neural networks t o unmasking chaotic secure communication. Physica D, 124(1-3):248-257, 1998. 15. G. Alvarez, F. Montoya, M. Romera, and G. Pastor. Breaking parameter modulated chaotic secure communication system. arXiv: nlin. CD/0311041 v l 20, Nov 2003. 16. G. Alvarez, F. Montoya, G. Pastor, and M. Romera. Breaking a secure communication scheme based on the phase synchronization of chaotic systems. arXiv: nlin. CD/0311040 v l 20, Nov 2003. 17. H. Puebla and J. A. Ramirez. More secure communication using chained chaotic oscillators. Phys. Lett. A , 283:96-108, 2001. 18. G. Grassi and S. Mascolo. Synchronizing hyperchaotic systems by observer design. IEEE Trans. Circuits Syst. II, 46:478-483, 1999. 19. T. YANG. A survey of chaotic secure communication systems. Int. J. Computational Cognition, 2(2):81-130, 2004. 20. Y. Zhang, G.H. Du, Y.M. Hua, and J.J. Jiang. digital speech communication by truncated chaotic synchronization. Int. J. Bifurcation and Chaos, 13(3):691-701, 2003. 21. M. Itoh, C.W. Wu, and L.O. Chua. Communication systems via chaotic signals from a reconstruction viewpoint. Int. J. Bifurcation and Chaos, 7(2) :275-286, 1997.
TRANSITION BETWEEN FUZZY AND CHAOTIC SYSTEMS*
ZHONG LI, PING LI AND WOLFGANG A. HALANG Faculty of Electrical Engineering, Fern Uniuersitat Hagen, 58084, Germany E-mail: ping. li{ zhong. li, wolfgang. halang} Qfernuni-hagen. de
This paper presents an overview of recent studies on the interaction between fuzzy logic and chaos theory. On the one hand, it is shown that chaotic systems can be transformed to either model-free fuzzy models or model-based fuzzy models, which means that fuzzy systems can also be chaotic. On the other hand, i t is further shown that fuzzy systems can be made chaotic (or chaotified) with some simple and implementable controllers, and the existing chaotification approaches are mathematically rigorous in the sense of some commonly used mathematical criteria for chaos such as those defined by Deveney and by Li-Yorke.
1. Introduction Although the relation between fuzzy logic and chaos theory is not completely understood at the moment, the study on their interactions has been carried out for more than a decade, at least from the following aspects: fuzzy control of chaos, adaptive fuzzy systems from chaotic time series, theoretical relations between fuzzy logic and chaos theory, fuzzy modeling chaotifying Takagi-Sugeno of chaotic systems with assigned properties (TS) fuzzy models and fuzzy-chaos-based cryptography. Fuzzy logic was originally introduced by Lotfi Zadeh in 1965 in his seminal paper “fuzzy sets” ?, and the first evidence of physical chaos was Edward Lorenz’s discovery in 1963 ’, although the study of chaos can retrospect to the phyilosophical pondering hundreds of years ago and to the work of the French mathematician Jules Henri Poincark at the turn of the last century. Fuzzy set theory resembles human reasoning using approximate information and inaccurate data to generate decisions under uncertain environments. It is designed to mathematically represent uncertainty and vagueness and to provide formalized tools for dealing with imprecision in real‘i2,
3341596,
‘This work was supported by Alexander von Humboldt Foundation.
635
636 world problems. On the other hand, chaos theory is a qualitative study of unstable aperiodic behavior in deterministic nonlinear dynamical systems. Research reveals that it is due t o the drastically evolving and changing chaotic dynamics human brain can process massive information instantly. “The controlled chaos of the brain is more than an accidental by-product of the brain complexity, including its myriad connections, but rather, it may be the chief property that makes the brain different from an artificialintelligence machine” ’. Therefore, it is believed that both fuzzy logic and chaos theory are related to human reasoning and information processing. Based on the above-mentioned observations, the study on the interactions between fuzzy logic and chaos theory should provide a new and promising, although challenging, approach for theoretical research and simulational study of human intelligence. In order t o better understand this research subject, in this paper, the studies on transition between fuzzy and chaotic systems will be briefly reviewed, which includes fuzzy modeling of chaotic systems and chaotifying fuzzy systems. In addition, other research on fuzzy-chaos-based applications will also be commented. 2. f i z z y Modeling of Chaotic Systems In this section, only the model-based approach is considered for fuzzy modeling of chaotic systems. Here, TS fuzzy modeling of Lorenz system is carried out t o show this approach. The Lorenz equations are as follows:
dt
XY - bz
where (T,T , b > 0 are parameters ((Tis the Prandtl number, T is the Rayleigh number, and b is a scaling constant). The nominal values of ( a ,T , b) are (10,28, for chaos t o emerge. The system (1) has two nonlinear quadratic terms- xy and ZZ. Therefore, this system can be divided into a linear system with a nonlinear term as follows:
i)
dt
637
To construct a T S fuzzy model for the Lorenz system, the nonlinear terms z z and z y must be expressed as a weighted linear sum of some linear functions. For this purpose, we first need the following corollary. Corollary 2.1. Assume z E
[Ml , M 2 ] . The nonlinear term
f (2,Y) = 2.y can be represented b y a lineur weighted sum of linear functions of the form
where
and
Now, we can construct an exact TS fuzzy model, which is not an approximation, of system (1). Using Corollary 2.1, system (2) can be expressed as follows: Plant Rules: Rule 1: IF z ( t ) is about M1
dt
Rule 2: IF z ( t ) is about
M2
where
[
=
r -1]I;0 Mi -b
[
-u u
-u u
A1
, A2 =
T
-1
0
M2
-.".3 -b
,
638 and the membership functions are
rl = - X + Mz Mz
- MI
,
2 - MI rz= Mz - Mi ’
where rz is positive semi-definite for all z E [ M I ,M z ] . We emphasize that the TS fuzzy model of the Lorenz system, shown in Fig. 1, is not an approximation of the original system, but is a perfect fuzzy model since the defuzzified output of the TS fuzzy model is identical to that of the original chaotic Lorenz system.
Figure 1. TS fuzzy model of the Lorenz system
3. Generating Chaos From Fuzzy Systems
In contrast to the main stream of ordering or suppressing chaos, the opposite direction of making a nonchaotic dynamical system chaotic or retaining the existing chaos of a chaotic system, known as “chaotification” (or sometimes, “anticontrol of chaos”),has attracted continuous attention from the engineering and physics communities in recent years. There are many practical reasons for chaos generation, for instance, chaos has an impact on some novel time- and/or energy-critical applications. Specific examples include high-performance circuits and devices (e.g., delta-sigma modulators and power converters) , liquid mixing, chemical reactions, biological systems (e.g., in the human brain, heart, and perceptual processes), secure information processing, and critical decision-making in political, economic and military events. Some systematic and rigorous approaches have been developed t o chaotify general discrete-time and continuous-time systems, which inspire us to extend these technologies to fuzzy systems 33415.
639 An example is included here for illustration. First, consider a nonchaotic discrete-time T-S fuzzy model, given by
+ u(t), Rule 2: IF s ( t ) is rZ THEN z ( t + 1) = Gzz(t) + u ( t ) ,
Rule 1: IF z ( t ) is F1 THEN z ( t + 1) = Glz(t) where
G I = [d 0.3 ~o ]
,
G z = -d[ ~0.3 o]
)
z ( t ) E [-d, d] and d > 0, with the membership functions
The controlled TS fuzzy system is described as follows,
where the controller is taken as a sinusoidal function as shown above without corning to the detail for simplicity. In the simulation, the magnitude of the control input is arbitrarily chosen to be CJ = 0.1. Thus, Ilu(t)lloo< 0,and can also be regarded as a control parameter. Without control, the TS fuzzy model is stable. When ,B = 1.3, the phase portrait diagramsis shown in Fig. 2, respectively. These numerical results verify the theoretical analysis and the design of the proposed chaos generator. Remarks: For continuous-time TS fuzzy systems, two approaches have been used for chaotification. One is t o discretize them first, then the above method can be applied The other is very general by designing a tirnedelayed feedback controller 6 .
’.
4. Conclusion
In this paper, current studies on the interactions between fuzzy logic and chaos theory have been briefly reviewed with focus on fuzzy modeling of chaotic systems and chaotification of fuzzy systems. With the further understanding of their relations, combining fuzzy and chaos control methods will provide promising means for applications. Therefore, more efforts are needed to further explore the interactions between fuzzy logic and chaos theory.
640
Figure 2.
Phase portrait with some structure
References 1. S. Baglio, L. Fortuna and G. Manganaro, “Design of fuzzy iterators t o generate chaotic time series with assigned Lyapunov exponent,” Electronics Letters, vO1.32, No.4, 1996, pp.292-293. 2. M. Porto and P. Amato, “A fuzzy approach for modeling chaotic dynamics with assigned properties,” The 9th IEEE Int. Conf. on Fuzzy Systems, 2000, FUZZ-IEEE 2000, V01.1, pp.435-440. 3. 2 . Li, J. B. Park, G. Chen, Y. H. Joo and Y. H. Choi, “Generating chaos via feedback control from a stable TS fuzzy system through a sinusoidal nonlinearity,” Int. J . Bifur. Chaos, V01.12, No.10, 2002, pp.2283-2291. 4. 2. Li, J. B. Park, Y. H. Joo, G. Chen and Y. H. Choi, “Anticontrol of chaos for discrete TS fuzzy systems,” IEEE Runs. Circ. Syst.-I, Vo1.49, No.2, 2002, pp.249-253. 5. 2. Li, J. B. Park, and Y. H. Joo, “Chaotifying continuous-time TS fuzzy systems via discretization,” IEEE Runs. Circ. Syst.-I, vo1.48, No.10, 2001, pp.1237-1243. 6. 2. Li, W. Halang, G. Chen and L. F. Tian “Chaotifying a continuous-time TS fuzzy system via time-delay feedback,” J. of Dynamic of Disc, Cont., Impul. Syst., Series B, Vol.10, No.6, 2003, pp.813-832. 7. L. Zadeh, “Fuzzy sets,” Inf. Control, vo1.8, 1965, pp.338-353. 8. E.N. Lorenz, “Deterministic Nonperiodic Flow,” J . Atmos. Sci., Vol. 20, 1963, pp.130-141. 9. W.J. Freeman, “The physiology of perception,” Scientific American, Feb. pp. 78-85, 1991.
A NOVEL CHAOS-BASED VIDEO ENCRYPTION ALGORITHM *
HUAN JIAN, YAOBIN M A 0 AND ZHIQUAN WANG Department of Automation, Nanjing University of Sci. i 3 Tech., No.200 Xiaolingwei Str., Nanjing 210094, P. R. China E-mail: [email protected] ZHONG LI AND PING LI Faculty of Electrical and Computer Engineering, Fern Universitat Hagen, 58084, Germany E-mail: zhong. liafernuni-hagen. de
The proposed chaos-based encryption algorithm first employs a sawtooth-like chaotic map t o generate a pseudo-random bit sequence, then uses it t o determine the types of encrypting operations performed on video coding streams. Four kinds of selective encrypting operations are introduced to efficiently scramble and shuffle all of the DC coefficients, part of the AC coefficients of I blocks as well as Motion Vectors (MVs) through XOR, XNOR and replacement operations. With slight computational overhead and tiny data dilation, the encryption algorithm is amenable to the H.263 video-conference coding standard. Finally, the feasibility and the security of the proposed algorithm are demonstrated by experiments carried out on several segments of H.263 coded video streams.
1. Introduction
The wide deployment of video services such as VoD (Video on Demand), video conference and video surveillance has dramatically increased the research interest of multimedia security in recent ten years. To protect video contents, cryptology, which appears to be an effective way for information security, has been employed in many practical applications. However, due to the intrinsic properties of videos such as bulk capacity and high redundancy, compression of video data is unavoidable. The extra operations of encryption aggravate the cost of video coding, which makes real-time *This work was supported by the Nature Science Foundation of China under grant 60174005 and Nature Science Foundation of Jiangsu Province under grant BK2001054.
64 1
642
video application difficult. In this regards, it is argued that the traditional encryption algorithms like DES, IDEA, and RSA, which have been originally developed for text data, are not suitable for secure real-time video
application^^^^^. A recent major trend in multimedia encryption is to minimize the computational requirements by “selective encryption” that only those intelligible important parts of content are subject to encryptionl0>l2.Many h he me^^^^^^,^^^^^ complied with this principle have been brought out, to sum up, the core idea of the proposed selective algorithms for video encryption is to shuffle or scramble only a portion of the compressed bitstream so that the resulting output is un-decodable in the absence of correct keys, or, even it is illegally breakable, the visual quality of the decoded video is still unacceptable for practical use. Since most commonly used video compression standards such as MPEG, H.261 and H.263 utilize transform coding, for instance block-based DCT, to reduce spacial redundance and inter-picture prediction to eliminate temporal redundancy, the main energy of a video is concentrated in a few DC (Direct Current) and AC (Alternating Current) coefficients in most of the intra-coded frames (I frames), meanwhile, the inter-frame information is expressed as several motion vectors (MVs). After I-frame extraction, the residual information of a video segment is contained in two other kinds of frames, forward predictive coded frames (P frames) and hi-directional predictive frames (B frames) respectively. Selective encryption can be imposed on entire I frames or portion of I, P and B frames as well as MVs. Different schemes employs different element selection strategies and different cryptological algorithms. Some earlier algorithms involve encryption of I framesg. However, Agi and Gong’ revealed that great portions of the videos can be reconstructed by un-encrypted I blocks in P and B frames, therefore sole encryption of I frames may not be sufficient. Tanglo has suggested a cipher scheme that encrypts different levels of selected video streams. According to different security requirements, stream headers, I frames, I blocks in P and B frames and some other types of frames are subject to encryption respectively. Other efficient algorithms that alter sign bits of motion vectors and DCT coefficients have been proposed by Shi and Bhargava7, which significantly reduced the calculation overhead spent on encryption. Recently, Zeng and Lei l2 has proposed a scramble technique in frequency domain that divides the transform coefficients into blocks/segments and performs encryptions through part or all of following three operations: selective bit scrambling, block shuffling and block rotation of the transform coefficients and motion vectors.
643
In parallel with the development of selective encryption, another important technology, chaos based cryptology, has emerged and pullulated. Owing to the prominent properties of chaos like sensitivity to parameters and initial values, ergodicity and broad-band spectrum, chaotic pseudo-random sequence has several advantageous features such as the ease of generation, sensitive dependence on seeds, and non-periodicity, which especially suits for the application of cryptology. There has some attempts on incorporating chaos in data encryptions, and even video encryptions. Sobhy and Shehatas have described an algorithm that uses the chaotic Lorenz system to accomplish all-purpose encryp tions. According to their experiments, the time required for encrypting an image of 93 kilo-bytes is about 20 seconds, which is quite unacceptable for a real-time application. A chaos based image encryption scheme suggested by Yen and Guol' is to change the motion vectors by XOR and XNOR operations with two separately generated chaotic sequences controlled by two predefined keys. However, since the scheme doesn't mention any operations on I blocks, the improvement of the security on video encryption is suspectable. Inspirited by selective encryption and chaos-based encryption, in this paper, we present a novel chaotic video encryption algorithm. A sawtoothlike chaotic map is first used to generate a pseudo-random bit sequence (PRBS), then, according to the PRBS, the DC coefficients, some AC coefficients of I blocks and motion vectors are encrypted consequently. The details of the algorithm as well as some experimental results will be elaborated in the paper. The rest of the paper is organized as follows. In section 2, details of the proposed algorithm are described. Section 3 will exhibit the experimental results. Finally, section 4 concludes the whole paper.
2. Chaos-based video encryption algorithm 2.1. The generation of chaotic pseudo-random bit sequence
To simplify the software implementation and fasten the encryption speed, a sawtooth-like map3 (shown in formula (1) ) is used to generate the chaotic PRBS (CPRBS). = cz,
mod 1
(1)
It can be demonstrated that the above map has many good properties which totally meets our CPRBS generator requirements.
644
Firstly, we inspect the Lyapunov Exponent (LE) of map (1). According to the definition of Lyapunov Exponent of one dimensional map, the LE of the map, A, can be calculated as below: ~
T-1
= lnc
(2)
So when c > 1, the map is chaotic. Further more, if we restrict the parameter c in an integer set and let c 2 2, the distribution of the sequence generated by map (1) should be even6. That means, if we want to generate an evenly distributed pseudorandom bit sequence by bi-polarizing the chaotic sequence produced by the sawtooth-like map, the threshold should be set to 0.5. By using above chaotic map, a CPRNG is easy to obtain. Suppose that after n iterations, the map produces a value x,. The n t h bit b, of the sequence is then determined by the following coin tossing formula: bn=
{ 0 ifif xx,, <2 0.5 0.5 1
Thus, one obtains a bit sequence, { b l , b2, . . . ,b,, . . . }. Owing to the intrinsic properties of chaos, like ergodicity and mixing, the CPRNG has many good features: unique dependence of the sequence on the seed, equiprobable occurrence of “0” and “1,” and asymptotic statistical independence of bits, etc. 2.2. Video encryption scheme
The main idea of the proposed algorithm is to encrypt, through the chaotic sequence, the quantized DC coefficients and parts of AC coefficients of I blocks as well as the MVs. To simplify the description of our algorithm, we take H.263 complied video as example and first define some notations. Generally speaking, each I block in H.263 video streams consists of 64 components, one DC coefficient and 63 AC ones. The DC coefficient is denoted as d, and all other AC coefficients are denoted as a i j , where i , j E [O, 8),2 j # O , i , j E 2 . The AC coefficients are then further divided into 3 groups: diagonal set I d =
+
645 {aijli = j , i , j E Z } , upper triangular set I, = {aijli < j , O <_ i , j < 8 , 2 , j E Z } , and lower triangular set Il = { a i j ( i > j , O 5 i , j < 8 , 2 , j E 2 ) . The horizontal parts of MVs are denoted as X ,while the vertical parts denoted as Y. In order to efficiently shuffle and scramble video streams, four kinds of operations are introduced, which are enumerated as follows:
Op 1: Cyclically left-shift 1 bit of d ; each nonzero AC coefficients in I , exclusive-OR-ed (XOR-ed) with the corresponding elements in Il, and replace the elements of Il with the operated results; replace Y with the non-zero outputs of X XOR-ing Y. Op 2 Cyclically left-shift 2 bits of d ; each nonzero AC coefficients in Il XOR-ed with the corresponding elements in I % , and replace the elements of I , with the XOR-ed results; replace X with the nonzero outputs of Y XOR-ing X . Op 3: Cyclically right-shift 2 bit of d; each nonzero AC coefficients in It XNOR-ed with the corresponding elements in I,, and replace the elements of I, with the XNOR-ed results; replace X with the non-zero outputs of Y XNOR-ing X. Op 4: Cyclically right-shift 1 bit of d; each nonzero AC coefficients in I , not-exclusive-OR-ed (XNOR-ed) with the corresponding elements in Il, and replace the elements of Il with the XNOR-ed results; replace Y with the non-zero outputs of X XNOR-ing Y. The full encryption algorithm is shown in Fig.1, where the cipher operations have been seamlessly integrated into the H.263 encoding process, i.e., before run-length-coded and packed, the quantized DC and AC coefficients and MVs are subject to encryption. The operation types are determined by chaotic pseudo-random bit sequence that generated previously. Each two bits consist of one integer that indexes one of the aforementioned four operations. The decryption process is similar to that of encryption, so will not be detailed any more.
3. Experimental results In this section we present the testing result of the chaotic pseudo-random bit sequence and illustrate the experiments on the proposed video encryption algorithm.
646
Figure 1.
The block diagram of proposed chaos-based video encryption algorithm
3.1. FIPS 1402 test To make the performance test of the CPRBS, a practical and widely accepted specification, FIPS 140-2, that issued by the National Institute of Standards and Technology (NIST) in the United States4, is employed. The specification consists of 4 subjects with a total of 16 items, or in more precise description, a single stream of 20,000 consecutive bits should be subjected to the following 4 tests: mono-bit test, poker test, runs test and long run test. Table 1 lists one test result of a CPRBS, showing the good performance of the generated bit sequence.
Table 11 comparative maran and pnn withthree traning meth met 150 Table 11 comparative maran and pnn withthree traning meth met 150 test item result
monobit
I
poker I
I
9,949
I
18.9504
I
run test r=l I r=2 II r=3 II r=4 2,385 1,278 606 298
1
I
I
I
I
I
r=5 1 r>6 174 164
I
I
long run 0
3.2. Encryption results Experiments of encrypting video sequences have been performed, which shows the effectiveness of the proposed algorithm. Fig. 2 has exhibited one set of typical results. 3.3. Encryption speed The cost of time on encrypting H.263 video streams has been tested. In this experiment, three segments of videos are subject to test, each of which
647
Figure 2. Experiments on Forman.qczfsequence: the upper left is the original image of the 1st frame, the upper right is the corresponding encrypted image, the bottom left is the original image of the 86th frame and the bottom right is the corresponding encrypted image.
contains 300 frames of 176 x 144 images. The testing results are shown in Table 2. From the experimental results, we can find that the computational overhead introduced by the encryption processing is slight and the encryption processing doesn’t interfere the motion features of the videos. Table 2.
Testing results of encryption speed.
video comparative
Table 11 maran and pnn withthree traning meth met 150 Table 11 comparative maran and pnn withthree traning meth met 150 Foreman.qcif
32.27
32.84
1.77%
25.63pnn withthree 25.86traning meth 0.90%met 150 Table 11 comparative maran and 24.03traning meth 1.05%met 150 Table 11 comparative maran and pnn withthree
3.4. Data dilation The same three video segments are also used for the test of data dilation. Experimental results show that the proposed encryption algorithm introduces slight data inflation. Table 3 has exhibits the results.
Table 11 comparative maran and pnn withthree traning meth met 150 video seauence Foreman. qcaj Claire.. qcij
Bridae-far.acif .,” * <
number of frames
,
300 300 300
un-encrypted video fKB) 118 \
,
22.4 6.93
data dilation f%)
encrypted video fKB)
. ,
I
I
119 23 6.95
\
I
0.85% 2.67% 0.29%
,
648 4. Conclusion
A new chaos based selective encryption algorithm t hat caters t o video coding is presented. T h e algorithm employs a sawtooth-like chaotic map t o generate the pseudo-random bit sequence which was further used t o control the process of shuffling and scrambling the DC, AC coefficients and the MVs in H.263 video streams. Experimental results show t hat the algorithm is feasible an d only introduce a modest overhead in video encoding. More secure issues and further experimental results are still in processing and will be reported in forth-coming paper.
References 1. I. Agi and L. Gong, in Proceedings of the Internet Society Symposium on Network and Distributed System Security, 137 (1996). 2. A.M. Alattar et al, in Proceedings of the 1999 International Conference on Image Processing (IXCIP'99), 4,256 (1999). 3. G. Chen and D. Lai, International Journal of Bifurcation and Chaos, 8,1585 (1998). 4. NIST, Federal Information Processing Standards Publication FIPS P U B 1402: Security Requirements for Cryptographic Modules, 2001. 5. Y.B. Mao and G.R. Chen, in Handbook of Computational Geometry for Patt e r n Recognition, Computer Vision, Neurocomputing and Robotics, ed. Eduardo Bayro-Corrochano (Springer-Verlag, New York, in press). 6. Y.B. M m and W.B. Liu, A Chaotic Stream Encryption Scheme Accommodable to FPGA Implementation, Technical Report, (2004). 7. C. Shi and B. Bhargava, in Proceedings 17th IEEE Symposium on Reliable Distributed Systems, '98, 381 (1998). 8. M.I. Sobhy and A.R. Shehata, in Proceeding of the I E E E Acoustic, Speech and Signal conference, '01, 997 (2001). 9. G.A. Spanos and T.B. Maples, in 4th International Conference on Computer Communications and Networks (ICCCN '95), 2 (1995). 10. L. Tang, in Proceedings of the Fourth A C M International Mutimedia Conference '96, 137 (1996). 11. J.C. Yen and J.I. Guo, in Proceedings 2000 IEEE International Symposium on Circuits and Systems,ISCAS 2000, 4, 49 (2000). 12. W. Zeng and S. Lei, IEEE Dan. on Multimedia, 5,118 (2003).
A THEORY OF FUZZY CHAOS FOR THE SIMULATION AND CONTROL OF NON-LINEAR DYNAMICAL SYSTEMS OSCAR CASTILLO AND PATRICIA MELIN Computer Science Dept., Tvuana Institute of Technology Tijuana, B. C.,Mexico We describe in this paper a new theory of chaos using fuzzy logic techniques. Chaotic behavior in non-linear dynamical systems is very difficult to detect and control. Part of the problem is that mathematical results for chaos are difficult to use in many cases, and even if one could use them there is an underlying uncertainty in the accuracy of the numerical simulations of the dynamical systems. For this reason, we can model the uncertainty of detecting the range of values where chaos occurs, using fuzzy set theory. Using fuzzy sets, we can build a theory of Fuzzy Chaos, where we can use fuzzy sets to describe the behaviors of a system. We illustrate our approach with two cases: Chua's circuit and Dufing's oscillator.
1. Introduction In the traditional mathematical theory of chaos, we have that 'chaotic behavior' is defined as sensitive dependence on initial conditions [6, 91. This sensitive dependence is measured with mathematical concepts like the Lyapunov exponents or the fractal dimension for the dynamical system [16]. However, in the numerical simulations usually we have uncertainty related to numerical errors in the methods and also in the initial values. For this reason, it is very difficult to identify real chaotic behavior. The approach presented in this paper is to relax the traditional mathematical definition of 'chaos' by using the theory of fuzzy logic [21], in this way obtaining a new more realistic definition of chaotic behavior. Our fuzzy chaos definition is a weaker definition of chaos, because we do not impose strict conditions on the accuracy of the numerical values for this behavior to occur. Also, it is an easier definition to apply to real world problems because in many cases we only have relative empirical evidence of chaos, which of course means that we have uncertainty about the identification of this behavior [I, 2, 31. 2.
Towards a New Theory of Fuzzy Chaos
For a given dynamical system expressed as a non-linear differential equation: dy/dt=f(t,y) y(O)=y0 (1) Or as non-linear difference equation: Yt+l=f(Yt,' '1 Y(O>=Yo (2) We can have many different types of dynamic behaviors, for the above equations, depending on the parameter values, and also depending on the analytical properties of the function f. Also, there exists a fundamental difference between Equations (1) and (2), namely, that differential equations can only exhibit chaos when they are at least three-dimensional [9, 201. However, difference equations can exhibit chaos even for the one-dimensional case. '
649
650
In particular, we can have “chaotic behavior”, defined formally as sensitive dependence on initial conditions for many real dynamical systems. However, in numerical simulations we usually have uncertainty related to numerical errors in the methods and also in the initial values. For this reason, is very difficult to identify precisely real chaotic behavior [7, 9, 131. We can relax the traditional mathematical definition of “chaos” by using the theory of fuzzy logic [12, 231, in this way obtaining a new more realistic definition of chaotic behavior. We assume that we have a dynamical system in the real line given as: Yt = f(Yt-1, Q) (3) In this case, we can associate chaotic behavior with the number of period doublings (or bifurcations) that occur when the parameter 8 is varied. According to this fact, we can state the following definition: Definition 1 (chaotic behavior according to period doublings) A one-dimensional dynamical system shows fuzzy chaos when the number of period doublings is considered to be large: Definition 2 (fuzzy chaos by the fractal dimension) A one-dimensional dynamical system shows fuzzy chaos, when the value of the fractal dimension is large (close to a numeric value of 2 for the plane): IF the ftactal dimension is large THEN behavior is fuzzy chaos. Also, the value of the dimension has to be calculated from the time series using the box counting algorithm [4, 5, 161.
3. Controlling Chaos More than two decades of intensive studies on non-linear dynamics have posed the question on the practical applications of chaos [13]. One of the possible answers is to control chaotic behavior in such a way as to make it predictable. Indeed, nowadays the idea of controlling chaos is an appealing one [7, 141. The methods described in this section are illustrated by the example of Chua’s circuit [8]. Chua’s circuit contains three linear energy storage elements (an inductor and two capacitors), a linear resistor, and a single non-linear resistor NR, namely Chua’s diode with a three-segment piecewise linear v-i characteristic defined by f(v,I)=movcI + %( ml - mo)(lvCl+ 11-1 V,I - 11) (4) where the slopes in the inner and outer regions are m, and ml respectively. In this case the state equations for the dynamics of Chua’s circuit are as follows: CidV_d-=G(vcz - VCI)- f(vCi) dt C&,z= G(vCi - vcz) - i~ (5) dt L k = v,2 dt where G=I/R. It is well-known that for R = 1.64 kSZ, C1= 10 nF, Cz= 99.34 nF, ml = -0.76 mS, mo = 0.41 mS, and L = 18.46 mH, Chua’s circuit operate on the
651
chaotic double-scroll Chua’s attractor. We show in the following figures the simulation of Chua’s circuit for initial conditions (-3, -3, -10). Figure 1 shows the plot of variable Vc, in time. In this figure, we can appreciate the erratic behavior of this variable. In Figure 2 we show a bi-dimensional view of the double-scroll Chua’s attractor. The chaotic dynamics of Chua’s circuit have been widely investigated [15]. Experiments with this circuit are very easy to perform, even for non-specialists [S, 131. Slrnulallon
o l C h u a
Clrcllll
4 ,
3
I 2
.
1
.
1
E
y
O I ”
‘I‘ .4
I
,“
50
100
150 llrn e
250
200
300
Figure 1 Plot of variable Vcl of Chua’s circuit. S I r n U1.U.a”
-‘ I *
-
*
-
3
-
2
-
>
OICIIY.
0
CIraUlt
1
2
3
1
I
s
V C l
Figure 2 Bi-dimensional view of Chua’s attractor.
To explain the role of dynamical absorbers in controlling chaotic behavior let us consider the Duffing’s oscillator, coupled with an additional linear system: X” + ax’+ bX + cX3+ d(X-Y) = Bo + Blcoswt (64 Y” + e(Y-X) = 0 (6b) where a, b, c, d, e, Bo, B1, and w are constants. Here d and e are the characteristic parameters for the absorber, and we take e as the control parameter. It is well known that the Duffing’s oscillator shows chaotic behavior for certain parameter regions. We show in Figure 3 a two-dimensional view of the chaotic behavior in Duffing’s oscillator. In Figure 4 we show a plot of
652 variable X across time [0, 3501. Let us consider the parameters of Equation (6) to be fixed at the values a=0.077, b=O, c=1.0, Bo=0.045, B1=0.16, ~ 1 . 0then , we can find [7] that we have chaos for e E [0, 0.101, and we can control this chaos by increasing e above 0.10. s
I I
Y
I. 110
I
0
I D
Y
ffi" I
E q
Y
a 110"
~p
1
0
5
0
0
5
?
Y
Figure 3 Chaotic behavior in Duffing's oscillator. I lm
.I
5
Y
la tlo n
0
f D
L-0
5 0
1 0 0
1 5 0 tlm
Y
ffln I E 4
.
Y
1 llo n
pL
2 0 0
2 5 0
300
3
0
Figure 4 Plot of variable X across time for Duffing's oscillator. 4.
Controlling Chaotic Behavior using the Concept of Fuzzy Chaos
In any of the above-mentioned methods for controlling chaos we have that a specific parameter is used to change the dynamics of the system from a chaotic to a stable behavior. For example, for the specific case of Duffing's oscillator the parameter "e" of equation ( 6 ) can be used for controlling the chaotic behavior of the oscillator. However, the crisp interval [0, 0.101 for parameter "el' in which chaotic behavior occurs is not really an accurate range of values. Of course, for e= 0 we can expect chaotic behavior, but as "e" increases in value real chaotic behavior is more difficult to find. In the crisp boundary of e=0.10, things are more dramatic, one can either find cyclic stable behavior or unstable behavior depending on the conditions of the experiment or simulation. For this reason, it is more appropriate to use the proposed concept of "fuzzy chaos", which will allow us to model the uncertainty in identifying this chaotic behavior. In this case, a membership function could be defined to represent this uncertainty in
653 finding chaotic behavior, and also this is really helpful in controlling chaotic behavior as we can take action even before completely chaotic behavior is present. For the case of the Duffing's oscillator we can define fuzzy rules for identifying specific dynamic behaviors. For example, chaotic behavior can be given by the following rule: IF e is Small THEN behavior is fuzzy chaos. In the above fuzzy rule, the linguistic term "small" has to be defined by the appropriate membership function. Other similar rules can be established for identifying different dynamic behavior for the system. One obvious advantage of this approach is that we are able to have relative evidence of chaotic behavior before there is complete instability. As a consequence of this fact we can take action in controlling this chaotic behavior sooner than with traditional methods. A sample fuzzy rule for controlling chaos is as follows: IF behavior is fuzzy chaos THEN increase is small positive This fuzzy rule simple states that when fuzzy chaos is present then we must increase slightly the value of ''el'. Of course, linguistic terms, like ''small positive", need to be defined properly. We show in Table 1 the comparison between the methods for controlling chaos for the two dynamic systems considered in this paper. In Table 1 we show the efficiency and accuracy for controlling chaotic behavior for the two cases described before. We did consider a sample of 200 different conditions experimental conditions for both dynamical systems, and compare the relative number of times that a particular method was able to really control chaotic behavior. The implementation of the fuzzy chaos approach for behavior identification was done in MATLAB [17]. We show in Figure 5 the membership functions of the linguistic variable corresponding to the parameter "el', in which there are three linguistic values. Table 1 Comparison between the methods for controlling chaos.
I Chua's circuit Duffng's oscillator
I 0
1 Traditional chaos definition 1 98.50 1 96.00
0 05
0 ,
0 -5
1
New fuzzv chaos definition (%)
I 99.50 I 98.50
0 2
I
0 25
e
Figure 5 Membership hnctions for parameter "e".
0 3
1
654
5.
Conclusions
We have presented in this paper a new theory of Fuzzy Chaos for non-linear dynamical systems. We can apply this theory for behavior identification. We also presented in this paper a new method for controlling non-linear dynamical systems. This method is based on a hybrid fuzzy-chaos approach to achieve, the control of a particular dynamical system given its mathematical model.
References 1. 2
3.
4. 5.
6.
7. 8. 9. 10
11
12 13 14 15
16 17 18
19 20 21
Abraham, E. & Firth, W. J. (1984). "Multiparameter Universal Route to Chaos in a Fabry-Perot Resonator", Opticul Bistubility, Vol. 2, pp. 119-126. Castillo, 0. & Melin, P. (1995). "An Intelligent System for the Simulation of Non-Linear Dynamical Economical Systems", Journul o/ Muthemuticul Modelling and Sirnulation in Systems Analysis, Gordon and Breach Publishers, Vol. 18-19, pp. 767-770. Castillo, 0. & Melin, P. (1996). "Automated Mathematical Modelling and Simulation of Dynamical Engineering Systems using Artificial Intelligence Techniques", Proceedings CESA'96, Gerf EC Lille, pp. 682-687. Castillo, 0. & Melin, P. (1997). "Mathematical Modelling and Simulation of Robotic Dynamic Systems using Fuzzy Logic Techniques and Fractal Theory", Proceedings of IMACS World Congress'97, Wissenschafi & Technik Verlag, Vol. 5, pp.343-348. Castillo, 0. & Melin, P. (1998). "Modelling, Simulation and Behavior Identification of NonLinear Dynamical Systems with a New Fuzzy-Fractal-Genetic Approach", Proceedings of IPMU'98, EDK Publishers, Vol. 1,pp. 467-474. Castillo, 0. & Melin, P. (1999). "A General Method for Automated Simulation of Non-Linear Dynamical Systems using a New Fuzzy-Fractal-Genetic Approach, Proceedings CEC'99, IEEE Press, Vol. 3, pp. 2333-2340. Castillo, 0. & Melin, P. (2001). Soft Computingfor Control of Nan-lineur Dynumicul Systems, Springer-Verlag, Heidelberg, Germany. Chua, L.O. (1993). "Global unfolding of Chua's circuit", IEICE Transactions Fund., pp. 704734. Devaney, R. ( 1989). An Introduction to Chuotic Dynumicul Systems, Addison Wesley Publishing. Goldberg, D.E. (1989). Genetrc Algorithms in Search, Optimizulion und Machine Leurning, Addison Wesley Publishing. Grebogi, C., On, E. & Yorke, J. A. (1987). "Chaos, Strange Attractors, and Fractal Basin Boundaries in Nonlinear Dynamics", Science, Vol. 238, pp. 632-637. Jang, J.3.R., Sun, C.-T. & Mizutani, E. (1997). Neurofuq und Soft Computing: A Compututronul Approuch to Leurnrng und Muchine Intelligence, Prentice-Hall. Kapitaniak, T. (1996). Controlling Chaos: Theoreticul und Pructicul Methods in Non-Linear Dynumrcs, Academic Press. Kocarev, L. & Kapitaniak, T. (1995). "On an equivalence of chaotic attractors", Journul of Physics A , , Vol. 28, pp. 249-254. Madan, R. (1993). Chuu's Circurt: Purudigm for Chuos, World Scientific, Singapore. Mandelbrot, B. (1987). The Fructul Geometry of Nature, W. H. Freeman and Company. Nakamura, S. (1997). Numerical Anulysis und Graphic Visuulixtion with MATLAB, Prentice Hall. Ott, E., Grebogi, C. & Yorke, J.A. (1990). "Controlling Chaos". Physicul Review Letters, Vol. 64, pp. 1196-1199. Pyragas, K. (1992). "Continuous control of chaos by self-controlling feedback", Physicul Letlers, pp. 42 1-428. Rasband. S.N. ( 1990). Chuorrc Dynamics ofNon-Lineur Systems, Wiley Interscience. Zadeh, L. A. (1975). The Concept of a Liiguistic Variabfe and its Application to Approximate Reasoning, Injvmulion Sciences, 8,43-SO \
,
HARDWARE IMPLEMENTATION OF AN IMPROVED SYMMETRY FEATURE POINT EXTRACTION ALGORITHM
D. POPESCU Computer Science Department, University P O L I T E H N I C A of Bucharest, 303 Splaiul Independentei, sec. 6, Bucharest, R O M A N I A E-mail: [email protected]
J. ZHANG Computer Science Department, University of Hamburg, 30 Vogt-Koelln Street, 22527 Hamburg, Germany E-mail: [email protected]
This paper presents a hardware implementation for an improved symmetry features point extraction algorithm used in a range estimation task, for an autonomous mobile robot application. The algorithm that computes the symmetry points is an improved version of the Symmetry Feature Point Extraction algorithm (SFP) presented in [ 1 1. With this algorithm the symmetry feature points are obtained directly from the cycle image acquired by the camera. All experiments are realised with the resources presented in [ 11 and a hardware development board with a XilinxZE FPGA circuit.
1. Introduction The development of a robust fuzzy expert system supposes a robust feature extraction phase. This task is realised using an enhanced algorithm based on the detection of the symmetry points from a digital image, as it was presented in [ 1 1. All the experiments shown in this paper use the resources presented in [ 41. For the hardware implementation, a board with a Xilinx Spartan2E FPGA circuit was used.
655
656 2. Improved Symmetry Feature Point Extraction Algorithm Since symmetry features points is a global feature, each object can be reduced to only one point. This represent a great advantage in aplications like range estimation because the object classification becomes, in fact, a point classification according to its properties (position, colour, symmetry values). The symmetry algorithm presented in detail in [ 1 ] implies two major steps: horizontal s y m m e t r y computation and vertical s y m m e t r y computation. Some improvements can be realised and these improvements are:
(1) for the horizontal symmetry computation the equation can be rewriten like:
but
s k
can take the form:
where:
Having in view the equations Eq. (3) and Eq. (4), the following recurrence equations can be established:
sl(k
1 + 1) = C2
sj(k
Ic + 1) = C2
'
(pi-k
-pi+k)'
'
(pi-k
-pi+k)'+
+ s1(k).
(5)
sj(k).
(6)
(2) for the vertical symmetry computation the recurrence equations are established in a similar manner. From the experiments it was observed that for the SFP algorithm the processor time necessary to compute all symmetry feature points for one figure is 50,480,000. In the case of the Improved SFP's algorithm (using
657 Eq. ( 5 ) and Eq. (6)) the processor time is dramatically reduced t o 330,000. The software program run on computer with PENTIUM IV processor and 512MB of RAM. In the Figure l(b), an example of a cyclic image acquired by the camera is presented. Figure l ( a ) introduces the notations for understanding how the symmetry points are obtained. The Improved SFP algorithm is:
b.
a.
Figure 1. (a) Useful notations, (b) a cyclic test image.
(1) Start to compute the vertical symmetry for all pixels which belong to the radius V S that ends in the point Po. (2) Compute the local maximum for all pixels and mark all the found pixels white. (3) Rotate the VS radius with an a angle, where a = %, in trigonometric sense. (4) The new pixel coordinates are obtained with the following equations: zi = ziant cos(a) ’
+ giant
yi = giant ’ cos(a) - S2,nt
&(a)
(7)
.sin(a)
(8)
’
( 5 ) Repeat the above steps until V S becomes SA1. It is already known that if we can determine from a circle, we can generate the whole circle. This fact is very helpful because will be necessary t o compute the pixel coordonates only for S1. (6) Establish a circle with the radius 0s. ( 7 ) Test if all Po pixels, which belong to the circle established in step 6, are vertical symmetry points or if they have vertical symmetry neighbours.
658 (8) If the test fails, mark all these pixels black. Decrease the radius and repeat steps 7 and 8.
Every computed symmetry point, p , has an 8-dimensional feature vector like in [ 1 1.
3. Hardware Implementation The hardware implementation is realised using a DIGILAB2E board which has an FPGA circuit (SPARTAN2EPQ208) and a clock frequency fixed a t 50MHz. Having in view the huge data computation, some memory modules need to be added at the initial configuration of the board. The data that came from the Nikon camera require implementing the usual arithmetic operations (addition, difference, multiplication and division) in floating point like in [ 2 ] and [ 3 1. The Improved SFP’sarchitecture was made using parallel and pipeline processing techniques. The total computation time for the whole Improved SPF’s algorithm in the worst case is 3,688,257 clock cycles. Having in view that the board clock has the frequency of SOMHz, the obtained time for running the Improved SPF’s algorithm is 73,765,140 n s which means approximatively 74 m s . In the best case, when the pipeline is full, the computational time is 2072,s. So we can conclude that this implementation works in real time. In order to improve the total computation time it is necessary to increase the clock frequency and to choose memory modules with a very high time access. The Improved SFP’s algorithm presented in this paper includes all the improvements that can be made to the classical SFP’s algorithm.
References 1. D. Popescu and J. Zhang and K. Huebner, Real-time intelligent vision sensor f o r robot navigation using symmetry features, Computational Intelligent Systems for Applied Research, Proceedings of the 5th International FLINS Conference, 421 (2002). 2. M. M. Mano, Computer System Architecture, Prentice-Hall, 303 (2000). 3. D. Popescu Pipeline Implementation of the Flouting Point Operations using FPGA circuits, Technical Report No. (2104-03, University POLITEHNICA of Bucharest, (2003). 4. D. Popescu and J. Zhang Fuzzy Expert System based o n symmetry features for range estimations, 6th International Conference on Climbing and Walking Robots, 1007(2003).
DESIGN OF A FUZZY MODEL-BASED CONTROLLER FOR A DRUM BOILER-TURBINE SYSTEM AHCENE HABBI , MIMOUN ZELMAT Laboratoire dilutomatique Applique'e, F.H. C. Universite'de Boumerdis,35000 Boumerdis, Algeria Phone/Fau.: +213-24-816905 E-mail: [email protected]
This paper addresses the design of a fuzzy control system for a drum boiler-turbinegenerator unit. In the design procedure, a dynamic fuzzy augmented system is suggested for the nonlinear boiler-turbine plant to deal with its non-minimum phase behavior. The f u v y control system is synthesized from a local concept viewpoint on the issue of the optimal control theory. The good performance of the designed fuzzy control system is shown by simulations in various conditions.
1. Introduction Due to the deregulation in energy market, the power companies are under ever increasing pressure to improve the efficiency of their industrial equipment. For instance, a combined cycle power plant used for cogeneration of electric and thermal energy may need to provide a large amount of steam on demand while at the same time maintain balance in power generation. The drum boiler-turbinegenerator (BTG) unit is a critical part to the power plant. It is difficult to change production configurations in a BTG unit because of the resulting major disturbances in energy and material balance. When the process losses this balance, it becomes much more difficult to control due to the changes in process dynamics. Because of the complicated dynamics of the boiler-turbine system many modeling efforts have been made. Models that are suitable for control design have been investigated in many papers [I+]. In our recently published paper [3], we developed a dynamic fuzzy model for a 160MW drum boilerturbine system. We demonstrated that the proposed fuzzy model captures well the key dynamical properties of the physical plant over a wide operating range, and is suitable for model-based control design. The control of the boiler-turbine system is still of substantial interest. One of the key difficulties is the control of the water level in the drum boiler. Water level dynamics are non-minimum phase, because of the shrink-and-swell effect, and vary very much with the load. The boiler-turbine control system, which controls the electrical output, the drum steam pressure and the drum water level is necessary for the stable load following, the safety of the power plant and fuel saving. Without good control of these variables, flexible production of energy to meet demands will be difficult. 659
660 Feedwater flow
Figure 1. Schematic of the drum boiler-turbine system.
On the issue of singularly perturbed model based control, Patre et al. developed, in [ 8 ] , a periodic output feedback controller for a steam power plant. However, the proposed controller does not have much practical implications since the design procedure is achieved on the basis of simplified dynamical equations of the plant. In [5], Kwon et a/. investigated the use of the robust control theory for designing a multivariable LQG controller for the boilerturbine system. The basic limitation is that for the nonlinear plant like the boilerturbine system, the control scheme is very complex since the gain-scheduling techniques are used. The goal of this work is to propose a scheme for designing a fuzzy model based controller for the nonlinear boiler-turbine system. Figure 1 shows the principal components of the steam power plant. First, a fuzzy augmented system is suggested to deal with the non-minimum phase behavior of the plant and to meet a desired loop shape. Then, quadratic optimization problem is solved for each local fuzzy augmented system and a global fuzzy controller is deduced for the global fuzzy system. Finally, a fuzzy estimator is built upon classical estimation theory using a local concept approach. 2.
Drum Boiler-turbine System Representation
2.1. The Boiler-turbine Dynamic Fuzzy Model
The dynamic fuzzy model of the nonlinear plant is represented by a set of IFTHEN logical rules as follows: Rule i: IF 6x,(t) is F' and 6x,(t) is Mi THEN 6x(t) = AiGx(t) + BiGu(t) 6y(t) = CiGx(t) + Di6u(t), i = 1,...,r.
(1)
where F' , Mi are fuzzy term sets of the ith plant-rule associated to the variation of the drum steam pressure 6x, (t) and the variation of the density of fluid in the
661
system 6x,(t), respectively. These state variables are defined as fuzzy variables in the fuzzy system. The local state-space parameters ( A i , B i , C i , D i ) in the consequent part describe local dynamics of the physical plant at a specified operating point (xi,u i ) . 2.2. The Fuzzy Augmented System
The redistribution of steam and water in the system causes the shrink-and-swell effect which causes the non-minimum phase behavior of water level dynamics. In order to deal with these inherent properties, we suggest to introduce dynamics augmentation for each fuzzy subsystems. It plays the role of making the singular values of the augmented subsystem as close as possible at a specified frequency range. The fuzzy augmented system can be expressed by the following IFTHEN rules: Rule i: IF 6x,(t) is F' and 6x,(t) is M i THEN i , ( t ) =A,x,(t)+B,iu,(t) 6y(t) = CaiXa(t),
where x, = [ :] , A Y
=[
A,
I:[=
B, O],13ai
(2) i = 1, ..., r.
andCai
=[: It,
and ua satisfies the following equation: 6ti(t) = Hu, (t).
(3 1
where H is obtained by using the pseudo-diagonalization method and a column scaling diagonal matrix [9]. The introduction of this constant matrix into the state equations plays the role of achieving weak interactions between the system variables and forcing the local dynamics to meet the desired specifications in the control design procedure.
3. Drum Boiler-turbine Fuzzy Control System Design The boiler-turbine control system is required to have the good command tracking, the disturbance rejection and the robustness to parameter variations. In order to achieve these objectives, in this paper, we use the LQG method with the concept of multivariable loop shaping. The design procedure is achieved from a local viewpoint, i.e. each fuzzy subsystem is forced to behave according to the desired specifications in the designed local control loop. Since the local fuzzy system is linear, its quadratic optimization problem is the same as the general
662 linear quadratic issue [ 101. Therefore, solving the optimal control problem for the fuzzy augmented system of Eq. (2) gives the following fuzzy control law: 1
i = 1 , . ..,I-.
u, (t) = - x p i R - ' B z i P i x a (t),
(4)
i=l
where p, denote the normalized firing strength of the ith rule, and P, is symmetric positive definite solution to the control algebraic Riccati equation: A:,P,
+P,A; +Q-P,B,,R-'BT,P,
i = l , ...,r.
=o,
(5)
where Q and R are weighting matrices chosen via several simulations to satisfy the design specifications with limitations in control inputs. The resulting optimal feedback fuzzy system with augmented dynamics is described by: r
a (t)
=
r
7,
]
e Pj x a (t).
P j [A - B ai
i=l j=l
In practice, it is necessary to design a fuzzy estimator in order to implement the fuzzy controller given by Eq. (4). The idea is that for each local dynamics, an estimator gain is determined on the basis of the loop transfer recovery (LTR) method. The resulting global estimator is a 'fuzzy blending' of each local estimator and can be expressed by a set of IF-THEN logical rules as follows: Rule i: IF 6x,(t) is F' and 6x,(t) is Mi THEN i,(t) = Aaika(t)+Baiua(t)+Gi[6y(t)-6f(t)] W t ) = CaikaW,
i
(7)
= 1, ...,r.
where G, (i=l, ...,r) are estimation error matrices. For the determination of estimation gains, covariance matrices Qf and Rf are chosen so that the loop transfer recovery can be approximately achieved for each local dynamics. For the obtained covariance matrices, the following estimator algebraic Riccati equation is solved for the positive definite matrix r,: r,A; +AaiTi+ Q f -TiC%R;'CaiTi =0,
i = l , ..., r.
(8)
and the estimator gains are given by: G i = TiC:RT',
i = 1,..., r.
(9)
As can be noticed, the fuzzy controller and the fuzzy estimator are independently designed according to the Separation Property [6].
663 Steam pressure (kglcm')
108.4 108.3 108.2 108
108.1 108
I
0
I
50
Time(r)lOO
150
k 1
200
Electrical ouiput (MW)
66'67 66.66
/ ; M
66.65
150
0
50
0
200
50
TimelqO
150
200
Water level (m)
Time{$' Water level (m)
2: ~J~~ 0.01
0.006 -0.02
0.002 0
50
100 Time (5)
(4
150
200
0
50
100
Time ( 6 )
150
200
(b)
Figure 2. Disturbances rejection with the designed fuzzy control system: (a) Input disturbance rejection, (b) output disturbance rejection.
4.
Simulation Results
To validate the design objectives, simulations are performed under various situations. In the simulation, we consider the rejection of disturbances at the input and output variables of the boiler-turbine system, and the tracking property of the fuzzy control system. Figure 2-a shows the responses of the boiler-turbine output variables to a step increase in fuel flow rate. The effect of this input disturbance is small and vanishes for t>50s. In addition, the output disturbance is well attenuated and its effect becomes insignificant after 100s as shown in Figure 2-b. In all cases, the electrical output most rapidly reaches its nominal state and the drum water level most slowly does. The ability of the designed fuzzy control system to track reference commands is well achieved as depicted in Figure 3. The boiler-turbine output variables follow well the reference input and the steady state error becomes zero in all cases.
5. Conclusion In this paper, a fuzzy control system with a fuzzy controller and a fuzzy estimator is designed for a fossil fuelled nonlinear boiler-turbine system. The design procedure is achieved from a local viewpoint using a nonlinear dynamic fuzzy model. A fuzzy augmented system is suggested to deal with the non-
664 Electrical output (MW)
Steam pressure (kg/cm')
108.5
107.5
1071 0 008
"
50
I
100
Time ( 5 ) Water level (m)
150
O
50
100
Time(s)
150
200
1
I
I
0
2oo
50
100
Time (s)
150
200
Figure 3. Tracking commands with the designed fuzzy control system
minimum phase behavior of the nonlinear plant. Using local concept approach, a nonlinear fuzzy control law is derived to control the multivariable boiler-turbine system and a nonlinear fuzzy estimator is designed to implement the fuzzy controller. Simulation results show that the synthesis methodology, which independently designs the fuzzy controller and the fuzzy estimator with the separation property, is very effective to the control of the strongly interacting variables of the nonlinear boiler-turbine system. References
1. ,htrom K. J. and Bell R. D. Drum-boiler dynamics. Autornatica, 36, (2000). 2. Bell R. B. and Astrom K. J. Dynamic models for boiler-turbine-alternator units. Report of Lund Institute of Technology, (1987). 3. Habbi A. and Zelmat M. A dynamic fuzzy model for a drum boiler-turbine system. Autornatica, 39(7), (2003). 4. Flynn M. E. and O'Malley M. J. A drum boiler model for long term power system dynamic simulation. IEEE Trans. Power Syst., 14(1), (1999). 5. Kwon W. H., Kim S. W. and Park P. G . On the multivariable robust control of a boiler-turbine system. Proc. IFAC Pow. Syst. Pow. Plant Cont.,(1989). 6. Ma X. J., Sun Z. Q. and He Y.Y. Analysis and design of fuzzy controller and fuzzy observer. IEEE Trans. Fuzz. Syst., 6(1), (1998). 7. Mayne D. Q., Rawlings J. B., Rao C. V. and Scokaert P. 0. M. Constrained model predictive control. Autornatica, 36(6), (2000). 8. Patre B. M., Bandyopadhyay B. and Werner H. Periodic output feedback control for singularly perturbed discrete model of steam power system. ZEE Proc. Cont. Theo. Applic., 146(3), (1999). 9. Rosenbrock H. Computer-aided control system design. Press Inc., (1974). 10. Wu S.J. and Lin C. T. Optimal fuzzy controller design: Local concept approach. IEEE Trans. Fuzz. Syst. 8(2), (2000).
UNIVERSAL TAKAGI-SUGENO FUZZY CONTROLLER CORE IMPLEMENTED IN A PLD DEVICE
D. OSELI, M. MRAZ, N. ZIMIC Laboratory for computer structures and systems, Faculty of Computer and Information Science, University of Ljubljana, Trzaska 25, SI-1000 Ljubljana, S L O V E N I A E-mail: [email protected] Fuzzy logic controllers are nowadays mostly implemented as software code and run on conventional microprocessors. If there is a need for high speed processing, the controller must be implemented in hardware. One of the solutions is implementing a fuzzy logic controller in a programmable logic device. Taking into consideration some initial limitations, a universal fuzzy controller core can be constructed. Such controller can be quickly adapted to various system transfer functions, even while the controller is operating. This paper outlines some important design issues that we came across while constructing such fuzzy controller cores.
1. Introduction In the last decade an enormous theoretical as well as practical progress has been made in the field of fuzzy logic. One of the highlights is the fuzzy controller [1],[6],[10],[11].Such controllers are very robust, capable of approximating most system transfer functions [7],[8],[9]and are simple to develop. The most common method of implementation of the fuzzy controller is software coded controller running on a general purpose microprocessor. Implementation of fuzzy controllers in target systems where very short controller response times are required (less then lps) is more complicated. In such cases the use of a general purpose microprocessor which executes the fuzzy controller code does not solve the problem as large amount of code needs to be processed every controller cycle. Due to the outlined problems, special hardware solutions for fuzzy processing are required. Methods of fuzzy processing can be categorized as follows:
(1) General purpose microprocessors [I], (2) Microprocessors with additional fuzzy instructions [2], 665
666 (3) Fuzzy co-processors [3], (4) Programmable logic implementations [4], [5],[13], (5) ROM based implementations. For real-time and short response time systems or applications only hardware solutions as programmable logic devices or ROM based implementations can work satisfactorily. This paper outlines the hardware implementation of the fuzzy logic controller - implementation in a programmable logic device. The main characteristic of such implementation is highly parallel processing of information in the programmable logic device. This is the opposite of conventional micro- or co-processor based implementation where code is processed sequentially in many system clock cycles. The review of practical implementations and published papers researching this area shows, that the most common approach to the hardware implementation of the fuzzy controller is the use of pre-calculated data. Computationally demanding processing (e.g. degree of membership of input variable) of the fuzzy controller is pre-calculated in advance and accessed on a look-up basis during operation of the controller. In such cases only the computationally less demanding processing is implemented in the programmable logic device as active logic, with pre-calculated data stored in memory. 2. Towards universal fuzzy controller core The architecture of hardware implementation of universal fuzzy controller core must be designed in a way to allow very fast processing of various controller configurations. The drawback of the universality is that the footprint of such controller is larger, more logic gates are required and also more system clock cycles for a single controller cycle needed. The benefit is that the fuzzy controller is capable of handling variety of different configurations. To implement a universal fuzzy controller that could be configured for any type of application, thus operating with any translating function, one would need a nearly infinitely large PLD device. Thus some initial limitations are required. The first one is the type of the fuzzy controller. We have chosen a Takagi-Sugeno [6]. Its main advantage over Mamdani type is the defuzzification method. The weighted average equation (Eq.1) can be efficiently implemented in hardware, while Mamdani methods like COG are far more complex. Another advantage is the ability to optimize the translating function of the fuzzy controller using the ANFIS tool [12]. ANFIS is available in the Fuzzy logic package in the Matlab tool. Takagi-Sugeno fuzzy controllers are also proofed to approximate any system transfer function, as
667 presented in [7], [8] and [9].
Eq. (1) presents a general form of weighted average equation where all n fuzzy rules are evaluated. As presented, the weights for output linear functions are minimums of degrees of membership. The wedge operator presents a min function. Refer to [6] for detailed explanation of the Eq. (1). Further we have chosen three trapezoidal membership functions per input and decided for the two input single output fuzzy controllers. Trapezoidal membership functions are best suited for the hardware implementation due to the linearity. Such two input fuzzy controller with three membership functions per input can have up to nine fuzzy rules if using full set of rules, where each combination of active membership functions defines a new rule. The controller complexity was further simplified using fixed position of membership functions. In our prototype we have decided t o implement equally spaced membership functions with overlapping of only neighbouring membership functions. In such cases, the sum of degrees of membership of input variable t o input membership functions is always 1 (see Figure 1). Fuzzification
I
Small
x.
Medium
hree
I
Def -
Figure 1. Fuzzification and defuzzification example with three membership functions, three rules and three output linear functions.
Figure 1 presents a fuzzification and defuzzification example of a single input, single output fuzzy controller. Input value 89 is first fuzzified, as a result we get three degrees of membership (one is 0 as membership function Large is not activated). Further we have three examples of linear output equations yI , ye and y3. Combining input value and degrees of membership, a controller output value t is calculated.
668
3. Using ANFIS for controller development There are two possible ways to construct a fuzzy controller. The first is manual and the second is automatic construction of optimized controller using the ANFIS tool [12]. Manual construction of the fuzzy controller is suitable for systems where translating function is not known precisely. Controllers for such systems are constructed on the basis of desired behaviour of the controller and controller developer’s knowledge. For systems where translating function is known in precise or at least approximate form, the ANFIS tool can make all the hard work for the developer. In our case, the translating functions of tested controllers were known so the controllers were optimized using the ANFIS tool. The optimization procedure is the following: (1) Create initial empty FIS (Fuzzy Inference System - Matlab variable that completely define the fuzzy controller) structure, (2) Set the desired number of inputs, outputs, number, type and position of membership functions, create full set of fuzzy rules, (3) Create a file with pairs of input values and desired controller output values, use as many as possible, (4) Use the ANFIS tool on the FIS structure and set the desired max. error. Run until error is low enough.
The result of such optimization is a FIS structure, with parameters of output linear equations that define the behaviour of the controller. For two input controller with three membership functions per input and full set of fuzzy rules, there are nine output linear equations of form Eq. (2).
yz = p i + p f *xi + p ; *xz
Each Eq. (2) has three variable parameters p 6 , p f and p ; . For the described controller there are a total of 27 parameters. These are the parameters that completely define the behaviour of the controller. The programming of the fuzzy controller core thus consists only of transferring these parameters to the controller. Controller is then ready to operate with the desired transfer function.
669
4. VHDL model and results
The VHDL model of the presented fuzzy controller consists of two processes. One process is responsible for uploading the controller with variable parameters. The other is actually the fuzzy controller process, responsible for fuzzification, inference and defuzzification. This process is constructed as a state machine with four states. So far we have managed to optimize the code so that only 11 system clock cycles are required for a single controller cycle. As mentioned the only variable parameters of the fuzzy controller are of linear output equations. For the presented controller there is a total of 27 parameters. Compared t o any other implementation of the fuzzy controller, this is the smallest amount of data required t o completely configure the controller. Another advantage of the presented concept is the reconfigurability of the controller while it is running. It is possible t o upload new parameters to the controller without stopping it. This is suitable for systems that require constant control.
-
1
E
i o 1
-2 -3
4 300 300
0
Figure 2.
0
Typical error distribution of a two input hardware based fuzzy controller
The target programmable logic device for our prototype was Cypress Delta lOOk equivalent logic gates device. Approximately 80% of all logic gates were used. System clock frequency was llMHz, so we measured a l p s intervals between new controller output values. Due to the relatively small device size some design restrictions additionally appeared (e.g. predefined position of the input membership functions). The fuzzy controller core was tested with several different fuzzy controller configurations.
670 5 . Conclusion
As presented in Figure 2 the typical error distribution of 8 bit architecture of hardware based fuzzy controller is up to 3%. We must outline that this error is generated by the integer arithmetic in the PLD device. Using 12 or 16 bit arithmetic would considerably improve the precision of the controller. In our future research work, we intend to use larger programmable logic devices, where more complex fuzzy controllers could be implemented. We see the future of such fuzzy controller cores in hybrid, single chip devices, where conventional microprocessor core and fuzzy controller core are integrated in parallel. Some aspects of such design already appeared, as presented in [14]
References 1. M. Mraz, Mathematics and computers i n simulation, Elsevier, No.56, 259 (2001). 2. V. Salapura, I E E E Transactions on fuzzy systems, Vo1.8, No.6, 781(2000). 3. H. Eichfeld, T. Kunemund, M. Menke, I E E E Transactions o n fuzzy systems, vo1.4, No.4, 460 (1996). 4. M. A. Manzoul, D. Jayabharathi, I E E E Transactions o n Systems, Man and Cybernetics, Vo1.25, No.1, 213 (1995). 5. H. Surmann, A. Ungering, K. Goser,International Workshop on FieldProgrammable Logic and Applications, Austria, 124 (1992). 6. T. Takagi, M. Sugeno, I E E E Transactions on Systems, Man and Cybernetics, vo1.15, No.1, 116 (1985). 7. J.L. Castro, IEEE Transactions on Systems, Man and Cybernetics, Vo1.25, No.4, 629 (1995). 8. H. Ying, I E E E Transactions on Fuzzy Systems, vo1.6, No.4, 582 (1998). 9. H.O. Wang, J. Li, D. Niemannm K. Tanaka, Proceedings 9th I E E E Int. Conf. on Fuzzy Szstems, (2000). 10. R. Babuska, H.B. Verbruggen, Control Engineering Practice, Elsevier, vo1.4, No.11, 1593 (1996). 11. M. J. Patyra, J. L. Grantner, K. Koster, I E E E Transactions on Fuzzy Systems, vo1.4, No.4, 439 (1996). 12. J . 3 R. Jang, I E E E Transactions on Systems, Man, and Cybernetics, Vo1.23, No.3, 665 (1993). 13. E. Lago, M.A. Hinojosa, C.J. Jimenez, A. Barriga, S.S. Solano, X I I Conference on design of circuits and integrated systems DCIS 97, Spain, 715 (1997). 14. D. Andrews, D. Niehaus, P. Ashenden, I E E E Computer, Vo1.37, No.1, 118 (2004).
PARALLEL PIPELINE FLOATING-POINT FUZZY PROCESSOR
N. POPESCU Computer Science Department, University P O L I T E H N I C A of Bucharest, 303 Splaiul Independentei, sec. 6, Bucharest, R O M A N I A E-mail: [email protected]
J. ZHANG Computer Science Department, University of Hamburg, 30 Vogt-Koelln Street, 22527 Hamburg, Germany E-mail: [email protected]. de This paper presents a hardware implementation for a Parallel Pipeline FloatingPoint Fuzzy Processor (PPFF) which uses Gaussian membership functions in order to represent the membership functions. The processor inputs are the parameters from a Sugeno fuzzy system. These parameters are developed in an off-line phase and are supposed to be known. The processor inputs are represented in singleprecision floating-point and all operations are realised in pipeline. The hardware implementation is realised using a H.0.T I1 Development System.
1. Introduction Fuzzy logic has a great advantage: any real system can be defined and modelled by it, even if the degree of uncertainty is large. All the designed fuzzy processors use trapezoidal membership functions in contrast to the implementation presented in the next sections, which uses Gaussian membership function. Additionally, in all fuzzy processors designed until now [ 1 1, [ 2 1, the number of membership functions and the number of inputs are kept low. These limitations are imposed by the fact that the number of rules in a fuzzy system is given by the number of fuzzy sets to the power of the number of input variables. If the number of fuzzy rules and the number of inputs are higher, then the resulting number of fuzzy rules which must be generated by the hardware device becomes too large.
671
672 In order to avoid this limitation, the developed PPFF processor applies the parameters from a Sugeno fuzzy system to inputs. The parameters are determined in an off-line phase, or are taken from a hardware device which implements a Sugeno fuzzy system determination method like in [ 41. These parameters are very small (1.30e-12) and, in order to work with them, a single-precision floating-point representation is absolutely necessary. Another limitation for the classical fuzzy processor consists in the fact that the inputs are considered integer. This fact is considered a limitation in robotic applications, because it is necessary to work with numbers in floating point in order to have good results. To eliminate this limitation, the PPFF processor can work with data represented in single-precision floating-point.
2. The Main Features of a PPFF Processor The experiments take into consideration a Sugeno fuzzy system with 8 inputs, 21 fuzzy rules in the data base and 21 membership functions for each input. The main features of the PPFF processor are summarised below: 0
0 0
0
0
0
0
the number of the inputs is at least 7; each input is given in a signed representation using 32 bits; each output has a signed representation using 32 bits; 21 Gaussian membership functions are used for each of the input variable fuzzy sets; 21 crisp membership functions called Oi are described for the output variable Y ; the overlapping of the membership functions is not restricted. the defuzzification method is a zero-order Sugeno; T-norm conjunction is implemented by a minimum.
In this implementation, the processing rate is highly dependent on the fuzzy expert system. This dependency implies the use of a memory module which is large enough for the future applications. Having in view that the expert system cannot work in natural environments, statistically speaking, the maximum number of fuzzy rules can be considered 100. All the possible fuzzy rules are stored in this module called “Fuzzy Rule Memory (FRM)”. Taking into account that any membership function has a Gaussian form, all the fuzzy rules are active. For this reason it is not necessary to develop a module to establish which fuzzy rule is active. The hardware implemen-
673 tation is realised for a H.O.T. I1 Development System in which the clock frequency has a maximum value of 100 MHz. This value for the clock frequency implies a clock period of 10 ns. The PPFF processor has been divided into several pipeline stages. 3. Fuzzification Process
The chosen shape for any membership function is the Gaussian one. The easiest description for the Gaussian shape is given in the equation Eq. (1). p ( x ) = [lt
(yy] -1
In this equation, it is important to specify that: 0 0
PPFF processor; a and b represent the Gaussian parameters. These parameters are used to describe any Gaussian membership function and they are generated in the off-line phase. They are unchangeable and they are stored in a special memory module. z represents the crisp input to the
A number of 294 Gaussian parameters will be stored. A floating point representation is used for any Gaussian parameter.
4. Inference Process The inference process implies the determination of some variables like the degree of truth and the firing level. The intersection between the value of the crisp input and the fuzzy set of the linguistic term from the premise is denoted by p and represents the degree of matching. This degree of matching is usually called "degree of truth". The determination of the p value is an important step, this value representing the grade of membership of an input variable to a given fuzzy set. The p value is always between 0 and 1. To determine this value is enough to substitute the z variable with a crisp input of the PPFF processor in the equation 1. Of course the values for the Gaussian variables are a priori known. The process t o find the p ( z ) value must be realised for every input. After all the p values are computed, another variable called firing level and denoted with CY is computed. The value for LY is calculated taking the minimum value from all the p variables. For every fuzzy rule from the
674
fuzzy rule database, an CY value is obtained. The total number of a values is 21 and, together with the crisp values of output, they are involved in the defuzzification process. The crisp value for the output is denoted zi. 5. Defuzzification Process
The mathematical equation which defines the zero-order Sugeno defuzzification method is presented in equation Eq. (2).
The defuzzifier block (zero-order Sugeno) performs a floating point multiplication in the first step, two floating point additions in the second step and a floating point division, in the third step. Once all the rules have been processed, the data stored into the two adders of the defuzzifier go to the divider circuit to compute the crisp output value. This floating point division is computed in parallel to the pipeline stages while the system begins to process a new data set. The floating point division operations are realised in conformity with [ 31. The data flow through the PPFF processor is presented in Figure 1.
6. Performance of the PPFF Processor The performance of the PPFF processor is analysed having in view the computational time required for all the pipeline stages. All the arithmetic operations are carried out in floating point in conformity with [ 3 1. The computational time is denoted with (CT) and, considering the above specifications, the pipeline stages description and the computation for the CT are given below:
(1) In this stage the computation of the p value starts. The first operation is a floating point difference between the crisp input and the first Gaussian parameter denoted a in the equation Eq. (1). The PPFF processor has a number of seven inputs, so all these inputs start to be processed in parallel like in Figure 1. It can be concluded that in this stage CT = 40ns. (2) The second stage is dedicated to realise a floating point division between the result obtained in the first step and the second Gaussian
675
I
Figure 1. The data flow in the PPFF processor.
parameter denoted by b. To realise this stage a computational time of 30 n s is required. The value of CT becomes 70 n s . (3) The result obtained in the second stage is risen t o the power of two. A simple way to realise this is to multiply by itself the result from the second stage. The value of CT becomes 100 n s . (4) The fourth pipeline stage implements a sum between a constant value “1” and the result obtained in stage 3. Having in view that the floating point addition requires a computational time of 40 n s , the new value for the CT is now 140 n s .
676 ( 5 ) The result at the stage 4 must be to the power of -1. The CT value is changed to 170 ns. From Figure 1, it can be observed that, now,
all the p values are computed. (6) In this stage, a minimum value between all the computed p values is computed. It can be observed that the circuit which realises the minimum operation is a combinatorial circuit. At the end of this pipeline stage the value Q for the first fuzzy rule is calculated. This stage represents the end of the inference phase. (7) With this stage the defuzzification phase starts. After this stage the new value for CT is 210 ns. (8) This pipeline stage realises a parallel floating-point addition. In this stage the dividend and the divisor for equation Eq. (2) are computed. Having in view that these two floating point operations are made in parallel, the new value for CT is 250 n s . (9) Finally, this last pipeline stage computes the zo value. This computation implies a floating division, so that the final value for CT is 280 ns. 7. Conclusions
In conclusion, the PPFF processor proposed in this paper has the capabilities to work in real time. Having in view that it works with parameters from a Sugeno fuzzy system, that the data values are expressed in floating-point and that the defuzzification method implemented is zero-order Sugeno, the obtained results are very precise. The same results will be obtained if the trapezoidal membership function will be used. References 1. A. Kandel and G. Langholz Fuzzy Hardware - Architectures and Applications, Kluwer Academic Publishers, 181 (1998). 2. A. Gabrielli, E. Gandolfi, E. Masetti and M. R. Roch VLSI design and realisation of a 4 input high speed fuzzy processor, 6th IEEE International Conference
on Fuzzy Systems, 779 (1997). 3. D. Popescu Pipeline Implementation of the Floating Point Operations using F P G A circuits, Technical Report No. (2104-03, University POLITEHNICA of Bucharest, (2003). 4. H. Molina, 0. Arellano, A. Reyes, L.M. Flores, J. A. Moreno, F. Gomez CMOS A N F I S Neurofuzzy System Prototype, Instrumentation and Development, 5 ( 2 ) (2001)
DEALING WITH DYNAMIC ASPECTS OF
OPERATORS’ PERFORMANCE GUEORGUI PETKOV Department of Thermal and Nuclear Power Engineering, Technical University of Sofa, 8 Kliment Ohridski Street, Room 2356, Sofa, 1797, Bulgaria
The paper presents the issues of dealing with dynamic aspects of operators’ performance. They are evaluated on the base of macroscopic “second-by second’ context model of cognition and communication processes. The human action context is represented as a field of interaction between human, machine, technology and organization. They form context together but the field influences dynamically their configuration. Some useful achievements of mathematical psychology are applied to context quantification and dynamic reconfiguration of cognition and communication processes modeling. A simplified probabilistic approach to dynamic quantification of operators’ performance is demonstrated by the Performance Evaluation of Teamwork method.
1. Introduction The human-machine system (HMS) behavior is formed and influenced by the dynamic processes’ interactions in hardware, software and liveware. The representations of the hardware and software performances are quantities that can be observed and measured. They can be explicitly determined and defined by matrices, which eigenvectors form a Hilbert space. On the other hand, the dynamic aspects of liveware are a product of perturbations in physical and mental processes. They are not easily observed and measured but require advanced and sophisticated analyzing tools. Fuzzy logic and intelligent tools can be applied but their use should be preceded by appropriate theoretical and practical insights for dealing with dynamic aspects of operators’ performance. The paper presents capacities of Performance Evaluation of Teamwork (PET) method for simplified probabilistic treatment and quantification of operators’ performance.
2. Dynamics of Context
2.1. Background The integration of human and machine has many dimensions and projections. When considering the dynamics of human performance reliability the first dimension is time. This temporal approach, e.g. time reliability curves (TRC), is usually complemented by procedural, influential or contextual approaches to avoid “bareness in modeling”. The TRC is “virtually impervious to context” [ 11. 677
678 The influence of situational, task, human and plant factors is indicated by performance shaping factors (PSF) [Z]. The “categories of human behavior according to basically different ways of representing the constraints in the behavior of a deterministic environment or system” (skill-, rule-, and knowledgebased levels) [3] are taken into account by adjusting the input parameters of probability distribution. This “improved” TRC version called the human cognitive reliability (HCR) correlation, though being justifiably criticized, remains one of the most commonly employed human reliability analysis (HRA) methods. However, from psychologists’ point of view, this method is very ambiguous and with “less-than-adequate psychological realism”, regarding its results. Hollnagel points out that “any description of human actions must recognize that they occur in context“ and must account for “how the context influences actions” [4]. Consequently, the questions of primary importance for dynamic aspects of human performance are: “what is human action context” “how the context is represented and changed” and “how the context influences the human performance”.
2.2. What Is Human Action Context? The human action context “consists of the ideas, situation, events, or information that relate to it and make it possible to understand it fully” [5].According to this definition, at least four general answers to that question are possible: the term “context” is used for particular state of mind of a cognitive system, for the situation itself (the state of the universe), for a particular description of the event sequence, or for all information about situation, event, state of mind and environment. A dynamic theory of context considers it as a dynamic fuzzy set of entities that influence human cognitive behavior on a particular occasion [ 6 ] .Usually psychologists prefer to represent context as “a state of the mind” - “the term of context refers to a set internal or mental representations and operations rather than a set of environmental elements”. The reason for that is the fact that the context is specijc (“only very few things in the universe that do influence context”), individual (“different people will be influenced by different elements of the same environment”) and internally represented (“all entities in the environment which do influence human behavior are internally represented”). However, the context is not only internally represented. It is externally in$uenced and formed by environment as well. No one doubts that the context elements are perceived and memorized from previous individual and generalized experience of people. A human tries to match all relevant elements by the reasoning mechanism, to take into account
679 their current state and to predict them in the future. These elements are associated in the human memory as a framework of the environment of specific human activity. This framework does not always consist of real objects but of ones that are subjectively (individually and specifically) perceived and memorized as a real field of interaction with this environment. Consequently, the context theoretically could be defined as a common state of universe, mind and situation (in their relation). In other words, the context should contain all information about environment and human in this situation (objective and subjective information). A practical problem of human reliability is how to take into account the dependence between operators and environment and to quantify human error probability (HEP). As is well known from natural sciences, the quantitative approach to macroscopic systems is based on the calculation of the number of the accessible states [7]. Practically, context may be regarded as a statistical measure of the degree of the HMS state randomness defined by the number of accessible states talung place in the systems’ ensemble. According to the PET inductive approach and by analogy with Shannon’s source-coding theorem, all N independent identically distributed context elements (each with information S ) can be compressed into more than NxS bits with negligible loss of information, as N+m; conversely, if they are compressed into fewer than NxS bits there is a dramatic fall-off information. Since it is impossible to describe the whole process in detail and all HMS accessible states, it is evident that these steps add immeasurably to knowledge of actual context.
2.3. How The Context Is Represented And Changed? On the one hand, a context description of given situation has to reflect dynamically, all specific information for the mind and environment before and after initiating event. On the other hand, the description of the ensemble of HMS and context elements must be sufficiently general for human, technology and organization of specific control area. Consequently, the use of several levels of context elaboration is imposed. The PET method uses macroscopic description as of its first level approximation. Regardless of the place, moment and agent, the performed human erroneous action (HEA) could be divided into three basic types that determine the reliability of human performance: violation (V), cognition error (CE) and execution error (EE). Based on these concepts a “second-by-second” macroscopic quantification procedure of contexts of individual cognition and team communication processes is made. Technologically recognised and associatively relevant context factors and conditions (CFC) such as goals,
680 transfers, safety functions, trends of parameters, scenario events and human actions are taken into account as context elements. The main principles of the dynamic description of context are the following: Context is a common state of universe, mind and situation; Context could be described for an isolated system that is unavoidable approximation; 0 Context consists of associatively relevant elements that on the macroscopic level should be technologically recognized CFC as well; 0 Context can be quantified by accounting the mental or physical accessible states of HMS and human-human system (HHS); Context quantification is not provided for separate action point probability, and for a probability of any human action in time interval after the initiating event. The procedure includes consecutive application of the Combinatorial Context Model (CCM) and the Violation of Objective Kerbs (VOK) method for context quantification and the use of their results for obtaining the individual and team probability of cognitive error by the PET method. Combinatorial context model
The CCM is based on the concept of “human performance shifts”, i.e. on the assumption that the context rate in any situation is proportional to the deviation (gradient) in operator’s mental model subjective (perceived and memorized) image ($sn) of past and future from the objective (reasoned) one ($),. Context Quantification consists in counting identical contexts (bit states). The probability is obtained as a relation of the number of combinations resulting in the same context to the total number of all context combinations. Violation of objective kerbs
To trace the context image in time, it is necessary to know how the context elements (CFC) change: I $on(t)-$sn(t)I. For the cognitive process @on(t)=$on(to)=@,n=const and &,,(t) changes from the minimum &,(to) to the objectively expected value $,(RT)=$,, where RT (response time) is the interval between initiation of mental process (cognition or communication) and response action. The general case, when $,,(t)#const and @,,(t)fconst should be considered. But if the objective image changes from $lon(t)to $zon(t)because of any cause or reason, then it is VOK. The cognitive process is violated and the operator is motivated to achieve another objective image. For the communication process, if $lsn(t)>@zsn(t), @lsn(t)=constand $zsn(t) changes from the minimum $2sn(to) to the objectively expected value
68 1
@zS,(CT)=@,,,,CT - communication time. It means that the objectively expected value is changed to the subjective knowledge of the team partner and vice versa.
Human erroneous actions definitions On the base of the Reason’s [S] qualitative definitions, the CCM and VOK quantitative definitions of HEA are formulated: 0 HEA (HEA=CEuEE) are ‘all those occasions in which a planned sequence of mental or physical activities fails to achieve its intended outcome’. PET cognitive error is probable when the @sn(t)f@on(t). n=l.. .N, i.e. when the differences between objective and subjective images of human action context is not zero, where zero-context is I Qon(t)-Qsn(t) I -+O/min. Violation: ‘aberrant action’ (literally ‘straying from the path’. . .)’; PET violation occurs when the objective image of factor n is changed from @lo,(t)to Qzon(t)because of any reason, n is number of CFC. Context quantificationformulae The procedure of the CCM and VOK in the PET method uses simple context quantification formula that is a relation of the system accessible states at the current moment t and the initial moment to (the beginning of the cognition or communication process):
where indices are: n = 0, 1, 2...N ( N is the number of CFC), and i - a number of transitions between equilibrium states.
3. How The Context Influences Human Performance? The most trivial approach to decision-making process description is to use a homeomorphic graph architecture, where sub-processes are presented as nodes. The empirical data obtained by mathematical psychology shows that the time required for accomplishment of the mental processes with selective influence is ignorable short compare to the time in arcs presenting non-selective influence. That is why the PET approach represents HMS and HHS as systems based entirely on holographic behavior without any communication among separate sub-processes (e.g. observation, identification, etc.). Therefore, subprocesses of cognitioddecision-making process in the model are considered to
682
be independent and quasi-constant values with stochastic nature (presented as nodes). The context impact on mental processes could be rather compared to the “conductor in electromagnetic field” phenomena (“decision sequence” in context” field). This field generates decisions instead of electricity. The communication between mental processes is described by “modulating context control links”. It means that these contextual links measure the probability (degree) of connectivity between them and accomplish the control of whole mental process (holographic-likebehavior). By this means, we first take into account non-selective influence and then analyze process in-depth considering selective influence. The process of decision-makingis regarded as consisting of quasi-processes. The following modeling framework for decision-making phase of human action, where appropriate contextual links are applied, can be outlined: 1. Configuration (graph architecture and its organization in time) of individual cognitive process; the PET method uses the architecture of a stepladder model as reliability model of individual cognition. 2. Configuration of group communication process (communication is superstructure of individual cognition); a simple geometrical model as reliability model is used by the PET method. 3. Configuration of leadership process should be applied as well (it is not available at the PET method yet). 4.
Issues
The operators’ performance should be based on dynamic context description and dynamic reliability models of human-machine and human-human systems. The PET practicability and efficiency were illustrated by retrospective HRA of well-known accidents and a quantitative assessment of operators’ performance on the base of thermo-hydraulic calculations and full-scope simulator data [9].
References 1. Ed Dougherty, Reliability Engineering and System Safety 41,27, (1993). 2. A. D. Swain and H. E. Guttman, NUREG/CR-1278, chapter 3, (1983) 3. J. Rasmussen, IEEE Transactions on Systems, Man, and Cybernetics, SMC13 3,258, (1983) 4. E. Hollnagel, CREAM, Elsevier Science Ltd. (1998). 5. J. Sinclair et al., Dictionary, William Collins Sons & Co Ltd., 305 (1998).
683 6. B. Kokinov, IJCAI-95 Workshop on Modeling Contest in Knowledge Representation and Reasoning, LAFORIA 95/11, (1995). 7. G. Petkov, P. Antao and C. Guedes Soares, ESREL’2001,3, 1851, (2001). 8. J. Reason, Reliability Engineering and System Safety 46,297, (1994). 9. G. Petkov, V. Todorov, T. Takov, K. Stoychev, V. Vladimirov, I. Chukov and V. Petrov, JRC ESReDA 25‘hSeminar, 29211027, (2003).
This page intentionally left blank
AUTHOR INDEX
Akdag H. Alasty A. Alt S. Andreadis I. Arieli 0. Avdic S. Ayaz E.
250 623 617 393 57 236 557
de Pedro T. Deschrijver G. Ding Y. DOJ.-H. Domingos R.P. Du J. Du Y.
450 57 124 7 583 483,537 318.324
Baptista R.P. Baraldi P. Barutcu B. Belatreche A. Bell D.A. Benedikt J. Benitez-Read J.S. Bien Z. Bozdag C.E. Brijs T. Bruniaux P. Buyukozkan G.
583 573 557 205 161 409 603 7 489 167 525 495,519
Enomoto M. Er M.J. Ertay T.
589 483,537 489,507
Feng J. Feng L. FernAndez F. Fernandez-Madrigal J.-A. Fiordaliso A. Forster T. Fortemps Ph.
148,579 370 397 438 56 1 118 561
Castillo 0. Cebeci U. Chen G. Chen S. Chen T. Chen Y. Chung C.E. Cornelis C. Crespo J.C.
444,513,649 531 25, 167, 173, 179,629 292,296,308 47 1 525 416 57,195 397
Davis E. De Cock M. De Lope J. De Moraes R.M. Deng F.
543 195 463 314 138
Galindo C. Gao L. Gao Y. Garcia R. Gdmez D. GonzAlez C. Gonzalez J. Guan J.W. Guo P. Guo S. Guo X. Gupta M.M. Gurkan E. Gutitrrez J.
438 124 537 450 258 450 438 161 501 128 179 13 428 397
Habbi A.
659
685
686 Halang W.A. Hampel R. Happiette M. Hines J.W. HOUZ.-G. Hu Y. Huang C. Huang Z. Hullermeier E Hwang I.K.
629,635 118,617 525 543,549 13 240,434 358,364,370 597 376 416
Inoue H.
364
Jang H. Janssens D. Jeong J.H. Jian H. Jiang B. Jung D.W.
7 167 611 64 1 105 611
Kabak 0. Kagami Y. Kahraman C. Kanemoto S. Karimi I. Kastner W. Kawamura S. Kerre E.E. Kharrat M. Kim D.-J. Kim J.-B. Kim K.B. Kim K.-B. Kim S.P. Koehl L. Kolev B. Kolman E. Kou W. Kudo K. Kunsch P.L. Kwon K.C.
477 5 89 489,519 5 89 376 118 589 57, 173, 195,342 211 7 7 215 20 1 611 47 1 189 111 338 557 561 416
Liu S.Y. Liu X. Liu Y.L. Loo C.K. Lu Jia Lu Jie Lu N.
276,282,288 167 583 611 185 629,635,641 350 87 318, 324,629, 635,641 334 422 161 29, 51, 266, 272, 276,282,288 65,71 292,296 65,71 422 240,434 300,304 148,579
Ma J. Machado L.S. Maguire L.P. Mao Y. Maravall D. Margaliot M. Martens J. Martinez L. Mazadiego A. Mazlack L.J. McGinnity T.M. Melin P. Meskouris K. Montero J. Moon B.S. Moon J.-W. Mori M. Mou Q.
173,328 314 205 64 1 463 111 342 266,272 397 I55 205 444,513,649 376 19,258 416 20 1 589 134
Lan J. Lan Y. Lapa C.M.F. Lee B.C. Li T. Li P. Li R. Li W. Li Z. Liang X. Lim W. S . Liu D.Y. Liu J.
687 Mraz M. Muminov L. Na M.G. Nabeshima K. Nair H. Nijera-Hernindez M. Nam J.-H. Naranjo J.E. Ng R.
665 409 611 557 246 603
Oseli D. Oussalah M.
665 382
Pan W. Pan X. Pan Y. Patrascu V. Patricio M.A. Pizsit L. Pei Z. Pereira C.M.N.A. P6rez-Clavel B. Petkov G. Petruzela I. Ping Z. Poidomani C. Poncela A. Popescu D. Popescu N. Put F.
99 51 99 403 463 230,236 77,93 583,607 603 677 567 83 258 456 655 67 1 342
Qi D. Qiao S.
318,324 358
Rao M.V.C. Rasmussen B. Remion Y. Ren L. Ren M. Resconi G.
422 549 250 124 179 35
20 1 450 525
Revuelto J. Reyhani N. Robinson J. Ruan D.
450 211 6 29, 173, 185,266, 328,519
Sacco W.F. Salarieh H. Sanchez A. Sandoval F. Sanz A. Schirru R. Schockaert S. Seeliger A. Seker S. Sergiadis G.D. Shi C. Shin S.H. Song K.-Y. Sunde C.
607 623 397 456 397 583,607 195 617 557 222 300 61 1 13 236
Tamaoki T. Tang B. Trazegnies C. Trivifio G. Truck I. Trujillo L. Tsiftzis Y. Turkcan E. Tuzkaya U.R.
589 334 456 397 250 444 393 557 507
Ulengin F. Urdiales C.
477 456
Vlachos I.K. Vroman P.
222 47 1
Wang J. Wang P.P. Wang W. Wang Y.
350 6 105 272,318
688 Wang Z. Wets G. Wu F. w u z.
134,641 167 304 144
Xie W. x u Y.
Xue Y.
350 26, 29, 51,65, 71, 77, 83, 87, 93, 105, 138, 144,276,282, 288, 292, 296, 308, 318, 324, 328,338 334
Yager R.R. Y6Aez J. Yang G. Yang J.-B.
5 258 597 272
Yang L. Yi L. Younes A.A. Yu s. Yu w. Yuan J.
597 77 250 597 525 350
Zelmat M. Zeng C . Zeng X. Zhang G. Zhang J. Zhang X. Zhang Y. Zhu Y. Zhu Z. Zimic N. Zimmermann H.-J. Zio E.
659 41 471,525 300,304 358,655,671 124 579 45 334 665 3 573