This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
<strong>
p ϕ, CΓ=p ϕ, and etc. Thus using Kap ϕ, EΓp ϕ and p CΓ ϕ, we can express various of probabilistic knowledge properties. 2.2
Semantics of RATPK
We will describe the semantics of RATPK.
1098
Z. Cao
Definition 5. A model S of RATPK is a concurrent game structure [2] S = (Σ, Q, Π, π, e, d, δ, ∼a, Pa , here a ∈ Σ), where (1) Σ is a finite set of agents, in the following, without loss of generality, we usually assume Σ = {1, ..., k}. (2) Q is a finite, nonempty set, whose elements are called possible worlds or states. (3) Π is a finite set of propositions. (4) π is a map: Q → 2Π , where Π is a set of atomic formulas. (5) e is an environment: V → 2Q , where V is a set of proposition variables. (6) For each player a ∈ Σ = {1, ..., k} and each state q ∈ Q, a natural number da (q) ≥ 1 of moves available at state q to player a. We identify the moves of player a at state q with the numbers 1, ..., da (q). For each state q ∈ Q, a move vector at q is a tuple j1 , ..., jk such that 1 ≤ ja ≤ da (q) for each player a. Given a state q ∈ Q, we write D(q) for the set {1, ..., d1 (q)} × ... × {1, ..., dk (q)} of move vectors. The function D is called move function. (7) For each state q ∈ Q and each move vector j1 , ..., jk ∈ D(q), a state δ(q, j1 , ..., jk ) that results from state q if every player a ∈ Σ = {1, ..., k} choose move ja . The function is called transition function. (8) ∼a is an accessible relation on Q, which is an equivalence relation. (9) Pa is a probability function: Q × ℘(Q) → [0, 1], such that for every a, Pa (s, {s |la (s) = la (s )}) = 1. The definition of computation of a concurrent game structure is similar to the case of Kripke structure. In order to give the semantics of RATPK, we need to define strategies of a concurrent game structure. Strategies and their outcomes. Intuitively, a strategy is an abstract model of an agent’s decision-making process; a strategy may be thought of as a kind of plan for an agent. By following a strategy, an agent can bring about certain states of affairs. Formally, a strategy fa for an agent a ∈ Σ is a total function fa that maps every nonempty finite state sequence λ ∈ Q+ to a natural number such that if the last state of λ is q, then fa (λ) ≤ da (q). Thus, the strategy fa determines for every finite prefix λ of a computation a move fa (λ) for player a. Given a set Γ ⊆ Σ of agents, and an indexed set of strategies FΓ = {fa | a ∈ Γ }, one for each agent a ∈ Γ, we define out(q, FΓ ) to be the set of possible outcomes that may occur if every agent a ∈ Γ follows the corresponding strategy fa , starting when the system is in state q ∈ Q. That is, the set out(q, FΓ ) will contain all possible q-computations that the agents Γ can “enforce” by cooperating and following the strategies in FΓ . Note that the “grand coalition” of all agents in the system can cooperate to uniquely determine the future state of the system, and so out(q, FΣ ) is a singleton. Similarly, the set out(q, F∅ ) is the set of all possible q-computations of the system. We can now turn to the definition of semantics of RATPK. Definition 6. Semantics of RATPK [[Γ ϕ]]S = {q | there exists a set FΓ of strategies, one for each player in Γ , such that for all computations λ ∈ out(q, FΓ ), we have λ[1] ∈ [[ϕ]]S .}
Verifying Real-Time Temporal, Cooperation and Epistemic Properties
1099
[[Γ []ϕ]]S = {q | there exists a set FΓ of strategies, one for each player in Γ , such that for all computations λ ∈ out(q, FΓ ) and all positions i ≥ 0, we have λ[i] ∈ [[ϕ]]S .} [[Γ ϕU ψ]]S = {q | there exists a set FΓ of strategies, one for each player in Γ , such that for all computations λ ∈ out(q, FΓ ), there exists a position i ≥ 0, such that λ[i] ∈ [[ψ]]S and for all positions 0 ≤ j < i, we have λ[j] ∈ [[ϕ]]S .} [[Γ [][i,j] ϕ]]S = {q | there exists a set FΓ of strategies, one for each player in Γ , such that for all computations λ ∈ out(q, FΓ ) and all positions i ≤ m ≤ j, we have λ[m] ∈ [[ϕ]]S .} [[Γ ϕU[i,j] ψ]]S = {q | there exists a set FΓ of strategies, one for each player in Γ , such that for all computations λ ∈ out(q, FΓ ), there exists a position i ≤ m ≤ j, such that λ[m] ∈ [[ψ]]S and for all positions 0 ≤ k < m, we have λ[k] ∈ [[ϕ]]S .} [[Ka ϕ]]S = {q | for all r ∈ [[ϕ]]S and r ∈∼a (q) with ∼a (q) = {q | (q, q ) ∈∼a }} E E [[EΓ ϕ]]S = {q |for allr ∈ [[ϕ]]S andr ∈∼E Γ (q)with∼Γ (q) = {q |(q, q ) ∈∼Γ }}, E here ∼Γ = (∪a∈Γ ∼a ). C C [[CΓ ϕ]]S = {q |for allr ∈ [[ϕ]]S andr ∈∼C Γ (q)with∼Γ (q) = {q |(q, q ) ∈∼Γ }}, C E here ∼Γ denotes the transitive closure of ∼Γ . [[Kap ϕ]]S = {q | Pa (q, ∼a (q) ∩ [[ϕ]]S ) ≥ p}, here ∼a (q) = {r | (q, r) ∈∼a }; [[EΓp ϕ]]S = ∩a∈Γ [[Kap ϕ]]S ; [[CΓp ϕ]]S = ∩k≥1 [[(FΓp )k ϕ]]S , here [[(FΓp )0 ϕ]]S = Q, [[(FΓp )k+1 ϕ]]S = [[EΓp (ϕ ∧ (FΓp )k ϕ)]]S . Intuitively, Γ ϕ means that group Γ can cooperate to ensure ϕ at next step; Γ []ϕ means that group Γ can cooperate to ensure ϕ always holds; Γ ϕU ψ means that group Γ can cooperate to ensure ϕ until ψ holds; Γ [][i,j] ϕ means that group Γ can cooperate to ensure ϕ always holds in the interval of [i, j]; Γ ϕU[i,j] ψ means that group Γ can cooperate to ensure ϕ until ψ holds in the interval of [i, j]. For example, a RATPK formula Γ1 ϕ ∧ Γ2 [][i,j] ψ holds at a state exactly when the coalition Γ1 has a strategy to ensure that proposition ϕ holds at the immediate successor state, and coalition Γ2 has a strategy to ensure that proposition ψ holds at the current and all future states between time i and j.
3
Model Checking for RATPK
In this section we give a symbolic model checking algorithm for RATPK. The model checking problem for RATPK asks, given a model S and a RATPK formula ϕ, for the set of states in Q that satisfy ϕ. In the following, we denote the desired set of states by Eval(ϕ). For each ϕ in Sub(ϕ) do case ϕ = Γ θ : Eval(ϕ ) := CoP re(Γ, Eval(θ)) case ϕ = Γ []θ : Eval(ϕ ) := Eval(true) ρ1 := Eval(θ) repeat
1100
Z. Cao
Eval(ϕ ) := Eval(ϕ ) ∩ ρ1 ρ1 := CoP re(Γ, Eval(ϕ )) ∩ Eval(θ) until ρ1 = Eval(ϕ ) case ϕ = Γ θ1 U θ2 : Eval(ϕ ) := Eval(f alse) ρ1 := Eval(θ1 ) ρ2 := Eval(θ2 ) repeat Eval(ϕ ) := Eval(ϕ ) ∪ ρ2 ρ2 := CoP re(Γ, Eval(ϕ )) ∩ ρ1 until ρ1 = Eval(ϕ ) case ϕ = Γ [][i,j] θ : k := j Eval(ϕ ) := Eval(true) while k = 0 do k := k − 1 if k ≥ i then Eval(ϕ ) := CoP re(Γ, Eval(ϕ )) ∩ Eval(θ) else Eval(ϕ ) := CoP re(Γ, Eval(ϕ )) end while case ϕ = Γ θ1 U[p,q] θ2 : k := j Eval(ϕ ) := Eval(f alse) while k = 0 do k := k − 1 Eval(ϕ ) := CoP re(Γ, Eval(ϕ ) ∪ Eval(θ2 )) ∩ Eval(θ1 ) end while case ϕ = Ka θ : Eval(ϕ ) := {q | Img(q, ∼a ) ⊆ Eval(θ)} case ϕ = EΓ θ : Eval(ϕ ) := ∩a∈Γ Eval(Ka θ) case ϕ = CΓ θ : Eval(ϕ ) := Eval(true) repeat ρ := Eval(ϕ ) Eval(ϕ ) := ∩a∈Γ ({q|Img(q, ∼a ) ⊆ Eval(θ)} ∩ ρ) until ρ = Eval(ϕ ) case ϕ = Kap θ : Eval(ϕ ) := {q | Pa (Img(q, ∼a ) ∩ Eval(θ)) ≥ p} case ϕ = EΓp θ : Eval(ϕ ) := ∩a∈Γ Eval(Kap θ) case ϕ = CΓp θ : Eval(ϕ ) := Eval(true) repeat ρ := Eval(ϕ ) Eval(ϕ ) := ∩a∈Γ ({q|Pa (Img(q, ∼a ) ∩ Eval(θ) ∩ ρ) ≥ p}) until ρ = Eval(ϕ ) end case return Eval(ϕ, e)
Verifying Real-Time Temporal, Cooperation and Epistemic Properties
1101
The algorithm uses the following primitive operations: (1) The function Sub, when given a formula ϕ, returns a queue of syntactic subformulas of ϕ such that if ϕ1 is a subformula of ϕ and ϕ2 is a subformula of ϕ1 , then ϕ2 precedes ϕ1 in the queue Sub(ϕ). (2) The function Reg, when given a proposition p ∈ Π, returns the set of states in Q that satisfy p. (3) The function CoP re. When given a set Γ ⊆ Σ of players and a set ρ ⊆ Q of states, the function CoP re returns the set of states q such that from q, the players in Γ can cooperate and enforce the next state to lie in ρ. Formally, CoP re(Γ, ρ) contains state q ∈ Q if for every player a ∈ Γ, there exists a move ja ∈ {1, ..., da (q)} such that for all players b ∈ Σ−Γ and moves jb ∈ {1, ..., db (q)}, we have δ(q, j1 , ..., jk ) ∈ ρ. (4) The function Img : Q × 2Q×Q → Q, which takes as input a state q and a binary relation R ⊆ Q × Q, and returns the set of states that are accessible from q via R. That is, Img(q, R) = {q | qRq }. (5) Union, intersection, difference, and inclusion test for state sets. Note also that we write Eval(true, e) for the set Q of all states, and write Eval(f alse, e) for the empty set of states. Partial correctness of the algorithm can be proved induction on the structure of the input formula ϕ. Termination is guaranteed since the state space Q is finite. Proposition 1. The algorithm given in the above terminates and is correct, i.e., it returns the set of states in which the input formula is satisfied. The cases where ϕ = Ka θ, ϕ = EΓ θ, ϕ = CΓ θ, ϕ = Kap θ, ϕ = EΓp θ and ϕ = CΓp θ simply involve the computation of the Img function at most polynomial times, each computation requiring time at most O(|Q|2 ). Furthermore, real-time CTL model checking algorithm can be done in polynomial time. Hence the above algorithm for RATPK requires at most polynomial time. Proposition 2. The algorithm given in the above costs at most polynomial time on |Q|. A famous efficient model checking technique is symbolic model checking [22], which uses ordered binary-decision diagrams (OBDDs) to represent Kripke structures. Roughly speaking, if each state is a valuation for a set X of Boolean variables, then a state set ρ can be encoded by a Boolean expression ρ(X) over the variables in X. For Kripke structures that arise from descriptions of closed systems with Boolean state variables, the symbolic operations necessary for CTL model checking have standard implementations. In this case, a transition relation R on states can be encoded by a Boolean expression R(X, X ) over X and X ,where X is a copy of X that represents the values of the state variables after a transition. Then, the pre-image of ρ under R, i.e., the set of states that have R-successors in ρ, can be computed as ∃X (R(X, X ) ∧ ρ(X )). Based on this observation, symbolic model checkers for CTL, such as SMV [6], typically use OBDDs to represent Boolean expressions, and implement the Boolean and pre-image operations on state sets by manipulating OBDDs.
1102
Z. Cao
To apply symbolic techniques to our model checking algorithm, we should mainly give symbolic implementation of the computation of Eval(Ka θ), Eval(EΓ θ), Eval(CΓ θ), Eval(Kap θ), Eval(EΓp θ) and Eval(CΓp θ). The computation of Eval(Ka θ) can also be done using standard symbolic techniques. When given an equivalence relation ∼a and a set ρ of states, suppose that ∼a (X , X) is a Boolean expression that encodes the equivalence relation ∼a and ρ(X ) is a Boolean expression that encodes the set ρ of states, then {q | Img(q, ∼a ) ⊆ ρ} can be computed as ∃X (∼a (X , X) ∧ (X → ρ(X ))). The computation of Eval(EΓ θ), Eval(CΓ θ), Eval(Kap θ), Eval(EΓp θ) and Eval(CΓp θ) can be done similarly.
4
A Case Study
In this section we study an example of how RATPK can be used to represent and verify the properties in multi-agent systems. The system we consider is a train controller (adapted from [1]). The system consists of three agents: two trains and a controller−see Figure 1. The trains, one Eastbound, the other Westbound, occupy a circular track. The trains cost one hour to pass through the circular track. At one point, both tracks need to pass through a narrow tunnel. There is no room for both trains to be in the tunnel at the same time, therefore the trains must avoid this to happen. Traffic lights are placed on both sides of the tunnel, which can be either red or green. Both trains are equipped with a signaller, that they use to send a signal when they approach the tunnel. The train will enter the tunnel between 300 and 500 seconds from this event. The controller can receive signals from both trains, and controls the colour of the traffic lights within 50 seconds. The task of the controller is to ensure that trains are never both in the tunnel at the same time. The trains follow the traffic lights signals diligently, i.e., they stop on red. In the following, we use in tunnela to represent that agent a is in the tunnel.
Fig. 1. The local transition structures for the two trains and the controller
Verifying Real-Time Temporal, Cooperation and Epistemic Properties
1103
Firstly, consider the property that ”when one train is in the tunnel, it knows the other train is not the tunnel”: ∅[](in tunnela → Ka ¬in tunnelb ) (a = b ∈ {T rainE, T rainW }) We now consider the formula that express the fact that ”it is always common knowledge that the grand coalition of all agents can cooperate to get train a in the tunnel within one hour”: ∅[]CΣ Σ <>[0,1hour] in tunnela (a ∈ {T rainE, T rainW }), where we abbreviate Γ <>[i,j] ϕ as ¬Γ [][i,j] ¬ϕ. We can verify these formulas by using our model checking algorithm for RATPK, and results show that the above propertiesy hold in the system. In some cases, the communication is not unreliable, and the signal send by trains may be not received by the train controller. Therefore the train controller may have probabilistic knowledge such as “the probability of ‘T rainE has send a signal’ is no less than 0.6.” Such properties can also be expressed in RATPK and verified by our model checking algorithm.
5
Conclusions
Recently, there has been growing interest in the logics for representing and reasoning temporal and epistemic properties in multi-agent systems [4,11,16,17,20,21,24]. In this paper, we present a real-time temporal probabilistic knowledge logic RATPK, which is a succinct and powerful language for expressing complex properties. In [14], Halpern and Moses also presented and study some real-time knowledge modal ities such as -common knowledge CG , -common knowledge CG and timestamped T common knowledge CG . It is easy to see that all these modalities can be expressed T in RATPK, for example, CG ⇔ CG and CG ⇔ [][T,T ] CG . Moreover, the approach to model checking RATPK is studied. It is also hopeful to apply such RATPK logic and this model checking algorithm to verify the correctness of real-time protocol systems.
References 1. R. Alur, L. de Alfaro, T. A. Henzinger, S. C. Krishnan, F. Y. C. Mang, S. Qadeer, S. K. Rajamni, and S. Tasiran, MOCHA user manual, University of Berkeley Report, 2000. 2. R. Alur and T. A. Henzinger. Alternating-time temporal logic. In Journal of the ACM, 49(5): 672-713. 3. A. Arnold and D. Niwinski. Rudiments of µ-calculus. Studies in Logic, Vol 146, North-Holland, 2001. 4. M. Bourahla and M. Benmohamed. Model Checking Multi-Agent Systems. In Informatica 29: 189-197, 2005. 5. J. Bradfield and C. Stirling. Modal Logics and mu-Calculi: An Introduction. In Handbook of Process Algebra, Chapter 4. Elsevier Science B.V. 2001. 6. E. M. Clarke, J. O. Grumberg, and D. A. Peled. Model checking. The MIT Press, 1999.
1104
Z. Cao
7. Zining Cao, Chunyi Shi. Probabilistic Belief Logic and Its Probabilistic Aumann Semantics. J. Comput. Sci. Technol. 18(5): 571-579, 2003. 8. H. van Ditmarsch, W van der Hoek, and B. P. Kooi. Dynamic Epistemic Logic with Assignment, in AAMAS05, ACM Inc, New York, vol. 1, 141-148, 2005. 9. E. A. Emerson, C. S. Jutla, and A. P. Sistla. On model checking for fragments of the µ-calculus. In CAV93, LNCS 697, 385-396, 1993. 10. N. de C. Ferreira, M. Fisher, W. van der Hoek: Practical Reasoning for Uncertain Agents. Proc. JELIA-04, LNAI 3229, pp82-94. 11. N. de C. Ferreira, M. Fisher, W. van der Hoek: Logical Implementation of Uncertain Agents. Proc. EPIA-05, LNAI 3808, pp536-547. 12. R. Fagin, J. Y. Halpern, Y. Moses and M. Y.Vardi. Reasoning about knowledge. Cambridge, Massachusetts: The MIT Press, 1995. 13. R. Fagin, J. Y. Halpern, Y. Moses, and M. Y. Vardi. Common knowledge revisited, Annals of Pure and Applied Logic 96: 89-105, 1999. 14. J. Y. Halpern and Y. Moses. Knowledge and common knowledge in a distributed environment. J ACM, 1990, 37(3): 549-587. 15. W. van der Hoek. Some considerations on the logic PFD: A logic combining modality and probability. J. Applied Non-Classical Logics, 7(3):287-307, 1997. 16. W. van der Hoek and M. Wooldridge. Model Checking Knowledge, and Time. In Proceedings of SPIN 2002, LNCS 2318, 95-111, 2002. 17. W. van der Hoek and M. Wooldridge. Cooperation, Knowledge, and Time: Alternating-time Temporal Epistemic Logic and its Applications. Studia Logica, 75: 125-157, 2003. 18. M. Jurdzinski. Deciding the winner in parity games is in UP∩co-UP. Information Processing Letters, 68: 119-134, 1998. 19. M. Jurdzinski, M. Paterson and U. Zwick. A Deterministic Subexponential Algorithm for Solving Parity Games. In Proceedings of ACM-SIAM Symposium on Discrete Algorithms, SODA 2006, January 2006. 20. M. Kacprzak, A. Lomuscio and W. Penczek. Verification of multiagent systems via unbounded model checking. In Proceedings of the 3rd International Conference on Autonomous Agents and Multiagent Systems (AAMAS-04), 2004. 21. B. P. Kooi. Probabilistic Dynamic Epistemic Logic. Journal of Logic, Language and Information 2003, 12: 381-408. 22. K. L. McMillan. Symbolic model checking: An Approach to the State Explosion Problem. Kluwer Academic, 1993. 23. I. Walukiewicz. Completeness of Kozen’s axiomatisation of the propositional µcalculus. Information and Computation 157, 142-182, 2000. 24. M. Wooldridge, M. Fisher, M. Huget, and S. Parsons. Model checking multiagent systems with mable. In Proceedings of the First International Conference on Autonomous Agents and Multiagent Systems (AAMAS-02), 2002.
Regulating Social Exchanges Between Personality-Based Non-transparent Agents G.P. Dimuro, A.C.R. Costa, L.V. Gon¸calves, and A. H¨ ubner Escola de Inform´ atica, PPGINF, Universidade Cat´ olica de Pelotas 96010-000 Pelotas, Brazil {liz, rocha, llvarga, hubner}@ucpel.tche.br Abstract. This paper extends the scope of the model of regulation of social exchanges based on the concept of a supervisor of social equilibrium. We allow the supervisor to interact with personality-based agents that control the supervisor access to their internal states, behaving either as transparent agents (agents that allow full external access to their internal states) or as non-transparent agents (agents that restrict such external access). The agents may have different personality traits, which induce different attitudes towards both the regulation mechanism and the possible profits of social exchanges. Also, these personality traits influence the agents’ evaluation of their current status. To be able to reason about the social exchanges among personality-based non-transparent agents, the equilibrium supervisor models the system as a Hidden Markov Model.
1
Introduction
Social control is a powerful notion for explaining the self-regulation of a society, and the various possibilities for its realization have been considered, both in natural and artificial societies [1,2]. Social rules may be enforced by authorities that have the capacity to force the agents of the society to follow such rules, or they may be internalized by the agents, so that agents follow such rules because they were incorporated into the agents’ behaviors. The centralized social exchange control mechanism presented in [3,4] is based on the Piaget’s theory of social exchange values [5].1 It is performed by an equilibrium supervisor, whose duty is: (i) to determine, at each time, the target equilibrium point for the system; (ii) to decide on which actions it should recommend agents to perform in order to lead the system towards that equilibrium point; (iii) to maintain the system equilibrated until another equilibrium point is required. For that, the supervisor builds on Qualitative Interval Markov Decision Processes (QI-MDP) [3,4], MDP’s [11] based on Interval Mathematics [12]. In this paper, trying to advance the development of a future model of decentralized social control, we extend the centralized model presented in [3,4], to 1
This work has been partially supported by CNPq and FAPERGS. A discussion about related work on value-based approaches was presented in [6]. Values have been used in the MAS area, through value and market-oriented decision, and value-based social theory [7,8,9]. However, other work based on social exchange values appeared only in the application to the modeling of partners selection [10].
A. Gelbukh and C.A. Reyes-Garcia (Eds.): MICAI 2006, LNAI 4293, pp. 1105–1115, 2006. c Springer-Verlag Berlin Heidelberg 2006
1106
G.P. Dimuro et al.
consider a personality-based agent society. We allow the agents to have different personality traits, which induce different attitudes towards the regulation mechanism (blind obedience, eventual obedience etc.) and the possible profits of social exchanges (egoism, altruism etc.). Also, these personality traits influence the agents’ evaluation of their current status (realism, over- or under-evaluation). So, the agents may or may not follow the given recommendations, thus creating a probabilistic social environment, from the point of view of social control. We observe that the study of personality-based multiagent interactions can be traced back to at least [13,14,15], where the advantages and possible applications of the approach were extensively discussed. In both works, personality traits were mapped into goals and practical reasoning rules (internal point of view). Modeling personality traits from an external (the supervisor’s) point of view, through state transition matrices as we do here, seems to be new. The agents are able to control the supervisor access to their internal states, behaving either as transparent agents (that allow full external access to their internal states) or as non-transparent agents (that restrict such external access). When the agents are transparent, the supervisor has full knowledge of their personality traits and has access to all current balances of values, and so it is able to choose, at each step, the adequate recommendation for each agent [6]. In the paper, we focus on the supervisor dealing only with non-transparent agents. The supervisor has no direct access to their balances of material exchange values, and must thus rely on observations of what the agents report each other about their balances of virtual exchange values, considering also that the agents are influenced by their personalities in their subjective evaluation of their current status. We also assume that, due to the non-transparency, the supervisor has no direct knowledge of the agents’ personality traits. To solve the problems of determining the most probable current state of the system, recognizing agent’s personalities and learning new personalities traits, the supervisor uses a mechanism based on Hidden Markov Models (HMM) [16]. The paper is structured as follows. In Sect. 2, we review the modelling of social exchanges. Section 3 presents the proposed regulation mechanism. Section 4 introduces the exchanges between personality-based agents. The HMM is introduced in Sect. 5, and simulation results in Sect. 6. Section 7 is the Conclusion.
2
The Modelling of Social Exchanges
The evaluation of an exchange by an agent is done on the basis of a scale of exchange values [5], which are of a qualitative nature – subjective values like those everyone uses to judge the daily exchanges it has (good, bad etc.). In order to capture the qualitative nature of Piaget’s concept of scale of exchange values [5], techniques from Interval Mathematics [12] are used, representing any value as an interval X = [x1 , x2 ], with −L ≤ x1 ≤ x ≤ x2 ≤ L, x1 , x2 , L ∈ R.2 2
According to Piaget [5], the subjective nature of social exchange values prevents approaching social exchanges with methods normally obtained in Economy, since there the affective personality traits of the social agents are often abstracted away to allow economic behaviors to be captured in their rational constitution.
Regulating Social Exchanges
1107
A social exchange between two agents, α and β, is performed involving two types of stages. In stages of type Iαβ , the agent α realizes a service for the agent β. The exchange values involved in this stage are the following: rIαβ (the value of the investment done by α for the realization of a service for β, which is always negative); sIβα (the value of β’s satisfaction due to the receiving of the service done by α); tIβα (the value of β’s debt, the debt it acquired to α for its satisfaction with the service done by α); and vIαβ (the value of the credit that α acquires from β for having realized the service). In stages of the type IIαβ , the agent α asks the payment for the service previously done for β, and the values related with this exchange have similar meaning. rIαβ , sIβα , rIIβα and sIIαβ are called material values. tIβα , vIαβ , tIIβα and vIIαβ are the virtual values. The order in which the exchange stages may occur is not necessarily Iαβ − IIαβ . We observe that the values are undefined if either no service is done in a stage of type I, or no credit is charged in a stage of type II. Also, it is not possible for α to realize a service for β and, at the same, to charge him a credit. A social exchange process is a sequence of stages of type Iαβ and/or IIαβ . The material results, according to the points of view of α and β, are given by the sum of the well defined material values involved in the process, and are denoted, respectively, by mαβ and mβα . The virtual results vαβ and vβα are defined analogously. A social exchange process is said to be in material equilibrium if mαβ and mβα are around a reference value s ∈ R. Observe that, in any exchange stage, either α or β has to perform a service, so decreasing its material results.
3
The Social Exchange Regulation Mechanism
Figure 1 shows the architecture of our social exchange regulation mechanism, which extends the one proposed in [4] with a learning module based on HMM [16]. The equilibrium supervisor, at each time, uses an Evaluation Module to analyze the conditions and constraints imposed by the system’s external and internal environments (not shown in the figure), determining the target equilibrium point. To regulate transparent agents, the supervisor uses a Balance Module (Σ) to calculate their balances of material results of the performed exchanges. To regulate non-transparent agents, the supervisor uses an observation module (Obs.) to access the virtual values (debts and credits) that they report, and the HMM module to recognize and maintain an adequate model of the personality traits of such agents, generating plausible balances of their material exchange values. Taking both the directly observed and the indirectly calculated material results, together with the currently target equilibrium point, the supervisor uses the module that implements a personality-based QI–MDP to decide on recommendations of exchanges for the two agents, in order to keep the material results of exchanges in equilibrium. It also takes into account the virtual results of the exchanges in order to decide which type of exchange stage it should suggest. The states of a QI–MDP [4] are pairs (Eα,β , Eβ,α ) of classes of material results (investments and satisfactions) of exchanges between agents α and β, from the
1108
G.P. Dimuro et al.
Supervisor Evaluation Module
Exchg . Balances
Virtual Values
HMM
Sum.
Obs. Exchanges
Target Point
QI - MDP
Transparent Agents
Non-transp. Agents
Recommendations
System Fig. 1. The regulation mechanism for personality-based social exchanges
point of view of α and β, respectively.3 Es = {Es− , Es0 , Es+ } is the set of the supervisor representation of the classes of unfavorable (Es− ), equilibrated (Es0 ) and favorable (Es+ ) material results of exchanges, related to a target equilibrium 0 0 point s. (Eα,β , Eβ,α ) is the terminal state, when the system is in equilibrium. The actions of the QI–MDP model are state transitions that have the form j j i i (Eα,β , Eβ,α )→(Eα,β , Eβ,α ), with i, i , j, j ∈ {−, 0, +}, which may be of the following types: a compensation action, which directs the agents’ exchanges to the equilibrium point; a go-forward action, which directs them to increasing material results; a go-backward action, which directs them to decreasing material results. j i The supervisor has to find, for the current state (Eα,β , Eβ,α ), the action that 0 0 may achieve the terminal state (Eα,β , Eβ,α ). The choice of actions is constrained by the rules of the social exchanges and some transitions are forbidden (e.g., both agents increasing results simultaneously), so in some cases the supervisor has to find alternative paths in order to lead the system to the equilibrium. An action generates an optimal exchange recommendation, consisting of a partially defined exchange stage that the agents are suggested to perform. Also, by the analysis of the agent’s virtual results (debts and credits), the supervisor recommends a specific type of exchange stage (I or II).
4
Social Exchanges Between Personality-Based Agents
We define different levels of obedience to the supervisor that the agents may present: (i) Blind Obedience (the agent always follows the recommendations), (ii) Eventual Obedience (the agents may not follow the recommendations, according to a certain probability) and (iii) Full Disregard of Recommendations (the agent always decides on its own, disregarding what was recommended). 3
In this paper, we considered just a sample of classes of material results. See [3,4] for the whole family of classes of a QI-MDP, and the procedure for determining them.
Regulating Social Exchanges
1109
The agents may have different social attitudes that give rise to a statetransition function, which specify, for each obedience level, and given the current state and recommendation, a probability distribution Π(Es ) over the set of states Es that the interacting agents will try to achieve next, depending on the their personality traits. In the following, we illustrate some of those personality traits (Table 1): (i) Egoism (the agent is mostly seeking his own benefit, with a high probability to accept exchanges that represent transitions toward states where it has favorable results); (ii) Altruism (the agent is mostly seeking the benefit of the other, with a high probability to accept exchanges that represent transitions toward states where the other agent has favorable results). (iii) Fanaticism (the agent has a very high probability to enforce exchanges that lead it to the equilibrium, avoiding other kinds of transitions); (iv) Tolerance (the agent has a high probability to enforce exchanges that lead it to the equilibrium if his material results are far from that state, but it accepts other kinds of transitions). Table 1. A pattern of probability distribution Π(Es ) for individual agent transitions Egoist agents E+ E− very high very low very high very low very high very low Fanatic agents Π(Es ) E0 E+ E− E 0 very high very low very low E + very high very low very low E − very high very low very low Π(Es ) E0 E+ E−
E0 low low low
Altruist agents E0 E+ E− low very low very high low very low very high low very low very high Tolerant agents E0 E+ E− high low low high low low high low low
Table 2 shows parts of sample state-transition functions F for systems composed by (a) two tolerant agents and (b) two egoist agents that always disregard the supervisor’s recommendations. The mark X indicates that the transition is forbidden according to the social exchange rules (both agents cannot increase their material results simultaneously, as explained in Sect. 3). The system (b) presents an absorbent state, (E − , E − ), meaning that the system is not able to leave that state if it reaches it4 , and so may never achieve the target equilibrium point. We remark that even if the agents present a certain level of obedience, there may be a great deal of uncertainty about the effects of the supervisor’s recommendations. For example, considering an obedience level of 50%, the statetransition functions shown in Table 2 becomes the ones shown in Table 3. Since the supervisor has no access to the current state (material results of exchanges), it has to rely on observations of the agents’ evaluations of their virtual results (debts (D), credits (C) or null results (N)). Due to their personality traits, they may present different attitudes concerning such evaluations (Table 4): (i) Realism (the agent has a very high probability to proceed to realistic evaluations); (ii) Over-evaluation (the agent has a very high probability to report that 4
The probability 100% in the last line of Table 2(b) just means that the agents refuse to exchange, remaining in the same state (E − , E − ).
1110
G.P. Dimuro et al.
Table 2. Parts of state-transition functions F for pairs of agents that always disregard recommendations (a) (tolerant, tolerant) agents
F (%) (E 0 , E 0 ) (E 0 , E + ) (E 0 , E − ) (E + , E 0 ) (E + , E + ) (E + , E − ) (E − , E 0 ) (E − , E + ) (E − , E − ) (E 0 , E 0 ) 63.90 X 13.70 X X 2.90 13.70 2.90 2.90 (E + , E − ) 49.20 10.50 10.50 10.50 2.20 2.20 10.50 2.20 2.20 (E − , E − ) X X 37.85 X X 8.10 37.85 8.10 8.10
(b) (egoist, egoist) agents
F (%) (E 0 , E 0 ) (E 0 , E + ) (E 0 , E − ) (E + , E 0 ) (E + , E + ) (E + , E − ) (E − , E 0 ) (E − , E + ) (E − , E − ) (E 0 , E − ) X X 0.00 X X 0.00 15.00 85.00 0.00 (E + , E + ) 2.20 12.00 0.70 12.00 64.10 4.00 0.70 4.00 0.30 + − (E , E ) 2.20 12.80 0.00 12.00 68.00 0.00 0.70 4.30 0.00 (E − , E − ) X X 0.00 X X 0.00 0.00 0.00 100.00
Table 3. Parts of state-transition functions F for pair of agents with 50% of obedience (a) (tolerant, tolerant) agents
F (%) (E 0 , E 0 ) (E 0 , E + ) (E 0 , E − ) (E + , E 0 ) (E + , E + ) (E + , E − ) (E − , E 0 ) (E − , E + ) (E − , E − ) (E 0 , E 0 ) 81.95 X 6.85 X X 1.45 6.85 1.45 1.45 (E + , E − ) 74.6 5.25 5.25 5.25 1.10 1.10 5.25 1.10 1.10 (E − , E − ) X X 18.92 X X 29.05 18.92 29.05 4.06
(b) (egoist, egoist) agents
F (%) (E 0 , E 0 ) (E 0 , E + ) (E 0 , E − ) (E + , E 0 ) (E + , E + ) (E + , E − ) (E − , E 0 ) (E − , E + ) (E − , E − ) (E 0 , E − ) X X 0.0% X X 25.00 7.50 67.50 0.00 (E + , E + ) 51.10 6.00 0.35 6.00 32.05 2.00 0.35 2.00 0.15 + − (E , E ) 51.10 6.40 0.00 6.00 34.00 0.00 0.35 2.15 0.00 (E − , E − ) X X 0.00 X X 25.00 0.00 25.00 50.00
Table 4. A pattern of probability distribution Π(O) over the set of observations O = {N, D, C} of agents’ evaluations of their virtual results, in each state Realistic agents Over-evaluator agents Under-evaluator agents Π(O) D N C D N C D N C 0 E very low very high very low very low low very high very high low very low E + very high very low very low low medium high very high very low very low E − very low very low very high very low very low very high high medium low
Table 5. Part of an observation function G for (under-eval.,over-eval.) agents G (%) (N,N) (N,D) (N,C) (D,N) (D,D) (D,C) (C,N) (C,D) (C,C) (E 0 , E 0 ) 4 0 16 16 0 64 0 0 0 (E 0 , E − ) 0 0 20 0 0 80 0 0 0 + − (E , E ) 0 0 0 0 0 100 0 0 0 − − (E , E ) 0 0 30 0 0 50 0 0 20
it has credits); (iii) Under-evaluation (the agent has a very high probability to report that it has debts). Table 5 shows part of a sample observation function G that gives a probability distribution of observations of evaluations of virtual results for a pair of (under-evaluator,over-evaluator) agents, in each state.
5
Reasoning About Exchanges
To be able to reason about exchanges between pairs of non-transparent personality-based agents, the supervisor models the system as Hidden Markov Model [16].
Regulating Social Exchanges
1111
Definition 1. A Hidden Markov Model (HMM) for exchanges between nontransparent personality-based agents is a tuple Es , O, π, F, G, where: (i) the set Es of states is given by the pairs of classes of material results, where s is the equilibrium point: Es = {(E 0 , E 0 ), (E 0 , E + ), (E 0 , E − ), (E + , E 0 ), (E + , E + ), (E + , E − ), (E − , E 0 ), (E − , E + ), (E − , E − )}; (ii) the set O of observations is given by the possible pairs of agents’ evaluations of virtual results: O = {(N, N ), (N, D), (N, C), (D, N ), (D, D), (D, C), (C, N ), (C, D), (C, C)}; (iii) π is the initial probability distribution over the set of states Es ; (iv) F : Es → Π(Es ) is the statetransition function, which gives for each state, a probability distribution over the set of states Es ; (v) G : Es → Π(O) is the observation function that gives, for each state, a probability distribution over the set of observations O. This model allows the supervisor to perform the following tasks: Task 1: to find the probability of a sequence of agents’ evaluations of virtual results, using a backward-forward algorithm [16]; Task 2: to find the most probable sequence of states associated to a sequence of agents’ evaluations of virtual results, using the Viterbi algorithm [16]; Task 3: to maintain an adequate model of the personality traits of the agents, given their observable behaviors: the supervisor adjusts the parameters of its current model to the probability of occurrence of a frequent sequence of observations (via the Baum-Welch algorithm [16]), in order to compare the resulting model with the known models and to classify it. Notice that, whenever a new non-transparent agent join the society, the supervisor assumes the position of an observer, building HHMs in order to obtain an adequate model of the personality traits of such agent and to find the most probable state of the system at a given instant. After that, it is able to start making recommendations. We assume that obtaining the model of an agent’s personality traits is independent of the agent’s degree of obedience. Of course, to discover the degree of obedience of an agent is a trivial task.
6
Simulation Results
Some simulation results were chosen for the discussion, considering the supervisor’s tasks detailed in Sect. 5. For that, adaptations of the well-known algorithms backward-forward (for task 1), Viterbi (for task 2) and Baum-Welch (for task 3) (see [16]), were incorporated in the supervisor behaviour (Fig. 1, HHM module). 6.1
Simulation of Tasks 1 and 2
The methodology used for the analysis of the performance of the supervisor in the tasks 1 and 2 considered: (i) test-situations with two agents, combining all different personality traits; (ii) a uniform initial probability distribution π over the set of states; (iii) the computation of the probabilities of occurrence of all sequences of two consecutive observations (agents’ evaluations); (iv) the computation of the most probable sequence of states that generates each observation.
1112
G.P. Dimuro et al.
Table 6(a) presents some peculiar results obtained for a pair of tolerant/realist agents. As expected, the simulations showed that the observations reflected the actual state transitions. The most probable sequences of observations were those ending in null virtual results, associated to transitions toward the equilibrium (e.g., obs. 1, 2, 3). The state transitions that did not faithfully reflect the observations were those that took the place of transitions that are forbidden, according to the social rules of the modelling. For example, the transition found for observation 4 (which presented the lowest occurrence probability, for sequences ending in null results) was found in place of (E 0 , E − ) → (E 0 , E 0 ), since the latter is a forbidden transition. Observations with very low probability were associated, in general, to transitions that went away from the equilibrium (e.g., obs. 5). Table 6(b) shows some selected results for a pair of (tolerant/under-evaluator, tolerant/over-evaluator ) agents. As expected, the transitions did not always reflect the observations (e.g., obs. 1, 2). Nonetheless, the overall set of simulations showed that almost 70% of the observations ending in null results coincided with transitions ending in the equilibrium. However, those observations presented very low probability (e.g., obs. 1 and 3, the latter having the lowest occurrence probability, since it reflected an adequate transition, which was not expected for non realist agents). Observation 4 presented the highest occurrence probability, and its associated transition towards to the equilibrium point was the most expected one for a pair of tolerant agents. There was always a high probability that the agents evaluated their virtual results as (D, C) whenever they were in the equilibrium state, as expected. In general, sequences of observations containing the results (D, C) were the most probable; on the contrary, sequences of observation presenting (C, D) had almost no probability of occurrence (e.g., obs. 5). Table 6(c) shows some results for a pair of (egoist/realist, altruist/realist ) agents. The observations ending in null virtual results presented very low probabilities, although, in general, they reflected the actual transitions. Observation 2 was the most probable sequence ending in null virtual results. Observation 3 represents that particular case discussed before, in which the corresponding faithful transition was not allowed, and, therefore, the actual transition did not reflect the observation. The most probable observations were those associated to transitions that made the agents depart away from the equilibrium (e.g., obs. 4), or those associated to transitions that maintained benefits for the egoist agent and loss for the altruist agent (e.g., obs. 5). Table 6(d) shows some results for a pair of (egoist/under-evaluator, altruist/ over-evaluator ) agents. The sequences of observations ending in null virtual results presented very low probability (e.g., obs. 1, 2 and 3), but they are still significant because they coincided with transitions ending in the equilibrium. The other sequences of observations that did not end in null results, but corresponded to transitions that led to the equilibrium (e.g., obs. 4), presented no probability of occurrence. Notice the very high probability of observation 5, that reflected exactly the combination of those extreme personalities.
Regulating Social Exchanges
1113
Table 6. Simulation results for pair of agents (a) (tolerant/realist,tolerant/realist) N Observation Probab. State Transition 1 (N,N)-(N,N) 3.6% (E 0 , E 0 ) → (E 0 , E 0 ) 2 (D,D)-(N,N) 3.4% (E + , E + ) → (E 0 , E 0 ) 3 (D,N)-(N,N) 3.3% (E + , E 0 ) → (E 0 , E 0 ) 4 (N,C)-(N,N) 1.5% (E 0 , E 0 ) → (E 0 , E 0 ) 5 (D,N)-(D,D) 0.3% (E + , E 0 ) → (E + , E + )
(b) (tolerant/under-eval., tolerant/over-eval.) N Observation Probab. State Transition 1 (N,N)-(N,N) 0.084% (E − , E + ) → (E 0 , E 0 ) 2 (D,C)-(N,N) 1.902% (E − , E − ) → (E − , E 0 ) 3 (C,D)-(N,N) 0.014% (E − , E + ) → (E 0 , E 0 ) 4 (D,C)-(D,C) 35.28% (E + , E − ) → (E 0 , E 0 ) 5 (C,D)-(C,D) 0.0004% (E − , E + ) → (E − , E + )
(c) (egoist/realist, altruist/realist) N Observation Probab. State Transition 1 (N,N)-(N,N) 0.36% (E 0 , E 0 ) → (E 0 , E 0 ) + 2 (D,D)-(N,N) 0.42% (E , E + ) → (E 0 , E 0 ) 3 (C,N)-(N,N) 0.27% (E − , E 0 ) → (E 0 , E − ) 4 (N,N)-(D,C) 5.07% (E 0 , E 0 ) → (E + , E − ) 5 (C,D)-(D,C) 5.35% (E − , E + ) → (E + , E − )
(d) (egoist/under-eval., altruist/over-eval.) N Observation Probab. State Transition 1 (N,N)-(N,N) 0.002% (E − , E + ) → (E 0 , E 0 ) + 2 (D,C)-(N,N) 0.070% (E , E − ) → (E 0 , E 0 ) 3 (D,N)-(N,N) 0.016% (E + , E + ) → (E 0 , E 0 ) 4 (C,D)-(C,D) 0.000% (E − , E + ) → (E 0 , E 0 ) 5 (D,C)-(D,C) 53.71% (E + , E − ) → (E + , E − )
6.2
Simulation of Task 3
The methodology used for the analysis of the performance of the equilibrium supervisor in the task 3 considered the following steps: (i) given a frequently noticed sequence of observations of evaluations of virtual results, the HMM is adjusted by generating new parameters (initial distribution, transition and emission matrices) for the probability of such observations; (ii) the new HMM is compared with the models known by the supervisor, stored in a library: the difference between the new HMM and each of such models is evaluated by using the infinite norm, X − Y ∞ , where X is any parameter of a reference n HHM, Y is the respective parameter of the new HHM, and A∞ = maxi j=1 | Aij | is the maximum absolute row sum norm of a matrix A; (iii) the new HMM is then classified as either describing a new model of personality traits or being of one of the kinds of models maintained in the library, according to a given error. To adjust the parameters of a given model, we used the Baum Welch algorithm [16] (which we noticed happened to preserve the compliance of the transition matrices to the exchange rules). Table 7 shows the analysis done by the supervisor when observing the interactions between five non-transparent agents and the others personality-based agents. The results were obtained by comparing adjusted HMM´s (for probabilities of observations) with the other models of pairs of agents, considering the maximum error of 0.7. For simplicity, only realist agents were considered. For the observation in line 1 (probability of 80%, in interactions with tolerant agents), the least error between the new model and all other models resulted in Table 7. Recognition of new personality traits (T = tolerance, E = egoism, A = altruism) N Observation Prob. (%) 1 (D,D)-(N,N)-(N,N) 80 2 (D,D)-(N,N)-(N,N) 100 3 (D,N)-(D,D)-(D,D) 60 4 (N,N)-(C,C)-(C,C) 40 5 (D,C)-(C,N)-(D,C)-(C,N) 50
Least Error (model) 0.6 (T,T) 1.0 (T,T) 0.6 (E,E) 0.6 (A,A) 1.2 (T,T)
Personality Trait tolerance new classification egoism altruism new classification
1114
G.P. Dimuro et al.
its compatibility with a model of tolerant agents. So, the supervisor classified the non-transparent agent as tolerant. For the observation in line 5 (probability of 50%, in interactions with tolerant agents), the least error found was larger than the admissible error, and then the supervisor concluded that the agent had a new personality trait. Line 2 shows the dependence of the results on the probability of the observation: if in line 1 it was 100%, the supervisor would conclude that the agent presented a new personality trait.
7
Conclusion
The paper leads toward the idea of modelling agents’ personality traits in social exchange regulation mechanisms. It extends the centralized regulation mechanism based on the concept of equilibrium supervisor by introducing the possibility that personality-based agents control the supervisor access to their internal states, behaving either as transparent agents (agents that allow full external access to their internal states) or as non-transparent agents (agents that restrict such external access). We studied three sample sets of personality traits: (i) blind obedience, eventual obedience and full disregard of recommendations (related to the levels of adherence to the regulation mechanism), (ii) fanaticism, tolerance, egoism and altruism (in connection to preferences about balances of material results), and (iii) realism, over- and under-evaluation (in connection to the agents’ tendencies in the evaluation of their own status). The main focus was on dealing with non-transparent agents, when the supervisor has to make use of an observation module, implemented as a HHM, to be able to recognize and maintain an adequate model of the personality traits of such agents. This may be important for open agent societies as, for example, in applications for Internet. Also, it seems that the consideration of the agent (non)transparency feature is new to the issue of social control systems. To analyze the efficiency of the supervisor observation module, we performed simulations which results showed that the approach is viable and applicable. Also, the simulations hinted on the possibility of establishing sociological properties of the proposed HMM, like the property that the adjustment of a given model by the Baum-Welch procedure preserves the constraints imposed by the rules that regulate the value-exchange processes. Future work is concerned with the internalization of the supervisor into the agents themselves, going toward the idea of self-regulation of exchange processes, not only distributing the decision process [17], but also considering incomplete information about the balances of material results of the exchanges between non-transparent agents, in the form of a personality-based qualitative interval Partially Observable Markov Decision Process (POMDP) [18,19].
References 1. Castelfranchi, C.: Engineering social order. In Omicini, A., Tolksdorf, R., Zambonelli, F., eds.: Engineer. Societ. in Agents World. Springer, Berlin (2000) 1–18 2. Homans, G.C.: The Human Group. Harcourt, Brace & World, New York (1950)
Regulating Social Exchanges
1115
3. Dimuro, G.P., Costa, A.C.R., Palazzo, L.A.M.: Systems of exchange values as tools for multi-agent organizations. Journal of the Brazilian Computer Society 11 (2005) 31–50 (Special Issue on Agents’ Organizations). 4. Dimuro, G.P., Costa, A.C.R.: Exchange values and self-regulation of exchanges in multi-agent systems: the provisory, centralized model. In Brueckner, S., Serugendo, G.M., Hales, D., Zambonelli, F., eds.: Proc. Work. on Engineering Self-Organizing Applic., Utrecht, 2005. Number 3910 in LNAI, Berlin, Springer (2006) 75–89 5. Piaget, J.: Sociological Studies. Routlege, London (1995) 6. Dimuro, G.P., Costa, A.C.R., Gon¸calves, L.V., H¨ ubner, A.: Centralized regulation of social exchanges between personality-based agents. (In: Proc. of the Work. on Coordination, Organization, Institutions and Norms in Agent Systems, COIN@ECAI’06, Riva del Garda, 2006) 7. Antunes, L., Coelho, H.: Decisions based upon multiple values: the BVG agent architecture. In Barahona, P., Alferes, J.J., eds.: Proc. of IX Portug. Conf. on ´ Artificial Intelligence, Evora. Number 1695 in LNCS, Berlin (1999) 297–311 8. Miceli, M., Castelfranchi, C.: The role of evaluation in cognition and social interaction. In Dautenhahn, K., ed.: Human cognition and agent technology. John Benjamins, Amsterdam (2000) 225–262 9. Walsh, W.E., Wellman, M.P.: A market protocol for distributed task allocation. In: Proc. III Intl. Conf. on Multiagent Systems, Paris (1998) 325–332 10. Rodrigues, M.R., Luck, M.: Analysing partner selection through exchange values. In Antunes, L., Sichman, J., eds.: Proc. of VI Work. on Agent Based Simulations, MABS’05, Utrecht, 2005. Number 3891 in LNAI, Berlin, Springer (2006) 24–40 11. Puterman, M.L.: Markov Decision Processes. Wiley, New York (1994) 12. Moore, R.E.: Methods and Applic. of Interval Analysis. SIAM, Philadelphia (1979) 13. Carbonell, J.G.: Towards a process model of human personality traits. Artificial Intelligence 15 (1980) 49–74 14. Castelfranchi, C., Rosis, F., Falcone, R., Pizzutilo, S.: A testbed for investigating personality-based multiagent cooperation. In: Proc. of the Symp. on Logical Approaches to Agent Modeling and Design, Aix-en-Provence (1997) 15. Castelfranchi, C., Rosis, F., Falcone, R., Pizzutilo, S.: Personality traits and social attitudes in multiagent cooperation. Applied Artif. Intelligence 12 (1998) 649–675 16. Rabiner, L.R.: A tutorial on Hidden Markov Models and selected applications in speech recognition. Proc. of the IEEE 77 (1989) 257–286 17. Boutilier, C.: Multiagent systems: challenges and oportunities for decision theoretic planning. Artificial Intelligence Magazine 20 (1999) 35–43 18. Kaelbling, L.P., Littman, M.L., Cassandra, A.R.: Planning and acting in partially observable stochastic domains. Artificial Intelligence 101 (1998) 99–134 19. Nair, R., Tambe, M., Yokoo, M., Pynadath, D., Marsella, S.: Taming decentralized POMDPs: Towards efficient policy computation for multiagent settings. In: Proc. 18th Intl. Joint Conf. on Artificial Intelligence, IJCAI’03, Acapulco (2003) 705–711
Using MAS Technologies for Intelligent Organizations: A Report of Bottom-Up Results Armando Robles1,2 , Pablo Noriega1, Michael Luck2, and Francisco J. Cant´u3 1
IIIA - Artificial Intelligence Research Institute Bellaterra, Barcelona, Spain 2 University of Southampton, Electronics and Computer Science Southampton, United Kingdom 3 ITESM Campus Monterrey Research and Graduate Studies Office, Monterrey, N.L. M´exico {arobles, pablo}@iiia.csic.es, [email protected], [email protected]
Abstract. This paper is a proof of concept report for a bottom-up approach to a conceptual and engineering framework to enable Intelligent Organizations using MAS Technology. We discuss our experience of implementing different types of server agents and a rudimentary organization engine for two industrial-scale information systems now in operation. These server agents govern knowledge repositories and user interactions according to workflow scripts that are interpreted by the organization engine. These results show how we have implemented the bottom layer of the proposed framework architecture. They also allow us to discuss how we intend to extend the current organization engine to deal with institutional aspects of an organization other than workflows.
1 Introduction This paper reports results on two particular aspects of our progress towards a framework to support knowledge intensive organizations: the design of server domain agents and the implementation of an organization engine. We are proposing a framework for the design of systems enabled by electronic institutions that drive the operation of actual corporate information systems. This is an innovative approach to information systems design since we propose ways of stating how an organization is supposed to operate: its institutional prescription, and having that prescription control the information system that handles the day to day operation of the organization: the enactment of the organization. We are not restricting our proposal to any particular domain of application but we do have in mind organizations that are self-contained (i.e. with a boundary that separates the organization from its environment) and have a stable character (i. e., whose mode of operation does not change very rapidly). We should also make clear that our proposal is not intended for organizational design, what we are proposing is a framework for the design and deployment of agent-based systems that support already designed organizations. Finally, we should point out that we are designing a framework to be applied to new information systems but as this paper demonstrates we find it is also applicable, with some reservations, to the conversion of traditional legacy information systems. A. Gelbukh and C.A. Reyes-Garcia (Eds.): MICAI 2006, LNAI 4293, pp. 1116–1127, 2006. c Springer-Verlag Berlin Heidelberg 2006
Using MAS Technologies for Intelligent Organizations
1117
In our framework we propose a conceptual architecture and the tools to build corporate information systems. The framework we propose is built around the notion of electronic institution (EI) [2] and uses agent-based technologies intensively. Instead of using the notion of electronic institutions to represent and harness only static procedural features —as is currently the case— we propose to extend the notion of electronic institution to capture conveniently more flexible procedural features. In order to capture other non-procedural institutional features of an organization as well, we use the extended notion of electronic institution and develop different sorts of agents and agent architectures —server agents, organization agents and user agents. In addition to those accretions we are also involved in the consequent extension of available tools in order to handle the added expressiveness and functionality. In previous papers we have outlined the framework [10] and discussed its components from a top-down perspective [11] and reported the first implementation experiences [13]. In this paper we recount our experience with the agentification of two existing corporate information systems of the type we want to be able to capture with our framework and discuss how we plan to extend that experience for the intended framework. The experience of agentifying these industrial scale systems had two main outcomes: a family of actual server agents that deal with the knowledge repositories and user interfaces of the two application domains and a rough version of the organization engine that we propose for the full-blown framework. The paper is organized as follows: after a quick review of relevant background, we present our basic proposal, in Section 3, and in Section 4 what we have accomplished in the bottom-up agentification process. We then discuss why and how we intend to evolve a workflow engine into an organizational engine in Section 5. Finally, in Section 6 we present ongoing work and conclusions.
2 Background 2.1 Organizations We think of an organization, a firm, as a self-contained entity where a group of individuals pursue their collective or shared goals by interacting in accordance with some shared conventions and using their available resources as best they can [9,7,1]. This characterization focuses on the social processes and purpose that give substance and personality to a real entity and naturally allows to consider people, processes, information and other resources as part of the organization. We choose to use this particular notion in our discourse because at least for the moment we do not want to commit to other organization-defining criteria like sustainability, fitness in its environment or status and substitutability of personnel. We want to focus further in what have been called knowledge-intensive or intelligent organizations whose distinguishing feature is the explicit recognition of their corporate-knowledge and know-how as an asset [6]. The everyday operation of an organization consists of many activities that are somewhat structured and that involve personnel, clients and resources of different sorts. It is usual for organizations to manage and keep track of those activities through on-line information systems that are usually called corporate information systems (CIS). We will assume that intelligent organizations have CIS and we will further assume that corporate knowledge and know-how may be contained in the CIS.
1118
A. Robles et al.
Hotels, hospitals and other types of organizations, have conventions that structure or institutionalize their activity in consistent ways so that employees and clients have some certainty about what is expected of them or what to expect from each other. These conventions are usually also a convenient way of establishing procedures that save coordination and learning efforts and pinpoint issues where decision-making is regularly needed. These institutional conventions usually take the form of organizational roles, social structures, canonical documents, standard procedures, rules of conduct, guidelines, policies and records; that is, habits and objects, that participants adhere to in a more or less strict way (cf. e.g. [14]). Our aim is to design a framework that is fit to capture such institutional aspects of an intelligent organization and make them operational as part of its CIS. 2.2 Electronic Institutions We adopt the concept of electronic institution, EI, as defined in the IIIA and specified through the following components: a dialogical framework —that defines ontology, social structure and language conventions— and a deontological component that establishes the pragmatics of admissible illocutory actions and manages the obligations established within the institution [8]. EI is currently operationalized as EI0 [2]. In particular, its deontological component is specified with two constructs that we will refer to in the rest of the paper: First, a performative structure that includes a network of scenes linked by transitions. Scenes are role-based interaction protocols specified as finite state machines, arcs labelled by illocutions and nodes corresponding to an institutional state. Transitions describe the role–flow policies between scenes. Second, a set of rules of behavior that establish role-based conventions regulating commitments. These are expressed as pre and postconditions of the illocutions admissible by the performative structure. There is a set of tools (EIDE)[2] that implements EI0 electronic institutions. It includes a specification language (ISLANDER) generating an executable EI and middleware (AMELI) that activates a run-time EI to be enacted by actual agents. We want to take advantage of these developments to capture the institutional aspects of an organization and be able to incorporate these aspects as part of a CIS. More precisely, we will use EI notions to represent stable institutional activities, roles, procedures and standard documents. We will also take advantage of EI as coordination artifacts to organize corporate interactions according to the (institutional) conventions of the organization. Finally, we will use an extended version of EI0 in order to specify and implement an organization engine that enacts the institutional conventions of an organization by driving the operation of the components of its CIS.
3 A Proposal for EI-Enabled Organizations Our aim is to design a conceptual framework to deal with the design and construction of corporate information systems. Since we intend to make such framework applicable for knowledge-intensive CIS and we find that the notion of electronic institution is well adapted to this purpose, we are calling it a framework for EI-enabled organizations. We are proceeding in the following manner:
Using MAS Technologies for Intelligent Organizations
1119
Agentify the components of standard CIS with three types of agents owned and controlled by the organization: server agents, user agents, and staff agents. Encapsulate institutional knowledge as (a) agentified knowledge repositories of different types (business rules, workflow scripts), (b) decision-making capabilities, guidelines or policies that are modelled as staff agents, and (c) the choice of functions that are delegated in staff agents. Extend the notion of electronic institution to describe and implement properly the significant institutional aspects of an organization. Build an operative environment where a prescriptive description of an organization governs and reacts with the CIS that handles the day-to-day operation of the organization. Figure 1 (left) outlines the functional architecture of the framework. That functional architecture has been motivated and described elsewhere ([10], [11]), however, for the purpose of this paper we may say that the top layer is a prescriptive definition of the organization that the bottom layer eventually grounds on the components (users, transactions, programs, data) of the business domain. The core is an organization engine built around an EI that implements and enforces the institutional conventions through a middleware that maps them into the agentified CIS in the bottom layer.
Fig. 1. Architectural diagram of the proposed framework (left) and the implemented workflow engine for the outpatient care system involving User, Business rule and Database server agents(right)
– The electronic institution layer implements the normative specification using as input the performative scripts produced by the EI specification language. The runtime functionalities of this layer are similar to those of AMELI [3,2] since it runs in close interaction with the organization middleware and it guarantees that all interactions comply with the institutional conventions. – The organization middleware layer contains a grounding language interpreter and uses it to pass grounding language commands from the run-time EI to the CIS components and map actions in the CIS with state transitions in the run-time EI.
1120
A. Robles et al.
Thus, the grounding language is used to specify the sequencing of instantiation of performative scripts as well as agent behaviour in order to manage interactions with the actual CIS elements: users, interfaces and knowledge repositories. The basic functions of this middleware layer are: • to log users into the organization, controlling user roles, agent resources and security issues. • to monitor user interaction, • to execute the grounding language interpreter, • to implement interaction devices1 , and • to control the actual mappings between the grounding language interpreter and domain entities.
4 A Bottom-Up Approach 4.1 The Agentified CIS We have approached the design of our framework from both ends. The top-down approach is centered in the theoretical and computational extensions of the EI0 developments (c.f. [11]). The bottom-up approach that we explore in this paper has consisted in the agentification of two CIS in actual operation, and the design and implementation of a rudimentary organization engine that is currently a workflow engine proficient enough to drive the MAS-ified operation of those two CIS. The systems that we have been working with are integral vertical industry systems. One system implements the full operation of large hotels: reservations, room assignment, restaurant services, accounting and billing, personnel, etc. The other implements the full operation of a hospital: outpatient care, nurse protocols, pharmacy, inventory control, electronic medical records and so on. They have been developed by Grupo TCA and have been in an evolving operation for almost 20 years.2 Over the last 5 years, TCA has been modifying its hotel information system to facilitate its agentification. It is now a consolidated set of business rules available to a middleware workflow engine that reads workflow scripts and delegates concrete tasks and procedures to participating user and server agents. This modestly MAS-ified CIS whose architecture is reported in [13] is already operational in 15 hotels. In the health care domain, TCA has MAS-ified the outpatient care subsystem [12] as a first step for the agentification of their integral hospital information system. These two MAS-ified CIS, show how we intend to put our proposal to work and, as we report below, the experience brought to light many issues that should be taken into account for the development of our framework. 1
2
For those domain entities that need to be in contact with external agents we have developed a special type of server agent that we call interaction device. These devices implement interfacing capabilities between external users and other domain elements, e.g. form handling, data base calls, business rule triggering. TCA is a medium-size privately owned information systems company that has been active in the design and development of integral information systems since 1982 for the Latin American market.
Using MAS Technologies for Intelligent Organizations
1121
4.2 The Workflow Engine The workflow engine (WF-engine) is currently operational and implements a restricted version of the main component of the organization middleware: the organization engine. Once initiated, WF-engine reads a workflow script from a repository and interprets the commands contained in it. The commands are executed in the sequence dictated by the workflow conditional directives, and each command triggers the inter-operation of server agents that control domain components —data bases, business rules, forms— and their interaction with human users. Figure 1 (right) illustrates how the workflow engine supervises the agents that handle specialized domain components, such as databases or business rule repositories — a specialized business rule server agent (Bag) fetches, from a central repository, business rules that use data provided by another specialized database server agent (Dag), to provide input to a user agent (Uag) that displays it in a user form. Each workflow specification is stored in a repository as a workflow script. Since each domain component is represented in the environment by a specialized server agent, we have implemented commands for sending requests to the corresponding server agents for their execution of business rules, data base accesses, reports definitions, and for end-user interactions. Each task specified in a protocol is implemented as one of the following domain actions: – a business rule, that could be as simple as a single computation or as complex as a complete computer program; – a data base access to add, delete, modify or retrieve information from a data base; – a user interaction through a specialized form; or – a reference to another workflow script. We have built an interpreter that takes a workflow script and produces a set of actions. This implementation involves activation of server and user agents of different types, the sequencing of their actions and the parameter loading and passing during those actions. The interpreter uses the following commands: – – – –
read workflow specification script, initialize variables, load defaults for variables and data, and execute workflow commands.
Initially, the workflow interpreter reads the main workflow script and starts executing the specified commands, controlling and sequencing the interaction between the intervening agents as well as loading and executing other possible workflow scripts specified in the main workflow. Here is a workflow script segment used in the Back Office module of the Hotel Information System to implement the task of adding a supplier.3 The script specifies the coordination of interactions between database, business-rule and user agents (who use specialized forms). 3
Script and business rules are taken from TCA’s hot500.wf and hot500.br repositories.
1122
A. Robles et al.
Procedure AddSupplier begin InitializeVariables ; Interact(UserAgent(DefineGrid(grid01))); Interact(UserAgent(InputFields(grid01,Supplier))); Interact(BRServerAgent(ConsistencyCheck)); Interact(DBServerAgent(Suppliers,New)); end
WF-Engine Functional Features. Agent mediated interactions The WF-engine acts as a link between the user interface and the data base and business rule repositories, but all interactions are mediated by ad-hoc server agents. Specialized server agents for domain components. The main function of the specialized server agents is to act as a business domain components (including business rule repositories and data bases) facilitators for all user agents that may be logged-in at several client sessions. The user interface is mediated by a user agent that is regarded as a client for the business rule and data base server agents. Persistent communication. Once the interaction between a user agent and a server agent is established, the infrastructure makes sure that the communication between both agents is persistent until one of the agents decides to terminate it. Business rule triggering. As shown in the previous examples, workflow scripts are currently not much more than sequences of conditional clauses that invoke, through specialized agents, the activation of specific business rules. Business rules are specialpurpose programs, stored in a repository that may be accessed by a business rule agent who is able to trigger rules and use the results of such triggerings. Business rule agents (BRagents) react to workflow transitions by requesting business rule inputs either from database server agents who query a data base or from user agents that read input from a user form. With those inputs, the BRagent triggers a rule whose result is passed to a data base server agent or a user agent or, more frequently, is used by the BRagent to change the workflow state. WF-Engine Programming Functionalities. Context of interaction. The system programmer is responsible for maintaining the context of all agent interactions because as agent interactions evolve, they modify the context of the world, updating data and status variables as required. Precedence of execution. During workflow execution, event value verification takes precedence over sequential process execution; that is, in the middle of a conditional execution, it is possible to break the sequential flow and skip directly to the first command of another conditional clause. Workflow scope of execution. Regarding the scope of workflow execution, once a flat form or grid is addressed, all subsequent workflow commands will be made in the scope of that specific flat form or grid, until another one is addressed.
Using MAS Technologies for Intelligent Organizations
1123
Scope of variables. Global variables are available in the scope of the workflow definition, that is, in the workflow specification the programmer can test for the value of variables defined as global by any server agent. It is the programmer’s responsibility to define and maintain the proper scope for the required variables. WF-Engine Limitations. The WF-engine has no control over what is said between agents. Because of the way workflow scripts are currently implemented, it deals only with specific conditional commands that test for contextual changes represented by changes in data and status variables. This is an important limitation whenever we want to deal with complex interactions, because we are forced to “hardwire” the control code for the execution of alternative procedures in the workflow script or in the business rules it involves. In the WF-engine we implemented in this experiment we designed specific commands that deal with the transfer of data between the workflow engine and user or server agents. While it is natural to transfer information as data, the transfer of control data that may alter or modify agent behavior is undesirable but due to the limited expressiveness of workflow scripts we had to implement it in the WF-engine. We have used working memory to pass control data, but this use entails the messy problem of dealing with global variables and thus imposing severe restrictions on agent autonomy. In the implementations described here we only use reactive agents. Such a primitive implementation is enough for the current needs but we may readily change their specification to involve more sophisticated behavior to take advantage of more sophisticated organization engines.
5 From WF-Engine to O-Engine Lessons Learned. Our experience of MAS-ifing two CIS, has brought to light many pertinent issues for an organizational engine. The main lessons are: Complexity trade-off. Considering the agentification of systems with equivalent functionalities, our experience with the MAS-ification shows that when business rules capture much of the discretional (rational) behaviour of agents, it is enough to use simple procedural rules to implement those procedures where the business rules are involved. Conversely, as business rules become simpler, the procedural requirements become more involved, the need for agent discretional behaviour is increased, and the need for handling agent commitments arises. The more “atomic” the business rules are, the more complexity is needed to handle them, both in the flow of control and in the agent behavior. Agent commitments. These two experiments have also shown that if we do not have a structural mechanism to control the commitments generated through agent interactions, we need to hard-wire the required program logic to keep track of pending commitments inside each agent, as part of the workflow or inside some business rules. Assume that agent a performs an action x at time t and establishes a commitment to do action y at time, say t + 3. If action x is implemented as a business rule, then we must have a mechanism to send a return value to the BRagent, or some way to set a variable in some kind of working memory.
1124
A. Robles et al.
Viable approach. We have described how we MAS-ified two CIS. In the process, we have outlined the construction of the required server and user agents, have developed the required business rules, and specified the workflow needed for the appropriate sequence of execution between the intervening agents. In this sense we have been able to implement two CIS that correspond roughly to the type of EI-enabled CIS we want to build with our framework. Even though the WF-engine is an embryonic version of an organizational engine and workflow scripts are clumsy parodies of EI-performative scripts, we have shown that specialized server agents, knowledge repositories and display devices may be driven by a prescriptive specification and some intended benefits are already available even in this rudimentary examples: – We found considerable savings in software-development time and effort avoiding duplicate code by building business rule server agents and business rule repositories, since the same agent scheme can exploit similar repositories. – We ensured problem separation at system design time, allowing domain experts to define the appropriate workflow and leaving to the engineer the task of engineering server agent behaviour. By having business rules managed by server agents, the problem is reduced to implementing some control over these agents. – Separating workflow and business rule definitions from business rule and workflow control begets a considerable simplification of the system upgrading process. This simplification allows us a glimpse at the possibility of having dynamic behavior in the CIS prescription. Additional Functionality for the O-Engine. In our framework, we want to be able to prescribe what the valid interactions among agents are. We have decided that the only valid mechanism for agent interaction —to communicate requests, assertions, results— should be illocutions. Hence, instead of using working memory, we need a proper grounding language and a mechanism to control agent illocutions and the ensuing commitments over time. This suggests us the use of production rules and an inference mechanism that will be used to define and operate the institutional conventions of performative scripts and also to load knowledge bases of staff agents. We need to design a proper grounding language to map the sequencing and instantiation of performative scripts and server agents illocutions in order to manage interactions with the domain components. How to Define and Implement the O-Engine. In order to address the issues mentioned in this section, we need to change the definition of a workflow engine into a more sophisticated organization engine that handles performative scripts —that capture more information than workflows— illocutory interactions and dynamic agent commitments. We will implement this required functionality by extending the concept of Electronic Institution. In fact each performative script is built as an electronic institution and an extension of the current machinery for transitions is used to intertwine the scripts. We will also need to extend the expressiveness of ISLANDER by having sets of modal formulae (norms) as a way of specifying performative scripts [4,5]. The grounding language will be a gradual extension —as we increase the functionality and autonomy of
Using MAS Technologies for Intelligent Organizations
1125
server agents— of the primitive commands that we use to load WF scripts and sequence the interaction of intervening agents and their calls to business rule and databases that we now hide in the WF interpreter. Once the performative script is modelled and specified (using an extension of IIIA’s ISLANDER tool), it is saved in a performative scripts repository. The organizational engine reads and instantiates each performative script as needed. The current EI0 operationalization of EI [2] will be taken as the starting point for these purposes but we are extending it to have a better representation of organizational activities, the extended functionality and a leaner execution.
6 Final Remarks Recapitulation. In [10] we took a top–down approach for the definition of a framework for enacting intelligent organizations. We proposed having a prescriptive specification that drives the organization’s information system day to day operation with an organizational engine based on electronic institutions. In this paper we report our experiences with a bottom-up approach where we tested and proved adequate a rudimentary version of the proposed framework. In this paper we also discussed how we expect to attain the convergence of the top–down and bottom–up approaches by, on one hand, transforming the WF-engine that is now functional in two industrial-scale CISs into an organization engine that may deal with more elaborate organizational issues and, on the other hand, implementing the extensions of EI0 that the organizational engine entails. Programme. Our intention is to be able to build and support large information systems that are effective and flexible. What we are doing is to device ways of stating how an organization should work and, in fact, making sure that the prescribed interactions are isomorphic with the actions that happen in the information system. We realize that there is a tension between the detailed specification of procedures and the need to a continuous updating of the system and since we know that the ideal functioning will be changing, we want that the actual operation changes as well. In order to achieve this flexibility we are following three paths: – Making the information system agent-pervasive. This way we make sure that all interactions in the CIS become, in fact, illocutions in the organizational engine, and then we may profit from all the advantages that electronic institutions bring about to express complex interaction protocols and enforce them. – Simultaneously we are going for plug-able components —performative scripts, business rule and knowledge repositories, server agents, user agents— that are easy to specify, assemble, tune and update so that we can use them to deploy interaction protocols that are stable, quickly, and thus allowing us to update these protocols parsimoniously. – We count on staff agents that are reliable and disciplined (since they are part of the organization) and, because they may have better decision-making capabilities and because we can localize their knowledge, we can build into them the flexibility needed to accommodate less frequent interactions or atypical situations (and thus simplify interaction protocols) and also to accommodate more volatile conventions (and thus save us from more frequent updates).
1126
A. Robles et al.
We entertain the expectation that we will be able to incorporate autonomic features into our systems. Next steps. An outline. In the top-down strategy we are (a) looking into the formal and conceptual extensions of the EI0 so that we may handle complex performative structures and assemble them from simpler performative scripts. (b) Devising ways of expressing deontological conventions declaratively, so that we may specify performative scripts declaratively and logically enforce them. (c) Defining the guidelines for a grounding language that translates EI manageable illocutions into CIS components actions. In the bottom-up approach we will (a) start enriching server agents so they can interact with EI0 performative structures, with “more atomic” business rules and with the other application domain entities. (b) We will also develop user agents and interaction devices further, so that we have better access and control for external users of the system. (c) We will also start implementing actual performative scripts, staff agents and appropriate business rules, on one side, and a grounding language to handle their interactions on the other. (d) We will extend the current WF-engine to handle (c). In the implementational front we foresee (a) a prototype organization engine, built on top of EIDE, to handle the bottom-up developments. (b) An extension to the ISLANDER (ISLAplus) tool to handle the new expressiveness of the organizational engine. (c) A leaner version of EIDE that instantiates an ISLAplus specification into an organization engine and enacts it on a CIS.
Acknowledgments This research is partially funded by the Spanish Ministry of Education and Science (MEC) through the Web-i-2 project (TIC-2003-08763-C02-00) and by private funds of the TCA Research Group.
References 1. Howard E. Aldrich, editor. Organizaions and Environments. Prentice Hall, 1979. 2. Josep Lluis Arcos, Marc Esteva, Pablo Noriega, Juan A. Rodr´ıguez-Aguilar, and Carles Sierra. Environment engineering for multiagent systems. Engineering Applications of Artificial Intelligence, (submitted), October 2004. 3. Marc Esteva, Juan A. Rodriguez-Aguilar, Bruno Rosell, and Josep Lluis Arcos. AMELI: An agent-based middleware for electronic institutions. In Third International Joint Conference on Autonomous Agents and Multi-agent Systems (AAMAS’04), pages 236–243, New York, USA, July 19-23 2004. 4. Andres Garcia-Camino, Pablo Noriega, and Juan Antonio Rodriguez-Aguilar. Implementing norms in electronic institutions. In Fourth International Joint Conference on Autonomous Agents and Multiagent Systems, 2005. 5. Andres Garcia-Camino, Juan Antonio Rodriguez-Aguilar, Carles Sierra, and Wamberto Vasconcelos. A Distributed Architecture for Norm-Aware Agent Societies. In Fourth International Joint Conference on Autonomous Agents and Multiagent Systems. Declarative Agent Languages and Technologies workshop (DALT’05), 2005. (forthcoming).
Using MAS Technologies for Intelligent Organizations
1127
6. Jay Liebowitz and Tom Beckman. Knowledge Organizations. Saint Lucie Press, Washington, DC, 1998. 7. James G. March and Herbert A. Simon. Organizations. John Wiley and sons, New York, USA., 1958. 8. Pablo Noriega. Agent Mediated Auctions: the Fishmarket Metaphor. PhD thesis, Universitat Aut`onoma de Barcelona (UAB), Bellaterra, Catalonia, Spain, 1997. Published by the Institut d’Investigaci en Intelligncia Artificial. Monografies de l’IIIA Vol. 8, 1999. 9. Douglas C. North. Institutions, Institutional change and economic performance. Cambridge Universisy press, 40 west 20th Street, New York, NY 10011-4211, USA, 1990. 10. Armando Robles and Pablo Noriega. A Framework for building EI–enabled Intelligent Organizations using MAS technology. In M.P. Gleizes, G. Kaminka, A. Now´e, S. Ossowski, K. Tuyls, and K. Verbeeck, editors, Proceedings of the Third European Conference in Multi Agent Systems (EUMAS05), pages 344–354., Brussel, Belgium, December 2005. Koninklijke Vlaamse Academie Van Belgie Voor Wetenschappen en Kunsten. 11. Armando Robles, Pablo Noriega, Francisco Cant´u, and Rub´en Morales. Enabling Intelligent Organizations: An Electronic Institutions Approach for Controlling and Executing Problem ´ Solving Methods. In Alexander Gelbukh, Alvaro Albornoz, and Hugo Terashima-Mar´ın, editors, Advances in Artificial Intelligence: 4th Mexican International Conference on Artificial Intelligence, Proceedings ISBN: 3-540-29896-7, pages 275 – 286, Monterrey, NL, MEX, November 2005. Springer-Verlag GmbH. ISSN: 0302-9743. 12. Armando Robles, Pablo Noriega, Michael Luck, and Francisco Cant´u. Multi Agent approach for the representation and execution of Medical Protocols . In Fourth Workshop on Agents Applied in Healthcare (ECAI 2006), Riva del Garda, Italy, Aug 2006. 13. Armando Robles, Pablo Noriega, Marco Robles, Hector Hernandez, Victor Soto, and Edgar Gutierrez. A Hotel Information System implementation using MAS technology. In Industry Track – Proceedings Fifth International Joint Conference on AUTONOMOUS AGENTS AND MULTIAGENT SYSTEMS (AAMAS 2006), pages 1542–1548, Hakodate, Hokkaido, Japan, May 2006. 14. Pamela Tolbert and Lynn Zucker. chapter The Institutionalization of Institutional Theory, pages 175–190.
Modeling and Simulation of Mobile Agents Systems Using a Multi-level Net Formalism Marina Flores-Badillo, Mayra Padilla-Duarte, and Ernesto López-Mellado CINVESTAV Unidad Guadalajara Av. Científica 1145 Col. El Bajío, 45010 Zapopan, Jal. México
Abstract. The paper proposes a modeling methodology allowing the specification of multi mobile agent systems using nLNS, a multi level Petri net based formalism. The prey-predator problem is addressed and a modular and hierarchical model for this case study is developed. An overview of a nLNS simulator is presented through the prey predator problem.
1 Introduction Nowadays Multi Agent Systems (MAS) is a distributed computing paradigm that is attracting the attention of many researchers in AI applications. Petri Nets and their extensions have been widely used for modeling, validating and implementing large and complex software systems. In the field of MAS, high level Petri Nets have been well adopted for modeling agents of parts of them, because these formalisms allow representing in a clear and compact manner complex behavior. In [5] and [6] the Valk’s Elementary Object System [8] has been extended in a less restrictive definition of a three-level net formalism for the modeling of mobile physical agents; later in [7] that definition is extended to a multilevel Petri Net System, the nLNS formalism. In this paper the nLNS formalism is used to show that it is possible to model the well-know Prey-Predator problem in a modular and hierarchical way. This paper is organized as follows. Section 2 presents a version of the PreyPredator problem and describes the nLNS formalism. Section 3 presents a methodology for building modular and hierarchical models. Finally, section 4 gives an overview of a software tool for simulating nLNS models.
2 Background In this section we describe the held version of the prey predator problem and present an overview of the nLNS formalism. 2.1 The Prey Predator Problem The original version of the Prey-Predator pursuit problem was introduced by Benda, et al [1] and consisted of four agents (predators) trying to capture an agent (prey) by surrounding it from four directions on a grid-world (fig. 1). This problem (and several A. Gelbukh and C.A. Reyes-Garcia (Eds.): MICAI 2006, LNAI 4293, pp. 1128 – 1138, 2006. © Springer-Verlag Berlin Heidelberg 2006
Modeling and Simulation of Mobile Agents Systems
1129
variations) has been studied by several researchers. Kam-Chuen, et al [2] used a genetic algorithm to evolve multi-agent languages for agents; they show that the resulting behavior of the communicating multi-agent system is equivalent to that of a Mealy finite state machine whose states are determined by the concatenation of the strings in the agents’ communication language. Haynes and Sen [3] used genetic programming to evolve predator strategies and showed that a linear prey (pick a random direction and continue in that direction for the rest of the trial) was impossible to capture. In [4] Chainbi, et al. used CoOperative Objets to show how this language may be used for the design and model of the problem.
: Prey : Predator
Capture situation Fig. 1. The prey-predator problem: the prey is captured by predators
The held version of the Prey-Predator problem is specified below: 1. There is only one prey and one predator. 2. The world is a n×n grid. 3. Both prey and predator agents are allowed to move in only four orthogonal directions: Up, Right, Left, Down. 4. Prey and predator are allowed to choose when to move. 5. The predator can perceive the prey only if it is in its perception scope (Fig. 2), prey and predator perceive each other at the same time. 6. When a prey has perceived a predator, it chooses a direction in a random way except to the predator position. 7. When a predator has perceived a prey, it moves on the direction of the prey. 8. Attack condition, a prey is attacked when its position is occupied by a predator (Fig. 3). 9. The prey dies once that it has received three attacks (the prey is allowed to recover its strengths). 2.2 A Multi-level Net Formalism This section includes only an overview of nLNS; a more accurate definition is detailed in [7]. An nLNS model consists mainly of an arbitrary number of nets organized in n levels according to a hierarchy; n depends on the degree of abstraction that is desired in the model. A net may handle as tokens, nets of deeper levels and symbols; the nets of level n permits only symbols as tokens, similarly to CPN. Interactions among nets are declared through symbolic labeling of transitions.
1130
M. Flores-Badillo, M. Padilla-Duarte, and E. López-Mellado
a)
b)
Fig. 2. The prey-predator problem: a) a predator and its perception scope, b) the prey is on the predator’s perception scope
Fig. 3. The prey-predator problem: attack condition
Figure 4 sketches pieces of the components of a nLNS. The level 1 is represented by the net NET1, the level 2 by the nets NET2,1 and NET2,2, the nets NET3,1, NET3,2, NET3,3, and NET3,4 compose the level 3, and the nets NET4,1, NET4,2, NET4,3 form the level 4. A net of level i is a tuple NETi = (typeneti, μi), where is composed by a PN structure, the arcs weight (π ((p, t), lab) or π ((t, p), lab)) expressed as multi sets of variables, and a transition labeling declaring the net interaction. μi is the marking function. A nLNS model, called net system, is a n-tuple NS= (NET1, NET2, … NETn) where NET1 is the highest level net, and NETi = {NETi,1 , NETi,2, ... , NETi,r} is a set of r nets of level i. The components of a model may interact among them through synchronization of transitions. The synchronization mechanism is included in the enabling and firing rules of the transitions; it establishes that two or more transitions labeled with the same symbol must be synchronized. A label may have the attributes ≡, ↓, ↑, which express local, inner, and external synchronization. A transition t of a net of level i NETi is enabled with respect to a label lab if: 1. There exists a binding bt that associates the set of variables appearing in all π((p, t), lab). 2. It must fulfill that ∀p ∈•t, π((p, t), lab)
Modeling and Simulation of Mobile Agents Systems
1131
{↓}) It is required the enabling of the transitions labeled with lab↑ belonging to other lower level nets into •t. These transitions fire simultaneously and the lower level nets and symbols declared by π((p, t), lab)
Fig. 4. Piece of 4-LNS
3 Modeling the Prey Predator Problem 3.1 General Strategy The use of nLNS induces a modular and hierarchical modeling methodology allowing describing separately the behavior of all the involved components and then to integrate such models into a global one through the transition synchronization. For our version of the Prey-Predator problem we consider that 3 levels are enough for the modeling. The first level structures the environment where the agent prey and predator move through, the second level represents the behavior of the agent prey and predator, and the third level describes a specific item of the agent prey.
1132
M. Flores-Badillo, M. Padilla-Duarte, and E. López-Mellado
3.2 Model of the Environment For simplifying the model we consider a 3×3 grid-world; however this structure can be generalized by adding more nodes to the net. The Figure 5 shows the structure of the environment, a Net of level 1 where places represent each region of the grid which are connected to each other through transitions. The arcs represent the directions in which, each agent is allowed to move (right, left, up, down). The initial marking of the system has a net of level 2 which represents the agent prey, and other net of level 2 which represents the agent predator.
Fig. 5. Environment model
Since the predator and prey has the same detection scope, both of them use the transitions marked with the word detect. There is one of these transitions for each place that belongs to the detection scope. We use the transitions marked with MovTo for representing the possible allowed directions that both Prey and Predator may move. For example, Predator is allowed to move, from de original position, to the right, left, up and down direction. The prey is allowed to move, from the original position, to the left, down, and up direction. In the second level nets we can observe that once that detection has happened, the prey will move to one of the allowed directions except on the direction of the predator. The predator will move to the prey direction if the transition marked hungry occurred, otherwise it will ignore the prey position. If the both Predator and Prey are in the same place, the predator will attack (attack) the prey until the prey run away or die. 3.3 Model of the Predator The agent predator is modeled as a Level 2 Net (Fig. 6). This net describes the general behavior of a predator. The initial marking states that the predator is in a
Modeling and Simulation of Mobile Agents Systems
1133
satiety state, thus if it detects a prey with the detect transition, it will ignore the prey position and it can move to the four allowed directions. The labels on the transition marked with detect has external synchronization with the environment. The transition marked with hungry is used for change the behavior of the predator; when a detect transition is fired it will ignore the allowed directions and then it will move to the position occupied by Prey (MoveDirect which its label has external synchronization with the environment). Also the Predator can move freely (MovFree) on search of a prey (if it did not detect a prey yet). When a Predator is in the same position that a prey, he will attack the prey (transition marked with attack, which has external synchronization with the environment). If the prey runs away, the predator will return to the search. If the prey dies (it received 3 attacks), the predator will try to find another prey. If we want modify the problem for including more than one predator, we will need more levels for modeling agent cooperation and communication protocols.
Fig. 6. General Model of the Predator
3.4 Model of the Prey Figure 7 shows the general behavior of an agent prey. The initial marking of a prey is the place marked with wander. It is allowed to move to four orthogonal directions (Up, Down, Right, Left) with the use of the labels on the transition marked with MoveTo which are synchronized with the level 1 net describing the environment. If a Prey detects a Predator (detect), it will try to escape from Predator and move to the allowed directions but the predator direction. The Prey only moves if it has strengths, for that we used a place marked with strengths which represents the maximal number of times that a predator can attack him before kill him. When this place loses all his tokens, the prey will die (die); however the Prey has the opportunity to recover his strengths when it is in the state represented by the place marked wander. If the Prey still has strengths, it will be able to run away from a predator attack. A net of level 3 will help to the prey to “remember” the last visited position for avoiding going back to that direction.
1134
M. Flores-Badillo, M. Padilla-Duarte, and E. López-Mellado
Perceived
Fig. 7. General Model of the Prey
3.4.1 Level 3 Net This net is used for controlling the movement of a Prey (Fig. 8). When a prey detects a predator the transition marked with detect occurs; then a symbol is added inside the place marked with perceived. This symbol will represent the actual direction of the predator. If a predator is on the right, this place will have an R. Once this net has that symbol, it will enable only the labels which represent the possible allowed directions the Prey could use to try to run away of the Predator, except the movement to the Right (for the example) because we are sure that on that direction is a predator and the normal behavior of a prey is to run away from the predator. All the movements’
Fig. 8. Level 3 Net. Prey’s Internal Net
Modeling and Simulation of Mobile Agents Systems
1135
labels have external synchronization with the prey net. When the prey has chosen a direction, the place marked with run will have a symbol which represents the last direction. If the Prey came from the left, this place will have an L and the enabling rules will do the same as the previous place.
4 Model Simulation The simulation of Prey Predator problem has been performed through the execution of the 3-level net model described above. This task has been possible with the help of a software tool that allows the visual edition and the interactive execution of n-level net models expressed in nLNS. The tool provides facilities for the interactive execution of the model: for a current marking the system indicates which transitions are enabled and with respect to which label are they enabled; then the user selects what transition to fire. Below we are including several views of the edited model. Every net is built in a single window and it can be saved and updated for model adjustments. In figure 9 we can see that the allowed directions for the prey and predator are enabled with the initial marking; also the perception label is enabled too (PreyLeft_PredRight, MovPredUp, MovPredDown, MovPredLeft, MovPredRight, MovPreyUp, MovPreyDown) In Fig. 10 we can observe the transitions enabled with the initial marking (PreyLeft_PredRight, MovPreyUp, MovPreyDown) In figure 11 we can see the transitions enabled with the initial marking (PreyLeft_PredRight, MovPredUp, MovPredDown, MovPredLeft, MovPredRight, hungry). And in the fig. 12 we have fired the transitions with the labels hungry, next PreyLeft_PredRight, then MovPreyUp, and finally MovePredLeft. The agents can perceive each other again PreyUp_PredDown. As we show in fig. 13, we have fired the transition with the label PreyUp_PredDow. On the second place of the Net we have the color D which represents the predator’s direction; so the prey only can move to the right MovPreyRigh. In fig. 14 we have
Fig. 9. Simulator screen (environment net)
Fig. 10. Simulator screen (prey net)
1136
M. Flores-Badillo, M. Padilla-Duarte, and E. López-Mellado
Fig. 11. Simulator screen (predator net)
Fig. 13. Simulator Screen (Level 3 Net)
Fig. 15. Simulator Screen (predator net)
Fig. 12. Simulator Screen (environment net 2)
Fig. 14. Simulator Screen (Level 3 Net 2)
Fig. 16. Simulator Screen (Prey Net)
Modeling and Simulation of Mobile Agents Systems
1137
fired the transition with the label MovPreyRigh. On the third place we have the color L which means that the prey came from the left direction. The prey is not going to go back to that direction. Only the labels MovPreyDown and MovPreyRight are enabled. In figure 15 the Prey and predator are in the same place so, the predator attacked the prey and it can attack again. In fig. 16 the predator attacked the prey for 3 times; the prey does not have strengths anymore, so it dies. In figure 17 we the prey has died the predator restart the search of another prey. It cannot perceive the dead prey.
Fig. 17. Simulator Screen (Environment net)
5 Implementation Issues Prototypes of multi mobile agent systems have been developed from nLNS specifications; they have been coded in Java using the facilities of JADE, according to a methodology presented in [7]. The software has been distributed and tested into several networked (LAN) personal computers.
6 Conclusions Modeling and simulation can be used in the earliest stages of developing life cycle of multi mobile agent systems. This allows validating the functioning requirements avoiding backtracking in software design and implementation.
1138
M. Flores-Badillo, M. Padilla-Duarte, and E. López-Mellado
The use of nLNS allows obtaining modular and hierarchical models whose components are nets which have a simple structure, in the most cases. Thus the nets are first conceived separately and then they are integrated within the pertinent models and interrelated through the transition synchronization. The Prey Predator problem herein presented illustrates several key issues inherent to interactive mobile agent systems. Current research addresses the automated generation of Java code from specifications in nLNS using the tool herein described.
References [1] [2] [3] [4]
[5]
[6] [7]
[8]
M. Benda, V. Jagannathan and R. Dodhiawalla, "On Optimal Cooperation of Knowledge Sources," Technical Report BCS-G2010-28, Boeing AI Center, 1985. Kam-Chuen Jim, C. Lee Giles, ”Talking Helps: Evolving Communicating Agents for the Prey-Predator Pursuit Problem”. Artificial Life 6(3): 237-254 (2000) Thomas Haynes and Sandip Sen. “Evolving behavioral strategies in predators and prey”. IJCAI-95 Workshop on Adaptation and Learning in Multiagent Systems,1995 W. Chainbi, C. Hanachi, and C. Sibertin-Blanc. “The multiagent prey-predator problem : A Petri net solution”. In Proceedings of the IMACS-IEEESMC conference on Computational Engineering in Systems Application (CESA'96), pages 692--697, Lille, France, July 1996. E. López, H. Almeyda. “A three-level net Formalism for the modeling of multiple mobile robot systems”. Proc. IEEE Int. Conf. on Systems, Man, and Cybernetics, Oct. 5-8, 2003 pp. 2733-2738. N. I. Villanueva. “Multilevel modeling of batch processes”. MSc. Thesis Cinvestav-IPN Guadalajara, Jalisco, México, Dec. 2003. Sánchez Herrera, R., López Mellado, E. “Modular and hierarchical modeling of interactive mobile agents”, IEEE Internacional Conference on Systems, Man and Cybernetics, Oct. 2004. Page(s): 1740-1745 vol.2 R. Valk “Petri nets as token objects: an introduction to elementary object nets”. Int. Conf. on application and theory of Petri nets, 1998, pp1-25, Springer-Verlag
Using AI Techniques for Fault Localization in Component-Oriented Software Systems J¨ org Weber and Franz Wotawa Institute for Software Technology, Technische Universit¨ at Graz, Austria {jweber, wotawa}@ist.tugraz.at
Abstract. In this paper we introduce a technique for runtime fault detection and localization in component-oriented software systems. Our approach allows for the definition of arbitrary properties at the component level. By monitoring the software system at runtime we can detect violations of these properties and, most notably, also locate possible causes for specific property violation(s). Relying on the model-based diagnosis paradigm, our fault localization technique is able to deal with intermittent fault symptoms and it allows for measurement selection. Finally, we discuss results obtained from our most recent case studies.
1
Introduction
Several research areas are engaged in the improvement of software reliability during the development phase, for example research on testing, debugging, or formal verification techniques like model checking. Unfortunately, although substantial progress has been made in these fields, we have to accept the fact that it is not possible to eliminate all faults in complex systems at development time. Thus, it is highly desirable to augment complex software systems with autonomous runtime fault detection and localization capabilities, especially in systems which require high reliability. The goal of our work is to detect and, in particular, to locate faults at runtime without any human intervention. Though there are numerous techniques like runtime verification which aim at the detection of faults, there is only litte work which deals with fault localization at runtime. However, it is necessary to locate faults in order to be able to automatically perform repair actions. Possible repair actions are, for example, the restart of software components or switching to redundant components. In this paper we propose a technique for runtime diagnosis in componentoriented software systems. We define components as independent computational modules which have no shared memory and which communicate among each other by the use of events, which can contain arbitrary attributes. We suppose that the interactions are asynchronous. We assume that, as often the case in practice, no formalized knowledge about the application domain exists. Moreover, we require the runtime diagnosis to
This research has been funded in part by the Austrian Science Fund (FWF) under grant P17963-N04. Authors are listed in alphabetical order.
A. Gelbukh and C.A. Reyes-Garcia (Eds.): MICAI 2006, LNAI 4293, pp. 1139–1149, 2006. c Springer-Verlag Berlin Heidelberg 2006
1140
J. Weber and F. Wotawa
impose low runtime overhead in terms of computational power and memory consumption. Ideally, augmenting the software with fault detection and localization functionality necessitates no change to the software. Moreover, to avoid damage which could be caused by a faulty system, we have to achieve quick detection, localization, and repair. Another difficulty is the fact that the fault symptoms are often intermittent. Our approach allows one to assign user-defined properties to components and connections. The target system is continuously monitored by rules, i.e., pieces of software which detect property violations. The fact that the modeler can implement and integrate arbitrary rules provides sufficient flexibility to cope with today’s software complexity. In practice, properties and rules will often embody elementary insights into the software behavior rather than complete specifications. The reason is that, due to the complexity of software systems, often no formalized knowledge of the behavior exists and the informal specifications are coarse and incomplete. Our model is similar to that proposed in [1] and which was previously used for the diagnosis of hardware designs [2], but we argue that the approach in [1] is too abstract to capture the complex behavior of large software components. In order to enable fault localization, complex dependencies between properties can be defined. When a violation occurs, we locate the fault by employing the model-based diagnosis (MBD) paradigm [3,4]. In terms of the classification introduced in [5], we propose a state-based diagnosis approach with temporal behavior abstraction. We provide a formalization of the architecture of a component-based software system and of the property dependencies, and we outline an algorithm for computing the logical model. Moreover, we present a runtime fault detection and localization algorithm which allows for measurement selection. Furthermore, we give examples related to the control software of autonomous robots and discuss the results of case studies. The concrete models which we created for this system mainly aim at the diagnosis of severe faults like software crashes and deadlocks.
2
Introduction to the Model Framework
Figure 1 illustrates a fragment of a control system for an autonomous soccer robot as our running example. This architectural view comprises basically independent components which communicate by asynchronous events. The connections between the components depict the data flows. The Vision component periodically produces events containing position measurements of perceived objects. The Odometry periodically sends odometry data to the WorldModel (WM). The WM uses probability-based methods for tracking object positions. For each event arriving at one of its inputs, it creates an updated world model containing estimated object positions. The Kicker component periodically creates events indicating whether or not the robot owns the ball. The Planner maintains a knowledge base (KB), which includes a qualitative representation of the world model, and chooses abstract actions based on this knowledge. The content of the KB is periodically sent to a monitoring application.
Using AI Techniques for Fault Localization
Vision
OM
WM (WorldModel)
Odometry
1141
MD
WS
PS Planner
connections: OM ... Object Measurement WS ... World State HB
Kicker
MD ... Motion Delta PS ... PlannerState HB ... HasBall
Fig. 1. Architectural view on the software system of our example
We propose a behavior model of software components which abstracts from both the values (content) of events and the temporal behavior. Our model assigns sets of properties, i.e. constraints, to all components and connections. Properties may capture temporal constraints, constraints on the values of events, or a combination of both. At runtime, the system is continuously monitored by observers. Observers comprise rules, i.e., pieces of software which monitor the system and detect property violations. In the logical model the properties are represented by property constants. We use the proposition ok(x, pr, s) which states that the property pr holds for the component or connection x during a certain period of time, namely the time between two discrete observation snapshots si−1 and si . While the system is continuously monitored by the rules, the diagnosis itself is based on multiple snapshots which are obtained by polling the states of the rules (violated or not violated) at discrete time points. Each observation belongs to a certain snapshot, and we use the variable s as a placeholder for a specific snapshot. The diagnosis accounts for the observations of all snapshots. This approach to MBD is called multiple-snapshot diagnosis or state-based diagnosis [5]. For the sake of brevity we will often use the notation ok(x, pr) instead of ok(x, pr, s). An example for a component-related property is np, expressing that the number of processes (threads) of a correctly working component c must exceed a certain threshold. In our running example, pe denotes that events must occur periodically on a connection, and cons e is used to denote that the value of events on a certain connection must not contradict the events on connection e. Figure 2 depicts the properties which we assign to the connections. For example, the rules for ok(W S, cons OM ) check if the computed world models on connection WS correspond to the object position measurements on connection OM. Ideally, an observer would embody a complete specification of the tracking algorithm used in the WM component. In practice, however, often only incomplete and coarse specifications of the complex software behavior are available. Therefore, the observers rely on simple insights which require little expert knowledge. The rules of the observer for ok(W S, cons OM ) could check if all environment objects which are perceived by the Vision are also part of the computed world models, disregarding the actual positions of the objects. Our experience
1142
J. Weber and F. Wotawa
has shown that such abstractions often suffice to detect and locate severe faults like software crashes or deadlocks. In order to enable the localization of faults after the detection of property violation(s), we define functional dependencies between properties on outputs and inputs as shown in Fig. 3. For example, ¬AB(P lanner) → ok(P S, pe) indicates that the Planner is supposed to produce events on connection PS periodically, regardless of the inputs to the Planner. AB denotes abnormality of a component. ¬AB(P lanner) ∧ ok(W S, cons OM ) → ok(P S, cons OM ) states that, if the world states on WS are consistent with the object measurements on OM, then the same must hold for the planner states on PS; i.e., ok(P S, cons OM ) depends on ok(W S, cons OM ). Note that a property on an output connection may also depend on multiple input properties. This is the case for the property ok(W S, pe).
Vision
ok(OM, pe)
WM (WorldModel)
Odometry ok(MD, pe)
ok(WS, pe) ok(WS, cs_OM)
ok(PS, pe) ok(PS, cons_OM) ok(PS, cons_HB)
ok(HB, pe)
Planner
Kicker
WM dependences: ok(OM, pe)
ok(WS, pe)
ok(MD, pe)
ok(WS, cons_OM)
Fig. 2. The model assigns properties to connections
¬AB(V ision) → ok(OM, pe) ¬AB(Odometry) → ok(M D, pe) ¬AB(W M ) ∧ ok(OM, pe) ∧ ok(M D, pe) → ok(W S, pe) ¬AB(W M ) ∧ ok(OM, pe) → ok(W S, cons OM ) ¬AB(Kicker) → ok(HB, pe) ¬AB(P lanner) → ok(P S, pe) ¬AB(P lanner) ∧ ok(W S, cons OM ) → ok(P S, cons OM ) ¬AB(P lanner) → ok(P S, cons HB) for each comp. c : ¬AB(c) → ok(c, np)
Fig. 3. The dependencies between properties in our example
To illustrate our basic approach we outline a simple scenario by locating the cause for observed malfunctioning. We assume a fault in the WM causing the world state WS and, as a consequence, the planner state P S to become inconsistent with the object position measurements OM . As a result, the observer for ok(P S, cons OM, s) detects a violation, i.e. ¬ok(P S, cons OM, s0 ) is an observation for snapshot 0. All other observers are initially disabled, i.e. they do not provide any observations. Based on this observation, we can compute diagnosis candidates by employing the MBD [3,4] approach for this observation snapshot. By computing all (subset minimal) diagnoses, we obtain three single-fault diagnosis candidates, namely {AB(V ision)}, {AB(W M )}, and {AB(P lanner)}. After activating observers for the output connections of these candidates, we obtain the second observation snapshot ok(OM, pe, s1 ), ok(W S, pe, s1 ), ¬ok(W S, cons OM, s1 ), ok(P S, pe, s1 ), ¬ok(P S, cons OM, s1 ), and ok(P S, cons HB, s1 ). This leads to the single diagnosis {AB(W M )}.
Using AI Techniques for Fault Localization
3
1143
Formalizing the Model Framework
In Definition 1 we introduce a model which captures the architecture of a component-oriented software system and the dependencies between properties. Definition 1 (SAM). A software architecture model (SAM) is a tuple (COM P, CON N, Φ, ϕ, out, in) with: – – – –
a set of components COM P a set of connections CON N a (finite) set of properties Φ a function ϕ : COM P ∪ CON N → 2Φ , assigning properties to a given component or connection. – a function out : COM P → 2CON N , returning the output connections for a given component. – the (partial) function in : COM P × CON N × Φ → 2CON N ×Φ , which expresses the functional dependencies between the inputs and outputs of a given component c. For all output connections e ∈ out(c) and for each property pr ∈ ϕ(e), it returns a set of tuples (e , pr ), where e is an input connection of c and pr ∈ ϕ(e ) a property assigned to e .
This definition allows for the specification of a set of properties Φ for a specific software system. We introduce a function ϕ in order to assign properties to components and connections. The function in returns for each property pr of an output connection a set of input properties P R on which pr depends. For example, those part of the SAM which relates to the WM component and its output connection WS are defined as follows (Fig. 3): ϕ(W M ) = {np}, ϕ(W S) = {pe, cons OM, eo} out(W M ) = {W S} in(W M, W S, pe) = {(OM, {pe}), (M D, {pe})}, in(W M, W S, cons OM ) = {(OM, {pe})} The logical model is computed by Alg. 1. Based on a SAM, it generates the logical system description SD. Algorithm 1. The algorithm for computing the logical model Input: The SAM. Output: The system description SD. computeModel(COM P, CON N, Φ, ϕ, out, in) (1) SD := {}. (2) For all c ∈ COM P : (3) For all pr ∈ ϕ(c): add ¬AB(c) → ok(c, pr, s) to SD. (4) For all e ∈ out(c), for all pr ∈ ϕ(e): add ¬AB(c) ∧
ok(e , pr , s) → ok(e, pr, s)
(e , pr ) ∈ in(c, e, pr)
to SD.
1144
J. Weber and F. Wotawa
Note that the universal quantification implicitly applies to variable s. It denotes a discrete snapshot of the system behavior. Each observation (¬)ok(x, pr, si ) relates to a certain snapshot si , where i is the snapshot index. A diagnosis is a solution for all snapshots. The temporal ordering of the different snapshots is not taken into account. It is also important that, supposed that the number of snapshots is finite, the logical model which is computed by this algorithm can be easily transformed to propositional Horn clauses and thus the model is amenable to efficient logical reasoning. The size of the model in terms of number of literals depends on the number of components, the max. fan-in and fan-out of components (i.e., the max. number of input and output connections, resp.), the max. number of properties assigned to each component and connection, and the number of snapshots which compose the observations of the final diagnosis. Theorem 1. The number of literals in SD after the transformation to Horn clauses is O(ns · nc · (m · f )2 ), where ns is the (max.) number of snapshots, nc = |COM P |, m is the maximum fan-in and fan-out, and f the max. number of properties for each component and connection (i.e., |ϕ(x)| ≤ f for all x ∈ COM P ∪ CON N ). Note that the set P R which is returned by the function in has the size m · f in the worst case. Assuming that ns is very small, which is the case in practice, the number of literals is of order O(nc · (m · f )2 ).
4
Runtime Monitoring and Fault Localization
The runtime diagnosis system consists of two modules, the diagnosis module (DM) and the observation module (OM). These modules are executed concurrently. While the DM performs runtime fault detection and localization at the logical level, the OM continuously monitors the software system and provides the abstract observations which are used by the DM. Let us consider the OM first. It basically consists of observers. Each observer comprises a set of rules which specify the desired behavior of a certain part of the software system. A rule is a piece of software which continuously monitors that part. The execution of the rules is concurrent and unsynchronized. When a rule detects a violation of its specification, it switches from state not violated to the state violated. To each observer a set of atomic sentences is assigned which represent the logical observations. Furthermore, an observer may be enabled or disabled. A disabled observer does not provide any observations, but it may be enabled in the course of the fault localization. Note that it is often desired to initially disable those observers which otherwise would cause unnecessary runtime overhead. Definition 2 (Observation Module OM ). The OM is a tuple (OS, OSe ), where OS is the set of all available observers and OSe ⊆ OS the set of those observers which are currently enabled.
Using AI Techniques for Fault Localization
1145
Definition 3 (Observer). An observer os ∈ OS is a tuple (R, Ω) with: 1. a set of rules R. For a rule r ∈ R, the boolean function violated(r) returns true if a violation of its specification has been detected. 2. A set of atomic sentences Ω. Each atom ω ∈ Ω has the form ok(x, pr, s), where x ∈ COM P ∪ CON N , pr ∈ ϕ(x), and s is a variable denoting an observation snapshot (see Definition 1). An observer detects a misbehavior if one or more of its rules are violated. Let υ(OSe ) denote the set of observers which have detected a misbehavior, i.e. υ(OSe ) = {(R, Ω) ∈ OSe | violated(r) = true, r ∈ R}. Then the total set of observations of a certain snapshot si is computed as shown in Alg. 2. Algorithm 2. The algorithm for computing the set of observations Input: The set of enabled observers and a constant denoting the current snapshot. Output: The set OBS which comprises ground literals. computeOBS(OSe , si ) (1) OBS := {}. (2) For all os ∈ OSe , os = (R, Ω): (3) If os ∈ υ(OSe ): add ω∈Ω ¬ω to OBS (4) else: add ω∈Ω ω to OBS. (5) For all atoms α ∈ OBS: substitute si for variable s.
Algorithm 3 presents the algorithm which is executed by the diagnosis module DM. The inputs to the algorithm are the logical system description SD, which is returned by the computeModel algorithm (Alg. 1), and an observation module OM = (OS, OSe ). Algorithm 3. The runtime diagnosis algorithm Input: The logical system description and the observation module. performRuntimeDiagnosis(SD, OM ) (1) Do forever: (2) Query the observers, i.e. compute the set υ(OSe ). (3) If υ(OSe ) = {}: (4) Set i := 0, OBS := {}, f inished := f alse; i is the snapshot index (5) While not f inished: (6) Wait for the symptom collection period δc . (7) Recompute υ(OSe ). (8) OBS := OBS ∪ OBSi , where OBSi := computeOBS(OSe , si ) (9) Reset all rules to not violated. (10) Compute D: D := {Δ|Δ is a min. diagnosis of (SD, COM P, OBS)}. (11) If |D| = 1 or the set OSs := ms(SD, OBS, OS, OSe ) is empty: start repair, set f inished := true. (12) Otherwise: set i := i+1, enable observers in OSs , and set OSe := OSe ∪OSs .
The algorithm periodically determines whether a misbehavior is detected by an observer. In this case, it waits for a certain period of time (line 6). This
1146
J. Weber and F. Wotawa
gives the observers the opportunity to detect additional symptoms, as it may take some time after faults manifest themselves in the observed system behavior. Thereafter, the diagnoses are computed (line 10) using Reiter’s Hitting Set algorithm [3]. Note that the violated rules are reset to not violated after computing the logical observations (line 9). Therefore, an observer which detects a misbehavior in snapshot sj may report a correct behavior in sj+1 . This is necessary for the localization of multiple faults in the presence of intermittent symptoms. When we find several diagnoses (lines 11 and 12), it is desirable to enable additional observers in OS \ OSe . We assume the function ms(SD, OBS, OS, OSe ) to perform a measurement selection, i.e. it returns a set of observers OSs (OSs ⊆ OS \ OSe ) whose observations could lead to a refinement of the diagnoses. We do not describe the function ms in this paper. In [4] a strategy based on Shannon entropy to determine the optimal next measurement is discussed. The fault localization is finished when either a unique diagnosis is found or the diagnoses cannot be further refined by enabling more observers (line 11). We finally discuss the computational complexity of the diagnosis computation. For most practical purposes it will be sufficient to search only for subset-minimal diagnoses which can be obtained by using Reiter’s algorithm. Furthermore, by transforming the model to Horn clauses, we can perform consistency checks in the same order as the number of literals (see Theorem 1) [6]. The number of required calls to the theorem prover (TP) is of order O(2nc ) with nc = |COM P |. However, if the max. cardinality k of the diagnoses is much smaller than the number of components, which is usually the case in practice, then the number of required TP calls is approximately O(nkc ). For example, k = 2 if there are only single and dual-fault diagnoses. Theorem 2. Assuming that k nc and that ns is very small, the overall complexity of the diagnoses computation is approximately O(nk+1 · (m · f )2 ) (see c Theorem 1).
5
Case Studies and Discussion
We implemented the proposed diagnosis system and conducted a series of experiments using the control software of a mobile autonomous soccer robot. The implemented measurement selection process may enable multiple observers at the same time in order to reduce the time required for fault localization. The components of the control system are executed in separate applications which interact among each other using CORBA communication mechanisms. The software runs on a Pentium 4 CPU with a clock rate of 2 GHz. The model of the software system comprises 13 components and 14 connections. We introduced 13 different properties. 7 different types of rules were implemented, and the observation module used 21 instances of these rule types. For the specification of the system behavior we used simple rules which embody elementary insights into the software behavior. For example, we specified the minimum number of processes spawned by certain applications. Furthermore,
Using AI Techniques for Fault Localization
1147
we identified patterns in the communication among components. A simple insight is the fact that components of a robot control system often produce new events either periodically or as a response to a received event. Other examples are rules which express that the output of a component must change when the input changes, or specifications capturing the observation that the values of certain events must vary continuously. We simulated software failures by killing single processes in 10 different applications and by injecting deadlocks in these applications. We investigated if the faults can be detected and located in case the outputs of these components are observed. In 19 out of 20 experiments, the fault was detected and located within less than 3 seconds; in only one case it was not possible to detect the fault. Note that we set the symptom collection period δc to 1 second (see Alg. 3, line 6), and the fault localization incorporated no more than 2 observation snapshots. Due to the small number of components and connections, the computation of the diagnoses required only a few milliseconds. Furthermore, the overhead (in terms of CPU load and memory usage) caused by the runtime monitoring was negligible, in particular because calls to the diagnosis engine are only necessary after an observer has detected a misbehavior. Furthermore, we conducted 6 non-trivial case studies in order to investigate more complex scenarios. We injected deadlocks in different applications. We assumed that significant connections are either unobservable or should be observed only on demand, i.e. in course of the fault localization, because otherwise the runtime overhead would be unacceptable. In 4 scenarios we injected single faults, while in the other cases 2 faults occurred in different components almost at the same time. Moreover, in 2 scenarios the symptoms were intermittent and disappeared during the fault localization. In all of the 6 case studies, the faults could be correctly detected and located. In two cases, the fault was immediately detected and then located within 2 seconds. In one case the fault was detected after about 5 seconds, and the localization took 2 more seconds. However, in three case studies the simple rules detected the faults only in certain situations, e.g. when the physical environment was in a certain state. We gained several insights from our experiments. In general, state-based diagnosis appears to be an appropriate approach for fault localization in a robot control system as a particular example for component-oriented software. We were able to identify simple patterns in the interaction among the components, and by using rules which embody such patterns it was possible to create appropriate models which abstract from the dynamic software behavior. Furthermore, the approach proved to be feasible in practice since the overhead caused by the runtime monitoring is low. The main problem is the fact that simple rules are often too coarse to express the software behavior. Such rules may detect faults only in certain situations. Therefore, it may happen that faults are either not detected or that they are detected too late, which could cause damage due to the misbehavior of the software system. Hence, it is desirable to use more complex rules. However, appropriate
1148
J. Weber and F. Wotawa
rules are hard to find since, in practice, often no detailed specification of the software exists. Furthermore, the runtime overhead would increase significantly. The usage of simple rules also has the effect that more connections must be permanently observed than it would be the case if more complex rules were used. For example, in the control system we used in our experiments we had to observe more than half of the connections permanently in order to be able to detect severe faults like deadlocks in most of the components.
6
Related Research and Conclusion
There is little work which deals with model-based runtime diagnosis of software systems. In [7] an approach for model-based monitoring of component-based software systems is described. The external behavior of components is expressed by Petri nets. In contrast to our work, the fault detection relies on the alarmraising capabilities of the components themselves and on temporal constraints. In the area of fault localization in Web Services, the author of [8] proposes a modelling approach which is similar to ours. Both approaches use grey-box models of components, i.e. the dependencies between the inputs and outputs of components are modelled. However, their work assumes that each message (event) on a component output can be directly related to a certain input event, i.e. each output is a response which can be related to a specific incoming request. As we cannot make this assumption, we abstract over a series of events within a certain period of time. Another approach to model the behavior of software is presented in [9]. In order to deal with the complexity of software, the authors propose to use probabilistic, hierarchical, constraint-based automata (PHCA). However, they model the software in order to detect faults in hardware. Software bugs are not considered in this work. In the field of autonomic computing, there are model-based approaches which aim at the creation of self-healing and self-adaptive systems. The authors of [10] propose to maintain architecture models at runtime for problem diagnosis and repair. Similar to our work, they assign properties to components and connectors. However, this work does not employ fault localization mechanisms. Rapide [11] is an architecture description language (ADL) which allows for the definition of formal constraints at the architecture level. The constraints define legal and illegal patterns of event-based communication. Rapide’s ability for formalizing properties could be utilized for runtime fault detection. However, Rapide does not provide any means for fault localization. This paper presents a MBD approach for fault detection and, in particular, fault localization in component-oriented software systems at runtime. Our model allows one to introduce arbitrary properties and to assign them to components and connections. The fault detection is performed by rules, i.e. pieces of software which continuously monitor the software system. The fault localization utilizes dependencies between properties. We provide algorithms for the generation of the logical model and for the runtime diagnosis. Finally, we discuss case studies which
Using AI Techniques for Fault Localization
1149
demonstrate that our approach is frequently able to quickly detect and locate faults. The main problem is the fact that simple rules may often be insufficient in practice. We intend to evaluate our approach in other application domains as well. Moreover, our future research will deal with autonomous repair of software systems at runtime.
References 1. Steinbauer, G., Wotawa, F.: Detecting and locating faults in the control software of autonomous mobile robots. In: Proceedings of the 19th International Joint Conference on AI (IJCAI-05), Edinburgh, UK (2005) 1742–1743 2. Friedrich, G., Stumptner, M., Wotawa, F.: Model-based diagnosis of hardware designs. Artificial Intelligence 111 (1999) 3–39 3. Reiter, R.: A theory of diagnosis from first principles. Artificial Intelligence 32 (1987) 57–95 4. de Kleer, J., Williams, B.C.: Diagnosing multiple faults. Artificial Intelligence 32 (1987) 97–130 5. Brusoni, V., Console, L., Terenziani, P., Dupr´e, D.T.: A spectrum of definitions for temporal model-based diagnosis. Artificial Intelligence 102 (1998) 39–79 6. Minoux, M.: LTUR: A Simplified Linear-time Unit Resolution Algorithm for Horn Formulae and Computer Implementation. Information Processing Letters 29 (1988) 1–12 7. Grosclaude, I.: Model-based monitoring of software components. In: Proceedings of the 16th Eureopean Conference on Artificial Intelligence, IOS Press (2004) 1025– 1026 Poster. 8. Ardissono, L., Console, L., Goy, A., Petrone, G., Picardi, C., Segnan, M., Dupr´e, D.T.: Cooperative Model-Based Diagnosis of Web Services. In: Proceedings of the 16th International Workshop on Principles of Diagnosis. DX Workshop Series (2005) 125–132 9. Mikaelian, T., Williams, B.C.: Diagnosing complex systems with software-extended behavior using constraint optimization. In: Proceedings of the 16th International Workshop on Principles of Diagnosis. DX Workshop Series (2005) 125–132 10. Garlan, D., Schmerl, B.: Model-based adaptation for self-healing systems. In: WOSS ’02: Proceedings of the first workshop on Self-healing systems, New York, NY, USA, ACM Press (2002) 27–32 11. Luckham, D., et al.: Specification and analysis of system architecture using RAPIDE. IEEE Transactions on Software Engineering 21 (1995) 336–355
Exploring Unknown Environments with Randomized Strategies Judith Espinoza, Abraham S´ anchez, and Maria Osorio Computer Science Department University of Puebla 14 Sur and San Claudio, CP 72570 Puebla, Pue., M´exico [email protected], {asanchez, aosorio}@cs.buap.mx
Abstract. We present a method for sensor-based exploration of unknown environments by mobile robots. This method proceeds by building a data structure called SRT (Sensor-based Random Tree). The SRT represents a roadmap of the explored area with an associated safe region, and estimates the free space as perceived by the robot during the exploration. The original work proposed in [1] presents two techniques: SRT-Ball and SRT-Star. In this paper, we propose an alternative strategy called SRT-Radial that deals with non-holonomic constraints using two alternative planners named SRT Extensive and SRT Goal. We present experimental results to show the performance of the SRT-Radial and both planners. Keywords: Sensor-based nonholonomic motion planning, SRT method, randomized strategies.
1
Introduction
Building maps of unknown environments is one of the fundamental problems in mobile robotics. As a robot explores an unknown environment, it incrementally builds a map consisting of the locations of objects or landmarks. Many practical robot applications require navigation in structured but unknown environments. Search and rescue missions, surveillance and monitoring tasks, and urban warfare scenarios, are all examples of domains where autonomous robot applications would be highly desirable. Exploration is the task of guiding a vehicle in such a way that it covers the environment with its sensors. We define exploration to be the act of moving through an unknown environment while building a map that can be used for subsequent navigation. A good exploration strategy can be one that generates a complete or nearly complete map in a reasonable amount of time. Considerable work has been done in the simulation of explorations, but these simulations often view the world as a set of floor plans. The blueprint view of a typical office building presents a structure that seems simple and straightforward -rectangular offices, square conference rooms, straight hallways, and right angles A. Gelbukh and C.A. Reyes-Garcia (Eds.): MICAI 2006, LNAI 4293, pp. 1150–1159, 2006. c Springer-Verlag Berlin Heidelberg 2006
Exploring Unknown Environments with Randomized Strategies
1151
everywhere- but the reality is often quite different. A real mobile robot may have to navigate through rooms cluttered with furniture, where the walls may be hidden behind desks and bookshelves. The central question in exploration is: Given what one knows about the world, where should one move to get as much new information as possible? Originally, one only knows the information that can get from its original position, but wants to build a map that describes the world as much as possible, and wants to do it as quick as possible. Trying to introduces a solution to this open problem, we present a method for sensor-based exploration of unknown environments by non-holonomic mobile robots. The paper is organized as follows. Section II presents briefly the RRT approach. Section III gives an overview of the SRT method. Section IV explains the details of the proposed perception strategy, SRT-Radial. Section V analyzes the performance of the two proposed planners, SRT Extensive and SRT Goal. Finally, the conclusions and future work are presented in Section VI.
2
RRT Planning
While not as popular as heuristic methods, non-reactive planning methods for interleaved planning and execution have been developed, with some promising results. Among these are agent-centered A∗ search methods [2] and the D∗ variant of A∗ search [3]. Nevertheless, using these planer requires discretization or tiling of the world in order to operate in continuous domains. This leads to a tradeoff between a higher resolution, with is higher memory and time requirements, and a low resolution with non-optimality due to discretization. On the other hand, RRT (Rapidly-Exploring Random Trees) approach should provide a good compliment for very simple control heuristics, and take much of the complexity out of composing them to form navigation systems. Specifically local minima can be reduced substantially through lookahead, and rare cases need not be enumerated since the planner has a nonzero probability of finding a solution or its own through search. Furthermore, and RRT system can be fast enough to satisfy the tight timing requirements needed for fast navigation. The RRT approach, introduced in [4], has become the most popular singlequery motion planner in the last years. RRT-based algorithms where first developed for non-holonomic and kinodynamic planning problems [7] where the space to be explored is the state-space (i.e. a generalization of configuration space (CS) involving time). However, tailored algorithms for problems without differential constraints (i.e. which can be formulated in CS) have also been developed based on the RRT approach [5], [6]. RRT-based algorithms combine a construction and a connection phase. For building a tree, a configuration q is randomly sampled and the nearest node in the tree (given a distance metric in CS) is expanded toward q. In the basic RRT algorithm (which we refer to as RRT-Extend), a single expansion step of fixed distance is performed. In a more greedy variant, RRTConnect [5], the expansion step is iterated while keeping feasibility constraints (e.g. no collision exists). As explained in the referred articles, the probability that a node is selected for expansion is proportional to the area of its Vorono¨ı
1152
J. Espinoza, A. S´ anchez, and M. Osorio
region. This biases the exploration toward unexplored portions of the space. The approach can be used for unidirectional or bidirectional exploration. The basic construction algorithm is given in Figure 1. A simple iteration is performed in which each step attempts to extend the RRT by adding a new vertex that is biased by a randomly-selected configuration. The EXTEND function selects the nearest vertex already in the RRT to the given sample configuration, x. The function NEW STATE makes a motion toward x with some fixed incremental distance , and tests for collision. This can be performed quickly (“almost constant time”) using incremental distance computation algorithms. Three situations can occur: Reached, in which x is directly added to the RRT because it already contains a vertex within of x; Advanced, in which a new vertex xnew = x is added to the RRT; Trapped, in which the proposed new vertex is rejected because it does not lie in Xf ree . We can obtain different alternatives for the RRT-based planners [6]. The recommended choice depends on several factors, such as whether differential constraint exist, the type of collision detection algorithm, or the efficiency of nearest neighbor computations. BUILD RRT(xinit ) 1 T .init(xinit ); 2 for k=1 to K 3 xrand ← RANDOM STATE(); 4 EXTEND(T , xrand ); 5 Return T EXTEND(T , x) 1 xnear ← NEAREST NEIGHBOR(x, T ); 2 if NEW STATE(x, xnear , xnew , unew ) then 3 T .add.vertex(xnew ); 4 T .add.edge(xnear , xnew , unew ); 5 if xnew = x then 6 Return Reached; 7 else 8 Return Advanced; 9 Return Trapped; Fig. 1. The basic RRT construction algorithm
3
The SRT Method
Oriolo et al. described in [1] an exploration method based on the random generation of robot configurations within the local safe area detected by the sensors. A data structure called Sensor-based Random Tree (SRT) is created, which represents a roadmap of the explored area with an associated Safe Region (SR). Each node of the SRT consists of a free configuration with the associated Local Safe Region (LSR) as reconstructed by the perception system; the SR is the union
Exploring Unknown Environments with Randomized Strategies
1153
of all the LSRs. The LSR is an estimate of the free space surrounding the robot at a given configuration; in general, its shape will depend on the sensor characteristics but may also reflect different attitudes towards perception. We will present two exploration strategies obtained by instantiating the general method with different perception techniques. The authors presented two techniques in their work. The first, where the LSR is a ball, realizes a conservative attitude particularly suited to noisy or lowresolution sensors, and results in an exploration strategy called SRT-Ball. The second technique is confident, and the corresponding strategy is called STRStar; in this case, the LSR shape reminds of a star. The two strategies were compared by simulations as well as by experiments. The method was presented under the assumption of perfect localization provided by some other module. The algorithm implementing the SRT method can be described as follows. BUILD SRT(qinit , Kmax , Imax , α, dmin ) 1 qact = qinit ; 2 for k=1 to Kmax 3 S ← PERCEPTION(qact); 4 ADD(T , (qact , S)); 5 i ← 0; 6 loop 7 θrand ← RANDOM DIR; 8 r ← RAY(S, θrand ); 9 qcand ← DISPLACE(qact, θrand , α · r); 10 i ← i + 1; 11 until (VALID(qcand , dmin , T ) o i = Imax ) 12 if VALID(qcand, dmin , T ) then 13 MOVE TO(qcand ); 14 qact ← qcand ; 15 else 16 MOVE TO(qact .parent); 17 qact ← qact .parent; 18 Return T ; This method is general for sensor-based exploration of unknown environments by a mobile robot. The method proceeds by building a data structure called SRT through random generation of configurations. The SRT represents a roadmap of the explored area with an associated Safe Region, an estimate of the free space as perceived by the robot during the exploration.
4
Exploration with SRT-Radial
As mentioned before, the form of the safe local region S reflects the sensors characteristics, and the perception technique adopted. Besides, the exploration strategy will be strongly affected by the form of S. The authors in [1] presented
1154
J. Espinoza, A. S´ anchez, and M. Osorio
a method called SRT-Star, which involves a perception strategy that completely takes the information reported by the sensor system and exploits the information provided by the sensors in all directions. In SRT-Star, S is a region with star form because of the union of several ‘cones’ with different radii each one, as in Figure 2. The radius of the cone i can be the minimum range between the distance of the robot to the closest obstacle or the measurable maximum rank of the sensors. Therefore, to be able to calculate r, the function RAY must identify first, the correspondent cone of θrand . While the conservative perception of SRT-Ball ignores the directional information provided by most sensory systems, SRT-Star can exploit it. On the opposite, under the variant implemented in this work and in absence of obstacles, S has the ideal form of a circumference, a reason that makes unnecessary the identification of the cone. This variant is denominated “SRT-Radial” [8], because once generated the direction of exploration θrand , the function RAY draws up a ray from the current location towards the edge of S, and the portion included within S, corresponds to the radius in the direction of θrand , as can be seen in Figure 2. Therefore, in the presence of obstacles, the form of S is deformed, and for different exploration directions, the radii lengths vary. To allow a performance comparison among the three exploration strategies, we have run the same simulations under the assumption that a ring of range finder is available. The same parameter values have been used.
Fig. 2. Left, Safe local region S obtained with the strategy of SRT-Star perception. Notice that the extension of S in some cones is reduced by the sensors rank of reach. In the middle, different radii obtained in the safe local region S with the SRT-Radial perception’s strategy. Right, Safe local region S obtained with the strategy of SRT-Ball.
5
Experimental Results
In order to illustrate the behavior of the SRT-Radial exploration strategy, we present two planners, the SRT Extensive and the SRT Goal. The planners were implemented in Visual C++ V. 6.0, taking advantage of the MSL1 library’s structure and its graphical interface that facilitates to select the algorithms, to visualize the working environment and to animate the obtained path. The library 1
http://msl.cs.uiuc.edu/msl/
Exploring Unknown Environments with Randomized Strategies
1155
GPC2 developed by Alan Murta was used to simulate the sensor’s perception systems. The modifications done to the SRT method are mainly in the final phase of the algorithm and the type of mobile robot considered. To perform the simulations, a perfect localization and the availability of sonar rings (or a rotating laser range finder) located on the robot are supposed. In general, the system can easily be extended to support any type and number of sensors. In the first planner, the SRT Extensive, a mobile robot that can be moved in any direction (a holonomic robot), as in the originally SRT method, is considered. The SRT Extensive planner finishes successfully, when the automatic backward process goes back to the initial configuration, i.e., to the robot’s departure point. In this case, the algorithm exhausted all the available options according to the random selection in the exploration direction. The planner obtained the corresponding roadmap after exploring a great percentage of the possible directions in the environment. The algorithm finishes with “failure” after a maximum number of iterations. In the second planner, a hybrid motion planning problem is solved, i.e., we combined the exploration task with the search of an objective from the starting position, the Start-Goal problem. The SRT Goal planner explores the environment and finishes successfully when it is positioned in the associated local safe region at the current configuration, where the sensor is scanning. In the case of not finding the goal configuration, it makes the backward movement process until it reaches the initial configuration. Therefore, in SRT Goal, the main task is to find the objective fixed, being left in second term the exhaustive exploration of the environment. In SRT Goal, the exploratory robot is not omnidirectional, and it presents a constraint in the steering angle, |φ| ≤ φmax < π/2. The SRT Goal planner was also applied to a motion planning problem, taking into account all the considerations mentioned before. The objective is the following: we suppose that we have two robots; the first robot can be omnidirectional or to have a simple no-holonomic constraint, as mentioned in the previous paragraph. This robot has the task of exploring the environment and obtaining a safe region that contains the starting and the goal positions. The second robot is non-holonomic, specifically a car-like robot, and will move by a collision-free path within the safe region. A local planner will calculate the path between the start and the goal configurations, with an adapted RRTExtExt method that can be executed in the safe region in order to avoid the process of collisions detection with the obstacles. The RRTExtExt planner was chosen because it can easily handle the non-holonomic constraints of the car-like robots and it is experimentally faster than the basic RRTs [6]. In the simulation process, the robot along with the sensor’s system move in a 2D world, where the obstacles are static; the only movable object is the robot. The robot’s geometric description, the workspace and the obstacles are described with polygons. In the same way, the sensor’s perception zone and the safe region are modeled with polygons. This representation facilitates the use of the GPC library for the perception algorithm’s simulation. If S is the zone that the sensor 2
http://www.cs.man.ac.uk/∼toby/alan/software/
1156
J. Espinoza, A. S´ anchez, and M. Osorio
can perceive in absence of obstacles and SR the perceived zone, the SR area is obtained using the difference operation of GPC between S and the polygons that represent the obstacles. The SRT Extensive algorithm was tested in environments with different values for Kmax , Imax , α, dmin . A series of experiments revealed that the algorithm works efficiently exploring environments almost in its totality. Table 1 summarizes the results obtained with respect to the number of nodes of the SRT and the running time. The running time provided by the experiments corresponds to the total time of exploration including the time of perception of the sensor. Figures 3 and 4 show the SRT obtained in two environments. The CPU times and the number of nodes change, according to the chosen algorithm, to the random selection in the exploration direction and the start and goal positions of the robot in the environment, marked in the figures with a small triangle. Table 1. Results of SRT Extensive method Environment 1 Environment 2 Nodes (min) Nodes (max) Time (min) Time (min)
92 111 132.59 sec 200.40 sec
98 154 133.76 sec 193.86 sec
Fig. 3. SRT and explored region for the environment 1. a) Time = 13.16 sec, nodes = 13. b) Time = 30.53 sec, nodes = 24. c) Time = 138.81 sec, nodes = 83.
One can observe how the robot completely explores the environment as much, fulfilling the entrusted task, for a full complex environment covered of obstacles or for a simple environment that it contains narrow passages. The advantage of the SRT-Radial perception strategy can be seen in these simulations, because it takes advantage of the information reported by the sensors in all directions, to generate and validate configurations candidates through reduced spaces. Because of the random nature of the algorithm, when it selects the exploration direction, it can leave small zones of the environment without exploring.
Exploring Unknown Environments with Randomized Strategies
1157
Fig. 4. SRT and explored region for the environment 2. a) Another interesting environment. b) Time = 25.34 sec, nodes = 19. c) Time = 79.71 sec, nodes = 41.
Fig. 5. a) Safe region and the security band. b) The RRT obtained with the RRTExtExt planner in 5.20 sec and 593 nodes. c) The path found for a forward car-like robot.
The SRT Goal algorithm finishes when the goal configuration is within the safe region of the current configuration or finishes when it returns to the initial configuration. When the SRT Goal algorithm has calculated the safe region that contains the starting and the final position, a second robot of type car-like has the option of executing locally new tasks of motion planning with other RRT planners. The safe region guarantees that the robot will be able to move freely inside that area since it is free of obstacles and it is unnecessary a collision checking by the RRT planners. But, we do not have to forget the geometry of the robot when it executes movements near the border between the safe region and the unknown space, because there is always the possibility of finding an obstacle that can collide with the robot. Therefore, it is necessary to build a security band in the contour of the safe region to protect the robot of possible collisions and to assure its mobility. Figures 5 and 6 show the security band, the calculated RRT and the path found for some mobile robots with different constraints. After many experiments made with both planners, we note that the SRT method does not make a distinction between obstacles an unexplored areas. In
1158
J. Espinoza, A. S´ anchez, and M. Osorio
Fig. 6. a) Safe region and the security band. b) The RRT obtained with the RRTExtExt planner in 13.49 sec and 840 nodes. c) The path found for a smoothing car-like robot.
fact, the boundary of the Local Safe Region may indifferently described the sensor’s range limit or the object’s profile. It means that during the exploration phase, the robot may approach areas which appear to be occluded. An important difference of SRT with other methods, is the way in which the environment is represented. The free space estimated during the exploration is simply the union of the LSR associated to tree’s node. However, relatively simple post-processing operations would allow the method to compute a global description of the Safe Region, which is very useful for navigation tasks.
6
Conclusions and Future Work
We have presented an interesting extension of the SRT method for sensor-based exploration of unknown environments by a mobile robot. The method builds a data structure through random generation of configurations. The SRT represents a roadmap of the explored area with an associated Safe Region, an estimate of the free space as perceived by the robot during the exploration. By instantiating the general method with different perception techniques, we can obtain different strategies. In particular, the SRT-Radial strategy proposed in this paper, takes advantage of the information reported by the sensors in all directions, to generate and validate configurations candidates through reduced spaces. SRT is a significant step forward with the potential for making motion planning common on real robots, since RRT is relatively easy to extend to environments with moving obstacles, higher dimensional state spaces, and kinematic constraints. If we compare SRT with the RRT approach, the SRT is a tree with edges of variable length, depending on the radius r of the local safe region in the random direction θrand . During the exploration, the robot will take longer steps in regions scarcely populated by obstacles and smaller steps in cluttered regions. Since, the tree in the SRT method is expanded along directions originating from qact , the method is inherently depth-first. The SRT approach retains some of the most important features of RRT, it is particularly suited for high-dimensional configuration spaces.
Exploring Unknown Environments with Randomized Strategies
1159
In the past, several strategies for exploration have been developed. One group of approaches deals with the problem of simultaneous localization and mapping, an aspect that we do not address in this paper. A mobile robot using the SRT exploration has two advantages over other systems developed. First, it can explore environments containing both open and cluttered spaces. Second, it can explore environments where walls and obstacles are in arbitrary orientations. In a later work, we will approach the problem of exploring an unknown environment with a car-like robot with sensors, i.e., to explore the environment and to plan a path in a single stage with the same robot. The integration of a localization module into the exploration process based on SLAM techniques is currently under way.
References 1. Oriolo G., Vendittelli M., Freda L. and Troso G. “The SRT method: Randomized strategies for exploration”, IEEE Int. Conf. on Robotics and Automation, (2004) 4688-4694 2. Koenig S. “Agent-centered search: Situated search with small look-ahead”, Proc. of the Thirteenth National Conference on Artificial Intelligence, AAAI Press (1996) 3. Stentz A. “The D* algorithm for real-time planning of optimal traverses”, Technical Report CMU-RI-TR-94-37, Carnegie Mellon University Robotics Institute, (1994) 4. LaValle S. M. “Rapidly-exploring random trees: A new tool for path planning”, TR 98-11, Computer Science Dept., Iowa State University, (1998) 5. Kuffner J. J. and LaValle S. M. “RRT-connect: An efficient approach to single-query path planning” IEEE Int. Conf. on Robotics and Automation, (2000) 995-1001 6. LaValle S. M. and Kuffner J. J. “Rapidly-exploring random trees: Progress and prospects”, Workshop on Algorithmic Foundations of Robotics, (2000) 7. LaValle S. M. and Kuffner J. J. “Randomized kinodynamic planning”, International Journal of Robotics Research, Vol. 20, No. 5, (2001) 378-400 8. Espinoza Le´ on Judith. “Estrategias para la exploraci´ on de ambientes desconocidos en rob´ otica m´ ovil”, Master Thesis, FCC-BUAP (in spanish) (2006)
Integration of Evolution with a Robot Action Selection Model Fernando Montes-González1, José Santos Reyes2, and Homero Ríos Figueroa1 1
Facultad de Física e Inteligencia Artificial, Universidad Veracruzana, Sebastián Camacho No. 5, Xalapa, Ver., México {fmontes, hrios}@uv.mx 2 Departamento de Computación, Facultad de Informática, Universidade da Coruña, Campus de Elviña, 15071 A Coruña [email protected]
Abstract. The development of an effective central model of action selection has already been reviewed in previous work. The central model has been set to resolve a foraging task with the use of heterogeneous behavioral modules. In contrast to collecting/depositing modules that have been hand-coded, modules related to exploring follow an evolutionary approach. However, in this paper we focus on the use of genetic algorithms for evolving the weights related to calculating the urgency for a behavior to be selected. Therefore, we aim to reduce the number of decisions made by a human designer when developing the neural substratum of a central selection mechanism.
1 Introduction The problem of action selection is related to how to make the right decision at the right time [1]. Entities that make the right decisions at the right time have more opportunity to survive than others not making a good guess of what a right decision implies. However, it is not evident for a human designer to know exactly what a right decision implies, or to have an idea of what a right decision should look like. In that sense ethologists have observed animals in situ to study their social habits and foraging behaviors in order to build models that capture changes in behavior and which add together to make complex behaviors. As a consequence, some researchers in robotics have taken a deep look at how ethologists have come to terms with the problem of developing models for explaining the decision process in animals. One common approach in robotics consists of modeling a complex behavior pattern as a single behavior that has to cope with the mishaps of the task to be solved. On the other hand, another approach foresees the various situations that the robot has to solve and models simple behaviors that when assembled together show complex patterns (process fusion). It should be noticed that both approaches have been successfully used to model foraging behavior in robots. For instance the work of Nolfi proposes the evolution of can-collection behavior as a complete solution [2]. However, as the author points out [3], in order to know whether the use of a genetic algorithm is feasible, we should answer the questions: What are we evolving? And How to evolve it? Furthermore, Seth [4] remarks on the A. Gelbukh and C.A. Reyes-Garcia (Eds.): MICAI 2006, LNAI 4293, pp. 1160 – 1170, 2006. © Springer-Verlag Berlin Heidelberg 2006
Integration of Evolution with a Robot Action Selection Model
1161
potential problems of the artificial separation in the design of behavioral modules and the process fusion of these behaviors. Some of these problems are related to the concurrence of behaviors in the process fusion and the artificial separation between behavioral description and the mechanistic implementation level. On the other hand the work of Kim & Cho incrementally evolved complex behaviors [5]. These authors discuss the difficulty of evolving complex behaviors in a holistic way, and offer a solution of combining several basic behaviors with action selection. In addition, the work of Yamauchi & Randall Beer [6] also points out the use of a modular and incremental network architecture instead of a single network for solving an entire task. As a result, we conclude that building animal robots (animats) meets specific needs that roboticists have to fulfill if the resolution of a task is to exhibit a complex behavior pattern. Nevertheless, the rationale for preferring one approach to another sometimes follows the developmental background of the solution we are proposing. Because we are looking at biologically plausible models of central selection that have incrementally been built; we are looking at the integration of perception and -possibly redundant reactive module behaviors for the right selection at the behavioral process fusion. Even more, we are interested in further developing the model of central selection; as a consequence, in our architecture, we have to incrementally include from simpler to complex behavioral modules and from engineered to biologically plausible selection mechanisms. In this paper, we are trying to focus on both the design and the evolution of the action selection mechanism, and behaviors to be selected, as a hybrid solution to the problem of action selection. Hence, we employ an evolutionary approach to optimize, at incremental stages, both behavior and the selection mechanism. The use of evolution is ruled out from the design of those sequential behaviors that are composed of a series of motor commands always executed in the same order. In order to facilitate the integration of evolution with action selection, a modular and extensible model of action selection is required. For our experiments we use a model of central action selection with sensor fusion (CASSF), the evolution of its internal parameters for selection, and the use of evolution in the development of the behavior patterns related to the exploration of the surrounding area. Nevertheless, a brief background on Genetic Algorithms is first required and is explained in section 2. Later on, in section 3 we explain the use of evolution in the design of neural behaviors: cylinder-seek, wall-seek and wall-follow; additionally, we introduce the following behaviors: cylinder-pickup and cylinder-deposit. In turn, these five behaviors will be used in conjunction to solve the foraging task set for a Khepera robot. The selection of a behavior is done by a central selection model namely CASSF that is presented in section 4. Next, in section 5 we present the results of the integration of genetic algorithms with the selection mechanism and neural behaviors. Finally, in section 6 we provide a general discussion highlighting the importance of these experiments.
2 Predominance in Robot Behavior Several evolutionary methods have been considered by the robotics community for producing either the controllers or the behaviors required for robots to accomplish a task and survive in real environments. These methods include Genetic Algorithms
1162
F. Montes-González, J. Santos Reyes, and H. Ríos Figueroa
(GAs), Evolutionary Strategies, Genetic and Evolutionary Programming and Coevolution, although in this study we have used GAs [7]. Examples of the use of evolutionary methods in the modeling of elementary behavior controllers for individuals can be found in existing literature [4, 6, 8]. A common approach relies on the use of neural networks with genetic algorithms (GAs) [9]. In this approach, once the topology of the neural network has been decided, a right choice of the neural weights has to be made in order to control the robot. Different selection of the weights of the neural controller will produce different individuals for which their resulting performance will range from clumsy to clever behavior. If we were to generate all possible performances of individuals according to their selection of weights, a convoluted landscape will be obtained. As a consequence, a gradient ascent has to be followed in order to find the optimal performance amongst all individuals. Therefore, at every step of the artificial evolution, the selection of more adapted individuals for solving a particular task predominates over the less adapted and eventually the fittest will emerge. However, a complete understanding has to be provided regarding whom the best individuals are, in order to let fitness and evolutionary operators come to terms.
Offspring
Decoding
[XX XX XY]
Reproduction Crossover Fitness Evaluation Function Mutation
Running in the Robot
Fig. 1. The new offspring is generated from the genotype of previous robot controllers
Fig. 2. The Khepera robot set in the middle of a squared arena with simulated food (wooden-cylinders)
At the beginning of the evolutionary process the initial population of neural controllers is made of random neural controllers, next their fitness is evaluated, and then GAs’ operators are applied. Selection chooses to breed the fittest individuals into the next offspring using crossover and mutation. Tournament selection is an instance of selection that chooses to breed new individuals from the winners of local competitions. Often selection occurs by means of agamic reproduction that inserts intact the fittest individuals into the next generation, which guarantees that the best solution so far found is not lost (elitism). Crossover is an operator that takes two individual encodings and swaps their contents around one or several random points along the chromosome. Mutation occurs with a probabilistic flip of some of the chromosome bits of the new individuals (in general, with a random change of the alleles of the genes). The latter operator explores new genetic material, so better individuals may be spawned. Whether neural controllers of basic behaviors, or modules in the selection mechanism, the individual genotypes directly encode the weights (real coding) of a particular neural module. Direct encoding is the one most used in evolutionary robotic experiments [9].
Integration of Evolution with a Robot Action Selection Model
1163
The performance of a neural controller depends on its resolution of the proposed task in a limited period of time in the test environment. The production of new offspring (Figure 1) is halted when a satisfactory fitness level is obtained. Evaluating robot fitness is very time-consuming, therefore, in the majority of the cases a robot simulator is preferred to evaluate candidate controllers that can finally be transferred to the real robot. The use of an hybrid approach combining both simulation and real robots [3, 10], and the use of noisy sensors and actuators in simulations, minimizes the “reality gap” between behaviors in simulation and real robots [9]. For an example of a behavior entirely evolved in a physical robot refer to the work of Floreano & Mondana [8].
Fig. 3. The behaviors for the selection mechanism were developed using the Webots Simulator
Fig. 4. The behaviors wall-seek, wall-follow and cylinder-seek share the same Neural Network substrate
3 The Development of Behavioral Modules The use of commercial robots, and simulators, has become a common practice nowadays among researchers focusing in the development of control algorithms for these particular robots. The Khepera robot (Figure 2) is a small robot [11], which has been commonly used in evolutionary robotics. The detection of objects in the robot is possible with the use of eight infrared sensors all around its body. Despite many simulators for the Khepera being offered as freeware, the use of a commercial simulator may be preferred over freeware in evolutionary experiments. The development of foraging experiments in simulation often requires the control of a functional gripper. For instance Webots is a 3D robot simulator [12] that fully supports the control of both the simulated and the real gripper turret attachment (Figure 3). In this work we use a global foraging behavioral type, which requires the robot to take cylinders from the center to the corners of an arena. Different basic behaviors are used, three of these for traveling around the arena, which share the same neural topology; and two behaviors for handling a cylinder, which were programmed as algorithmic routines. For behaviors sharing the same neural substrate we have employed the next simple neural network architecture, a fully connected feedforward multilayerperceptron neural network with no recurrent connections (Figure 4). Afferents are sent from the 6 neurons in the input layer to the 5 neurons in the middle layer; in turn, these middle neurons send projections to the 2 neurons at the output layer. The infrared sensors values of the Khepera range from 0 to 1023, the higher the value the
1164
F. Montes-González, J. Santos Reyes, and H. Ríos Figueroa
closer the object, the readings of the six frontal sensors are made binary with a collision threshold thc = 750. The use of binary inputs for the neural network facilitates the transference of the controller to the robot by adapting the threshold to the readings of the real sensors. The output of the neural network is scaled to ± 20 values for the DC motors. The genetic algorithm employs a direct encoding for the genotype as a vector c of 40 weights. Random initial values are generated for this vector ci, -1 < ci < 1, and n=100 neural controllers form the initial population G0. The two best individuals of a generation are copied as a form of elitism. Tournament selection, for each of the (n/2)-1 local competitions, produces two parents for breeding a new individual using a single random crossover point with a probability of 0.5. The new offspring is affected with a mutation probability of 0.01. Individuals in the new offspring are evaluated for about 25 seconds in the robot simulator. Once the behavior has been properly evolved, the controller is transferred to the robot for a final adjustment of the collision threshold. The behavior for finding a wall (wall-seek) can be seen as a form of obstacle avoidance because the arena has to be explored without bumping into an obstacle. The selection mechanism decides to stop this behavior when a wall has been reached. Sensor readings from walls and corners are different from the readings of infrared sensors close to a cylinder. The fitness formula for the obstacle behavior (adapted from [13]) in wall-seek was f
c1
3000 = ∑ abs(ls i )(1 − ds i )(1 − max_ iri ) i=0
(1)
Where for iteration i: ls is the linear speed in both wheels (the absolute value of the sum of the left and right speeds), ds is the differential speed on both wheels (a measurement of the angular speed), and max_ir is the highest infrared sensor value. The use of a fitness formula like this rewards those fastest individuals who travel on a straight line while avoiding obstacles. On the other hand, behavior representing running parallel to a wall resembles some kind of obstacle avoidance because if possible the robot has to avoid obstacles while running in a straight line close to a wall until a corner is found. The selection mechanism chooses to stop this behavior when a corner has been found. Next, the fitness formula (adapted from [14]) employed for the behavior wall-follow was as follows f
c2
= f
c1
* (tgh) 2
(2)
In this formula the tendency to remain close to walls (tgh), or thigmotaxis, is calculated as the fraction of the total time an individual is close to a wall. Therefore, a fitness formula such as this selects the individuals that travel near the wall with an avoidance behavioral type. A cylinder-seek behavior is a form of obstacle avoidance with a cylinder snifferdetector. The robot body has to be positioned right in front of the cylinder if the collection of the can is to occur. The locating cylinder behavior shares the same architecture as the behavior previously described, and its fitness formula was as follows f
c3
= f
c1
+ K 1 ⋅ cnear + K 2 ⋅ cfront
(3)
Integration of Evolution with a Robot Action Selection Model
1165
A formula such as this select individuals, which avoid obstacles and reach cylinders at different orientations (cnear), capable of orienting the robot body in a position where the gripper can be lowered and the cylinder collected (cfront). The constants K1 and K2, K1< K2, are used to reward the right positioning of the robot in front of a cylinder. Due to the sequential nature of the cylinder-pickup and the cylinder-deposit behaviors a more pragmatic approach was employed. These behaviors were programmed as algorithmic routines with a fixed number of repetitions for clearing the space for lowering the arm, opening the claw, and moving the arm upwards and downwards. Once all the mentioned behaviors were evolved and designed, they were transferred to the Khepera robot. IR
Odo
Resis
…
Sn
Processing
Processing
Processing
…
Processing
e1
MAIN LOOP
I1
en
I2
I3
…
IN
O2
O3
…
ON
B1 O1
Sensors
B2
…
…
Mn
Bn
Perceptual Variables
Computing Saliences
c1 s1 Motor Plant
LWheel
RWheel
Fig. 5. In the CASFF model, perceptual variables (ei) form the input to the decision neural network. The output selection of the highest salience (si) is gated to the motors of the Khepera. Notice the busy-status signal (c1) from behavior B1 to the output neuron.
4 Central Action Selection and Genetic Algorithms In previous work an effective model of central action selection was used for the integration of the perceptions from the robot body sensors and the motor expression of the most bidding behavior [15]. The centralized model of action selection with sensor fusion (CASSF) builds a unified perception of the world at every step of the main control loop (Figure 5). Therefore, the use of sensor fusion facilitates the integration of multiple non-homogenous sensors into a single perception of the environment. The perceptual variables are used to calculate the urgency (salience) of a behavioral module to be executed. Furthermore, behavioral modules contribute to the calculation of the salience with a busy-status signal indicating a critical stage where interruption should not occur. Therefore, the salience of a behavioral module is calculated by weighting the relevance of the information from the environment (in the form of perceptual variables) and its busy status. In turn, the behavior with the highest salience wins the competition and is expressed as motor commands sent directly to the motor wheels and the gripper. Next, we explain how the salience is computed using
1166
F. Montes-González, J. Santos Reyes, and H. Ríos Figueroa
hand-coded weights. Firstly, the perceptual variables wall_detector(ew), gripper_sensor(eg), cylinder_detector(ec), and corner_detector(er) are coded from the various typical readings of the different sensors. These perceptual variables form the context vector, which is constructed as follows (e =[ew, eg, ec, er], ew, eg, ec, er ∈{1,0}). Secondly, five different behaviors return a current busy-status (ci) indicating that ongoing activities should not be interrupted. Thirdly, the current busy-status vector is formed as next described, c =[ cs, cp, cw, cf, cd ], cs , cp , cw , cf, cd ∈{1,0}, for cylinderseek, cylinder-pickup, wall-seek, wall-follow, and cylinder-deposit respectively. Finally, the salience (si) or urgency is calculated by adding the weighted busy-status (ci·wb) to the weighted context vector (e·[wje] T). Then with wb = 0.7 we have: s = c ⋅ w + e ⋅ ⎛⎜ we ⎞⎟ i i b ⎝ j⎠
T
for
0.0, − 0.15 − 0.15, 0.0 we = [ s cylinder − pickup we = [ 0.0, − 0.15, 0.15, 0.0 p e wall − seek 0.0, 0.0 w = [ − 0.15, 0.15, w wall − follow 0.15, 0.0, 0.0 we = [ 0.15, f cylinder − deposit w e = [ 0.15, 0.15, 0.0, 0.15 d cylinder − seek
], ],
(4)
], ], ]
The calculation of the salience is made by the selection mechanism to choose the most relevant behaviors for the solution of the foraging task, which consists of taking cylinders in the center to the corners of an arena. The centralized model of selection implements winner-takes-all selection by allowing the most salient behavior to win the competition. The computation of the salience can be thought as a decision neural network with an input layer of four neurons and an output layer of five neurons with the identity transfer function. The Khepera raw sensory information is fed, into the neural network, in the form of perceptual variables. Next, the input neurons distribute the variables to the output neurons. The behavior that is selected sends a busy signal to the output neurons when its salience is above the salience of the other behaviors. A selected behavior sends a copy of this busy signal to the five output neurons, and the five behavioral modules may all be selected, thus each of the five behaviors add five more inputs to the output neurons. However, the definition of the behavioral modules is yet to be explained. Cylinder-seek travels around the arena searching for food while avoiding obstacles, cylinder-pickup clears the space for grasping the cylinder, wallseek locates a wall whilst avoiding obstacles, follow-wall travels next to a proximate wall, and cylinder-deposit lowers and opens an occupied gripper. In this paper we have used GAs to tune the weights of the decision network for these behaviors. Each of the five output neurons of the decision network weighs four perceptual variables plus five different busy signals. Then, the evolution of the decision network employs a direct encoding for the chromosome c of 45 weights. Random initial values are generated for the initial population G0 of n=80 neural controllers. Elitism and tournament selection were used for the evolution of a behavior. A new individual was spawned using a single random crossover point with a probability of 0.5. Individuals of the new offspring mutate with a probability of 0.01, and their fitness is evaluated for about 48 seconds.
Integration of Evolution with a Robot Action Selection Model
1167
The fitness formula for the weights of the decision network was f
c4
= K 1 ( f c 2 + fwf + fcf ) + ( K 2 ⋅ pkfactor ) + ( K 3 ⋅ dpfactor )
(5)
The evolution of the weights of the neural network were evolved using in the fitness formula (fc4) the constants K1, K2 and K3 with K1
Fig. 6. Fitness is plot across 25 generations. For each generation the highest fitness of one individual was obtained from its maximum fitness over three trials in the same conditions, and the maximum fitness of all the individuals was averaged as a measure of the population fitness. Individuals are more rewarded if they avoid obstacles, collect cylinders, and deposit cylinders close to the corners. The evolution is stopped when the maximum fitness stabilizes over a fitness value of 4000.
5 Experiments and Results The foraging task was set in an arena with cylinders as simulated food. Communication from a computer host to the robot was provided from a RS232 serial interface. In order to facilitate the use of the resistivity sensor in the gripper claw, the cylinders that simulated food were covered with foil paper. It should be noticed that in this paper a behavior is considered as the joint product of an agent, environment, and observer. Therefore, a regular grasping-depositing pattern in the foraging task is the result of the selection of the behavioral types: cylinder-seek, cylinder-pickup, wall-seek, wall-follow and cylinder-deposit in that order. Commonly, this task is formed by four grasping-depositing patterns of foiled cylinders; the ethogram in Figure 7 resumes this task [with a time resolution in seconds]. Collection patterns can be disrupted if for example the cylinder slips from the gripper or a corner is immediately found. Additionally, long search periods may occur if a cylinder is not located. Infrared are noisy sensors that present similar readings for different objects with different orientations. Similar sensor readings can be obtained when the robot is barely hitting a corner or the robot is too close to a wall with a 45 degree angle. The latter explains the brief selection of the wall-seek behavior in the ethogram in Figure 7.
1168
F. Montes-González, J. Santos Reyes, and H. Ríos Figueroa
On the other hand, in Figure 8 we observe that the ethogram for the evolved decision network is formed by the following behaviors cylinder-seek, cylinder-pickup, wall-seek and cylinder-deposit with the wall-follow behavior not being selected. The use of a fitness function for shaping selection as a single pattern is optimizing the selection of the behavior in time and in the physical environment. Therefore, the execution of wall-follow behavior is a feature that does not survive during the process of evolution even though the fitness function rewards those individuals that follow walls.
Fig. 7. Ethogram for a typical run of the handcoded decision network. The behaviors are numbered as 1-cylinder-seek, 2-cylinderpickup, 3-wall-seek, 4-wall-follow, 5-cylinderdeposit and 6-no action selected. Notice the regular patterns obtained for a collection of 4 cylinders with 3 cylinder deposits. The individual in this ethogram shows a 76 % of the highest evolved fitness.
Fig. 8. Ethogram for a run of the evolved decision network. Behaviors are coded as the previous ethogram. Here we observe that behavior patterns regularly occur, however, the behavior follow-walls is never selected. The number of collected cylinders is the same as the cylinders deposited (a total of 4). This individual presents a 99 % of the highest evolved fitness.
Previously, we mentioned that a behavior should be considered as the joint product of an agent, environment, and observer. However, there is an additional factor that should be taken into account, which is the fitness of the agent that is solving the foraging task, who finally alters the order in the selection of the behaviors. In Figure 9, we observe that network-evolved individuals have the highest fitness (collecting 4 cylinders); although, these individuals also have the worst fitness (collecting none of the cylinders). The evolved individuals excluded the selection of the follow-wall behavior. In contrast, hand-coded-network individuals have a similar collecting performance mostly between 3 and 2 cylinders, all executing the five behaviors in the right order; however, these individuals fail to collect all 4 cylinders because of long searchingcylinders and wall-finding periods. The diverse selection of the five basic behaviors to hand and evolved designs can be explained in terms of what Nolfi [16] establishes as the difference between the “distal” and “proximal” description of behavioral types. The first one comes from the observer’s point of view, which inspires hand designs. On the other hand, a proximal description is best described as the agent’s point of view from its sensory-motor systems, which utterly defines the different reactions of the agent to particular sensory perceptions. An important consideration of Nolfi´s work is that there
Integration of Evolution with a Robot Action Selection Model
1169
is no correspondence between the distal and proximal descriptions of behaviors. This should account for our explanation of why evolution finds different combinations for selection and the use of basic behaviors in order to obtain an improved overall foraging behavior.
4500 4500
44
Fitness Fitness
3600 3600
33
2700 2700
22
1800 1800
11
900 900 00
Collected Collectedcylinders cylinders
Plot of of Individual Individual Network Network Fitness Fitness Plot
evolved evolved network network hand-coded hand-coded network network
00 11
66
11 11
16 21 16 21 Individuals Individuals
26 26
31 31
Fig. 9. Plot of the decision fitness of 30 individuals. The secondary Y-axis shows the number of collected cylinders. We observe that although the evolved decision network presents the highest fitness, this network also produces the worst individuals. In contrast, the hand-coded decision network holds a similar fitness for all their individuals.
6 Conclusions The integration of evolution with a central action selection mechanism using nonhomogenous behavior was carried out in this study. Additionally, behaviors related to exploring the arena were also evolved. However, sequential behavior for handling the cylinders was programmed as algorithmic procedure. Exploration behavior was modeled as the first step, and the decision network as the second step, in the evolution of our model. Next, we compared the fitness of the evolved decision network with that of the hand-coded network using the same evolved and sequential behaviors. The evolved decision and the hand-coded networks present differences in their fitness and selection of behavior. The reason behind these differences is the result of the distal and the proximal descriptions of behavioral selection in our model. The use of a proximal fitness function for the evolved decision network evaluates a complex behavioral pattern as a single behavior ruling out the selection of one of the behavioral types (wall-follow) modeled in the distal definition of the hand-coded decision network. As a result of our experiments, we conclude that for any kind of behavioral modules, it is the strength of their salience which finally determines its own selection, and ultimately its own fitness value. Finally, we are proposing that the integration of evolution and action selection somehow fixes the artificial separation of the distal description of behaviors. Furthermore, we believe that the use of redundant neural components sharing the same neural substratum, which may incrementally be built, should shed some light in the fabrication of biologically-plausible models of action selection. However, the coevolution of the behaviors and the selection network has first to be explored.
1170
F. Montes-González, J. Santos Reyes, and H. Ríos Figueroa
Acknowledgment This work has been sponsored by CONACyT-MEXICO grant SEP-2004-C01-45726.
References 1. P. Maes, How to do the right thing, Connection Science Journal Vol. 1 (1989), no. 3, 291-323. 2. S. Nolfi, "Evolving non-trivial behaviors on real robots: A garbage collection robot", Robotics and automation system, vol. 22, 1997, 187-98. 3. S. Nolfi, Floreano D., Miglino, O., Mondada, F., How to evolve autonomous robots: Different approaches in evolutionary robotics, Proceedings of the International Conference Artificial Life IV, Cambridge MA: MIT Press, 1994. 4. A. K. Seth, Evolving action selection and selective attention without actions, attention, or selection, From animals to animats 5: Proceedings of the Fifth International Conference on the Simulation of Adaptive Behavior, Cambridge, MA. MIT Press, 1998, 139-47. 5. S.-B. C. Kyong-Joong Kim, Robot action selection for higher behaviors with cam-brain modules, Proceedings of the 32nd ISR(International Symposium in Robotics), 2001. 6. B. R. Yamauchi B., Integrating reactive, sequential, and learning behavior using dynamical neural networks, From Animals to Animats 3, Proceedings of the 3rd International Conference on Simulation of Adaptive Behavior, MIT Press/Bradford Books, 1994. 7. J. H. Holland, Adaptation in natural and artificial systems, MIT Press, 1992. 8. D. Floreano, Mondana F., Evolution of homing navigation in a real mobile robot, IEEE Transactions on Systems, Man and Cybernetics 26 (1996), no. 3, 396-407. 9. S. Nolfi and D. Floreano, Evolutionary robotics, The MIT Press, 2000. 10. J. Santos and R. Duro, Artificial evolution and autonomus robotics (in spanish), Ra-Ma Editorial, 2005. 11. F. Mondana, E. Franzi and I. P., Mobile robot miniaturisation: A tool for investigating in control algorithms, Proceedings of the 3rd International Symposium on Experimental Robotics, Springer Verlag, 1993, 501-13. 12. Webots, "http://www.cyberbotics.com", Commercial Mobile Robot Simulation Software, 2006. 13. D. Floreano and F. Mondana, Automatic creation of an autonomous agent: Genetic evolution of a neural-network driven robot, From Animals to Animats III: Proceedings of the Third International Conference on Simulation of Adaptive Behavior, MIT Press-Bradford Books, Cambridge MA, 1994. 14. D. Bajaj and M. H. Ang Jr., An incremental approach in evolving robot behavior, The Sixth International Conference on Control, Automation, Robotics and Vision (ICARCV'2000), 2000. 15. F. M. Montes González, Marín Hernández A. & Ríos Figueroa H., An effective robotic model of action selection, R. Marin et al. (Eds.): CAEPIA 2005, LNAI 4177 (2006), 123-32. 16. S. Nolfi, Using emergent modularity to develop control systems for mobile robots, Adaptive Behavior Vol. 5 (1997), no. 3/4, 343-63.
A Hardware Architecture Designed to Implement the GFM Paradigm Jérôme Leboeuf Pasquier and José Juan González Pérez Departamento de Ingeniería de Proyectos Centro Universitario de Ciencias Exactas e Ingenierías Universidad de Guadalajara Apdo. Postal 307, CP 45101, Zapopan, Jalisco, México [email protected], [email protected]
Abstract. Growing Functional Modules (GFM) is a recently introduced paradigm conceived to automatically generate an adaptive controller which consists of an architecture based on interconnected growing modules. When running, the controller is able to build its own representation of the environment through acting and sensing. Due to this deep-rooted interaction with the environment, robotics is, by excellence, the field of application. This paper describes a hardware architecture designed to satisfy the requirements of the GFM controller and presents the implementation of a simple mushroom shaped robot.
1 Introduction Growing Functional Modules (GFM) introduced in a previous paper [1], is a prospective paradigm founded on the epigenetic approach, first introduced in Developmental Psychology [2] and presently applied to autonomous robotics. This paradigm allows the design of an adaptive controller and its automatic generation as described in the following section. An architecture based on interconnected modules is compiled to produce an executable file that constitutes the controller. Each module corresponds to an autonomous entity able to generate a specific and suitable list of commands to satisfy a set of input requests. Triggering the effective commands results from learning which is obtained by comparing the input request with their corresponding feedback. As a consequence, at each instant, the internal structure of the module, commonly built as a dynamic network of cells, is adapted to fit the correlation corresponding to the generated commands and the obtained feedback [3]. The adaptation of the internal structure determines the type of module: each type integrates a particular set of growing mechanisms according to the category of tasks it should solve. For example, firstly the RTR-Module and then its improved version, the RTR2-Module are specialized in performing basic automatic control; their growing internal structures are described respectively in [4] and [5]. The graphic representation of the RTR2Module presented in figure 1, shows the input-output values along with the internal dynamic structure. A. Gelbukh and C.A. Reyes-Garcia (Eds.): MICAI 2006, LNAI 4293, pp. 1171 – 1178, 2006. © Springer-Verlag Berlin Heidelberg 2006
1172
J. Leboeuf Pasquier and J.J. González Pérez
Fig. 1. Illustration of the RTR2-Module: the feedback [θ ,δ θ] provided in response to a command ±δf focused on satisfying a request θ0 leads the growing of the dynamic internal structure
According to the previous description, the consistency of the feedback is a key element to produce a consistent learning. In particular, considering virtual applications like 3D simulation, only those offering a high quality rendering could be connected to a GFM controller. Evidently, the kind of feedback desired for a GFM controller may only be obtained by sensing the real world; therefore, robotics is an ideal field of application for the current paradigm. Concerning software, portability and “genericity” have been matter of a particular emphasis. For example some potential improvements to the previously mentioned RTR-Module have been ignored to allow this module to attend a wider range of applications. Similarly, a common protocol has been established to connect every GFM systems to its application. In particular, the choice of using an external system and its associated communication link rather than an embedded one is justified by the necessity of studying the behavior of the GFM controller. Thus, the main task of the embedded control card is restricted to a cycle that includes the following steps: • •
waiting for a command from the GFM controller, executing it, which commonly consists of applying the corresponding shift to the designated actuator, • reading all sensors and computing their values, • sending back a formatted list of values to the controller. Despite of its apparent simplicity, implementing in hardware the requirements of the embedded controller presents several difficulties; they are described in the next sections along with their potential solutions.
A Hardware Architecture Designed to Implement the GFM Paradigm
1173
2 General Specifications First of all, the choice of a control card to be embedded in several simple robots must satisfy some obvious specifications including: a low energy consumption, a reduced size and an affordable cost. Besides, flexibility, reliability and universality are other significant criteria as most users will be students. These prior considerations restrict the investigation to a low number of well proven and accessible products. On the other hand, due to the fact that the GFM controller must not be embedded for convenience, and considering the simplicity of the processing cycle described in the previous section, the power of the processor could be considered as not critical; nevertheless, the GFM applications involve many communications with actuators and sensors and ultimately might include audio or video signal pre-processing. Of the two, the audio processing would by far use the most CPU cycles as it will include at least one Fourier Transform. An alternative consists of implementing this pre-processing on an extra card and thus, reduce the main control card requirements. Next, some control applications like the inverted pendulum presented in [4] and [5], involve real time processing. In such circumstances, timing is determined by the application and not by the controller: i.e. after applying to the corresponding actuator an input command, the application waits for a predefined period of time before reading the sensors and sending their values back to the controller. Consequently, an additional criterion is the ability of the control card to manage real time processing. Until recently, the only solution to manage all these constraints jointly was using a Xilinx card [6] but, its codification results very difficult due to its dynamically reconfigurable architecture programmed with the VHSIC Hardware Description Language (VHDL). Furthermore and in accordance with the protocols requirements described in the next section, the Xilinx card does not include pulse width modulation, I2C or even RS-232 hardware communication ports; finally, the Xilinx card does not incorporate analog-digital (A/D) converters. Recently, MicroChips proposed a new control card called PicDem HPC and built around the PIC18F8722 microcontroller [7]; a potentially satisfactory solution in consideration of our requirements. Additionally, the availability of a C compiler [8] reduces development time and effort. In the following sections, the study and development of a solution, based on the PicDem HPC control card, is presented.
3 Handling Actuators Despite a wide offering of actuators on the market, the electrical actuators commonly employed in GFM robotic applications may be classified into two categories: servomotor and direct current (DC) motor. The servomotors are traditionally controlled by pulse width modulation (PWM) that consists in sending a pulse with a predefined frequency to the actuator, the length of the pulse width indicates the desired position. The PIC microcontroller offers two possible implementations of the PWM. The first one is resolved though hardware using the embedded PWM module which implies configuring two specific registers: the period of the signal is set in PR2
1174
J. Leboeuf Pasquier and J.J. González Pérez
while the pulse width is given by the T2CON value. Then, the instruction CCPxCON allows the selection of the desired DIO pin. Nonetheless, this microcontroller only allows configuring five digital ports. Furthermore, the GFM paradigm requires moving all servomotors step-by-step which implies a very expensive operation to compute the T2CON value each time, due to the potential number of servomotors involved. Furthermore, the GFM paradigm requires moving all servomotors step by step that implies to compute each time the T2CON value, a too expensive operation due to the potential number of involved servomotors. Consequently, a better solution is to implement the servomotors’ control in software. A simple driver (see code listing figure 2) is programmed to emulate the PWM: this driver computes the next position adding or resting a step, taking into account that, moves produced by a GFM controller, are always generated step by step. Thus, the pulse is generated with the BSF instruction that actives a specific pin for a duration corresponding to the high pulse length. Controlling all servos at the same time requires multiplexing this process; such a task is possible because the refresh frequency of the servo is much lower than the frequency of these pulses. As the applications use different kind of servomotors, a library containing a specific driver for each one, has been developed. INIT
BSF CALL CALL BCF CALL CALL GOTO
SERVO,0 MINIM_ON DELAY_ON SERVO,0 MINIM_OFF DELAY_OFF INIT
;Pin servo On ;Call minimum On ;Call function delay ;Pin servo Off ;Call minimum Off ;Call function delay ;End of cycle
Fig. 2. Code listing of the driver in charge of controlling the PWM port
The second category, the DC motors, cannot be directly controlled by the card due to the high intensity they require in input. Thus, two digital control pins are connected to an H-bridge that amplifies the voltage and intensity to satisfy the motor’s requirements and also protects the microcontroller from peaks of current. The H-bridge is a classical electronic circuit consisting of four transistors placed in diamond. The two digital pins indicate the desired direction of the motor; optionally, a third pin could be used to specify the velocity. This third pin acts as a potentiometer controlling the voltage and current by means of a pulse width modulation sent to a fixed output pin.
4 Handling Sensors Compared with traditional robots, epigenetic ones require more sensors since feedback is essential to expand the growing internal structure of each modules; this means higher requirements in terms of communication ports, either analog or digital. To connect basic digital sensors, digital input-output (DIO) pins are required; the proposed card offered seventy digital input/output pins. Digital sensors, including mainly contact and infrared switches, are directly connected to the DIO pins; while
A Hardware Architecture Designed to Implement the GFM Paradigm
1175
the signal of analog sensors including, for example photocells, rotation sensors, pressure or distance sensors must be first digitalized through an analog-digital (A/D) converter. The traditional solution, consisting of using an external A/D converter, offers a good resolution but extends the hardware and uses many pins from the control card. A better and obvious solution consists of using one of the sixteen embedded A/D converters that uses the three registers ADCONx and offers a sufficient precision of ten bits. The first register ADCON0 is used to indicate the input pin for the analog signal and also acts as a switch, the second register ADCON1 specifies which pin is configured as analog and finally, the third register ADCON2 selects the conversion clock.
5 Communication Requirements Handling actuators and basic sensors do not fulfill all the requirements; in practice, faster and more sophisticated communication protocols must be considered. First, a high speed port is necessary to communicate with the controller considering that feedback may include video and audio signals. The control card offers two embedded RS232 ports with a maximum speed of 115,200 bits per second. In consideration of video signal, this implies using a low resolution or a slow frame rate. Moreover, during the tests, the maximum communication speed appears to be 57,600 bits per second. Consequently, the use of a single control card must be discarded when implementing complex robots include higher signals requirements.
Fig. 3. I2C connection diagram to communicate with the mentioned peripherals
Nevertheless, there is still the need of connecting a compass, an ultrasonic sensor and, more recently, a camera by means of an I2C communication port which is a standard designed by Philips Corporation® in 1980 [9]. I2C’s protocol only uses two lines: the first one transmits the clock signal (SCL) while the other is used for fullduplex communication (SDA); it allows connecting several masters with several slaves; the only restriction is that all masters must share the same clock signal. Three transmission speed are available: standard mode (100 KBits per second), fast mode (400 KBits per second) and high speed (3.4 MBits per second). Opportunely, the control card includes an I2C port that allows connecting the previously mentioned peripherals; the resulting diagram is given figure 3.
1176
J. Leboeuf Pasquier and J.J. González Pérez
Fig. 4. View of the disassembled components for the mushroom shaped robot including actuators and sensors
6 Implementing a Mushroom Shaped Robot To illustrate the proposed solution, this section describes the implementation of this architecture in the case of a mushroom shaped robot. A virtual version of this robot and its associated controller has been described in [10]. The real version differs mainly in the presence of an extra actuator on the foot of the robot and some extra sensors situated on the leg that should detect potential causes of damage. Figure 4 shows a view of all the components of the robot, including three servomotors, a DC motor and the following list of sensors: two pressure sensors, three rotation sensors, a photocell, two contact sensors, an infrared switch and a compass. The block diagram of the embedded control board is given figure 5. First, a serial communication allows the card to communicate via the standard protocol with the personal computer that hosts the GFM controller. Secondly, four actuators including three servomotors and one DC motor communicate through respectively, the PWM ports and two DIO pins. Then, the photocell, the rotation sensors and the pressure sensors are connected through six A/D converters. The contact and infrared switches
A Hardware Architecture Designed to Implement the GFM Paradigm
COMPASS
CONTACT INFRARED
SD A
SLC
I2C GFM controller
TX
RX
R S 2 3 2
1177
DIO Pins
PIC 18F8722 PWM
SERVOMOTOR
DIO Pins
C o n v . A / D
ROTATION PHOTOCELL PRESSURE
DC MOTOR
Fig. 5. Block diagram of the embedded control board corresponding to the mushroom shaped robot
are directly sensed through three DIO pins. Finally, the compass uses an I2C communication port. Therefore, the complete architecture may be tested by means of a remote control application executed on the external computer, before using the GFM controller.
7 Conclusions The purpose of this paper is to describe a novel architecture developed to implement the embedded control board connected to an autonomous controller based on the Growing Functional Module paradigm. The main challenge induced by this paradigm consists in involving a higher number of communications ports because of a more elevated feedback required from the environment. The proposed solution using a PicDem HPC satisfies the initial requirements of a simple application; i.e. an application without audio-video signals, but with a high number of digital inputs-outputs and analog inputs. As an illustration, this architecture is applied to handle the actuators and sensors of robots, like the mushroom shaped one described in the previous section which has been successfully implemented. When facing robotic applications with a large number of actuators and sensors, a more complex architecture involves the same card and drivers; nevertheless, several subsets of sensors and actuators are handled by subsystems hosted on more rudimentary cards but powered by the same microcontroller. This solution and its application to a four-legged robot will be described in a forthcoming paper. In the case of robots requiring high level signals processing like voice and video, none of the previous architectures offers sufficiently capacity; mainly because of the high communication rate with the controller, but additionally, for the elevated processing requirements. Such high processing requirements could be satisfied by using a small single board computer which we are presently investigating.
1178
J. Leboeuf Pasquier and J.J. González Pérez
References 1. Leboeuf, J.: Growing Functional Modules, a Prospective Paradigm for Epigenetic Artificial Intelligence. Lecture Notes in Computer Science 3563. Larios, Ramos and Unger eds. Springer (2005) 465-471 2. Piaget, J.: Genetic Epistemology (Series of Lectures). Columbia University Press, Columbia, New York (1970) 3. Leboeuf, J.: A Self-Developing Neural Network Designed for Discrete Event System Autonomous Control. Advances in Systems Engineering, Signal Processing and Communications, Mastorakis, N. eds. Wseas Press (2002) 30-34. 4. Leboeuf, J.: A Growing Functional Module Designed to Perform Basic Real Time Automatic Control, publication pending in Lecture Notes in Computer Science, Springer (2006) 5. Leboeuf, J.: Improving the RTR Growing Functional Module, publication pending in proceedings of IECON, Paris (2006) 6. CoolRunner II CPLD Family, Xilinx internal documentation (2006) 7. Microchip, PIC18F8722 Family Data Sheet, 64/80-Pin, 1 Mbit, Enhanced Flash Microcontroller with 10-but A/D and NanoWatt Technology (2004) 8. MPLAB C18 C Compiler User’s Guide, Microchip Technology Inc’s internal documentation (2005) 9. Phillips Corporation, I2C bus specification vers.2.1, internal documentation (2000) 10. Leboeuf, J., Applying the GFM Prospective Paradigm to the Autonomous and Adaptive Control of a Virtual Robot, Lecture Notes in Artificial Intelligence, Springer 3789 (2005) 959-969
Fast Protein Structure Alignment Algorithm Based on Local Geometric Similarity Chan-Yong Park1, Sung-Hee Park1, Dae-Hee Kim1, Soo-Jun Park1, Man-Kyu Sung1, Hong-Ro Lee2, Jung-Sub Shin2, and Chi-Jung Hwang2 1
Electronics and Telecommunications Research Institute, 161 Gajung, Yusung, Daejeon, Korea {cypark, sunghee, dhkim98, psj, mksung}@etri.re.kr 2 Dept. of Computer Science, Chung Nam University (CNU), Daejon, Korea {hrlee, iplsub, cjhwang}@ipl.cnu.ac.kr
Abstract. This paper proposes a novel fast protein structure alignment algorithm and its application. Because it is known that the functions of protein are derived from its structure, the method of measuring the structural similarities between two proteins can be used to infer their functional closeness. In this paper, we propose a 3D chain code representation for fast measuring the local geometric similarity of protein and introduce a backtracking algorithm for joining a similar local substructure efficiently. A 3D chain code, which is a sequence of the directional vectors between the atoms in a protein, represents a local similarity of protein. After constructing a pair of similar substructures by referencing local similarity, we perform the protein alignment by joining the similar substructure pair through a backtracking algorithm. This method has particular advantages over all previous approaches; our 3D chain code representation is more intuitive and our experiments prove that the backtracking algorithm is faster than dynamic programming in general case. We have designed and implemented a protein structure alignment system based on our protein visualization software (MoleView). These experiments show rapid alignment with precise results.
1 Introduction Since it is known that functions of protein might be derived from its structure, functional closeness can be inferred from the method of measuring the structural similarity between two proteins [1]. Therefore, fast structural comparison methods are crucial in dealing with the increasing number of protein structural data. This paper proposes a fast and efficient method of protein structure alignment. Many structural alignment methods for proteins have been proposed [2, 3, 4, 5, 6] in recent years, where distance matrices, and vector representation are the most commonly used. Distance matrices, also known as distance plots or distance maps, contain all the pair-wise distances between alpha-carbon atoms, i.e. the Cα atoms of each residue [3]. This method has critical weak points in terms of its computational complexity and sensitivity to errors in the global optimization of alignment. Another research approach represents a protein structure as vectors of the protein’s secondary A. Gelbukh and C.A. Reyes-Garcia (Eds.): MICAI 2006, LNAI 4293, pp. 1179 – 1189, 2006. © Springer-Verlag Berlin Heidelberg 2006
1180
C.-Y. Park et al.
structural elements (SSEs; namely α-helices and β-strands). In this method, a protein structures are simplified as a vector for efficient searching and substructure matching [5]. But, this approach suffers from the relatively low accuracy of SSE alignments, and in some cases, it causes a failure in producing an SSE alignment due to the lack of SSEs in the input structures. A major drawback of these approaches is that it needs to perform an exhaustive sequential scan of a structure database to find similar structures to a target protein, which makes all previous methods not be feasible to be used for the large structure databases, such as the PDB [4]. This paper is organized as follows. In Section 2 a new alignment algorithm is proposed. We propose the 3D chain code and apply the backtracking algorithm for protein alignment. In Section 3 we show alignment result as RMSD and computation time.
2 The Proposed Protein Structure Alignment Algorithm The algorithm comprises four steps. (Figure 1) Step 1: We make a 3D chain code for 3-dimensional information of the protein. Since the protein chain is similar to thread, we regard a protein chain as a thread. Then, we convert the thread into a progressive direction vector and use the angles of the direction vector as local features. This method basically exploits the local similarity of the two proteins. Step 2: For local alignment of the two proteins, we compare each of the 3D chain code pairs. If two 3D
Protein A
Protein B
Generate 3D Chain Code
Generate 3D Chain Code
Compare 3D Chain Code Construct Similarity map
Step 2
Find SSPs Merge SSPs
Step 3
Join SSPs
Step 4
Final Alignment
Fig. 1. Overall algorithm steps
Step 1
Fast Protein Structure Alignment Algorithm Based on Local Geometric Similarity
1181
chain code pairs are similar, we plot a dot on the similarity map. After finishing the comparison of the two 3D chain code pairs, we make a similar substructure pair(SSP)set. Step 3: For fast calculation, we merge SSPs with secondary structure information of the proteins. Step 4: We apply a backtracking algorithm to join SSP by combing the gaps between two consecutive SSPs, each with its own score. In this section, we provide a detailed description of the new algorithm and its implementation. 2.1 3D Chain Code The protein structure data are obtained from the Protein Data Bank [4]. For each residue of the protein we obtain the 3D coordinates of its Cα atoms from the PDB file. As a result, each protein is represented by approximately equidistant sampling points in 3D space. To make a 3D chain code, we regard four Cα atoms as a set (Fig 2)[18]. We calculate a homogeneous coordinate transform to create a new coordinate (u,v,n) composed of a Up Vector (Cα i, Cα i+1) and a directional vector (Cα i+1, Cα i+2) as the new axis coordinate. This method uses the following equation: ⎧ R11 ⎪R ⎪ Homogeneous Coordinate Transform T = ⎨ 21 ⎪ R31 ⎪⎩ T1
R12
R13
R22
R23
R32 T2
R33 T3
0⎫ 0⎪⎪ ⎬ 0⎪ 1⎪⎭
(1)
The directional vector Dir (R31, R32, R33) is:
R31 =
x3 − x2 y − y2 z −z , R32 = 3 , R33 = 3 2 , v v v
v = ( x3 − x2 ) + ( y3 − y2 ) + ( z3 − z2 ) 2
2
(2)
2
Up vector Up (R21, R22, R23) is: Up = Upw-(Upw·Dir)*Dir, Upw = (x1-x2, y1-y2, z1-z2)
(3)
Right vector R (R11, R12, R13) is: R = Up x Dir
(4)
Translation vector T(T1, T2, T3) is: T= (-x3, -y3, -z3)
(5)
The transform T is applied to Cαi+3 to calculate a transformed Cα’i+3 ( xt, yt, zt). Then, we convert Cα’i+3 to a spherical coordinate. The conversion from cartesian coordinate to spherical coordinate is as follows:
1182
C.-Y. Park et al.
r = xt + yt + zt 2
2
2
⎛ yt ⎝ xt
⎞ ⎟⎟, (−π < θ < π ) ⎠ ⎛z ⎞ φ = cos−1 ⎜ t ⎟ = cos −1 ( zt ) (−π < φ < π ) ⎝r⎠
θ = tan −1 ⎜⎜
(6)
For protein structure matching, the Ca atoms along the backbone can be considered as equally spaced because of the consistency in chemical bond formation. Since we can use the same polygonal length between the Ca atoms, we regard r as 1 in the spherical coordinate. By following this step, the 3D chain code (CCA) of protein A is created for a protein chain: CCA={{Ø1,θ1}, {Ø2,θ2}, … {Øn,θn}}
(7)
Where n is the total number of amino acid of protein A minus 3. y u
Cαi (x1,y1 ,z1)
Cαi+1 (x2 ,y2 ,z2)
Cαi+3 (x4, y4 ,z4)
r φ
Cαi+2 (x3, y3, z3)
x v
θ z n Fig. 2. The 3D chain code
2.2 Finding Similar Substructure Pair Set Because the 3D chain code represents a relative direction in the 4 atoms of a protein, we can compare local similarity of two proteins by means of comparing 3D chain code of the two proteins. Given two proteins, we construct a similarity map. The similarity map represents how much two proteins are aligned together. The entry D(i,j) of the similarity map denotes the similarity between the 3D chain code values of the ith residue of protein A({Øi,θi}) and the jth residue of protein B({Øj,θj}), and is defined by the following equation. D(i, j ) = (φi − φ j ) 2 + (θ i − θ j ) 2
(8)
Fast Protein Structure Alignment Algorithm Based on Local Geometric Similarity
1183
This measure is basically the Euclidian distance. After calculating each D(i,j) for for i and j, we obtain the entry value below than degree angles of threshold (Td) in similarity map. We use 10 as Td in our experiments. Figure 3 shows an example of a similarity map for the 3D chain code between two particular proteins called 1HCL and 1JSU:A. By using this similarity map, our goal is to find all SSPs in the map. A SSP is represented as a diagonal line in the map. For finding a SSP, we find first element D(i,j) with the value below Td and then, find the next element at D(i+1, j+1) and D(i1, j-1) with the value below Td and the same procedure is repeated until the next elements is below Td. This process can be viewed as finding diagonal lines in the similarity map. After finding a SSP, we define it as a SSPkl (i,j)(Fig.4). SSPkl (i,j) = { {{Øi+0,θi+0}, {Øi+1,θi+1}, … {Øi+l,θi+l}}, {{Øj+0,θ j+0}, {Ø j+1,θ j+1}, … {Ø j+l,θ j+l}} }
(9)
(k is the index of SSP, l is the length of the SSP, i is the index of protein A, j is the index of protein B).
Fig. 3. The similarity map of 1HCL and 1JSU:A
i
Protein A i+l
j
SSPkl (i,j)
Protein B j+l Fig. 4. The k-th SSPkl (i,j) in similarity map
2.3 Merging Similar Substructure Pairs In the previous section, we have found many SSPs. Because the computation time of the protein alignment depends on the number of SSPs, we merge specific SSPs into a SSP.
1184
C.-Y. Park et al.
In the similarity map, we find rectangular shape which is composed of many SSPs. The SSPs which has same secondary structure cause the rectangular shape.(Figure 5) For example, if protein A and protein B has same α-helix structure, they have a similar geometric structure each 1 rotation turn. The β-strands are same. In this case, we merge SSPs with same secondary structure into a single SSP. After merging SSPs, the similarity map is shown in Figure 6. Protein A: Green
Protein B: Red
1
Protein A
9
1 1
Protein B
9 9
1
9
1
9
1 1
9 1
1
9
9 1
9
9 9
…
…
1
9
1
9
1
9
1
Fig. 5. The rectangular shape in the similarity map
Fig. 6. After merging SSPs, the similarity map of 1HCL and 1JSU:A
2.4 Joining Similar Substructure Pairs In this section, we should find optimal SSPs, which describe a possible alignment of protein A with protein B. We apply the modified backtracking algorithm [15] for joining SSPs. The backtracking algorithm is a refinement of the brute force approach, which systematically searches for a solution of a problem from among all the available
Fast Protein Structure Alignment Algorithm Based on Local Geometric Similarity
1185
options. It does so by assuming that the solutions are represented by the vectors (s1, ..., sm) of values and by traversing, in a depth-first manner, the domains of the vectors until the solutions are found. When invoked, the algorithm starts with an empty vector. At each stage it extends the partial vector with a new value. On reaching a partial vector (s1, ..., si) which cannot represent a partial solution by promising function, the algorithm backtracks by removing the trailing value from the vector, and then proceeds by trying to extend the vector with alternative values. The traversal of the solution space can be represented by a depth-first traversal of a tree. We represent the SSPs as nodes (vi) of the state space tree. The simple pseudo code is shown in figure 7. void backtrack(node v) // A SSP is represented as a node { if ( promising(node v) ) if (there is a solution at node v) Write solution else for ( each child node u of node v ) backtrack(u); }
Fig. 7. The backtracking algorithm
We use a connectivity value for each SSP as a promising function. If two SSPs (SSPk and SSPk+1 ) have a similar 3D rotation and translation below the threshold, the promising function returns the value true. The pseudo code is shown in figure 8. bool promising (node v) { Transform T = parentNode().SSP1.GetTransform(); v.SSP2.apply(T); double f = RMSD(v.SSP1, v.SSP2); return (f>threshold)? FALSE: TRUE; }
Fig. 8. The promising function in the backtracking algorithm
The root node in the tree is the first SSP. After running this algorithm, many solutions are established. We calculate the RMSD value from the solutions offered. Then, we select a solution with the minimum RMSD value.
3 Implementation and Results We have testes our algorithm on a MoleView visualization tool(Figure 9). MoleView is a Win2000/XP-based protein structure visualization tool. MoleView was designed
1186
C.-Y. Park et al.
to display and analyze the structural information contained in the Protein Data Bank (PDB), and can be run as a stand-alone application. MoleView is similar to programs such as VMD , MolMol , weblab, Swiss-Pdb Viewer, MolScript, RasMol, qmol[16], and raster3d[17], but it is optimized for a fast, high-quality rendering of the current PC-installed video card with an easy-to-use user interface.
(a) display stick model
(b) display secondary structure
(c) display ball and stick model and secondary structure (d) Close look at (c) Fig. 9. The screenshot of MoleView
Our empirical study of the protein structure alignment system using the 3D chain code could lead to very encouraging results. Figure 10 shows the alignment result of protein 1HCL_ and 1JSU_A. These protein are cyclin-dependent protein kinases, the uncomplexed monomer (1HCL:_) in the open state and the complex with cyclin and P27 (1JSU:A) in the closed state. While the sequences of the uncomplexed and complexed state are almost identical with 96.2% homology, there are significant conformational differences. Differences are found in both active site. The RMSD of the two proteins is 1.70 and the alignment time is 0.41 sec. Figure 11 shows the alignment result for protein 1WAJ_ and 1NOY_A. These proteins are the DNA Polymerase. The residues that matched are [7,31]-[63,78][87,102]-[104,119]-[128,253]-[260,297]-[310,372] of the protein 1WAJ and [6,30][60,75]-[84,99]-[101,116]-[124,249]-[256,293]-[306,368] of 1NOY_A. The number of alignment is 261 and the RMSD is 2.67. The processing time for alignment is 13.18 sec. The processing time is very short. In CE [7], this time is 298 seconds.
Fast Protein Structure Alignment Algorithm Based on Local Geometric Similarity
1187
A further result is our use as a test of the protein kinases, for which over 30 structures are available in the PDB. The results of a search against the complete PDB using the quaternary complex of the cAMP-dependent protein kinase in a closed conformation (1ATP:E) as a probe structure is presented in Table 1. The average RMSD is 2.45 and the average alignment time is 0.54 sec.
The number of aligned AA: 202 RMSD is 1.70 Alignment time is 0.41 sec
Matched SSP [18,23]-[31,58]-[64,133]-[153,162]-[168, 279] [29,34]-[39,66]-[72,141]-[162,171]-[177, 288] Fig. 10. The alignment between 1HCL_ and 1JSU_A(Image captured by MoleView)
The number of aligned AA: 261 RMSD is 2.67 Alignment time is 13.18 sec
Matched SSP [6,30]-[60,75]-[84,99]-[101,116]-[124,249]-[256,293]-[306,368] [7,31]-[63,78]-[87,102]-[104,119]-[128,253]-[260,297]-[310,372] Fig. 11. The alignment between 1WAJ_ and 1NOY_A (Image captured by MoleView)
1188
C.-Y. Park et al. Table 1. Experimental results of protein alignment
No. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Chain 2 1APM_E 1CDK_A 1YDR_E 1CTP_E 1PHK_ 1KOA_ 1KOB_A 1AD5_A 1CKI_A 1CSN_ 1ERK_ 1FIN_A 1GOL_ 1JST_A 1IRK_ 1FGK_A 1FMK_ 1WFC_ 1KNY_A 1TIG_
#Alignment 336 336 324 304 233 118 166 111 120 142 116 73 116 88 64 44 97 101 27 25
RMSD 0.54 0.80 0.66 2.63 1.68 1.53 2.95 3.43 2.61 2.77 2.99 2.46 3.07 2.60 3.25 3.16 2.37 3.44 2.41 3.66
Time(seconds) 0.91 0.51 0.48 0.49 0.43 0.70 0.54 0.68 0.48 0.46 0.57 0.55 0.56 0.45 0.46 0.55 0.67 0.54 0.46 0.31
4 Discussion and Conclusion This paper proposed a noble protein structure alignment method through the 3D chain code of a protein chain direction vector and a backtracking algorithm for joining SSPs. The 3D chain code represents the protein chain structure efficiently. The essential concept here is the idea of a protein chain as a thread. Beginning with this idea, we made a 3D chain code for searching similar substructures. For joining SSPs, we use the backtracking algorithm. Other protein structure alignment systems use dynamic programming. However, in this case, the backtracking algorithm is more intuitive and operates more efficiently. This algorithm has particular merit, unlike other algorithms. The methodology uses a 3D chain code that is more intuitive and a backtracking algorithm that is faster than dynamic programming generally speaking. Thus, the alignment is very faster. In general cases, the alignment time is 0.5 of a second and rarely exceeds 1.0 second. Consequently, because the proposed protein structure alignment system shows fast alignment with relatively precise results, it can be used for pre-screening purposes using the huge protein database.
References [1] Philip E. Bourne and Helge Weissig: Structural Bioinformatics, Wiley-Liss, 2003. [2] Taylor, W. and Orengo, C., “Protein structure alignment,” Journal of Molecular Biology, Vol. 208(1989), pp. 1-22.
Fast Protein Structure Alignment Algorithm Based on Local Geometric Similarity
1189
[3] L.Holm and C.Sander, “Protein Structure Comparison by alignment of distance matrices”, Journal of Molecular Biology, Vol. 233(1993), pp. 123-138. [4] Rabian Schwarzer and Itay Lotan, “Approximation of Protein Structure for Fast Similarity Measures”, Proc. 7th Annual International Conference on Research in Computational Molecular Biology(RECOMB) (2003), pp. 267-276. [5] Amit P. Singh and Douglas L. Brutlag, “Hierarchical Protein Structure Superposition using both Secondary Structure and Atomic Representation”, Proc. Intelligent Systems for Molecular Biology(1993). [6] Won, C.S., Park, D.K. and Park, S.J., “Efficient use of MPEG-7 Edge Histogram Descriptor”, ETRI Journal, Vol.24, No. 1, Feb. 2002, pp.22-30. [7] Shindyalov, I.N. and Bourne, P.E., “Protein structure alignment by incremental combinatorial extension (CE) of the optimal path”, Protein Eng., 11(1993), pp. 739-747. [8] Databases and Tools for 3-D protein Structure Comparison and Alignment Using the Combinatorial Extension (CE) Method ( http://cl.sdsc.edu/ce.html). [9] Chanyong Park, et al, MoleView: A program for molecular visualization, Genome Informatics 2004, p167-1 [10] Lamdan, Y. and Wolfson, H.J., “Geometric hashing: a general and efficient model-based recognition scheme”, In Proc. of the 2nd International Conference on ComputerVision (ICCV), 238-249, 1988. [11] Leibowitz, N., Fligelman, Z.Y., Nussinov, R., and Wolfson, H.J., “Multiple Structural Alignment and Core Detection by Geometric Hashing”, In Proc. of the 7th International Conference on Intelligent Systems for Molecular Biology (ISMB), 169-177, 1999 [12] Nussinov, R. and Wolfson, H.J., “Efficient detection of three-dimensional structural motifs in biological macromolecules by computer vision techniques”, Biophysics, 88: 10495-10499, 1991. [13] Pennec, X. and Ayache, N., “A geometric algorithm to find small but highly similar 3D substructures in proteins”, Bioinformatics, 14(6): 516-522, 1998. [14] Holm, L. and Sander, C., “Protein Structure Comparison by Alignment of Distance Matrices”, Journal of Molecular Biology, 233(1): 123-138, 1993. [15] S. Golomb and L. Baumert. Backtrack programming. J. ACM, 12:516-524, 1965. [16] Gans J, Shalloway D Qmol: A program for molecular visualization on Windows based PCs Journal of Molecular Graphics and Modelling 19 557-559, 2001 [17] http://www.bmsc.washington.edu/raster3d/ [18] Bribiesca E. A chain code for representing 3D curves. Pattern Recognition 2000;33: 755–65.
Robust EMG Pattern Recognition to Muscular Fatigue Effect for Human-Machine Interaction Jae-Hoon Song1, Jin-Woo Jung2, and Zeungnam Bien3 1
Air Navigation and Traffic System Department, Korea Aerospace Research Institute, 45 Eoeun-dong, Yuseong-gu, Daejeon 305-333, Korea [email protected] 2 Department of Computer Engineering, Dongguk University, 26 Pil-dong 3-ga, Jung-gu, Seoul 100-715, Korea [email protected] 3 Department of Electrical Engineering and Computer Science, Korea Advanced Institute of Science and Technology, 373-1 Guseong-dong, Yuseong-gu, Daejeon 305-701, Korea [email protected]
Abstract. The main goal of this paper is to design an electromyogram (EMG) pattern classifier which is robust to muscular fatigue effects for human-machine interaction. When a user operates some machines such as a PC or a powered wheelchair using EMG-based interface, muscular fatigue is generated by sustained duration time of muscle contraction. Therefore, recognition rates are degraded by the muscular fatigue. In this paper, an important observation is addressed: the variations of feature values due to muscular fatigue effects are consistent for sustained duration time. From this observation, a robust pattern classifier was designed through the adaptation process of hyperboxes of Fuzzy Min-Max Neural Network. As a result, significantly improved performance is confirmed.
1 Introduction As the number of the elderly is rapidly increasing along with the number of the handicapped caused by a variety of accidents, the social demand for welfare and support of state-of-the-art technology are also increasing to lead more safe and comfortable lives. In particular, the elderly or the handicapped have serious problems in doing a certain work with their own effort in daily life so that some assistive devices or systems will be very helpful to assist such people or to do the work instead of human beings endowing as much independence as possible so as to improve their quality of life. There are a variety of devices to assist their ordinary activities. One of useful methods is the electromyogram (EMG)-based interface. EMG can be acquired from any available muscle regardless of disability levels. Therefore, a design of EMG-based interface can follow the concept of universal design scheme. Another advantage of EMG is the ease-of-use. EMG-based interface is very intuitive since it reflects human intention on movement. A. Gelbukh and C.A. Reyes-Garcia (Eds.): MICAI 2006, LNAI 4293, pp. 1190 – 1199, 2006. © Springer-Verlag Berlin Heidelberg 2006
Robust EMG Pattern Recognition to Muscular Fatigue Effect
1191
Besides, there is a significant weakness in EMG-based interface, time-varying characteristics of EMG signals mainly from muscular fatigue effect. For example, when a user operates a powered wheelchair by his/her EMG, the user has to sustain a muscle contraction to control both direction and speed of the powered wheelchair. If the muscle contraction is sustained, muscular fatigue is generated by the contraction time and system performance is degraded by this fatigue effect. The main goal of this paper is to design an EMG pattern classifier which is robust to muscular fatigue effects for human-machine interaction. For this objective, the previous works about muscular fatigue effects are introduced with problems in section 2. Our approach to solve muscular fatigue effects are addressed in section 3. Finally, experimental results are shown in section 4.
2 Muscular Fatigue Effect in EMG Pattern Recognition According to a dictionary, fatigue is defined by the feeling of extreme physical or mental tiredness [1]. Muscular fatigue is, therefore, a fatigue due to sustained muscular contraction. Since a muscle movement accompanies with a variety of neural transmissions, muscular fatigue is expressed by complicated procedures. Muscular fatigue is known to be divided by central fatigue and peripheral fatigue according to sustained time of muscle contraction [2]. Central fatigue is defined as a fatigue of neural transmission, such as a decrease of the motor unit (MU) firing rate [2]. Peripheral fatigue is defined as a fatigue related to biochemical metabolism, such as an accumulation of metabolic by-products in the active muscles and a deficiency of energy sources [2]. Central fatigue is appeared through a couple of hours and a few days. Besides, peripheral fatigue is appeared through a few second and a few minute [3]. Therefore, peripheral fatigue is more important factor for HMI. We mainly considered peripheral fatigue in this paper. A muscular fatigue including peripheral fatigue can be expressed by the relations of characteristic frequencies such as median frequency (MDF) and mean frequency (MNF) [4]. Both MDF and MNF are defined by the following mathematical Eq. (1) and (2), respectively. Here, P(ω) represents the power spectrum at the specific frequency.
∫
MDF
0
P(ω ) dω = ∫
∞
MDF
P (ω )dω =
1 ∞ P(ω ) dω 2 ∫0
(1)
∞
∫ ω P(ω )dω MNF = ∫ P(ω )dω 0
∞
(2)
0
A representatively previous work regarding muscular fatigue compensation of EMG-based interface is a research about a prosthetic hand, named by ‘Utah arm’ [5]. Here, muscular fatigues generated by repetitive muscle movements (radial flexion motion) are improved by the fatigue compensating preprocessor. And, Winslow et al. [6] proposed a fatigue compensation method using artificial neural networks for functional electrical stimulation (FES) to generate stimulations with suitable intensity at a proper time. But, these all results of previous works are based on a specific single muscle movement, not considering various muscle movements together.
1192
J.-H. Song, J.-W. Jung, and Z. Bien
Besides, it is generally desired that more than two wrist movements are dealt with together for EMG-based human-machine interaction. For example, the required degree-of-freedom in the PC mouse control or powered wheelchair control is more than five; up, down, left, right, and click or stop (See Fig. 1). Therefore, fatigue levels are also different motion-by-motion even in a same muscle [7]. After all, a new fatigue compensation method is desired to meet together with fatigue effects of various motions from human intention.
Fig. 1. 6 Basic Motions for Human-Machine Interaction
3 Robust EMG Pattern Recognition to Muscular Fatigue Effect 3.1 Adaptation Method to Muscular Fatigue Effect There are several assumptions to be used for implementing the adaptation process of EMG pattern recognizer to muscular fatigue effects. The first assumption is that there is only one user for one recognizer. It considers excluding the effect of individual difference. The second assumption is that the locations of EMG electrodes are always same with 4 channels like Fig. 2. The third assumption is about a method of the
Fig. 2. Placement of surface electrodes for EMG acquisition
Robust EMG Pattern Recognition to Muscular Fatigue Effect
1193
muscle contraction. Sustained motions are considered instead of repetitive muscle movements. And recovery process is accompanied simultaneously with the ends of his/her motion. And the forth assumption is that EMG signal can be quasi-stationary when EMG signal is segmented by short periods [8]. This assumption may be verified with Fig. 3. Fig. 3 shows the variations of a feature value, Difference Absolute Mean Value (DAMV, See Eq. (3)), during 60 sec with one of the six basic motions including the reference motion. Each signal acquisition is repeatedly performed as many as ten times through sustained contractions for each defined basic motion. Here, fatigue effects between adjacent trials are excluded by assigning three minutes rest. Difference Absolute Mean Value (DAMV) [9] DAMV is the mean absolute value of the difference between adjacent samples and expressed by Eq. (3). Here, N is the size of time-window for computing. DAMV =
1 N ∑ | xi +1 − xi | N − 1 i =1
(3)
Time course of extracted feature(DAMV) - ILRUDC by Least square fitting 2.5 initial left right up down click
magnitude
2
1.5
1
0.5
0
0
10
20
30 time(sec)
40
50
60
(a) DAMV at channel #3 Time course of extracted feature(DAMV) - ILRUDC by Least square fitting 2.5 initial left right up down click
magnitude
2
1.5
1
0.5
0
0
10
20
30 time(sec)
40
50
(b) DAMV at channel #4 Fig. 3. Time-dependent feature variations
60
1194
J.-H. Song, J.-W. Jung, and Z. Bien
Fig. 3 shows that the trends of feature variations are consistent. This observation is still same in four other volunteers’ tests. After all, the amount of feature variation from initial value may be estimated using the contraction time. And, the muscle contraction time may be estimated as the lasting time of human motion based on the third assumption on the continuous muscle contraction. As a result, the degradation in system performance by the muscular fatigue effect can be compensated with differential feature value, DAMV in Fig. 3, by estimating the muscle contraction time via the lasting time of human motion. 3.2 Robust EMG Pattern Recognizer The suggested block diagram of robust EMG pattern recognizer is shown in Fig. 4. In Fig. 4, the above part is a general pattern recognition scheme and the below one is additional adaptation process for robust EMG pattern recognition to the muscular fatigue effect. Here, Except DAMV, three additional features [9], Integral Absolute Value (IAV), Zero-Crossing (ZC), and Variance (VAR), are used for the pattern classification (See Eq.(4),(5),(6)). And Fuzzy Min-Max Neural Network [10] is adopted as a pattern classification method by its conspicuous on-line learning ability. FMMNN is a supervised learning classifier that utilizes fuzzy sets as pattern classes. Each fuzzy set is a union of fuzzy set hyperboxes. Fuzzy set hyperbox is an n-dimensional box defined by a min point and a max point with a corresponding membership function. Learning algorithm of FMMNN is the following three-step process [10]: • Expansion: Identify the hyperbox that can expand and expand it. If an expandable hyperbox can’t be found, add a new hyperbox for that class. • Overlap test: Determine if any overlap exists between hyperboxes from different classes. • Contraction: If overlap between hyperboxes that represent different classes does exist, eliminate the overlap by minimally adjusting each of the hyperboxes. IAV =
1 N
N
∑| x | i =1
i
Fig. 4. Block diagram of proposed method
(4)
Robust EMG Pattern Recognition to Muscular Fatigue Effect
N ⎧ 1, if x > 0 ZC = ∑ sgn(− xi xi +1 ), sgn( x) = ⎨ i =1 ⎩0, otherwise
VAR =
1195
(5)
1 N ∑ ( xi − E{x})2 N − 1 i =1
(6)
E{x} is a mean value for a given segment. And, N is the size of time-window for computing. In Fig. 4, adaptation process consists of three sub-parts: parameter estimation, start-time detection, and fatigue compensation. Parameter Estimation: Two characteristic frequencies, MDF and MNF in Eq. (1) and (2) are calculated with the given signal. Start-Time Detection: A transition between any two basic motions defined in Fig. 1 naturally goes by reference posture since reference posture is defined as relaxation posture. Thus, the start-time of muscle contraction for another motion can be found by detecting reference posture. By several experiments on the values of both MDF and MNF for various human motions, a rule described in Eq. (7) has been found and used for detecting the start-time and initializing the time instant of a motion. If
MDF at channel #3 < 40 Hz
and
MDF at channel #4 < 40 Hz
and
MNF at channel #3 < 60 Hz and MNF at channel # 4 < 60 Hz , then start-time is
(7)
detected.
Fatigue Compensation: The fatigue compensation is performed simply using the graph in Fig. 3 because the graph in Fig. 3 can be used as a look-up-table to find the amount of compensation at the detected time. Specifically, the proposed fatigue compensation method is to adjust min-max values of hyperboxes in FMMNN according to the consistent feature variation in Fig. 3 for every 2 seconds. After these adjusting, min-max values of hyperboxes are re-adjusted through the learning algorithm of FMMNN, such as expansion, overlap test and contraction. This re-adjustment process is also done for every 2 seconds with the first step. Here, the meaning of 2 seconds is the minimum time period to be able to observe the muscular fatigue effect in EMG signal.
4 Experimental Results 4.1 Experimental Configuration Proposed robust EMG pattern recognizer was applied to controlling a mouse for human-computer interaction environment. Five non-handicapped men who have no prior knowledge about experiments were volunteered for this experiment. The objective of experiment was to follow six motions: reference, up, down, left, right, and click. EMG signals were acquired with a 4-ch EMG signal acquisition module like Fig. 5 which was specially designed for low noise, high gain, ease-to-use and small size. Sample rate for signal acquisition was used as 1 kHz and a size of time-window for
1196
J.-H. Song, J.-W. Jung, and Z. Bien
analysis was 128 ms. The same experiments were performed for five subjects but measured EMG signals were different from each subject due to their own physiological characteristics and slightly different locations of electrodes.
Fig. 5. Developed EMG signal acquisition module
4.2 Experimental Results IAV, ZC, VAR and DAMV are extracted from four channel EMG signals. Fig. 6 shows a distribution of DAMV (ch. #3 and ch. #4) in two-dimensional feature space.
Fig. 6. Feature distribution of EMG signal in two-dimensional space
As mentioned above, the trends of feature variations are consistent. Even though feature distributions are varied by sustained time of muscle contractions, class boundaries of FMMNN are correspondingly adjusted by proposed fatigue compensation method. Fig. 7 represents that hyperboxes of proposed method well reflect timevarying feature distributions due to muscular fatigue effects.
Robust EMG Pattern Recognition to Muscular Fatigue Effect
1197
Feature Distribution of EMG Signals - No.10 2.5 initial left right up down click
DAMV at ch #4
2
1.5
1
0.5
0 0
0.5
1
1.5
DAMV at ch #3
(a) Before application Feature Distribution of EMG Signals - No.10 2.5 initial left right up down click
DAMV at ch #4
2
1.5
1
0.5
0 0
0.5
1
1.5
DAMV at ch #3
(b) After application Fig. 7. Comparison of class boundaries Time course of pattern classification rates - User #2
Time course of pattern classification rates - User #3
90
80
80
80
70
70
70
60 50 40
60 50 40
30
30
20
20
10
recognition rate(%)
100
90
recognition rate(%)
100
90
0 20
30 tim e(sec)
40
50
60
40
10
fatigued fatigue com pensated
0 0
10
20
30 tim e(sec)
40
Time course of pattern classification rates - User #4
50
60
0
10
20
30 tim e(sec)
40
Time course of pattern classification rates - User #5
100
100
90
90
80
80
70
70 recognition rate(%)
10
50
20 fatigued fatigue com pensated
0 0
60
30
10
fatigued fatigue com pensated
recognition rate(%)
recognition rate(%)
Time course of pattern classification rates - User #1 100
60 50 40 30
60 50 40 30
20
20
10
10
fatigued fatigue com pensated
0
fatigued fatigue com pensated
0 0
10
20
30 tim e(sec)
40
50
60
0
10
20
30 tim e(sec)
40
50
60
(a) user #1 (b) user #2 (c) user #3 (d) user #4 (e) user #5 Fig. 8. Fatigue compensated pattern classification rates for multiple users
50
60
1198
J.-H. Song, J.-W. Jung, and Z. Bien
Fig. 9. Designed input pattern and the variation of characteristic frequency (MNF at CH #3)
Fig. 8 refers to pattern classification rates for all users. The designed input pattern including all of predefined basic motions and a plot of characteristic frequency, MNF, are shown in Fig. 9. System performance was highly improved by the proposed method, FMMNN and fatigue compensation, (connected line) compared with FMMNN only (dotted line). Bold points in Fig. 9 represent detected start-times of motions. If a muscle contraction is sustained, the trends of characteristic frequency toward lower frequency are obviously observed.
5 Conclusion Novel muscular fatigue compensation method is proposed for EMG-based humancomputer interaction in this paper. It is based on the observation that feature variations for a duration time of muscle contractions are consistent. Beginning with this observation, the proposed method is to adjust min-max values of hyperboxes according to the contraction time using learning algorithm of FMMNN. As a result, significant improvement was confirmed in the system performance expressed by pattern classification rates. The suggested robust EMG pattern recognizer to the muscular fatigue effect can be applied to various EMG-based systems, especially for people with handicap.
Acknowledgement This work was partially supported by the SRC/ERC program of MOST/KOSEF (grant #R111999-008).
References 1. Collins COBUILD Advanced Learner's English Dictionary, 4th ed., 2003. 2. Bigland-Ritchie, B., Jones, D.A., Hosking, G.P., Edwards, R.H.: Central and peripheral fatigue in sustained maximum voluntary contractions of human quadriceps muscle. Journal of Clinical Science and Molecular Medicine, Vol. 54 (6):609-14, June 1978.
Robust EMG Pattern Recognition to Muscular Fatigue Effect
1199
3. Kiryu, T., Morishiata, M., Yamada, H., Okada, M.: A muscular fatigue index based on the relationships between superimposed M wave and preceding background activity. IEEE Transactions on Biomedical Engineering, Vol. 45 (10):1194-1204, Oct. 1998. 4. Bonato, P., Roy, S.H., Knaflitz, M., de Luca, C.J.: Time-frequency parameters of the surface myoelectric signal for assessing muscle fatigue during cyclic dynamic contractions. IEEE Transactions on Biomedical Engineering, Vol. 48 (7):745 - 753, July 2001. 5. Park, E., Meek, S.G.: Fatigue compensation of the electromyographic signal for prosthetic control and force estimation. IEEE Transactions on Biomedical Engineering, Vol. 40 (10):1019-1023, Oct. 1993. 6. Winslow, J., Jacobs, P.L., Tepavac, D.: Fatigue compensation during FES using surface EMG. Journal of Electromyography and Kinesilogy, Vol. 13 (6):555-568, Mar. 2003. 7. Chen, J.-J.J., Yu, N.-Y.: The validity of stimulus-evoked EMG for studying muscle fatigue characteristics of paraplegic subjects during dynamic cycling movement. IEEE Transactions on Rehabilitation Engineering, Vol. 5 (2): 170-178, Jun. 1997. 8. Knox, R.R., Brooks, D.H., Manolakos, E., Markogiannakis, S.: Time-series based features for EMG pattern recognition: Preliminary results. Bioengineering Conference, Proceedings of the IEEE Nineteenth Annual Northeast, March 18-19 1993. 9. Boostani, R., Moradi, M.H.: Evaluation of the forearm EMG signal features for the control of a prosthetic hand. Journal of Physiological Measurement, Vol. 24:309-319, May 2003. 10. Simpson, P.: Fuzzy min-max neural networks - Part 1: Classification. IEEE Transactions on Neural Networks, Vol. 3:776-786, Sep. 1992.
Classification of Individual and Clustered Microcalcifications in Digital Mammograms Using Evolutionary Neural Networks Rolando R. Hern´ andez-Cisneros and Hugo Terashima-Mar´ın Center for Intelligent Systems, Tecnol´ ogico de Monterrey, Campus Monterrey Ave. Eugenio Garza Sada 2501 Sur, Monterrey, Nuevo Le´ on 64849 Mexico [email protected], [email protected]
Abstract. Breast cancer is one of the main causes of death in women and early diagnosis is an important means to reduce the mortality rate. The presence of microcalcification clusters are primary indicators of early stages of malignant types of breast cancer and its detection is important to prevent the disease. This paper proposes a procedure for the classification of microcalcification clusters in mammograms using sequential difference of gaussian filters (DoG) and three evolutionary artificial neural networks (EANNs) compared against a feedforward artificial neural network (ANN) trained with backpropagation. We found that the use of genetic algorithms (GAs) for finding the optimal weight set for an ANN, finding an adequate initial weight set before starting a backpropagation training algorithm and designing its architecture and tuning its parameters, results mainly in improvements in overall accuracy, sensitivity and specificity of an ANN, compared with other networks trained with simple backpropagation.
1
Introduction
Worldwide, breast cancer is the most common form of cancer in females and is, after lung cancer, the second most fatal cancer in women. Survival rates are higher when breast cancer is detected in its early stages. Mammography is one of the most common techniques for breast cancer diagnosis, and microcalcifications are one among several types of objects that can be detected in a mammogram. Microcalcifications are calcium accumulations typically 100 microns to several mm in diameter, and they sometimes are early indicators of the presence of breast cancer. Microcalcification clusters are groups of three or more microcalcifications that usually appear in areas smaller than 1 cm2 , and they have a high probability of becoming a malignant lesion. However, the predictive value of mammograms is relatively low, compared to biopsy. The sensitivity may be improved having each mammogram checked by two or more radiologists, making the process inefficient. A viable alternative is replacing one of the radiologists by a computer system, giving a second opinion [1]. A computer system intended for microcalcification detection in mammograms may be based on several methods, like wavelets, fractal models, support vector A. Gelbukh and C.A. Reyes-Garcia (Eds.): MICAI 2006, LNAI 4293, pp. 1200–1210, 2006. c Springer-Verlag Berlin Heidelberg 2006
Classification of Individual and Clustered Microcalcifications
1201
machines, mathematical morphology, bayesian image analysis models, high order statistic, fuzzy logic, etc. The method selected for this work was the difference of gaussian filters (DoG). DoG filters are adequate for the noise-invariant and size-specific detection of spots, resulting in a DoG image. This DoG image represents the microcalcifications if a thresholding operation is applied to it. We developed a procedure that applies a sequence of Difference of Gaussian Filters, in order to maximize the amount of detected probable microcalcifications (signals) in the mammogram, which are later classified in order to detect if they are real microcalcifications or not. Finally, microcalcification clusters are identified and also classified into malignant and benign. Artificial neural networks (ANNs) have been successfully used for classification purposes in medical applications, including the classification of microcalcifications in digital mammograms. Unfortunately, for an ANN to be successful in a particular domain, its architecture, training algorithm and the domain variables selected as inputs must be adequately chosen. Designing an ANN architecture is a trial-and-error process; several parameters must be tuned according to the training data when a training algorithm is chosen and, finally, a classification problem could involve too many variables (features), most of them not relevant at all for the classification process itself. Genetic algorithms (GAs) may be used to address the problems mentioned above, helping to obtain more accurate ANNs with better generalization abilities. GAs have been used for searching the optimal weight set of an ANN, for designing its architecture, for finding its most adequate parameter set (number of neurons in the hidden layer(s), learning rate, etc.) among others tasks. Exhaustive reviews about evolutionary artificial neural networks (EANNs) have been presented by Yao [2] and Balakrishnan and Honavar [3]. In this paper, we propose an automated procedure for feature extraction and training data set construction for training an ANN. We also describe the use of GAs for 1) finding the optimal weight set for an ANN, 2) finding an adequate initial weight set for an ANN before starting a backpropagation training algorithm and 3) designing the architecture and tuning some parameters of an ANN. All of these methods are applied to the classification of microcalcifications and microcalcification clusters in digital mammograms, expecting to improve the accuracy of an ordinary feedforward ANN performing this task. The rest of this document is organized as follows. In the second section, the proposed procedure along with its theoretical framework is discussed. The third section deals with the experiments and the main results of this work. Finally, in the fourth section, the conclusions are presented.
2
Methodology
The mammograms used in this project were provided by the Mammographic Image Analysis Society (MIAS) [4]. The MIAS database contains 322 images, with resolutions of 50 microns/pixel and 200 microns/pixel. In this work, the images
1202
R.R. Hern´ andez-Cisneros and H. Terashima-Mar´ın
with a resolution of 200 microns/pixel were used. The data has been reviewed by a consultant radiologist and all the abnormalities have been identified and marked. The truth data consists of the location of the abnormality and the radius of a circle which encloses it. From the totality of the database, only 25 images contain microcalcifications. Among these 25 images, 13 cases are diagnosed as malignant and 12 as benign. Some related works have used this same database [5], [6], [7]. The general procedure receives a digital mammogram as an input, and it is conformed by five stages: pre-processing, detection of potential microcalcifications (signals), classification of signals into real microcalcifications, detection of microcalcification clusters and classification of microcalcification clusters into benign and malignant. The diagram of the proposed procedure is shown in Figure 1. As end-products of this process, we obtain two ANNs for classifying microcalcifications and microcalcifications clusters respectively, which in this case, are products of the evolutionary approaches that are proposed. 2.1
Pre-processing
This stage has the aim of eliminating those elements in the images that could interfere in the process of identifying microcalcifications. A secondary goal is to reduce the work area only to the relevant region that exactly contains the breast. The procedure receives the original images as input. First, a median filter is applied in order to eliminate the background noise, keeping the significant features of the images. Next, binary images are created from each filtered image, intended solely for helping the automatic cropping procedure to delete the background marks and the isolated regions, so the image will contain only the region of interest. The result of this stage is a smaller image, with less noise. 2.2
Detection of Potential Microcalcification (Signals)
The main objective of this stage is to detect the mass centers of the potential microcalcifications in the image (signals). The optimized difference of two gaussian filters (DoG) is used for enhancing those regions containing bright points. The resultant image after applying a DoG filter is globally binarized, using an empirically determined threshold. A region-labeling algorithm allows the identification of each one of the points (defined as high-contrast regions detected after the application of the DoG filters, which cannot be considered microcalcifications yet). Then, a segmentation algorithm extracts small 9x9 windows, containing the region of interest whose centroid corresponds to the centroid of each point. In order to detect the greater possible amount of points, six gaussian filters of sizes 5x5, 7x7, 9x9, 11x11, 13x13 and 15x15 are combined, two at a time, to construct 15 DoG filters that are applied sequentially. Each one of the 15 DoG filters was applied 51 times, varying the binarization threshold in the interval [0, 5] by increments of 0.1. The points obtained by applying each filter are added to the points obtained by the previous one, deleting the repeated points. The same procedure is repeated with the points obtained by the remaining DoG filters. All of these points are passed later to three selection procedures.
Classification of Individual and Clustered Microcalcifications
1203
Original images
Pre-processing Median filter
Binarization
Automatic cropping
Pre-processed images
Detection of potential microcalcifications (signals) DoG filters
Global binarization
Region labeling
Point selection by minimum area, minimum gray level and minimum gradient
Classification of Signals Microcalcification data set Feature extraction from microcalcifications Microcalcification feature set Genetic Algorithm Microcalcification data set
Microcalcification feature subsets
Optimal microcalcification feature subset for classification; best overall accuracy, sensitivity and specificity
Neural Networks Neural Networks Overall accuracy
Classification of Microcalcification Clusters Detection of Microcalcification Clusters
Cluster data set Feature extraction from microcalcification clusters Cluster feature set Genetic Algorithm Cluster feature subsets
Optimal cluster feature subset for classification; best overall accuracy, sensitivity and specificity
Neural Networks Neural Networks Overall accuracy
Fig. 1. Diagram of the proposed procedure
These three selection methods are applied in order to transform a point into a signal (potential microcalcification). The first method performs selection according to the object area, choosing only the points with an area between a predefined minimum and a maximum. For this work, a minimum area of 1 pixel (0.0314 mm2 ) and a maximum of 77 pixels (3.08 mm2 ) were considered. The second methods performs selection according to the gray level of the points. Studying the mean gray levels of pixels surrounding real identified microcalcifications, it was found they have values in the interval [102, 237] with a mean of 164. For this study, we set the minimum gray level for points to be selected to 100. Finally, the third selection method uses the gray gradient (or absolute contrast, the difference between the mean gray level of the point and the mean gray level of the background). Again, studying the mean gray gradient of point surrounding real identified microcalcifications, it was found they have values in the interval [3, 56]
1204
R.R. Hern´ andez-Cisneros and H. Terashima-Mar´ın
with a mean of 9.66. For this study, we set the minimum gray gradient for points to be selected to 3, the minimum vale of the interval. The result of these three selection processes is a list of signals (potential microcalcifications) represented by their centroids. 2.3
Classification of Signals into Real Microcalcifications
The objective of this stage is to identify if an obtained signal corresponds to an individual microcalcification or not. A set of features are extracted from the signal, related to their contrast and shape. From each signal, 47 features are extracted: seven related to contrast, seven related to background contrast, three related to relative contrast, 20 related to shape, six related to the moments of the contour sequence and the first four invariants proposed by Hu in a landmark paper [8]. There is not an a priori criterion to determine what features should be used for classification purposes, so the features pass through two feature selection processes [9]: the first one attempts to delete the features that present high correlation with other features, and the second one uses a derivation of the forward sequential search algorithm, which is a sub-optimal search algorithm. The algorithm decides what feature must be added depending of the information gain that it provides, finally resulting in a subset of features that minimize the error of the classifier (which in this case was a conventional feedforward ANN). After these processes were applied, only three features were selected and used for classification: absolute contrast (the difference between the mean gray levels of the signal and its background), standard deviation of the gray level of the pixels that form the signal and the third moment of contour sequence. Moments of contour sequence are calculated using the signal centroid and the pixels in its perimeter, and are invariant to translation, rotation and scale transformations [10]. In order to process signals and accurately classify the real microcalcifications, we decided to use ANNs as classifiers. Because of the problems with ANNs already mentioned, we decided also to use GAs for evolving populations of ANNs, in three different ways, some of them suggested by Cant´ u-Paz and Kamath [11]. The first approach uses GAs for searching the optimal set of weights of the ANN. In this approach, the GA is used only for searching the weights, the architecture is fixed prior to the experiment. The second approach is very similar to the previous one, but instead of evaluating the network immediately after the initial weight set which is represented in each chromosome of the GA, is assigned, a backpropagation training starts from this initial weight set, hoping to reach an optimum quickly [12]. The last approach is not concerned with evolving weights. Instead, a GA is used to evolve a part of the architecture and other features of the ANN. The number of nodes in the hidden layer is very important parameter, because too few or to many nodes can affect the learning and generalization capabilities of the ANN. In this case, each chromosome encodes the learning rate, a lower and upper limits for the weights before starting the backpropagation training, and the number of nodes of the hidden layer.
Classification of Individual and Clustered Microcalcifications
1205
At the end of this stage, we obtain three ready-to-use ANNs, each one taken from the last generation of the GAs used in each one of the approaches. These ANNs have the best performances in terms of overall accuracy (fraction of well classified objects, including microcalcifications and other elements in the image that are not microcalcifications). 2.4
Detection of Microcalcification Clusters
During this stage, the microcalcification clusters are identified. The detection and posterior consideration of every microcalcification cluster in the images may produce better results in a subsequent classification process, as we showed in [13]. Because of this, an algorithm for locating microcalcification cluster regions where the quantity of microcalcifications per cm2 (density) is higher, was developed. This algorithm keeps adding microcalcifications to their closest clusters at a reasonable distance until there are no more microcalcifications left or if the remaining ones are too distant for being considered as part of a cluster. Every detected cluster is then labeled. 2.5
Classification of Microcalcification Clusters into Benign and Malignant
This stage has the objective of classifying each cluster in one of two classes: benign or malignant. This information is provided by the MIAS database. From every microcalcification cluster detected in the mammograms in the previous stage, a cluster feature set is extracted. The feature set is constituted by 30 features: 14 related to the shape of the cluster, six related to the area of the microcalcifications included in the cluster and ten related to the contrast of the microcalcifications in the cluster. The same two feature selection procedures mentioned earlier are also performed in this stage. Only three cluster features were selected for the classification process: minimum diameter, minimum radius and mean radius of the clusters. The minimum diameter is the maximum distance that can exist between two microcalcifications within a cluster in such a way that the line connecting them is perpendicular to the maximum diameter, defined as the maximum distance between two microcalcifications in a cluster. The minimum radius is the shortest of the radii connecting each microcalcification to the centroid of the cluster and the mean radius is the mean of these radii. In order to process microcalcification clusters and accurately classify them into benign or malignant, we decided again to use ANNs as classifiers. We use GAs for evolving populations of ANNs, in the same three different approaches we used before for classifying signals. The first approach uses GAs for searching the optimal set of weights of the ANN. The second approach uses a GA for defining initial weight sets, from which a backpropagation training algorithm is started, hoping to reach an optimum quickly. The third approach uses a GA for evolving the architecture and other features of the ANN as it was shown in a previous stage, when signals were classified. Again, each chromosome encodes the learning rate, a lower and upper limits for the weights before starting the
1206
R.R. Hern´ andez-Cisneros and H. Terashima-Mar´ın
backpropagation training, and the number of nodes of the hidden layer. For comparison, a conventional feedforward ANN is used also. At the end of this stage, we obtain three ready-to-use ANNs, each one taken from the last generation of the GAs used in each of the approaches. These ANNs have the best performances in terms of overall accuracy (fraction of well classified clusters).
3 3.1
Experiments and Results From Pre-processing to Feature Extraction
Only 22 images were finally used for this study. In the second phase, six gaussian filters of sizes 5x5, 7x7, 9x9, 11x11, 13x13 and 15x15 were combined, two at a time, to construct 15 DoG filters that were applied sequentially. Each one of the 15 DoG filters was applied 51 times, varying the binarization threshold in the interval [0, 5] by increments of 0.1. The points obtained by applying each filter were added to the points obtained by the previous one, deleting the repeated points. The same procedure was repeated with the points obtained by the remaining DoG filters. These points passed through the three selection methods for selecting signals (potential microcalcifications), according to region area, gray level and the gray gradient. The result was a list of 1,242,179 signals (potential microcalcifications) represented by their centroids. The additional data included with the MIAS database define, with centroids and radii, the areas in the mammograms where microcalcifications are located. With these data and the support of expert radiologists, all the signals located in these 22 mammograms were preclassified into microcalcifications, and notmicrocalcifications. From the 1,242,179 signals, only 4,612 (0.37%) were microcalcifications, and the remaining 1,237,567 (99.63%) were not. Because of this imbalanced distribution of elements in each class, an exploratory sampling was made. Several sampling with different proportions of each class were tested and finally we decided to use a sample of 10,000 signals, including 2,500 real microcalcifications in it (25%). After the 47 microcalcification features were extracted from each signal, the feature selection processes reduced the relevant features to only three: absolute contrast, standard deviation of the gray level and the third moment of contour sequence. Finally, a transactional database was obtained, containing 10,000 signals (2500 of them being real microcalcifications randomly distributed) and three features describing each signal. 3.2
Classification of Signals into Microcalcifications
In the third stage, a conventional feedforward ANN and three evolutionary ANNs were developed for the classification of signals into real microcalcifications. The feedforward ANN had an architecture of three inputs, seven neurons in the hidden layer and one output. All the units had the sigmoid hyperbolic tangent function as the transfer function. The data (input and targets) were
Classification of Individual and Clustered Microcalcifications
1207
scaled in the range [-1, 1] and divided into ten non-overlapping splits, each one with 90% of the data for training and the remaining 10% for testing. A ten-fold crossvalidation trial was performed; that is, the ANN was trained ten times, each time using a different split on the data and the means and standard deviations of the overall performance, sensitivity and specificity were reported. These results are shown in Table 1 on the row “BP”. Table 1. Mean (%) and standard deviation of the sensitivity, specificity and overall accuracy of simple backpropagation and different evolutionary methods for the classification of signals into real microcalcifications Sensitivity Specificity Std. Std. Method Mean Dev. Mean Dev. BP 75.68 0.044 81.36 0.010 WEIGHTS 72.44 0.027 84.32 0.013 WEIGHTS+BP 75.81 0.021 86.76 0.025 PARAMETERS 73.19 0.177 84.67 0.035
Overall Std. Mean Dev. 80.51 0.013 82.37 0.011 84.68 0.006 83.12 0.028
For the three EANNs used to evolve signal classifiers, all of their GAs used a population of 50 individuals. We used simple GAs, with gray encoding, stochastic universal sampling selection, double-point crossover, fitness based reinsertion and a generational gap of 0.9. For all the GAs, the probability of crossover was 0.7 and the probability of mutation was 1/l, where l is the length of the chromosome. The initial population of each GA was always initialized uniformly at random. All the ANNs involved in the EANNs are feedforward networks with one hidden layer. All neurons have biases with a constant input of 1.0. The ANNs are fully connected, and the transfer functions of every unit is the sigmoid hyperbolic tangent function. The data (input and targets) were normalized to the interval [-1, 1]. For the targets, a value of “-1” means “not-microcalcification” and a value of “1” means “microcalcification”. When backpropagation was used, the training stopped after reaching a termination criteria of 20 epochs, trying also to find individual with fast convergence. For the first approach, where a GA was used to find the ANNs weights, the population consisted of 50 individuals, each one with a length of l = 720 bits and representing 36 weights (including biases) with a precision of 20 bits. There were two crossover points, and the mutation rate was 0.00139. The GA ran for 50 generations. The results of this approach are shown in Table 1 on the row “WEIGHTS”. In the second approach, where a backpropagation training algorithm is run using the weights represented by the individuals in the GA to initialize the ANN, the population consisted of 50 individual also, each one with a length of l = 720 bits and representing 36 weights (including biases) with a precision of 20 bits. There were two crossover points, and the mutation rate was 0.00139 (1/l). In this case, each ANN was briefly trained using 20 epochs of backpropagation, with a learning rate of 0.1. The GA ran for 50 generations. The results of this approach are shown in Table 1 on the row “WEIGHTS+BP”.
1208
R.R. Hern´ andez-Cisneros and H. Terashima-Mar´ın
Finally, in the third approach, where a GA was used to find the size of the hidden layer, the learning rate for the backpropagation algorithm and the range of initial weights before training, the population consisted of 50 individuals, each one with a length of l = 18 bits. The first four bits of the chromosome coded the learning rate in the range [0,1], the next five bits coded the lower value for the initial weights in the range [-10,0], the next five bits coded the upper value for the initial weights in the range [0,10] and the last four bits coded the number of neurons in the hidden layer, in the range [1,15] (if the value was 0, it was changed to 1). There was only one crossover point, and the mutation rate was 0.055555 (1/l). In this case, each ANN was built according to the parameters coded in the chromosome, and trained briefly with 20 epochs of backpropagation, in order to favor the ANNs that learned quickly. The results of this approach are shown also in Table 1, on the row “PARAMETERS”. We performed several two-tailed Students t-tests at a level of significance of 5% in order to compare the mean of each method with the means of the other ones in terms of sensitivity, specificity and overall accuracy. We found that for specificity and overall accuracy, evolutionary methods are significantly better than the simple backpropagation method for the classification of individual microcalcifications. No difference was found in terms of sensitivity, except that simple backpropagation was significantly better than the method that evolves weights. We can notice too that, among the studied EANNs, the one that evolves a set of initial weights and is complemented with backpropagation training is the one that gives better results. We found that in fact, again in terms of specificity and overall accuracy, the method of weight evolution complemented with backpropagation is significantly the best of the methods we studied. Nevertheless, in terms of sensitivity, this method is only significantly better than the method that evolves weights. 3.3
Microcalcification Clusters Detection and Classification
The process of cluster detection and the subsequent feature extraction phase generates another transactional database, this time containing the information of every microcalcification cluster detected in the images. A total of 40 clusters were detected in the 22 mammograms from the MIAS database that were used in this study. According to MIAS additional data and the advice of expert radiologists, 10 clusters are benign and 30 are malignant. The number of features extracted from them is 30, but after the two feature selection processes already discussed in previous sections, the number of relevant features we considered relevant was three: minimum diameter, minimum radius and mean radius of the clusters. As in the stage of signal classification, a conventional feedforward ANN and three evolutionary ANNs were developed for the classification of clusters into benign and malignant. The four algorithms we use in this step are basically the same ones we used before, except that they receive as input the transactional database containing features about microcalcifications clusters instead of features about signals. Again, the means of the overall performance, sensitivity
Classification of Individual and Clustered Microcalcifications
1209
Table 2. Mean (%) and standard deviation of the sensitivity, specificity and overall accuracy of simple backpropagation and different evolutionary methods for the classification of microcalcification clusters Sensitivity Specificity Std. Std. Method Mean Dev. Mean Dev. BP 55.97 0.072 86.80 0.032 WEIGHTS 72.00 0.059 92.09 0.038 WEIGHTS+BP 89.34 0.035 95.86 0.025 PARAMETERS 63.90 0.163 85.74 0.067
Overall Std. Mean Dev. 76.75 0.032 86.35 0.031 93.88 0.027 80.50 0.043
and specificity for each one of these four approaches are reported and shown in Table 2. We also performed several two-tailed Students t-tests at a level of significance of 5% in order to compare the mean of each method for cluster classification with the means of the other ones in terms of sensitivity, specificity and overall accuracy. We found that the performance of evolutionary methods is significantly different and better than the performance of the simple backpropagation method, except in one case. Again, the method that evolves initial weights, complemented with backpropagation, is the one that gives the best results.
4
Conclusions
Our experimentation suggests that evolutionary methods are significantly better than the simple backpropagation method for the classification of individual microcalcifications, in terms of specificity and overall accuracy. No difference was found in terms of sensitivity, except that simple backpropagation was significantly better than the method that only evolves weights. In the case of the classification of microcalcification clusters, we observed that the performance of evolutionary methods is significantly better than the performance of the simple backpropagation method, except in one case. Again, the method that evolves initial weights, complemented with backpropagation, is the one that gives the best results.
Acknowledgments This research was supported by the Instituto Tecnol´ogico y de Estudios Superiores de Monterrey (ITESM) under the Research Chair CAT-010 and the National Council of Science and Technology of Mexico (CONACYT) under grant 41515.
References 1. Thurfjell, E. L., Lernevall, K. A., Taube, A. A. S.: Benefit of independent double reading in a population-based mammography screening program. Radiology, 191 (1994) 241-244.
1210
R.R. Hern´ andez-Cisneros and H. Terashima-Mar´ın
2. Yao, X.: Evolving artificial neural networks. in Proceedings of the IEEE, 87(9) (1999) 1423-1447. 3. Balakrishnan, K., Honavar, V.: Evolutionary design of neural architectures. A preliminary taxonomy and guide to literature. Technical Report CS TR 95-01, Department of Computer Sciences, Iowa State University (1995). 4. Suckling, J., Parker, J., Dance, D., Astley, S., Hutt, I., Boggis, C., Ricketts, I., Stamatakis, E., Cerneaz, N., Kok, S., Taylor, P., Betal, D., Savage, J.: The Mammographic Images Analysis Society digital mammogram database. Exerpta Medica International Congress Series, 1069 (1994) 375-378. http://www.wiau.man.ac.uk/services/MIAS/MIASweb.html 5. Gulsrud, T. O.: Analysis of mammographic microcalcifications using a computationally efficient filter bank. Technical Report, Department of Electrical and Computer Engineering, Stavanger University College (2001). 6. Hong, B.-W., Brady, M.: Segmentation of mammograms in topographic approach. In IEE International Conference on Visual Information Engineering, Guildford, UK (2003). 7. Li, S., Hara, T., Hatanaka, Y., Fujita, H., Endo, T., Iwase, T.: Performance evaluation of a CAD system for detecting masses on mammograms by using the MIAS database. Medical Imaging and Information Science, 18(3) (2001) 144-153. 8. Hu, M.-K.: Visual pattern recognition by moment invariants. IRE Trans. Information Theory, Vol. IT-8 (1962) 179-187. 9. Kozlov, A., Koller, D.: Nonuniform dynamic discretization in hybrid networks. In Proceedings of the 13th Annual Conference of Uncertainty in AI (UAI), Providence, Rhode Island, USA (2003) 314-325. 10. Gupta, L., Srinath, M. D.: Contour sequence moments for the classification of closed planar shapes. Pattern Recognition, 20(3) (1987) 267-272. 11. Cant´ u-Paz, E., Kamath, C.: Evolving neural networks for the classification of galaxies. In Proceedings of the Genetic and Evolutionary Computation Conference, GECCO 2002, San Francisco, CA, USA (2002) 1019–1026. 12. Skinner, A., Broughton, J. Q.: Neural networks in computational material science: traning algorithms. Modeling and Simulation in Material Science and Engineering, 3 (1995) 371-390. 13. Oporto-D´ıaz, S., Hern´ andez-Cisneros, R. R., and Terashima-Mar´ın H.: Detection of microcalcification clusters in mammograms using a difference of optimized gaussian filters. In Proceedings of the International Conference in Image Analysis and Recognition (ICIAR 2005), Toronto, Canada (2005) 998–1005.
Heart Cavity Detection in Ultrasound Images with SOM Mary Carmen Jarur1 and Marco Mora1,2 1
Department of Computer Science, Catholic University of Maule Casilla 617, Talca, Chile [email protected] 2 IRIT-ENSEEIHT 2 Rue Camichel, 31200 Toulouse, France [email protected]
Abstract. Ultrasound images are characterized by high level of speckle noise causing undefined contours and difficulties during the segmentation process. This paper presents a novel method to detect heart cavities in ultrasound images. The method is based on a Self Organizing Map and the use of the variance of images. Successful application of our approach to detect heart cavities on real images is presented1 .
1
Introduction
Ultrasound heart images are characterized by high level of speckle noise and low contrast causing erroneous detection of cavities. Speckle is a multiplicative locally correlated noise. The speckle reducing filters have origin mainly in the synthetic aperture radar community. The most widely used filters in this category, such as the filters of Lee [7], Frost [2], Kuan [6], and Gamma Map [10], are based on the coefficient of variation (CV). Currently the detection of heart cavities considers two steps. The first one corresponds to the filtering of the noise using anisotropic diffusion and the CV [15,12]. The second one is the detection of the contours based on active contours [13,8]. In spite of the good results of this approach, both stages have a high complexity. For images segmentation, several schemes based on neuronal networks have been proposed. Supervised methods are reported in [9], and unsupervised segmentation techniques by using Self-Organizing Map (SOM) have been presented in [4,1,11]. In order to reduce the computation cost of the current techniques, this paper presents the first results of a novel approach to detect heart cavities by using neural networks. Our method combines elements of the filtering in images affected by speckle such as the variance, and the advantages shown by SOM in segmentation of images. 1
The authors of this paper acknowledge the valuable contributions of Dr. Clovis Tauber and Dr. Hadj Batatia from the Polytechnical National Institute of Toulouse, France, during the development of this research.
A. Gelbukh and C.A. Reyes-Garcia (Eds.): MICAI 2006, LNAI 4293, pp. 1211–1219, 2006. c Springer-Verlag Berlin Heidelberg 2006
1212
M.C. Jarur and M. Mora
Matlab, the Image Processing Toolbox and the Neural Networks Toolbox were used as platform to carry out most of the data processing work. The paper is organized as follows. The next section will review the self organizing map. Section 3 details our approach to heart cavity detection. The experimental results are shown in section 4. The paper is concluded in section 5.
2
The Self Organizing Map
The Self-Organizing Map (SOM), proposed by Kohonen, also called Kohonen network has had a great deal of success in many applications such as reduction of dimensions, data processing and analysis, monitoring of processes, vectorial quantification, modeling of density functions, clusters analysis, and with relevance for our work in image segmentation [5]. The SOM neural networks can learn to detect regularities and correlations in their input and adapt their future responses to that input accordingly. The SOM network typically has two layers of nodes and does a no lineal projection of multidimensional space about output discrete space represented by a surface of neurons. The training process is composed by the following procedures: 1. Initialize the weights randomly. 2. Input data is fed to the network through the processing elements (nodes) in the input layer. 3. Calculate the similitude between the input data and the neurons weight. 4. Determinate the winning neuron, that is, the node with the minimum distance respect to the input data is the winner. 5. Actualization of the weights of the winning neuron and its neighborhood, adjusting its weights to be closer to the value of the input pattern. 6. If it has got the maximum number of iterations, the learning process stops, in other case it returns to the step 2. For each input data xi , the Euclidean distance between the input data and the weights of each neuron wi,j in the one-dimensional grid is computed (step 3) by: n−1 dj = [xi (t) − wi,j (t)]2 (1) i=0
The neuron having the least distance is designated the winner neuron (step 4). Finally the weights of the winner neuron are updated (step 5) using the following expression: wi,j (t + 1) = wi,j (t) + hci · [xi (t) − wi,j (t)]
(2)
The term hci refers to a neighborhood set, Nc , of array points around the winner node c . Where hci = α(t) if i ∈ Nc and hci = 0 if i ∈ / Nc , where α(t) is some monotonically decreasing function of time (0 < α(t) < 1). With respect to the number of iterations, this must be big enough due to the statistics requirements as well as proportional to the number of neurons. It is suggested to be 500 times per map neuron [5].
Heart Cavity Detection in Ultrasound Images with SOM
3
1213
Proposed Approach to Heart Cavity Detection Using SOM
To detect the contours of the heart cavities in ultrasound images is a complex task. The boundaries of the cavities are not very well defined due to speckle and the low contrast of the images. In order to detect the cavities our approach supposes the existence of three classes of zones in the image: the exterior of the cavity, the interior of the cavity, and the border between both zones. This supposition allows to simplify the problem, and it can be visually verified in the figures presented in this paper. Our proposal to detect heart cavities has three stages. The first one calculates the variance of the pixels of the image. The second one considers the training of a SOM neural network using the variance-based image. The third one is the classification of the image with the weights of the training stage. The final stage allows to detect the three types of zones previously mentioned. 3.1
Variance Processing
The image is characterized by intensity at position (i, j) as Ii,j . The first step for processing the image is to calculate Ivar (I). Each component of Ivar (I) contains the variance of each pixel with a neighborhood of 3 by 3 pixels as shown in figure 1. The local variance is calculated as: Ivar (Ii,j ) = var(N hi,j )
(3)
where N hi,j is the set of pixels that forms the region centered in i, j of 3 by 3 pixels composed by: N hi,j = {Ii−1,j−1 , Ii−1,j , . . . , Ii+1,j , Ii+1,j+1 }
(4)
Fig. 1. The neighborhood of pixel i, j
In the case of pixels on the image limits the neighborhood is composed only by the existing pixels. The local variance is mapped between [0 − 1] range by expression: 1 ∗ Ivar (Ii,j ) = (5) 1 + Ivar(Ii,j ) ∗ The Ivar (I) matrix is the output of this stage and the input of SOM.
1214
3.2
M.C. Jarur and M. Mora
Training of SOM and Classification of Images
∗ The elements of the Ivar (I) matrix are the input data for the training of SOM. The network has nine inputs which correspond to a neighborhood of 3 by 3 of ∗ each pixel of Ivar (Ii,j ). The inputs are fully connected to the neurons onto a one-dimensional (1-D) array of outputs nodes. The network has three outputs to classify the image in: exterior, interior and edge. Figure 2 shows the network structure.
Fig. 2. Structure of SOM used for cavity detection
The learning of the network was done using all the pixels in the image. After training the network, it will be able to characterize the initial image in three zones. The learning algorithm considered the following characteristics: topology type is one-dimensional of 3 nodes, Euclidean distance function (1), and learning rate α = 0.8. The classification is made with the obtained weights from the training process. In order to classify the image, vicinities of 3 by 3 pixels are considered. Three images are obtained with the classification of the image, each one representing the exterior, edge and interior classes, respectively. In the following section we present the results.
4
Results
In this section we present two types of results: the cavity detection in individual images and in sequences of images. In the first one a SOM is trained with the image to classify. The second one presents a sequence of images of a heart movement video. A single image of the sequence is used to train the network, and weights obtained in this training are used to segment all the images of the video. 4.1
Cavity Detection in Individual Images
The first result corresponds to an ultrasound intra cavity image. Figure 3(a) ∗ shows the original image (I) and figure 3(b) presents the Ivar (I) which corresponds to variance-based image.
Heart Cavity Detection in Ultrasound Images with SOM
(a) Original image
1215
(b) Variance-based image
Fig. 3. Heart intra cavity image and its variance-based image ∗ For the second phase the Ivar (I) is presented to SOM for training. Figure 4 shows the output of SOM after training. Figure 4(a) represents the interior class, figure 4(b) the edge class, and figure 4(c) the exterior class. We use the interior class image in order to find the edges of the cavities. Figure 4(d) shows the edge detection using gradient operator with a Sobel mask [3]. To improve the results, traditional techniques of image processing such as erosion, smoothing by median filter, and dilatation are applied [3]. The erosion by eliminating the small bodies of the image is shown in figure 5(a), the smoothing by median filter is shown in figure 5(b), and the dilatation to recover the size of the objects is shown in figure 5(c). The final edge is visualized in figure 5(d). Another result of our approach is shown in figure 6. Figure 6(a) shows the ultrasound image with two cavities, and the process previously described is applied. Figure 6(b) shows the interior class given by SOM, figure 6(c) shows the edge of interior class by using the Sobel method, and figure 6(d) shows the edge of interior class after improvements. The results of cavity detection show that the network is able to identify the cavities efficiently.
4.2
Cavity Detection in Sequences of Images
The images shown in this section correspond to a sequence that represents the movement of the heart. The training process is made with a single image of the sequence. With the weights of this training all the images of the sequence are classified. Figures 7(a-d) correspond to the original images of the heart movement sequence, figures 7(e-h) are the variance-based images, figures 7(i-l) show the interior class of SOM, figures 7(m-p) correspond to the improvements of the previous images, and finally figures 7(q-t) show the final contour on the original images. The results show that with the parameters obtained in the training using a single image, the neuronal network allows to suitably classify the rest of the images of the sequence. The characteristic of generalization of the neuronal networks increases the degree of autonomy of our approach by classifying patterns that do not belong to the training set.
1216
M.C. Jarur and M. Mora
(a) Interior class
(b) Edge class
(c) Exterior class (d) Edges of interior class Fig. 4. Decomposition of variance image in three classes by SOM
(a) Erosion
(b) Smoothing
(c) Dilation
(d) Final edge
Fig. 5. Improvements in the cavity detection of the interior class image
Heart Cavity Detection in Ultrasound Images with SOM
(a) Original image
1217
(b) Interior class
(c) Edges of interior (d) Edges after imclass provements Fig. 6. Cavity detection with SOM for an image with two cavities
5
Conclusion
This paper has presented a novel approach for heart cavity detection in ultrasound images based on SOM. Our approach presents several advantages. The computational cost is low regarding techniques based on anisotropic diffusion and active contours. It is important to observe that the results in many cases show cavity edges completely closed nevertheless the high level of speckle contamination and low contrast of the ultrasonic images. Moreover, due to the unsupervised learning of SOM, our solution presents more autonomy than when supervised neuronal networks are used. Finally due to the generalization capacity of the neuronal network, we can suitably classify a images sequence of similar characteristics, training the network with a single image of the sequence.
1218
M.C. Jarur and M. Mora
(a)
(b)
(c)
(d)
(e)
(f)
(g)
(h)
(i)
(j)
(k)
(l)
(m)
(n)
(o)
(p)
(q)
(r)
(s)
(t)
Fig. 7. Sequence of four images
The use of SOM provides a good tradeoff between the optimal segmentation and computational cost. Finally, our work shows that the use of SOM is a promising tool for heart cavity detection in ultrasound images.
References 1. Dong G. and Xie M., “Color Clustering and Learning form Image Segmentation Based on Neural Networks” , IEEE Transactions on Neural Networks, vol. 16, no. 4, pp. 925-936, 2005. 2. Frost V., Stiles J., Shanmugan K. and Holtzman J., “A model for radar images and its application to adaptive digital filtering of multiplicative noise”, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. PAMI-4, pp. 157-166, 1982.
Heart Cavity Detection in Ultrasound Images with SOM
1219
3. Gonzalez R., Woods R.,“Digital Images Processing” , Addison-Wesley, 1992. 4. Jiang Y. ,Chen K. and Zhou Z., “SOM Based Image Segmentation”, Lecture Notes in Artificial Intelligence no. 2639, pp. 640-643, 2003. 5. Kohonen T.,“Self-Organizing Maps”, Berlin, Germany: Springer-Verlag, 2001. 6. Kuan D., Sawchuk A., Strand T. and Chavel P., “Adaptive restoration of images with speckle”, IEEE Transaction on Acoustics, Speech and Signal Processing, vol. 35, pp. 373-383, 1987. 7. Lee J., “Digital image enhancement and noise filtering by using local statistic”, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. PAMI-2, pp. 165-168, 1980. 8. Levienaise-Obadia B. and Gee A., “Adaptive Segmentation of Ultrasound Images”, in Electronic Proceedings of the Eight British Machine Vision Conference (BMVC), 1997. 9. Littmann E., Ritter H., “Adaptive Color Segmentation: A Comparison of Neural Networks and Statistical Methods”, IEEE Transactions on Neural Networks, vol. 8, No 1, pp. 175-185, 1997. 10. Lopes A., Touzi R. and Nezry E., “Adaptive speckle filters and scene heterogeneity”, IEEE Transactions on Geoscience and Remote Sensing, vol. 28, no. 6, pp. 992-1000, 1990. 11. Moreira J. and Costa L., “Neural-based color image segmentation and classification using self-organizing maps”, in Proceedings of the IX Brazilian Symposium of Computer Vision and Image Processing (SIBGRAPI’96), pp. 47-54, 1996. 12. Tauber C., Batatia H. and Ayache A., “A Robust Speckle Reducing Anisotropic Diffusion”, IEEE International Conference on Image Processing (ICIP), pp. 247250, 2004. 13. Tauber C., Batatia H., Morin G. and Ayache A.,“Robust B-Spline Snakes for Ultrasound Images Segmentation”, in Proceedings of IEEE Computers in Cardiology, 2004. 14. Yu Y. and Acton S., “Edge detection in ultrasound imagery using the instantaneous coefficient of variation”, IEEE Transaction on Image Processing, vol. 13, no. 12, pp. 1640-1655, 2004. 15. Yu Y. and Acton S., “Speckle Reducing Anisotropic Diffusion”, IEEE Transaction on Image Processing, vol.11, no. 11, 2002.
An Effective Method of Gait Stability Analysis Using Inertial Sensors Sung Kyung Hong, Jinhyung Bae, Sug-Chon Lee, Jung-Yup Kim, and Kwon-Yong Lee School of Mechanical and Aerospace Engineering, Sejong University Seoul, 143-747, Korea [email protected]
Abstract. This study aims to develop an effective measurement instrument and analysis method of gait stability, particularly focused on the motion of lower spine and pelvis during gait. Silicon micromechanical inertial instruments have been developed and body-attitude (pitch and roll) angles were estimated via closed-loop strapdown estimation filters, which results in improved accuracy of estimated attitude. Also, it is shown that the spectral analysis utilizing the Fast Fourier Transform (FFT) provides an efficient analysis method, which provides quantitative diagnoses for the gait stability. The results of experiments on various subjects suggest that the proposed system provides a simplified but an efficient tool for the evaluation of both gait stabilities and rehabilitation treatments effects.
1 Introduction Abnormal walking due to accident or disease limits the physical activity. Assessment of abnormal gait has been evaluated by gait analysis technique as a function of time, distance, kinematical motion, joint loads, and muscle forces, etc [1]. This gait analysis provides useful information in several clinical applications such as functional assessment after hip and knee arthroplasty, rehabilitation treatments by using prosthesis, or assistive devices, and risk evaluation of falls in elderly persons suffering from arthritis. However, for the systematic gait analysis, specially designed facilities such as CCD camera, force plate, electromyography, and data handling station as well as laboratory space including specific pathway and well-trained technician are essential. In addition, it takes a long time for analyzing the data, and data from only a few steps are representative of the gait performance instead of long time walking. For these requirements gait analysis has not been widely used but used for research purposes within a laboratory. Recent advances in the size and performance of micromechanical inertial instruments enable the development of a device to provide information about body motion to individuals who have gait problems. The usefulness of micromechanical inertial sensors has been shown in several applications [2-6]. However, in most of these researches, just the efficiency of the inertial sensors for clinical usage has been A. Gelbukh and C.A. Reyes-Garcia (Eds.): MICAI 2006, LNAI 4293, pp. 1220 – 1228, 2006. © Springer-Verlag Berlin Heidelberg 2006
An Effective Method of Gait Stability Analysis Using Inertial Sensors
1221
emphasized, and the defects of inertial sensors (such as integration drift in the estimated orientation due to nondeterministic errors [7]) and the methods of effective data analysis [8] have been overlooked. In this research, a low-cost and accurate micromechanical inertial instrument and a simplified but effective method for analyzing gait stability are provided. Particularly, focusing our tentative interests on lower spine and pelvis motion during gait, the propose instrument is placed on the spur of S1 spine as described in Ref. [6]. As shown in Figure 1, the proposed system consists of two parts: one is the micromechanical inertial instrument incorporating inertial sensors and closed-loop strap-down attitude estimation filters, and another is the digital analysis unit incorporating data acquisition and signal analysis functions in both time and frequency domain, which provides quantitative diagnoses for the gait stability. We performed on 30 rheumatism patients, 3 patients with prosthetic limb, and 10 healthy subjects to show the practical aspect of the proposed system. The results suggest that the proposed system can be a simplified and efficient tool for the evaluation of both gait stabilities and rehabilitation treatments effects.
Fig. 1. Proposed system configurations
2 Methods 2.1 Micromechanical Inertial Instrument The micromechanical inertial instrument includes an inertial sensor unit (ADXRS300, KXM52), a micro-processor (ATMEA8), a RF transmitter (Bluetooth), and a battery (Ni-MH) power module. The micromechanical sensors, which are described in Refs [9], consist of a so-called “chip” or “die” that is constructed from treated silicon. The sensing element is a proof mass (inertial element) that is deflected from its equilibrium position by an angular velocity in the case of gyroscope (gyro), and by a linear acceleration in the case of accelerometer.
1222
S.K. Hong et al.
Requirements for motion sensor performance can be estimated from the performance needed to control postural stability. Further requirements involve sensor size (small enough for body mounting) and power consumption (small enough to be battery powered for at least 12 hours). The single and double inverted pendulum models for human standing yields estimated natural frequencies of 0.4 Hz and 0.5 Hz, respectively, while the natural frequency during running is about 5 Hz [10]. Thus, the required bandwidth for motion sensors to provide estimates of body attitude would be approximately 10 Hz, which is twice the natural frequency during running. From human psychophysical experiments, the detection thresholds for linear acceleration and angular rate [11] are 0.05g and 1 deg/s, respectively. The performance of inertial sensors selected for this study is summarized in Table 1. To achieve better performance, temperature dependent characteristics such as scale factor (sensitivity) and noise (bias) of inertial sensors should be compensated as described in Ref. [12]. The performance of the sensors after software temperature compensation is depicted in bracket in Table 1. The micromechanical inertial instruments are housed in a 58 x 58 x 30 cm3 package (Figure 2) fastened to the spur of S1 spine, which is the segment of our tentative interests. The sensitivity axes of the inertial sensors are aligned with the subject’s frontal ( φ , roll) and sagittal ( θ , pitch) axes, respectively. When the switch is turned on, the attitude ( φ and θ ) from the estimation filter are transmitted into PC at 50Hz. A person can walk everywhere naturally, because RF transmitter eliminates long cords from inertial instruments to data acquisition/signal analysis computer.
Fig. 2. Prototype of Micromechanical Inertial Instrument Table 1. Inertial sensors specification
Parameters
Gyroscope (ADXRS 300) Resolution 0.1 deg/sec Bandwidth 40 Hz Bias Drift 50 deg/hr (20 deg/hr) Scale Factor Error 0.1% of FS (0.01% FS)
Accelerometer (KXM 52) 1 mg 100 Hz 2 mg 0.1% of FS
An Effective Method of Gait Stability Analysis Using Inertial Sensors
1223
2.2 Closed-Loop Strapdown Attitude Estimation Filter Both the gyro and accelerometer signals contain information about the orientation of the sensor. The sensor orientation can be obtained by integration of the rate signal (p, q, and r) obtained form gyros (Eq. 1). The rate signal from the gyro contains undesirable low-frequency drift or noise, which will cause an error in the attitude estimation when it is integrated. To observe and compensate for gyro drift, a process called augmentation is used, whereby utilizing other system states compensates gyro errors. One of the approaches which we used is so called the accelerometer aided mixing algorithm of SARS [7]. This scheme is derived from the knowledge that accelerometers (f) do not only measure the linear acceleration, but also gravitational vector (g). This knowledge can be used to make estimation of the attitude (Eq. 2) that is not very precise but does not suffer from integration drift from accelerometers. A 0.03 Hz lowpass filter, which must roll off at 1/frequency squared, is used to remove the part of the signal due to linear and angular acceleration while keeping the part due to gravity vector. ⎡ ⎡φ ⎤ ⎢1 sin φ tan θ ⎢ ⎥ ⎢ cos φ ⎢θ ⎥ = ⎢0 sin φ ⎢ψ ⎥ ⎢0 ⎣ ⎦ cos θ ⎣⎢ ⎛ fx ⎞ ⎟⎟ , ⎝ g ⎠
θ = sin −1⎜⎜
⎤ cos φ tan θ ⎥ ⎡ p ⎤ − sin φ ⎥ ⎢⎢ q ⎥⎥ ⎥ cos φ ⎥ ⎢ r ⎥ ⎣ ⎦ cos θ ⎦⎥
(1)
⎛ − fy ⎞ ⎟ ⎟ ⎝ g cosθ ⎠
(2)
φ = sin −1⎜⎜
This scheme involves a set of 3-axis rate gyro (p, q, and r) that provides the required attitude information as results of integration in combination with a 2-axis accelerometer (fx, and fy). The main idea of this scheme is that proper combining (PIfiltering) gyro and accelerometer measurements could make precise attitude information available. The block diagram of the processing scheme used for the prototype device is shown in Figure 3. This combination of rate gyros with accelerometers can give a considerably improved accuracy of the estimated attitude. These measurements showed that the estimated attitude angle has an rms error of ~1 deg over a 0- to 10-Hz bandwidth without limits of operating time. 2.3 Spectral Analysis Utilizing the FFT Usually it is difficult and ambiguous to analyze the characteristics of motion and stability directly from the raw signals measured in a time domain because of their complex fluctuations and some zero-mean random noise. So we tried to extract the quantitative features for identifying gait motion. The technique is to identify the frequency components and their magnitude of a signal buried in a noisy time domain
1224
S.K. Hong et al.
signal. The discrete Fourier transforms of the signal is found by taking the Fast Fourier transform (FFT) algorithm [13]. The Fourier transform of a time domain signal x(t) is denoted by x(ω) and is defined by
X (ω ) = ∫−∞∞ x(t )e − jwt dt
(3)
which transforms the signal x(t) from a function of time into a function of frequency ω. Also the power spectrum, a measurement of the power at various frequencies, is calculated. This power spectrum may be used as an important data for medical diagnosis of gait stability and degree of gait training. These algorithms are processed running Matlab/Simulink on the digital analysis computer.
Fig. 3. Block diagram of attitude estimation procedure
3 Experimental Results In this study, 30 patients with rheumatism, age 50-70 years, and 3 patients with prosthetic limb, age 30-40 years, were selected as subjects. The micromechanical inertial instrument was fixed on near the spur of S1 spine of patients to check pelvis fluctuation when they walk (Figure. 1). Subjects walked down a 10 m runway, at a selfdetermined pace repetitively. Also, the gait motions of 10 young healthy subjects, age 20-30 years, were measured for comparison with the patient’s. It was observed qualitative features of gait motion in both roll and pitch axes from the data of time domain (Figure 4) that healthy subjects showed little fluctuations with rhythmic, while patients with rheumatism and prosthetic limb showed relatively big fluctuations and less rhythmic. On the other hand, it was observed quantitative features of gait motion in both axes from the data of frequency domain (Figure 5) that healthy subjects showed single dominant frequency (1~2Hz) with small magnitude (<10 dB), while patients with rheumatism and prosthetic limb showed multiple dominant frequencies (1~5 Hz, 1~3 Hz) with big amplitude (>200dB). The experimental results for roll axis were summarized in Table 2.
An Effective Method of Gait Stability Analysis Using Inertial Sensors
1225
Table 2. Experimental results for roll axis
Subjects
Time Domain Fluctuation Rhythmic
Small(<2 deg) Healthy Med.(<7 deg) Rheumatism Prosthetic limb Big (<12 deg)
Quite Less Less
Frequency Domain Dominant Freq. Magnitude 1 Hz 1, 3, 5 Hz 1, 1.8, 2.5 Hz
< 10dB < 500dB < 15000dB
From Figure 6, it is shown that the quantitative features of frequency-magnitude data of each subject can be categorized clearly. It should be also noted that these features show the repeatability and consistency among each categorized subjects. This explains the practical aspects of the proposed system which provides quantitative and reliable diagnoses for the gait stability. To evaluate the rehabilitation treatment effects, rheumatism subjects were retested after 2 months’ physical fitness. Figure 7 shows the effectiveness of the physical fitness by noting the reduction of the magnitude of 1st frequency (moving from solid line category to dashed line category). Hence, the proposed system can be used as an objective and quantitative tool for the evaluation of rehabilitation treatment effects.
Fig. 4. Gait motion and stability in time domain
1226
S.K. Hong et al.
Fig. 5. Gait motion and stability in frequency domain
Fig. 6. Categorization of features of gait
Fig. 7. Evaluation of rehabilitation treatments
5 Conclusion A new, low-cost and accurate micromechanical inertial instrument and a simplified but effective method for analyzing gait stability have been proposed. By utilizing of the low cost inertial sensors and closed-loop strapdown attitude estimation filter, we can measure the attitude fluctuations characterizing gait motion accurately. Also, it is
An Effective Method of Gait Stability Analysis Using Inertial Sensors
1227
shown that the frequency analysis utilizing the Fast Fourier transform (FFT) provides quantitative diagnoses for the gait stability. We performed experiments on 30 rheumatism patients, 3 patients with prosthetic limb, and 10 healthy subjects to show the practical aspect of the proposed system. The results suggest that the proposed system can be a simplified and efficient tool for the evaluation of both gait stabilities and rehabilitation treatments effects. Another benefit of the proposed system is cordless, small and light, thus it has various applications of clinical practice by placing it on the segments of interest.
Acknowledgment Authors gratefully acknowledge the help of Dr. H.-Y. Lee, Department of Nursing in Seoul National University, for assistance in subjects testing and preliminary data processing. We also would like to thank S-J Yoon, CEO of Motionspace Inc., for the support in hardware design and manufacturing.
References 1. Tsuruoka, M., Shibasaki, R., Murai S., Wada, T.: Bio-Feedback Control Analysis of Postural Stability using CCD Video Cameras and a Force-Plate Sensor Synchronized System. IEEE International Conference on System, Man, and Cybernetics (1998) 3200-3205 2. Conard, W., Marc, S., et al.: Balance Prosthesis Based on Micromechanical Sensors Using Vibrotactile Feedback of Tilt. IEEE Trans. On Biomedical Eng. Vol 48 No.10 (2001) 1153-61 3. Griffin, B., Huber, B., Wallner, F., Fink, T.: A 'Sense of Balance' AHRS with Low-Cost Vibrating Gyroscopes for Medical Diagnostics. Symposium Gyro Technology, Stuttgart (1997) 4. Lee, C.-Y. Lee, J.-J.: Estimation of Walking Behavior Using Accelerometers in Gait Rehabilitation. International Journal of Human-Friendly Welfare Robotic Systems Vol.3, No.2. (2002) 32-36 5. Ochi, F., Abe, K., Ishigami, S., Otsu, K., Tomita, H.: Trunk Motion Analysis in Walking using Gyro Sensors. Proceedings 19th International Conference IEEE/EMBS. Chicago, IL. USA (1997) 6. Stanford, C. F., Francis, P. R., Chambers, H. G: The Effects of Backpack Loads on Pelvis and Upper Body Kinematics of the Adolescent Female During Gait. Proceedings 7th annual meeting Gait and Clinical Movement Analysis Society (GCMAS). (2002) 7. Hong, S.K.: Fuzzy Logic based Closed-Loop Strapdown Attitude System for Unmanned Aerial Vehicle. Sensors and Actuators A-Physical (2002) 8. Kitagawa, M., Gersch, W.: Smoothness Priors Analysis of Time Series. Lecture Notes in Statistics 116 (1996) 9. Kourepenis, A., Borenstein J, et al.: Performance of MEMS Inertial Sensors. AIAA GNC Conference (1998) 10. Jones, G.M., Milsum, J.H.: Spatial and Dynamic Aspects of Visual Fixation. IEEE Trans. Bio-Med Eng. Vol. BME-12 (1966) 54-62 11. Benson A.J.: Effect of Spaceflight on Thresholds of Perception of Angular and Linear Motion. Arch Otorhinolaryngol Vol. 244(3) (1987) 147-154
1228
S.K. Hong et al.
12. Hong, S.K.: Compensation of Nonlinear Thermal Bias Drift of Resonant Rate Sensor (RRS) using Fuzzy Logic. Sensors and Actuators Vol. 78 (1999) 143-148 13. Duhamel, P., and Vetteril, M.: Fast Fourier Transforms. A Tutorial Review and a state of the art. Vol. 19. Signal Processing. (1990).259-299
Author Index
Abraham, Ajith 283 Acevedo-Mosqueda, Mar´ıa Elena 357 Acosta-Mesa, H´ector-Gabriel 494 Acu˜ na, Gonzalo 305 Aguilar de L., Santos 922 Altan, Zeynep 868 An, Kun 316 Arango Isaza, Fernando 27 Arredondo Vidal, Tomas 101 Arroyo-Figueroa, Gustavo 522 Atkinson, John 985 Bae, Jinhyung 1220 Baek, Sunkyoung 828 Barrientos-Mart´ınez, Roc´ıo-Erandi 494 Batyrshin, Ildar 165 Bellec, Jacques-Henry 674 Bello, Rafael 176 Ben´ıtez, J.M. 562 Ben´ıtez-Guerrero, Edgard 684 Ben´ıtez-P´erez, H´ector 134 Bica, Francine 248 Bien, Zeungnam 745, 1190 Bolshakov, Igor A. 838 Brizuela, Carlos A. 404 Brna, Paul 208 Bustamante, Carlos 237 Calder´ on Mart´ınez, Jos´e Antonio 146 Camarena-Ibarrola, Antonio 952 Cant´ u, Francisco J. 1116 Cao, Zining 1095 C´ ardenas-Flores, Francisco 134 Carde˜ nosa, Jes´ us 932 Castro, Carlos 381 Castro, J.L. 562 Cervantes, Jair 572 Chae, Soo-Hoan 583 Ch´ avez, Edgar 952 Chen, Mingang 505 Chen, Toly 483 Cheng, Chia-Ying 974 Chi, Su-Young 1067
Cho, Tae Ho 112 Choi, Byung-Jae 156 Choi, Mun-Kee 426 Choi, Yoon Ho 327 Coello Coello, Carlos 294 Crawford, Broderick 381 Cruz C., Irma Cristina 922 Cruz-Ch´ avez, Marco Antonio 450 Cruz-Ram´ırez, Nicandro 494, 652 da Costa Bianchi, Reinaldo Augusto 704 De Baets, Bernard 176 de Souza Serapi˜ ao, Adriane Beatriz 1037 Deng, Chao 641 Do, Jun-Hyeong 745 Dokur, Z¨ umray 800 Dowe, David L. 593 Dudek, Gregory 715 Fan, Xianfeng 513 Fan, Xinghua 1017 Feng, Lei 612 Feng, Xiaoyi 726 Fernandes Martins, Murilo 704 Figueroa, Alejandro 985 Figueroa Nazuno, Jes´ us 1057 Filatov, Denis M. 165, 838 Flores, Juan J. 259 Flores-Badillo, Marina 1128 Flores-Pulido, Leticia 1075 Forcada, Mikel L. 844 Fraire H., H´ector J. 922 Frausto-Sol´ıs, Juan 450 Freund, Wolfgang 101 Gallardo, Carolina 932 Gao, Yingfan 1007 Garcia, Maria Matilde 176 Garc´ıa, Rodrigo 70 Garc´ıa-L´ opez, Daniel Alejandro Garc´ıa-Nocetti, Fabian 134 Garcia-Perera, L. Paola 1085
652
1230
Author Index
Garrido, Leonardo 237 Garro, Beatriz A. 367 Garza Casta˜ non, Luis Eduardo 810 Gelbukh, Alexander 27, 283, 855 Gerv´ as, Pablo 70 Gonzalez, Alain C´esar 820 Gonz´ alez, In´es 472 Gonz´ alez, Miguel A. 472 Gonz´ alez B., Juan J. 922 Gonz´ alez-Castolo, Juan Carlos 90 Gonz´ alez P´erez, Jos´e Juan 1171 Grosan, Crina 283 Guo, Mao Zu 641 Guti´errez, Everardo 404 Guti´errez-Fragoso, Karina 652 Gutierrez-Tornes, Agustin 122 Guzman-Arenas, Adolfo 1 Han, Mi Young 532 Han, Weiguo 695 Hao, Jin-Kao 392 He, Fengling 963 He, Lianghua 734 He, Pilian 554 He, Yong 505, 612 Hern´ andez-Cisneros, Rolando Rafael 1200 Hern´ andez-L´ opez, Alma-Rosa 684 Herv´ as, Raquel 70 Hong, Sung Kyung 1220 Hou, Yuexian 554 Hsieh, Ji-Lung 974 Hu, Die 734 Hu, Jie 663 Huang, Chung-Yuan 974 Huang, Hong-Zhong 513 Huang, Lingxia 505 Huang, Min 505 Huang, Wei 338 H¨ ubner, Alexandre 1105 Hwang, Chi-Jung 1179 Hwang, Gwangsu 1047 Hwang, Myunggwon 828, 1047 Ibarg¨ uengoytia, Pablo H. 218, 227 Iraola, Luis 932 Iscan, Zafer 800 Ishikawa, Tsutomu 49 Izquierdo-Bevi´ a, Rub´en 879
Jamett, Marcela 305 Jang, Hyoyoung 745 Jarur, Mary Carmen 1211 Jiang, Changjun 734 Jin, Peihua 505 Joo, Young Hoon 756 Judith, Espinoza 1150 Jung, Da Hun 461 Jung, Jin-Woo 745, 1190 Jung, Sung Hoon 745 Kats, Vladimir 439 Kechadi, M-Tahar 674 Khor, Kok-Chin 1027 Kim, Chul Woo 532 Kim, Dae-Hee 1179 Kim, Dong Seong 632 Kim, Jung-Yup 1220 Kim, Kap Hwan 461 Kim, Kyoung Joo 327 Kim, Min-Seok 1067 Kim, Pankoo 828, 1047 Kim, Soohyung 828 Kim, Sung-Ho 15 Kim, Sun-Jin 426 Kim, Wonpil 828 Kitano, Masaki 49 Kong, Hyunjang 828, 1047 Kozareva, Zornitsa 889, 900 Kwak, Keun-Chang 1067 Lai, Kin Keung 338 Leboeuf Pasquier, J´erˆ ome 1171 Ledeneva, Yulia Nikolaevna 146 Leduc, Luis Adolfo 81 Lee, Byung Kwon 461 Lee, Chong Ho 272, 767 Lee, Hae Young 112 Lee, Hong-Ro 1179 Lee, Inbok 583 Lee, Kwon-Yong 1220 Lee, Sang Min 632 Lee, Sug-Chon 1220 Legrand, Steve 855 Levner, Eugene 439 Li, Xiaoou 572 Li, ZhanHuai 543 Li, Zhen 726 Liu, Yushu 1007
Author Index Lopez-Martin, Cuauhtemoc 122 L´ opez-Mellado, Ernesto 90, 1128 L´ opez-Y´ an ˜ ez, Itzam´ a 357 Lou, Chengfu 505 Luck, Michael 1116 Luna-Ram´ırez, Wulfrano Arturo 652 Lv, Baohua 726 Ma, Runbo 1007 Mantas, C.J. 562 Martinez-Barco, Patricio 911, 996 Mart´ınez-Mu˜ noz, Jorge 165 Mase, Shigeru 186, 197 Mei, JinFeng 943 Mej´ıa-Lavalle, Manuel 522 Mendez, Gerardo Maximiliano 81 Meng, Jiang 316 Mex-Perera, J. Carlos 622, 1085 Miao, Qiang 513 Mondrag´ on-Becerra, Rosibelda 652 Monfroy, Eric 381 Monroy, Ra´ ul 622, 789 Montes de Oca, Saul 810 Montes-Gonz´ alez, Fernando 1160 Montoyo, Andr´es 889, 900 Mora, Marco 1211 Morales, Eduardo F. 227 Morales, Rafael 208 Morales-Men´endez, Rub´en 810 Morell, Carlos 176 Moreno-Monteagudo, Lorenza 879 Mu˜ noz, C´esar 101 Murrieta-Cid, Rafael 789 Nava-Fern´ andez, Luis-Alonso Navarro, Borja 879 Navarro, Nicol´ as 101 Nguyen, Ha-Nam 532, 583 Nishita, Seikoh 49 Noh, Sun Young 756 Nolazco-Flores, Juan Arturo 622, 1085 Noriega, Pablo 1116
494
Ohn, Syng-Yup 532, 583 ¨ Olmez, Tamer 800 Orhan, Zeynep 868 Oropeza Rodr´ıguez, Jos´e Luis 1057 Ortiz-Hern´ andez, Gustavo 652 Osorio, Maria 1150
Osorio, Mauricio Ozkarahan, Irem
1231
59 415
Padilla-Duarte, Mayra 1128 Palomar, Manuel 996 Park, Chan-Yong 1179 Park, Jin Bae 327, 756 Park, Jong Sou 632 Park, Soo-Jun 1179 Park, Sung-Hee 1179 Park, Young-Man 461 Park, Young-Mee 532 Paveˇsi´c, Nikola 38 Pazos Rangel, Rodolfo A. 922 P. Dimuro, Gra¸caliz 1105 Pelaquim Mendes, Jos´e Ricardo 1037 Peng, Tao 963 Peng, Yinghong 663 P´erez O., Joaqu´ın 922 P´erez-Ortiz, Juan Antonio 844 P´erez y P´erez, Rafael 70 na-Garc´ıa, Carlos Adolfo 652 Pi˜ Pogrebnyak, Oleksiy 820 ¨ un¸c 348 Polat, Ov¨ Posadas, Rom´ an 622 Proskurowski, Andrzej 259 Puche, J.M. 562 Pu¸sca¸su, Georgiana 911 Qui˜ nones-Reyes, Pedro 134 Quir´ os, Fernando 101 Quixtiano-Xicoht´encatl, Roc´ıo
1075
Rabelo, Clarice 1037 Ram´ırez, Justino 778 Ram´ırez, Noel 294 Reyes, Alberto 218, 227 Reyes-Galaviz, Orion Fausto 1075 Reyes-Garc´ıa, Carlos Alberto 146, 1075 Ribari´c, Slobodan 38 R´ıos Figueroa, Homero 1160 Rivera, Mariano 778 Riveron, Edgardo Manuel Felipe 820, 1057 Rizzo Guilherme, Ivan 1037 Robles, Armando 1116 Rocha Costa, Antˆ onio Carlos 1105 Rodriguez Vela, Camino 472 Rodriguez, Yanet 176
1232
Author Index
Rodriguez-Tello, Eduardo Ryu, Kwang Ryel 461
392
Saarikoski, Harri M.T. 855 S´ anchez, Abraham 1150 S´ anchez-Mart´ınez, Felipe 844 Sanchez-Torres, Brenda 1085 Santana-Quintero, Luis Vicente 294 Santos Reyes, Jos´e 1160 Saquete Bor´ o, Estela 911 Selim, Hasan 415 Shaw, Shih-Lung 695 Sheremetov, Leonid 165 Shim, Jaehong 1047 Shin, Juhyun 828 Shin, Jung-Sub 1179 Sierra, Mar´ıa 472 Song, Dong Ho 583 Song, Jae-Hoon 1190 Sossa, Juan Humberto 367, 820 Soto, Rogelio 237 Su´ arez, Armando 879 Su´ arez Guerra, Sergio 1057 Sucar, Luis Enrique 227 Sug, Hyontai 604 Sun, Chuen-Tsai 974 Sung, Man-Kyu 1179 Taga, Nobuyuki 186, 197 Tan, Peter Jing 593 Terashima-Mar´ın, Hugo 1200 Terol, Rafael M. 996 Ting, Choo-Yee 1027 Tonidandel, Flavio 704 Torres-Jimenez, Jose 392 Torres-M´endez, Luz Abril 715
Van Labeke, Nicolas 208 Varela, Ramiro 472 V´ azquez, Roberto A. 367 V´ azquez, Sonia 900 Verdin, Regina 248 V. Gon¸calves, Lunciano 1105 Vicari, Rosa 248 Vu, Trung-Nghia 532 Wang, Jin 272, 767 Wang, Jinfeng 695 Wang, Shouyang 338 Weber, J¨ org 1139 Wotawa, Franz 1139 Wu, Di 612 Xiao, ZhiJiao
943
Yan, GuangHui 543 Y´ an ˜ ez-M´ arquez, Cornelio Yang, Hongmin 554 Yazıcı, G¨ ul 348 Yi, Yang 943 Yıldırım, T¨ ulay 348 Yoo, Seog-Hwan 156 Yu, Ha-Jin 1067 Yu, Lean 338 Yu, Wen 572 Yuan, Liu 543
122, 357
Zapata Jaramillo, Carlos Mario Zepeda, Claudia 59 Zhang, Changli 963 Zhang, Jiling 726 Zuo, Wanli 963
27